In this post we’re going to explore how to build a DSL (Domain Specific Language) with a user-friendly syntax while maintaining as much type-safety as possible. We want that any operations that is not allowed by the business rules fail at compile time. This would be really nice as it makes sure that no one writes such forbidden logic (even by mistake).
More over Scala provides really nice syntactic sugar that can make a DSL syntax pretty neat.
Custom-DSL are nice as they provide all the type-safety you need against your data schema. However in this post I will focus only on the Java driver. Why? Because it’s both a simple and decent solution in my opinion.
The bad thing is that you lose any type-safety as all the queries are just plain strings. On the other hand you don’t have to learn a new DSL because your queries are just CQL. Add a thorough test coverage and you have a viable solution.
Moreover the Java driver provides an async API backed by Guava’s futures and it’s not that difficult to turn these futures into Scala futures – which makes a quite natural API in Scala.
We recently deployed in production a distributed system that uses Cassandra as its persistent storage.
Not long after we noticed that there were many warnings about tombstones in Cassandra logs.
WARN [SharedPool-Worker-2] 2017-01-20 16:14:45,153 ReadCommand.java:508 -
Read 5000 live rows and 4771 tombstone cells for query
SELECT * FROM warehouse.locations WHERE token(address) >= token(D3-DJ-21-B-02) LIMIT 5000
We found it quite surprising at first because we’ve only inserted data so far and didn’t expect to see that many tombstones in our database. After asking some people around no one seemed to have a clear explanation on what was going on in Cassandra.
The actor model allows us to write complex distributed applications by containing the mutable state inside an actor boundary. However with Akka this state is not persistent. If the actor dies and then restarts all its state is lost.
To address this problem Akka provides the Akka Persistence framework. Akka Persistence is an effective way to persist an actor state but it’s integration needs to be well thought as it can greatly impact your application design. It fits nicely with the actor model and distributed system design – but is quite different from what a “more classic” application looks like.
In this post I am going to gloss over the different components of Akka Persistence and see how they influence the design choices. I’ll also try to cover some of the common pitfalls to avoid when building a distributed application with Akka Persistence.
Now that we know about Markov chain, let’s focus on a slightly different process: the Markov Decision Process.
This process is quite similar to a Markov chain but adds more concept into it: Actions and Rewards. Having a reward means that it’s possible to learn which action yield the best rewards. This type of learning is also known as reinforcement learning.
Last time we talk about what is a Markov chain. However there is one big limitation:
A Markov chain implies that we can directly observe the state of the process. (e.g the number of people in the queue).
Many times we can only access an indirect representation or noisy measure of the state of the system. (e.g. we know the noisy GPS coordinates of a robot but we want to know it’s real position).
In this post we’re going to focus on the second point and see how to deal with HMMs. In fact HMM can be useful every time that we don’t have direct access to the system state. Let’s take some motivational examples first before we dig into the maths. Continue reading “Hidden Markov Model”
I’ve heard the term “Markov chain” a couple of times but never had a look at what it actually is. Probably because it sounded too impressive and complicated.
It turns out it’s not as bad as it sounds. In fact the whole idea is pretty intuitive and once you get a feeling of how things work it’s much easier to get your head around the mathematics.
Andrey Markov was a Russian mathematician who studied stochastic processes (a stochastic process is a random process) and specially systems that follow a suite of linked events. Markov found interesting results about discrete processes that form a chain. Continue reading “Markov chain”
The free monad is really neat for creating DSL as it allows to completely separate the business logic from the implementation.
It leaves a great freedom for the implementation choices and should you change your implementation you just need to rewrite your interpreter without changing any of the business logic.
That’s great! However after playing a little around with the free monad I was a bit skeptical about the amount of boilerplate required to plug everything together. Writing smart constructors and injectors is not the most trivial thing (depending on how your team is familiar with functional programming).
Now that we have a fairly good understanding of what the free monad is and how it works. Let’s see what it looks like to write a real program based on this concept. This time we won’t implement Free ourselves but use the cats library for that. If you don’t know all the theory behind the free monad and just want to see how to use it then read on as this post builds an application from scratch.
In this post we’ll write simple DSLs, lift them into the free monad and even combine them into one program. There is some boiler plate around but I want to get a feeling of what it looks like to combine more DSLs and see how we can architecture a whole application around the free monad. Continue reading “The free monad in (almost) real life”
Remember in part 1 we implemented a little program to interact with the user. We nicely separated the business logic from the implementation. And as good as it is we now need to add more functionality to our application and this exactly what we’re going to cover here.