The Cassandra Java Driver

Cassandra drivers are not just a dumb piece of software that sends CQL strings to a Cassandra node and waits for responses.

They are actually quite smart and are architectured in a way that should make your life easier while still attempting to get the most performance out of Cassandra.

In this post I am going to focus on the Java driver, have a quick look at its architecture and on some of the features it offers. Continue reading “The Cassandra Java Driver”

Querying Cassandra from Scala

When it comes to accessing Cassandra from Scala there are 2 possible approaches:

  • The official Java driver
  • A custom DSL like Quill or Phantom

Custom-DSL are nice as they provide all the type-safety you need against your data schema. However in this post I will focus only on the Java driver. Why? Because it’s both a simple and decent solution in my opinion.

The bad thing is that you lose any type-safety as all the queries are just plain strings. On the other hand you don’t have to learn a new DSL because your queries are just CQL. Add a thorough test coverage and you have a viable solution.

Moreover the Java driver provides an async API backed by Guava’s futures and it’s not that difficult to turn these futures into Scala futures – which makes a quite natural API in Scala.

There are still some shortcomings that you’d better be aware of when consuming a result set but overall I think that it’s still a simple solution that is worth considering. Continue reading “Querying Cassandra from Scala”

Understanding Cassandra tombstones

We recently deployed in production a distributed system that uses Cassandra as its persistent storage.

Not long after we noticed that there were many warnings about tombstones in Cassandra logs.

WARN  [SharedPool-Worker-2] 2017-01-20 16:14:45,153 ReadCommand.java:508 - 
Read 5000 live rows and 4771 tombstone cells for query 
SELECT * FROM warehouse.locations WHERE token(address) >= token(D3-DJ-21-B-02) LIMIT 5000 
(see tombstone_warn_threshold)

We found it quite surprising at first because we’ve only inserted data so far and didn’t expect to see that many tombstones in our database. After asking some people around no one seemed to have a clear explanation on what was going on in Cassandra.

In fact, the main misconception about tombstones is that people associate it with delete operations. While it’s true that tombstones are generated when data is deleted it is not the only case as we shall see. Continue reading “Understanding Cassandra tombstones”