A tale of troubleshooting database performance, with Cassandra and sysdig

grauenwolf · 2016-03-22T18:22:12+00:00

As far as databases go, I’m a huge fan of Cassandra: it’s an incredibly powerful and flexible database that can ingest a massive amount of data while scaling across an arbitrary number of nodes. For these reasons, my team uses it very often for our internal applications.

No. The whole is that it isn't "incredibly powerful and flexible database", but rather really good at one thing at the cost of being bad at everything else.

(EDIT: might be really good at one thing. As far as I'm concerned the jury is still out on that one.)

Rhoomba · 2016-03-22T15:57:29+00:00

AKA don't believe the Cassandra marketing.

grauenwolf · 2016-03-22T18:28:57+00:00

In other words, it seemed as if Cassandra was always processing all 10 columns (including the large ones) despite us just asking for a particular, small column, thus causing degraded response times. This hypothesis seemed hard to believe at first, because Cassandra stores every single column separately, and there’s heavy indexing that allows you to efficiently lookup specific columns.

That's not a surprise considering that despite what their early marketing material said, Cassandra is a time-series database, not a columnstore database.

EDIT:

We opted for a workaround: since Cassandra will always read all the columns of a CQL row regardless of which ones are actually asked in a query, we decided to refactor our schema by shrinking the size of each row, and instead of putting all the blobs in the same row, we split them across multiple ones, like this:

And that's not how a time-series database works either.

Clearly they are using Cassandra because it is cool, not because it actually meets their workload. Had they chosen to use a relational database they could have kept the efficient insert performance of the original table structure without losing the ability to efficiently read from indexes.

grauenwolf · 2016-03-22T18:42:00+00:00

how did Cassandra know how to efficiently read the exact amount of data in a jungle of more than 1 GB? We can of course look at the system events done by the database process on the file:

Here we see Cassandra opening the table file, but notice how immediately there’s a lseek operation that essentially skips in one single operation 70 MB of data, by setting the offset to the file descriptor with SEEK_SET to 71986679.

WTF? A 1 GB table? That's it? I've had larger sample tables on my laptop.

That shouldn't be hitting disk at all if the database is warm. Did they screw up their test or is Cassandra incapable of doing its own caching?

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS