Cassandra : Row Cache & Memtable Q&A

Below, you will find 2 sets of questions and answers regarding Cassandra’s (now v0.7) Row Cache and Memtable. These were answered by Matthew Dennis at Riptano, a company that is actively developing both Cassandra and a management suite known as RipCord.
Questions
Hi!
I have a super column family. Writes either modify a column within a super column or add a super column to the super column family. I wonder how the row cache works.
A. How do writes interact with the Row Cache entries?
1. Do you update the row cache entry in place in order to keep it consistent with the SSTable?
2. Do you update it when the memtable is updated or when the memtable is flushed to the SSTable?
B. Also, is the Memtable appended to an SSTable when it is flushed or does it become its own SSTable?
Answers
The row cache is kept up to date with the memtable in the sense that if the memtable has an entry for the row in it, it is modified to contain the newly written value. If the row cache does not already have the row, it is not added. Getting a row into the row cache requires that the row is read.
The row cache entry is updated in memory to keep it in sync with the memtable, not the SSTable. If the value is in the row cache, the SSTable(s) are not checked. If the row cache lacks an entry for the row, it is read from the SSTables (possibly by way of the key cache) and added to the row cache if it is enabled.
It is updated when the memtable is updated, not when it is flushed. The memtable and row cache essentially store a reference to the same row object.
A memtable is not appended to any SSTable (they are immutable). Each memtable flush creates a new SSTable. Compaction joins multiple SSTables into one larger SSTable.
On a related note, I generally recommend that people run with row cache disabled.
1) The OS page cache stores the rows in a more compact way.
2) Using the OS page cache greatly reduces the GC pressure in the JVM.
3) Disabling the row cache lowers the load on the server during writes where none of the writes are in the row cache (a very common scenario) because it avoids the check the cache, read the row, insert into the cache, evict something old from the cache path. Not to mention, constantly evicting things from the cache because new things are added only addes to the GC pressure.
4) Large rows can kill your performance by forcing the JVM to keep it in memory and in extreme cases of multiple large rows and/or *really* large rows can OOM you JVM.
5) The best case performance improvement you can hope of is generally < 5% and it easily turns negative in most cases.
The main downside of turning it off entirely is that in the case of rows that are appended to and/or have columns that are overwritten slow enough that pieces of the row make it into different SSTables all the time, it’s important that compaction keeps up.
Questions
Thanks a lot for this great explanation. We will disable it for now.
BTW, regarding the row cache eviction scheme, does it use LRU-by-access eviction both
1. When the row cache is full (i.e. evicts least recently accessed entry when the cache is full)
2. And by evicting based an age threshold in a separate reaper thread?
FYI.. by “LRU-by-access” I mean that it’s the least recently accessed row. Alternately, “LRU-by-creation” refers to least recently added to the cache.
Answers
It uses the SECOND_CHANCE eviction policy, which is based on insertion order but an entry gets “a second chance” if it was recently accessed.
That being said, you’re likely interested in https://issues.apache.org/jira/browse/CASSANDRA-975
-
calculatoare-second-hand liked this
-
drugsrdrugs liked this
-
rooksfury posted this