NoSQL @ Netflix : Part 1

Hi Folks!
I had a great time this past Thursday (Feb 17, 2011) speaking at the Silicon Valley Cloud Computing Group Meetup at Facebook. The talk covered Netflix’s move from RDBMS to NoSQL, specifically SimpleDB. Subsequent parts will provide our experiences with Cassandra, HBase, and other technologies.
Video is now available. The first 10 minutes are from sponsors, VMWare, RackSpace, and Scalr!
-s
Netflix’s Transition to High-Availability Storage Systems (QCon SF 2010) Slides
I presented at QCon SF (2010) yesterday on Netlflix’s transition to high-availability storage. The slides are on slideshare.
I expect that the presentation will be available on the QCon SF site in a few days. It’s (loosely) based on my previous white paper of the same title - see this post.
Netflix’s Transition to High-Availability Storage Systems
I just published a white paper titled Netflix’s Transition to High-Availability Storage Systems
Feel free to email me your thoughts at siddharthanand@yahoo.com
To download this paper as PDF, click on this .
SimpleDB Essentials for High Performance Users : Part 3

This is Part 3 of SimpleDB Essentials for High Performance Users. Check out Part 2
- Work around Attribute Value Length Limits
- If you need to store data that is vastly larger than 1024 bytes in a SimpleDB attribute, consider storing that data in S3 and putting a pointer (i.e. bucket name + object key) to the data in the simpleDB attribute. However, the drawback from this approach is that you will require 2 round-trips (i.e. one to SimpleDB and one to S3) to compose one logical row. Beyond the obvious performance hit, this approach is not transactionally sound.
- A better approach is to split that data over several SimpleDB attributes. You will need to control the splitting and joining logic of these SimpleDB attributes, but you will only need one roundtrip and you can leverage conditional puts for concurrency control. This approach is ideal if your data can fit in 10 or fewer attributes.
- Just remember that subsequent updates to these split attributes might be of different length
- Getting tripped up by the Default Select Query pagination limit of 100
- You must be aware that the SDB Select query supports the “limit N” expression. This allows the developer to specify N up to a max of 2500. If the developer chooses N=200 for example and 1000 items match the WHERE clause conditions, then the results would be returned in chunks of 200 at a time. 5 subsequent round trips would be required to fetch the 1000 items. For customer facing functionality, you are risking end-user timeouts. To avoid this, always specify “limit 2500”. Note: if you don’t specify it, the default value of 100 is assumed by SimpleDB
- Avoid any client code that auto-follows tokens returned by SimpleDB. SimpleDB Query timeouts could result in an unpredictably long-cycle of next-pointers. Auto-following these can not only result in an infinite loop on your servers, but customer-browser timeouts as well. Instead, follow these next pointers judiciously.
- Avoid carrying multi-table relationships into the cloud in the form of multi-domain relationships. Try to denormalize these relationships into single items. Doing joins in the application tier might require multiple round-trips to SDB and open customer-facing functionality to time-outs
- Remember that there are no sequences, locks, constraints (except for the uniqueness constraint on the item name), triggers, etc.. in SimpleDB. Don’t expect them
SimpleDB Essentials for High Performance Users : Part 2

This Part 2 of SimpleDB Essentials for High Performance Users. Check out Part 1
Read More
SimpleDB Essentials for High Performance Users : Part 1

Preamble
I’ve been a heavy-user of SimpleDB since January 2009, storing, writing, and reading billions of items. Based on my experience, I’ve compiled a list of best practices and conventions to simplify working with SimpleDB. I’ve divided this into multiple parts to ease readability.

Read More
A Java Out-of-Memory Error involving GZIP, Typica, and SimpleDB
UPDATE
I am providing an update here to the root cause.
Overview
I ran into an interesting Out of Memory bug this week. It occurs if you use gzip to send/receive data and under-utilize your Java Heap memory. This land-mine has existed since 2004, though hopefully you will not be bitten by it.
Problem Stack
A Java process was throwing the following Out-of-Memory Error.
JVMDUMP013I Processed Dump Event "uncaught", detail "java/lang/OutOfMemoryError".
Exception in thread "SDB WriterPool_4_rentalusers_incremental-thread-1" java.lang.OutOfMemoryError: ZIP004:OutOfMemoryError, MEM_ERROR in inflateInit2
at java.util.zip.Inflater.init(Native Method)
at java.util.zip.Inflater.<init>(Inflater.java:105)
at java.util.zip.ZipFile.getInflater(ZipFile.java:416)
at java.util.zip.ZipFile.getInputStream(ZipFile.java:359)
at java.util.zip.ZipFile.getInputStream(ZipFile.java:324)
at java.util.jar.JarFile.getInputStream(JarFile.java:467)
at sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:165)
at java.net.URL.openStream(URL.java:1041)
at java.lang.ClassLoader.getResourceAsStream(ClassLoader.java:455)
at com.xerox.amazonws.common.AWSQueryConnection.<init>(AWSQueryConnection.java:102)
at com.xerox.amazonws.sdb.Domain.<init>(Domain.java:72)
at com.xerox.amazonws.sdb.SimpleDB.getDomain(SimpleDB.java:202)
....
at java.lang.Thread.run(Thread.java:803)
Read More
SimpleDB Performance : 5 Steps to Achieving High Write Throughput
I was recently tasked with fork-lifting ~1 billion rows from Oracle into SimpleDB. I completed this forklift in November 2009 after many attempts. To make this as efficient as possible, I worked closely with Amazon’s SimpleDB folks to troubleshoot performance problems and create new APIs. I’d like to share some recommendations and observations.
Although I have covered these recommendations in depth in a previous post (i.e. link above), I’d like present a more succinct list of recommendations and observations here to maximize knowledge transfer.
Architecture
The architecture consists of a daemon (i.e. IR, for Item Replicator) that reads records out of Oracle and puts them into multiple SimpleDB domains. I’ve actually shown a second IR process that reads data out of SimpleDB for insertion into Oracle, but you should ignore it for the purpose of this discussion. When I refer to IR in this article, I mean the process replicating from Oracle to SimpleDB.

Recommendations
- Shard your data
- You can achieve much higher data access rates to multiple domains than to a single domain. Hence, rather than using a single domain, use multiple. This is because write traffic acts as if throttled or rate-limited at a domain level.
- Use slow-ramp up for writing
- AWS (SimpleDB) doesn’t like bursty writes and will often respond by throttling IR. When your data uploader starts up, have it slowly increase the write rate
- Use some sort of back-off strategy
- I’ve adopted Amazon recommendation for retry intervals (i.e. 250ms, 500ms, 1s, 2s). Essentially, wait 250 milliseconds on first failure before retrying, 500 milliseconds on second failure before retrying, and so on. After the 3rd retry attempt, stick to 2 second idle intervals.
- Use BatchPutAttributes instead of the singleton PutAttributes
- This will get you an order-of-magnitude improvement in throughput
- Set replace=false on puts
- This is the default. If you know that you are strictly always inserting unique records, puts with replace=false will run much faster than replace=true
- Also, since this is the default, Amazon recommends that users not set replace=false at all
Feel free to follow me on Twitter (@r39132).
The Oracle-SimpleDB Hybrid Part 3 : Defining the SimpleDB-Oracle translation
Preamble : See Part 2 : Solving the Eventual-Consistency Problem
When building a SimpleDB-Oracle (i.e. any key_value_store-RDBMS) hybrid system, translating between two very different data models presents a challenge. The challenge expands beyond the obvious ACID vs. BASE differences.
Most RDBMSs support the following features:
- Triggers
- Stored Procedures
- Constraints (e.g. integrity, foreign key, unique, etc…)
- Sequences
- Sequences used as Primary Keys
- Locks
- Tables without Primary Keys or Unique Keys or both
- Relationships between tables
Read More