SimpleDB Performance : 5 Steps to Achieving High Write Throughput
I was recently tasked with fork-lifting ~1 billion rows from Oracle into SimpleDB. I completed this forklift in November 2009 after many attempts. To make this as efficient as possible, I worked closely with Amazon’s SimpleDB folks to troubleshoot performance problems and create new APIs. I’d like to share some recommendations and observations.
Although I have covered these recommendations in depth in a previous post (i.e. link above), I’d like present a more succinct list of recommendations and observations here to maximize knowledge transfer.
The architecture consists of a daemon (i.e. IR, for Item Replicator) that reads records out of Oracle and puts them into multiple SimpleDB domains. I’ve actually shown a second IR process that reads data out of SimpleDB for insertion into Oracle, but you should ignore it for the purpose of this discussion. When I refer to IR in this article, I mean the process replicating from Oracle to SimpleDB.
- Shard your data
- You can achieve much higher data access rates to multiple domains than to a single domain. Hence, rather than using a single domain, use multiple. This is because write traffic acts as if throttled or rate-limited at a domain level.
- Use slow-ramp up for writing
- AWS (SimpleDB) doesn’t like bursty writes and will often respond by throttling IR. When your data uploader starts up, have it slowly increase the write rate
- Use some sort of back-off strategy
- I’ve adopted Amazon recommendation for retry intervals (i.e. 250ms, 500ms, 1s, 2s). Essentially, wait 250 milliseconds on first failure before retrying, 500 milliseconds on second failure before retrying, and so on. After the 3rd retry attempt, stick to 2 second idle intervals.
- Use BatchPutAttributes instead of the singleton PutAttributes
- This will get you an order-of-magnitude improvement in throughput
- Set replace=false on puts
- This is the default. If you know that you are strictly always inserting unique records, puts with replace=false will run much faster than replace=true
- Also, since this is the default, Amazon recommends that users not set replace=false at all
Feel free to follow me on Twitter (@r39132).