SimpleDB Essentials for High Performance Users : Part 1

Preamble

I’ve been a heavy-user of SimpleDB since January 2009, storing, writing, and reading billions of items. Based on my experience, I’ve compiled a list of best practices and conventions to simplify working with SimpleDB.  I’ve divided this into multiple parts to ease readability.

Details

  1. Since sorting is lexicographical, if you plan on sorting by certain attributes, then
    1. zero-pad logically-numeric attributes
    2. use Joda time (i.e. ISO8601 format & Zulu time zone) to store logical dates as Joda time formats are human readable and lexicographically sortable
  2. When storing dates, it is recommended that you store all dates in Joda time and use a single time zone — I recommend the Zulu time zone (i.e. GMT)
  3. Use a naturally-occurring (sometimes composite) unique key for the item name
    1. This can speed up by 2 orders of magnitude (i.e. tens of milliseconds vs. seconds) any queries that would otherwise need to “AND” conditions on several attributes. This is because with 2 (indexed) attributes, the most selective index will first be applied. Then the second index will act as a filter on the first. If the combination is unique, then you will achieve an index selectivity of 1 in your lookup. 
      1. As a contrived example, instead of select * from MY_DOMAIN where FIRST_NAME=’Sid’ and LAST_NAME=’Anand’ use select * from MY_DOMAIN where itemName = ‘Sid:Anand’
  4. If you don’t have a naturally-occuring unique key, then use a GUID or UUID for the item name. In the RDBMS world, people use DB sequences. In the cloud, UUID or GUID is the way to go. It can be computed local to your SimpleDB client — local is always better.
  5. Favor Composite-Value Attributes to speed up Selects
    1. Basically, “AND”ing in the WHERE clause is slow (i.e. see the earlier entry). This is a major bug IMHO and needs to be fixed. For small data sets, the work-around is to create composite-value attribute columns to avoid the costly set intersection 

 Be sure to check out Part 2 

  1. rooksfury posted this
blog comments powered by Disqus
About Me
A blog describing my work in building websites that hundreds of millions of people visit. I'm Chief Architect at ClipMine, an innovative video mining and search company. I previously held technical and leadership roles at LinkedIn, Netflix, Etsy, eBay & Siebel Systems. In addition to the nerdy stuff, I've included some stunning photography for your pure enjoyment!
Tumblelogs I follow: