SimpleDB Essentials for High Performance Users : Part 2

This Part 2 of SimpleDB Essentials for High Performance Users. Check out Part 1

 

  1. Beware of Case-senstivity
    1. Since domain names and attribute names are case-sensitive, for all domain and attribute names, use uppercase lettering and separate words with “_”
    2. When sharding domains, adopt zero-based index numbering and separate it from the root name with “_”
      1. e.g. MY_DOMAIN_0, MY_DOMAIN_1, … , MY_DOMAIN_99
  2. Shard Domains
    1. Since writes are throttled to SimpleDB domains, shard your domains to scale write traffic if you expect to do more than 70 singleton puts/second at any point in the future
    2. Also, if you plan to ever need more than 1 billion attributes or 10GB of space, shard your domains — use a rule of 10x for estimating customer-related growth
  3. Avoid Non-Indexed Queries
    1. Avoid queries like select * from MY_DOMAIN where SOME_ATTRIBUTE is NULL. These queries are not indexed and can result in an unpredictably long cycle of null results and next-pointer following. If you plan to do this query for customer-facing applications, you must use a pseudo-null. This will allow you to do queries like select * from MY_DOMAIN where SOME_ATTRIBUTE = ‘my-pseudo-null’
  4. Be Aware of Eventual Consistency
    1. If you require transactional guarantees for certain data, use Conditional Puts. You can optionally also use Consistent Reads, but it is not required.
  5. Use Batch Puts when possible for optimal write performance
    1. If you find that you are doing multiple item puts to the same domain, consider using the Batch Put API instead. You can submit up to 25 items (or 256 attributes or 1MB request size) to a single domain using the Batch Put API
    2. The only additional constraint is that the item names must be unique in the request, but this makes sense as there is no implied order of the items in the request
    3. The Batch Put is an all-or-nothing API. If it fails, then none of the items have been writtten. If it succeeds, the all of the items have been written. Partial success is not possible.
    4. When writing 25 items at a time, I have seen a 20-25x improvement in write-throughput when using Batch Put over Singleton Put

 Be sure to check out Part 3

  1. rooksfury posted this
blog comments powered by Disqus
About Me
A blog describing my work in building websites that hundreds of millions of people visit. I'm a senior member of LinkedIn Search Infrastructure. I previously held technical and leadership roles at Netflix, Etsy, eBay & Siebel Systems. In addition to the nerdy stuff, I've included some stunning photography for your pure enjoyment!
Tumblelogs I follow: