SimpleDB Essentials for High Performance Users : Part 1

Preamble
I’ve been a heavy-user of SimpleDB since January 2009, storing, writing, and reading billions of items. Based on my experience, I’ve compiled a list of best practices and conventions to simplify working with SimpleDB. I’ve divided this into multiple parts to ease readability.

Read More
Introducing the Oracle-SimpleDB Hybrid
My company would like to migrate its systems to the cloud. As this will take several months, the engineering team needs to support data access in both the cloud and its data center in the interim. Also, the RDBMS system might be maintained until some functionality (e.g. Backup-Restore) is created in SimpleDB.
To this aim, for the past 9 months, I have been building an eventually-consistent, multi-master data store. This system is comprised of an Oracle replica and several SimpleDB replicas. As I near completion of this system, I’d like to share its design.
Here’s the system:

We plan on accepting reads and writes in our data center (Oracle) and in our AWS region (SimpleDB). There are 2 Incremental Replicators (IRs) that transmit the changes between Oracle and SimpleDB. One replicates data from Oracle to SimpleDB, the other replicates data back from SimpleDB to Oracle.
Read More
Cloud Tips: How to Efficiently Forklift 1 Billion Rows into SimpleDB

About 9 months ago, I was tasked with fork-lifting a massive amount of data into Amazon’s SimpleDB in a short amount of time. I achieved it. Here’s what you need to know.
If you read-on, I’ll show you how to achieve data upload rates of around 10K items/second
SimpleDB Basics
First of all, if you have 1 billion rows to upload, you will need more than 1 domain. This is because Amazon SDB imposes certain limits on how much data you can store in one domain : see limits
Without digressing too much, figure out your optimal domain sharding scheme for you data growth by keeping the following formula in mind:
Storage Usage = (ItemNamesSizeBytes + AttributeValuesSizeBytes + AttributeNameSizebytes)
This is how Amazon computes your Storage Usage vis-a-vis their 10GB limit.
Note: You might need to ask them to raise your domains per account beyond 100 if you find 100 domains is too few for your data growth.
Read More