Denial of Service (DoS) : Some Thoughts
About a year ago, I had the opportunity to solve a class of Denial-of-Service attacks that were compromising our availability and scalability. During that investigation, I happened upon a revelation. That revelation led to a solution. I’ve since seen that learning applied to other systems, including Amazon’s SimpleDB, so I wanted to share it here.
Consider the following scenario (also depicted below):
- A web client issues an HTTP request to a web site
- The web site, upon receiving the request, attempts to determine if the current request is part of a larger DOS attack
- If so, a defense is executed
- If not, the web request follows a normal execution of business logic
- The web server returns a response to the web client

Read More
Introducing the Oracle-SimpleDB Hybrid
My company would like to migrate its systems to the cloud. As this will take several months, the engineering team needs to support data access in both the cloud and its data center in the interim. Also, the RDBMS system might be maintained until some functionality (e.g. Backup-Restore) is created in SimpleDB.
To this aim, for the past 9 months, I have been building an eventually-consistent, multi-master data store. This system is comprised of an Oracle replica and several SimpleDB replicas. As I near completion of this system, I’d like to share its design.
Here’s the system:

We plan on accepting reads and writes in our data center (Oracle) and in our AWS region (SimpleDB). There are 2 Incremental Replicators (IRs) that transmit the changes between Oracle and SimpleDB. One replicates data from Oracle to SimpleDB, the other replicates data back from SimpleDB to Oracle.
Read More
Cloud Tips: How to Efficiently Forklift 1 Billion Rows into SimpleDB

About 9 months ago, I was tasked with fork-lifting a massive amount of data into Amazon’s SimpleDB in a short amount of time. I achieved it. Here’s what you need to know.
If you read-on, I’ll show you how to achieve data upload rates of around 10K items/second
SimpleDB Basics
First of all, if you have 1 billion rows to upload, you will need more than 1 domain. This is because Amazon SDB imposes certain limits on how much data you can store in one domain : see limits
Without digressing too much, figure out your optimal domain sharding scheme for you data growth by keeping the following formula in mind:
Storage Usage = (ItemNamesSizeBytes + AttributeValuesSizeBytes + AttributeNameSizebytes)
This is how Amazon computes your Storage Usage vis-a-vis their 10GB limit.
Note: You might need to ask them to raise your domains per account beyond 100 if you find 100 domains is too few for your data growth.
Read More
Eventual Consistency Explained for Non-techies
If you work in the Computer industry, especially the Internet industry, chances are good that you have encountered an eventually-consistent system.
For example, when managing an internet or IT business, you might have considered one of all of the following DB architectures:
-
Use a single DB host
- e.g. MyHost
-
Use a single DB host for your writes, but several for your reads
- e.g. MyWriteHost & MyReadHost1, MyReadHost2, MyReadHost 3, etc …
-
Use multiple DB hosts
- e.g. MyHost1, MyHost2, etc….
-
Use multiple DB hosts for your writes and your reads
- e.g. MyWriteHost1, MyWriteHost2, etc… & MyReadHost1, MyReadHost2, etc, ….
These 4 choices represent an increasing degree of data traffic partitioning, with 1 having no partitioning and 2 having the most partitioning.
Read More