<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"><channel><atom:link rel="hub" href="http://tumblr.superfeedr.com/" xmlns:atom="http://www.w3.org/2005/Atom"/><description>
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));

try {
var pageTracker = _gat._getTracker("UA-12053669-1");
pageTracker._trackPageview();
} catch(err) {}

A technical blog describing my work in Cloud Computing. I am a member of Netflix’s Cloud Infrastructure team. Prior to joining Netflix, I served as the VP of Engineering at Etsy, worked as a search engineer and researcher at eBay, and solved performance issues at Siebel Systems. I earned my B.S. and M.Eng degrees from Cornell University. My graduate work focused on distributed (cloud) computing.

Feel free to follow me on Twitter (@r39132) or Linked In (http://www.linkedin.com/in/siddharthanand)</description><title>Practical Cloud Computing</title><generator>Tumblr (3.0; @rooksfury)</generator><link>http://practicalcloudcomputing.com/</link><item><title>Senior Software Engineer – Cloud Performance - Netflix</title><description>&lt;p&gt;&lt;strong&gt;Senior Software Engineer – Cloud Performance&lt;/strong&gt; &lt;strong&gt;– Netflix&lt;/strong&gt;&lt;br/&gt;Los Gatos, CA&lt;br/&gt;&lt;br/&gt;&lt;strong&gt;The Culture&lt;/strong&gt;&lt;br/&gt;Netflix hires extraordinary performers and gives them the freedom to make an impact. You may be aware of our booming streaming business, but are you aware of our advanced usage of cloud computing? In anticipation of our business’ rapid growth, Netflix, an early cloud adopter, now ranks among the top users of cloud-based infrastructure-as-a-service (a.k.a. IAAS). &lt;br/&gt;&lt;br/&gt;&lt;strong&gt;The Position&lt;/strong&gt;&lt;br/&gt;We are looking for best-of-breed, performance-minded software engineers with a passion and talent for scaling high-traffic distributed systems. You should have experience building similarly-trafficked systems and a track record of improving them. Your improvements should be represented by hard-numbers and grounded in engineering principles. &lt;br/&gt;&lt;br/&gt;&lt;strong&gt;Responsibilities include&lt;/strong&gt;&lt;br/&gt;• Drive cloud performance &amp; scalability optimization at Netflix and at our cloud partners&lt;br/&gt;• Proactively define and expose metrics that can improve our services’ performance, scalability, and availability&lt;br/&gt;• As a member of this team, you will also work on parts of our core software&lt;br/&gt;• Define and evangelize best-practices at Netflix for Cloud usage&lt;br/&gt;&lt;br/&gt;&lt;strong&gt;Minimum Job Qualifications&lt;/strong&gt;&lt;br/&gt;• 10 years of relevant software engineering experience - 6 years of experience with high-traffic, large-scale distributed systems and client-server architectures&lt;br/&gt;• Experience with Cloud Computing platforms (e.g. Amazon AWS, Microsoft Azure, Google App Engine) &lt;br/&gt;• Object-oriented programming experience with Java&lt;br/&gt;• BS/MS in computer science (or equivalent)&lt;br/&gt;&lt;br/&gt;&lt;strong&gt;Winning Qualities&lt;/strong&gt;&lt;br/&gt;• Understands complex systems from a performance perspective&lt;br/&gt;• Works well in teams&lt;br/&gt;• Shows leadership &lt;br/&gt;• Is meticulous and numbers-driven&lt;br/&gt;• Employs unambiguous, crystal-clear communication&lt;/p&gt;
&lt;p&gt;Contact me at siddharthanand@yahoo.com with your resume. Principals only (i.e. no recruiters). Locals preferred. Also, implement a method in Java, given input like AAABBaaccC, writes A3B2a2c2C1. Attach the source code with your email. What is the runtime and memory complexity of your solution using Big O notation?&lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/855127733</link><guid>http://practicalcloudcomputing.com/post/855127733</guid><pubDate>Sat, 24 Jul 2010 20:57:00 -0700</pubDate><category>Netflix</category><category>Cloud</category><category>Jobs</category></item><item><title>SimpleDB Essentials for High Performance Users : Part 3 </title><description>&lt;p&gt;&lt;img width="192" height="232" src="http://2.bp.blogspot.com/_gd8ewmnjidE/RdpbH4CubAI/AAAAAAAAAJY/4OjTmoibgLY/s400/calvin-y-hobbes%2Bswift%2Bkick.gif"/&gt;&lt;/p&gt;
&lt;p&gt;This is Part 3 of SimpleDB Essentials for High Performance Users. Check out &lt;a href="http://bit.ly/bPT5JP"&gt;Part 2&lt;/a&gt;&lt;/p&gt;
&lt;ol start="1"&gt;
&lt;li&gt;Work around Attribute Value Length Limits&lt;br/&gt;&lt;ol start="1"&gt;
&lt;li&gt;If you need to store data that is vastly larger than 1024 bytes in a SimpleDB attribute, consider storing that data in S3 and putting a pointer (i.e. bucket name + object key) to the data in the simpleDB attribute. However, the drawback from this approach is that you will require 2 round-trips (i.e. one to SimpleDB and one to S3) to compose one logical row. Beyond the obvious performance hit, this approach is not transactionally sound.&lt;/li&gt;
&lt;li&gt;A better approach is to split that data over several SimpleDB attributes. You will need to control the splitting and joining logic of these SimpleDB attributes, but you will only need one roundtrip and you can leverage conditional puts for concurrency control. This approach is ideal if your data can fit in 10 or fewer attributes.&lt;br/&gt;&lt;ol start="1"&gt;
&lt;li&gt;Just remember that subsequent updates to these split attributes might be of different length&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;Getting tripped up by the Default Select Query pagination limit of 100&lt;ol start="1"&gt;
&lt;li&gt;You must be aware that the SDB Select query supports the “&lt;strong&gt;limit N&lt;/strong&gt;” expression. This allows the developer to specify N up to a max of 2500. If the developer chooses N=200 for example and 1000 items match the WHERE clause conditions, then the results would be returned in chunks of 200 at a time. 5 subsequent round trips would be required to fetch the 1000 items.  For customer facing functionality, you are risking end-user timeouts. To avoid this, always specify “limit 2500”. Note: if you don’t specify it, the default value of 100 is assumed by SimpleDB&lt;/li&gt;
&lt;li&gt;Avoid any client code that auto-follows tokens returned by SimpleDB. SimpleDB Query timeouts could result in an unpredictably long-cycle of next-pointers. Auto-following these can not only result in an infinite loop on your servers, but customer-browser timeouts as well. Instead, follow these next pointers judiciously.&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;Avoid carrying multi-table relationships into the cloud in the form of multi-domain relationships. Try to denormalize these relationships into single items. Doing joins in the application tier might require multiple round-trips to SDB and open customer-facing functionality to time-outs&lt;/li&gt;
&lt;li&gt;Remember that there are no sequences, locks, constraints (except for the uniqueness constraint on the item name), triggers, etc.. in SimpleDB. Don’t expect them&lt;/li&gt;
&lt;/ol&gt;</description><link>http://practicalcloudcomputing.com/post/722637844</link><guid>http://practicalcloudcomputing.com/post/722637844</guid><pubDate>Mon, 21 Jun 2010 11:40:00 -0700</pubDate><category>SimpleDB</category><category>Amazon</category><category>AWS</category></item><item><title>SimpleDB Essentials for High Performance Users : Part 2</title><description>&lt;p&gt;&lt;img width="689" height="290" src="http://espritnoir.files.wordpress.com/2007/03/first_ch.gif"/&gt;&lt;/p&gt;
&lt;p&gt;This Part 2 of SimpleDB Essentials for High Performance Users. Check out &lt;a title="Part 1" href="http://bit.ly/cpqaXC"&gt;Part 1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;span class="status-body"&gt;&lt;span class="status-content"&gt;&lt;span class="entry-content"&gt;&lt;!-- more --&gt;&lt;/span&gt;&lt;!-- more --&gt;&lt;/span&gt;&lt;!-- more --&gt;&lt;/span&gt;&lt;span class="status-body"&gt;&lt;span class="status-content"&gt;&lt;span class="entry-content"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ol start="1"&gt;
&lt;li&gt;Beware of Case-senstivity&lt;ol start="1"&gt;
&lt;li&gt;Since domain names and attribute names are case-sensitive, for all domain and attribute names, use uppercase lettering and separate words with “_”&lt;/li&gt;
&lt;li&gt;When sharding domains, adopt zero-based index numbering and separate it from the root name with “_”&lt;ol start="1"&gt;
&lt;li&gt;&lt;span mce_style="color: #0000ff;"&gt;e.g. MY_DOMAIN_0, MY_DOMAIN_1, … , MY_DOMAIN_99&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;Shard Domains&lt;br/&gt;&lt;ol start="1"&gt;
&lt;li&gt;Since writes are throttled to SimpleDB domains, shard your domains to scale write traffic if you expect to do more than 70 singleton puts/second at any point in the future&lt;/li&gt;
&lt;li&gt;Also, if you plan to ever need more than 1 billion attributes or 10GB of space, shard your domains — use a rule of 10x for estimating customer-related growth &lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;Avoid Non-Indexed Queries&lt;ol start="1"&gt;
&lt;li&gt;Avoid queries like &lt;span mce_style="color: #0000ff;"&gt;select * from MY_DOMAIN where SOME_ATTRIBUTE is NULL&lt;/span&gt;. These queries are not indexed and can result in an unpredictably long cycle of null results and next-pointer following. If you plan to do this query for customer-facing applications, you must use a pseudo-null. This will allow you to do queries like &lt;span mce_style="color: #0000ff;"&gt;select * from MY_DOMAIN where SOME_ATTRIBUTE = ‘my-pseudo-null’&lt;/span&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;Be Aware of Eventual Consistency&lt;ol start="1"&gt;
&lt;li&gt;If you require transactional guarantees for certain data, use Conditional Puts. You can optionally also use Consistent Reads, but it is not required.&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;Use Batch Puts when possible for optimal write performance&lt;br/&gt;&lt;ol start="1"&gt;
&lt;li&gt;If you find that you are doing multiple item puts to the same domain, consider using the Batch Put API instead. You can submit up to 25 items (or 256 attributes or 1MB request size) to a single domain using the Batch Put API&lt;/li&gt;
&lt;li&gt;The only additional constraint is that the item names must be unique in the request, but this makes sense as there is no implied order of the items in the request&lt;/li&gt;
&lt;li&gt;The Batch Put is an all-or-nothing API. If it fails, then none of the items have been writtten. If it succeeds, the all of the items have been written. Partial success is not possible.&lt;/li&gt;
&lt;li&gt;When writing 25 items at a time, I have seen a 20-25x improvement in write-throughput when using Batch Put over Singleton Put&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span class="status-body"&gt;&lt;span class="status-content"&gt;&lt;span class="entry-content"&gt; Be sure to check out &lt;a href="http://bit.ly/bdGGWZ"&gt;Part 3&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/722621724</link><guid>http://practicalcloudcomputing.com/post/722621724</guid><pubDate>Mon, 21 Jun 2010 11:33:00 -0700</pubDate><category>SimpleDB</category><category>Amazon</category><category>AWS</category></item><item><title>SimpleDB Essentials for High Performance Users : Part 1</title><description>&lt;p&gt;&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;
&lt;p&gt;&lt;strong&gt;&lt;img width="300" height="300" align="middle" src="http://www.everypicture.com/shop/books/038aa2dd606d94b1929e642d4757a7d4/calvin-and-hobbes.jpg"/&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span mce_style="color: #3366ff;"&gt;Preamble&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I’ve been a heavy-user of SimpleDB since January 2009, storing, writing, and reading billions of items. Based on my experience, I’ve compiled a list of best practices and conventions to simplify working with SimpleDB.  I’ve divided this into multiple parts to ease readability.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;
&lt;p&gt;&lt;img src="http://www.tumblr.com/javascript/tiny_mce_3_3_3/plugins/pagebreak/img/trans.gif"/&gt;&lt;!-- more --&gt;&lt;/p&gt;
&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;
&lt;p&gt;&lt;strong&gt;&lt;span mce_style="color: #0000ff;"&gt;Details&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/strong&gt;&lt;/p&gt;
&lt;ol start="1"&gt;
&lt;li&gt;Since sorting is lexicographical, if you plan on sorting by certain attributes, then &lt;ol start="1"&gt;
&lt;li&gt;zero-pad logically-numeric attributes&lt;/li&gt;
&lt;li&gt;use Joda time (&lt;span mce_style="color: #008000;"&gt;i.e. ISO8601 format &amp; Zulu time zone&lt;/span&gt;) to store logical dates as Joda time formats are human readable and lexicographically sortable&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;When storing dates, it is recommended that you store all dates in Joda time and use a single time zone — I recommend the Zulu time zone (&lt;span mce_style="color: #008000;"&gt;i.e. GMT&lt;/span&gt;)&lt;/li&gt;
&lt;li&gt;Use a naturally-occurring (&lt;span mce_style="color: #008000;"&gt;sometimes composite&lt;/span&gt;) unique key for the item name&lt;ol start="1"&gt;
&lt;li&gt;This can speed up by 2 orders of magnitude (&lt;span mce_style="color: #008000;"&gt;i.e. tens of milliseconds vs. seconds&lt;/span&gt;) any queries that would otherwise need to “AND” conditions on several attributes. This is because set comparisons (&lt;span mce_style="color: #008000;"&gt;i.e. those involved in multi-attribute WHERE clauses&lt;/span&gt;) are costly in SimpleDB&lt;br/&gt;&lt;ol start="1"&gt;
&lt;li&gt;As a contrived example, instead of &lt;strong&gt;select * from MY_DOMAIN where FIRST_NAME=’Sid’ and LAST_NAME=’Anand’ &lt;/strong&gt;use &lt;strong&gt;select * from MY_DOMAIN where itemName = ‘Sid:Anand’&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;If you don’t have a naturally-occuring unique key, then use a GUID or UUID for the item name. In the RDBMS world, people use DB sequences. In the cloud, UUID or GUID is the way to go. It can be computed local to your SimpleDB client — local is always better.&lt;/li&gt;
&lt;li&gt;Favor Composite-Value Attributes to speed up Selects&lt;br/&gt;&lt;ol start="1"&gt;
&lt;li&gt;Basically, “AND”ing in the WHERE clause is slow (&lt;span mce_style="color: #008000;"&gt;i.e. see the earlier entry&lt;/span&gt;). This is a major bug IMHO and needs to be fixed. For small data sets, the work-around is to create composite-value attribute columns to avoid the costly set intersection &lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt; Be sure to check out &lt;a href="http://bit.ly/bPT5JP"&gt;Part 2&lt;/a&gt; &lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/712653349</link><guid>http://practicalcloudcomputing.com/post/712653349</guid><pubDate>Fri, 18 Jun 2010 14:16:00 -0700</pubDate><category>Amazon</category><category>SimpleDB</category><category>AWS</category><category>Cloud</category></item><item><title>A Java Out-of-Memory Error involving GZIP, Typica, and SimpleDB</title><description>&lt;p&gt;&lt;strong&gt;UPDATE&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;I am providing an update here to the root cause.&lt;/em&gt;&lt;strong&gt;&lt;br/&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Overview&lt;/strong&gt;&lt;br/&gt;I ran into an interesting Out of Memory bug this week. It occurs if you use gzip to send/receive data and under-utilize your Java Heap memory. This land-mine has existed since 2004, though hopefully you will not be bitten by it.&lt;br/&gt;&lt;br/&gt;&lt;strong&gt;Problem Stack&lt;/strong&gt;&lt;br/&gt;A Java process was throwing the following Out-of-Memory Error.&lt;/p&gt;
&lt;pre&gt;JVMDUMP013I Processed Dump Event "uncaught", detail "java/lang/OutOfMemoryError".&lt;br/&gt;Exception in thread "SDB WriterPool_4_rentalusers_incremental-thread-1" java.lang.OutOfMemoryError: ZIP004:OutOfMemoryError, MEM_ERROR in inflateInit2&lt;br/&gt;at java.util.zip.Inflater.init(Native Method)&lt;br/&gt;at java.util.zip.Inflater.&lt;init&gt;(Inflater.java:105) &lt;br/&gt;at java.util.zip.ZipFile.getInflater(ZipFile.java:416) &lt;br/&gt;at java.util.zip.ZipFile.getInputStream(ZipFile.java:359) &lt;br/&gt;at java.util.zip.ZipFile.getInputStream(ZipFile.java:324) &lt;br/&gt;at java.util.jar.JarFile.getInputStream(JarFile.java:467) &lt;br/&gt;at sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:165) &lt;br/&gt;at java.net.URL.openStream(URL.java:1041) &lt;br/&gt;at java.lang.ClassLoader.getResourceAsStream(ClassLoader.java:455) &lt;br/&gt;at com.xerox.amazonws.common.AWSQueryConnection.&lt;init&gt;(AWSQueryConnection.java:102) &lt;br/&gt;at com.xerox.amazonws.sdb.Domain.&lt;init&gt;(Domain.java:72)&lt;br/&gt;at com.xerox.amazonws.sdb.SimpleDB.getDomain(SimpleDB.java:202)&lt;br/&gt;....&lt;br/&gt;at java.lang.Thread.run(Thread.java:803)&lt;br/&gt;&lt;br/&gt;&lt;/pre&gt;
&lt;!-- more --&gt;
&lt;p&gt;&lt;strong&gt;Figuring it Out&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;This &lt;strong&gt;OutOfMemoryError &lt;/strong&gt;is not &lt;strong&gt;java.lang.OutOfMemoryError: Java heap space&lt;/strong&gt;. The latter is thrown when the memory occupied by all of your reachable Java objects exceeds the max heap -Xmx setting&lt;/li&gt;
&lt;li&gt;Inflater is a JNI class used for gunzipping compressed files. Deflater is its counterpart for gzipping&lt;/li&gt;
&lt;li&gt;In the example above, the Typica library is using the Inflater to get to a version file inside the Typica.jar. Every time a batch put or some other call is made and you create a new Domain object, Typica is mindlessly unjarring the Typica jar to get this file. (Why doesn’t Typica cache this version file once?).&lt;/li&gt;
&lt;li&gt;There is a limited amount of native heap memory on the system. It must be shared by &lt;ol&gt;
&lt;li&gt;Your Java program (JVM included)&lt;/li&gt;
&lt;li&gt;Other user programs &lt;/li&gt;
&lt;li&gt;Native libraries called by your Java program&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;The Inflater object takes little memory inside the JVM and quite a lot of memory outside the JVM in the native heap&lt;/li&gt;
&lt;li&gt;The Inflater’s finalize() cleans up the memory outside the JVM in the native heap&lt;/li&gt;
&lt;li&gt;As you know, finalize() is called after minor and major collections. If a major/minor cycle doesn’t happen for a while, the finalize won’t run and you risk running out of native heap memory&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;br/&gt;This is what was happening! Since we were GCing every 40-60 minutes, Inflater’s finalizers were not run often enough to clear up the native JNI memory. We ran out of native heap memory. &lt;br/&gt;&lt;br/&gt;The bug has been around since 2004. Here it is: &lt;a&gt;&lt;a href="http://bugs.sun.com/view_bug.do?bug_id=4797189"&gt;http://bugs.sun.com/view_bug.do?bug_id=4797189&lt;/a&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Fix/Workaround&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If you must use Typica, cache the Domain objects. Every time a Domain object is created, you are unjarring the typica.jar to get the version file. As a work-around, you can also reduce your Heap Memory so that finalizers run more often. As for me, I stopped using Typica.&lt;/p&gt;
&lt;p&gt;&lt;br/&gt;&lt;strong&gt;Other links&lt;/strong&gt;&lt;br/&gt;Some links that I skimmed and found useful, though they are very detailed:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a&gt;&lt;a href="http://www.ibm.com/developerworks/java/library/j-nativememory-aix"&gt;http://www.ibm.com/developerworks/java/library/j-nativememory-aix&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a&gt;&lt;a href="http://www.ibm.com/developerworks/java/library/j-jni/#resources"&gt;http://www.ibm.com/developerworks/java/library/j-jni/#resources&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description><link>http://practicalcloudcomputing.com/post/444939181</link><guid>http://practicalcloudcomputing.com/post/444939181</guid><pubDate>Fri, 12 Mar 2010 22:58:00 -0800</pubDate><category>Bug</category><category>GZIP</category><category>JVM</category><category>Java</category><category>SimpleDB</category><category>Typica</category></item><item><title>SimpleDB Performance : 5 Steps to Achieving High Write Throughput </title><description>&lt;p&gt;I was recently tasked with &lt;a target="_blank" href="http://practicalcloudcomputing.com/post/284222088/forklift-1b-records"&gt;fork-lifting ~1 billion rows from Oracle into SimpleDB&lt;/a&gt;. I completed this forklift in November 2009 after many attempts. To make this as efficient as possible, I worked closely with Amazon’s SimpleDB folks to troubleshoot performance problems and create new APIs. I’d like to share some recommendations and observations.&lt;/p&gt;
&lt;p&gt;Although I have covered these recommendations in depth in a previous post (i.e. link above), I’d like present a more succinct list of recommendations and observations here to maximize knowledge transfer.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Architecture&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;The architecture consists of a daemon (i.e. IR, for Item Replicator) that reads records out of Oracle and puts them into multiple SimpleDB domains. I’ve actually shown a second IR process that reads data out of SimpleDB for insertion into Oracle, but you should ignore it for the purpose of this discussion. When I refer to IR in this article, I mean the process replicating from Oracle to SimpleDB.&lt;/p&gt;
&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_kvni6zCf8O1qa94o2.png"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Recommendations&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt; &lt;/b&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Shard your data&lt;ol&gt;
&lt;li&gt;You can achieve much higher data access rates to multiple domains than to a single domain. Hence, rather than using a single domain, use multiple. This is because write traffic acts as if throttled or rate-limited at a domain level.&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;Use slow-ramp up for writing&lt;ol&gt;
&lt;li&gt;AWS (SimpleDB) doesn’t like bursty writes and will often respond by throttling IR. When your data uploader starts up, have it slowly increase the write rate&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;Use some sort of back-off strategy&lt;ol&gt;
&lt;li&gt;I’ve adopted Amazon recommendation for retry intervals (i.e. 250ms, 500ms, 1s, 2s). Essentially, wait 250 milliseconds on first failure before retrying, 500 milliseconds on second failure before retrying, and so on. After the 3rd retry attempt, stick to 2 second idle intervals.&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;Use BatchPutAttributes instead of the singleton PutAttributes&lt;ol&gt;
&lt;li&gt;This will get you an order-of-magnitude improvement in throughput&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;Set replace=false on puts &lt;ol&gt;
&lt;li&gt;This is the default. If you know that you are strictly always inserting unique records, puts with replace=false will run much faster than replace=true&lt;/li&gt;
&lt;li&gt;Also, since this is the default, Amazon recommends that users not set replace=false at all&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Feel free to follow me on Twitter (&lt;a target="_blank" href="http://twitter.com/r39132"&gt;@r39132&lt;/a&gt;).&lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/313922691</link><guid>http://practicalcloudcomputing.com/post/313922691</guid><pubDate>Sat, 02 Jan 2010 19:49:00 -0800</pubDate><category>SimpleDB</category><category>Performance</category></item><item><title>"The cloud" Explained for Normal People</title><description>&lt;p&gt;If you are like most people in software, you have heard the term “The cloud” but have no idea what it means. If you are industrious enough to buy a book, google the term, or troll twitter for related tweets, you are likely exasperated by the shear marketing buzzword blast you encounter.&lt;/p&gt;
&lt;p&gt;To make it easier on you, I am going to tell you what it means to me with a very specific example:&lt;/p&gt;
&lt;p&gt;I am helping my company move into the cloud. Specifically, we are going to use most of Amazon’s AWS services.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Definitions&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;First, some abbreviations and definitions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;b&gt;AWS&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;Amazon Web Services, a division of Amazon focusing on hosting our applications and data&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;SimpleDB&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;AWS’s always-available replacement for RDBMSs. Specifically SimpleDB is their hosted, replicated key-value store that is always available and accessible as a web service&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;S3&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;(a.k.a Simple Storage Service) AWS’s always-available file storage solution accessible as a web service&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;SQS&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;(a.k.a Simple Queue Service) AWS’s always-available queueing service accessible as a web service&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;ELB&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;(a.k.a Elastic Load Balancer) AWS’s always-available load balancing service accessible as a web service&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;EC2&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;(a.k.a Elastic Compute Cloud) AWS’s on-demand server offering accessible as a web service&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;CloudFront&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;AWS’s CDN (a.k.a Content Delivery Network) offering accessible as a web service&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All of the services above are pay-as-go (and are very reasonable at that) and are accessible as web services. They are also always-available.&lt;/p&gt;
&lt;p&gt;So why does one go about using these services?&lt;/p&gt;
&lt;!-- more --&gt;
&lt;p&gt;&lt;b&gt;Why use these AWS services?&lt;/b&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;b&gt;SimpleDB&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;Instead of storing your data in Oracle, Postgres, or MySQL, put it in SimpleDB. You will have to give up strong-data consistency and ACID transactionality, but you will gain high availability and throughput.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;S3&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;Want to store and access files and not worry about losing access to them temporarily or permanently due to hardware or software failures? Store them in S3&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;SQS&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;Want to use a messaging solution a la MQ series or WebLogic’s message queue servers? Worried how much much the hardware and licensing costs will run? Never mind! Use SQS and pay as you go.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;EC2 and ELB&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;Deciding how much money to earmark for data center maintenance and expansion over the next year? If you are running a start-up, you have no idea how your traffic will grow. Also, if you get a TechCrunch mention, your traffic can sky rocket, only to fall to only slightly-higher-than-earlier levels. Do you spend vast amounts of money on hardware to cover your peak traffic hardware needs? Or do you just live with denying users’ access to your site during peaks? EC2 and ELB will spin up hardware when the traffic peaks and spin it down when traffic subsides. You are charged only for what you use.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;CloudFront&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;Are CDNs charging an arm and a leg? Do they require accurate estimates of your proposed usage over the next few months? Use Cloudfront and pay as you go.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Basically, by moving these services out of your data center and into the cloud, you:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;No longer need to run and maintain a data center. You no longer need the associated staff&lt;/li&gt;
&lt;li&gt;Free yourself from some of the tax hassles associated with managing capital expenses (e.g. defining depreciation)&lt;/li&gt;
&lt;li&gt;Get a more scalable and available infrastructure without trying to build it yourself&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;b&gt;IAAS vs PAAS vs SAAS&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Amazon is an example of an IAAS (Infrastructure as a Service) provider. They provide a a means to define and deploy your application and environment (i.e. OS and supporting packages) on demand to any number of hardware instances that you need. They do this via ELB + EC2.  They also provide various cloud services (e.g. SQS, S3, SimpleDB, CloudFront, etc…)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Definition of IAAS, PAAS,  &amp; SAAS 
&lt;ul&gt;
&lt;li&gt;&lt;a target="_blank" href="http://www.keithpij.com/Home/tabid/36/EntryID/27/Default.aspx"&gt;&lt;a href="http://www.keithpij.com/Home/tabid/36/EntryID/27/Default.aspx"&gt;http://www.keithpij.com/Home/tabid/36/EntryID/27/Default.aspx&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;For a full taxonomy of the dizzying array of providers of IAAS, SAAS, PAAS, and Cloud Services, see the following link 
&lt;ul&gt;
&lt;li&gt; &lt;a target="_blank" href="http://peterlaird.blogspot.com/2008/09/visual-map-of-cloud-computingsaaspaas.html"&gt;&lt;a href="http://peterlaird.blogspot.com/2008/09/visual-map-of-cloud-computingsaaspaas.html"&gt;http://peterlaird.blogspot.com/2008/09/visual-map-of-cloud-computingsaaspaas.html&lt;/a&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I hope you found this article useful. Feel free to follow me &lt;a href="http://twitter.com/r39132"&gt;@r39132&lt;/a&gt;&lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/311045887</link><guid>http://practicalcloudcomputing.com/post/311045887</guid><pubDate>Fri, 01 Jan 2010 01:29:00 -0800</pubDate><category>AWS</category><category>Amazon</category></item><item><title>HTML 5's Web Sockets explained</title><description>&lt;p&gt;&lt;img height="372" width="403" alt="CandH_balancing" src="http://i119.photobucket.com/albums/o149/Akiddbk/CalvinAndHobbs.jpg"/&gt;&lt;/p&gt;
&lt;p&gt;I’ve been reading a bit about HTML 5’s WebSockets lately.&lt;/p&gt;
&lt;p&gt;First, here are some definitions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;b&gt;Comet&lt;/b&gt; - an umbrella term referring to techniques that provide “server push” using standard browser functionality — i.e. without the aid of specialty browser plug-ins. In practice, Comet in most-often implemented via Ajax with long polling.&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;Long polling&lt;/b&gt; - (from Wikipedia) “With long polling, the client requests information from the server in a similar way to a normal poll. However, if the server does not have any information available for the client, instead of sending an empty response, the server holds the request and waits for some information to be available. Once the information becomes available (or after a suitable timeout), a complete response is sent to the client. ”&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;Gateway&lt;/b&gt; - like a proxy, except gateways don’t alter the requests or response that they ferries between browser and server.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With these definitions out of the way, what are WebSockets?&lt;/p&gt;
&lt;p&gt;WebSockets is a new proposal under HTML 5 to provide full-duplex, bi-directional client-server interaction over a single TCP connection. The goal of this proposal is manyfold:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;b&gt;Increasing web server connection scalability&lt;/b&gt; - web applications that leverage WebSockets use half the number of web server connections than do Comet-based applications. Comet requires separate upstream and downstream connections.&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;Simplifing the coding task&lt;/b&gt; - the WebSocket API is much simpler to code with than the XMLHttpRequest()&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Sounds great! So, how do we use them?&lt;/p&gt;
&lt;p&gt;Unfortunately, WebSockets require browser support and currently only Chrome supports them (i.e. since &lt;a target="_blank" href="http://blog.chromium.org/2009/12/web-sockets-now-available-in-google.html"&gt;version 4.0.249.0&lt;/a&gt;). As a stop gap, a company named &lt;a target="_blank" href="http://www.kaazing.com/"&gt;Kaazing&lt;/a&gt; provides a gateway for your existing browsers.&lt;/p&gt;
&lt;p&gt;To learn more about WebSockets, check out the links below:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A good how-to using WebSockets from Tornado’s developer:  
&lt;ul&gt;
&lt;li&gt;&lt;a target="_blank" href="http://bret.appspot.com/entry/web-sockets-in-tornado"&gt;&lt;a href="http://bret.appspot.com/entry/web-sockets-in-tornado"&gt;http://bret.appspot.com/entry/web-sockets-in-tornado&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;A short introduction to Web Sockets from Kaazing:  
&lt;ul&gt;
&lt;li&gt;&lt;a target="_blank" href="http://www.kaazing.org/confluence/display/KAAZING/What+is+an+HTML+5+WebSocket"&gt;&lt;a href="http://www.kaazing.org/confluence/display/KAAZING/What+is+an+HTML+5+WebSocket"&gt;http://www.kaazing.org/confluence/display/KAAZING/What+is+an+HTML+5+WebSocket&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;</description><link>http://practicalcloudcomputing.com/post/310994137</link><guid>http://practicalcloudcomputing.com/post/310994137</guid><pubDate>Fri, 01 Jan 2010 00:35:00 -0800</pubDate><category>Html5</category><category>Html 5</category><category>WebSockets</category><category>long-polling</category></item><item><title>Website Performance - Why you should care and what you can do!</title><description>&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_kvl97gMEFa1qa94o2.gif"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Why Does Performance Matter?&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Oftentimes, people speak interchangeably about web site performance, scalability, and availability. Although these 3 terms are related, they are distinct and unique. Here are their definitions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;b&gt;availability&lt;/b&gt; - what is the total length of time that [some part of] a web site is available during a hour/day/year?&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;scalability&lt;/b&gt; - what is the largest number of concurrent users that a system can handle?&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;performance&lt;/b&gt; - what is the [worst] perceived response time for a single user?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It should be pretty clear that for web sites that charge money (either through ads or via subscription-based services), lesser availability translates into lesser revenue. In the same way that you can lose revenue via an outage, you can lose revenue during traffic peaks if your website cannot handle those peaks.&lt;/p&gt;
&lt;p&gt;Hence, just as scalability and availability can reduce your top-line (revenue), can performance have a similar affect?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;In 2006 Google’s tests showed that increasing load time by 0.5 seconds resulted in a 20% drop in traffic.&lt;/li&gt;
&lt;li&gt;In 2007 Amazon’s tests showed that for every 100ms increase in load time, sales would decrease 1%.&lt;/li&gt;
&lt;li&gt;This year (2009) Akamai (a CDN leader) revealed in a study that 2 seconds is the new threshold for eCommerce web page response times.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Hence, it does.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Where Should You Look for Performance Issues?&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Ask anyone who has worked on a web site or an enterprise system and they will say “Look at your database!”. Although that is true, the mistake people often make is stopping there. After tuning queries to their satisfaction, engineers seem to ignore 5-7 second website page load times. 80-90% of this time really comes from assembling the web page.&lt;/p&gt;
&lt;!-- more --&gt;
&lt;p&gt;&lt;b&gt;Front-end Engineering Best Practices&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;About 2 years ago, while analyzing website performance numbers, I noticed that 99% of Tomcat server response times were under 1 second. However, web pages would sometimes take 3 seconds to display. Where was this extra time spent? Luckily, I happened to attend a talk by &lt;a href="http://stevesouders.com/"&gt;Steve Souders&lt;/a&gt; a few weeks later -  he filled in the gaps. Here is a link to his front-end engineering best-practices. It’s a very long read, so use it as a reference:&lt;/p&gt;
&lt;p&gt;&lt;a target="_blank" href="http://developer.yahoo.com/performance/rules.html"&gt;&lt;a href="http://developer.yahoo.com/performance/rules.html"&gt;http://developer.yahoo.com/performance/rules.html&lt;/a&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Content Delivery Networks (a.k.a. CDNs)&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Most engineers are familiar with the benefits of caching, deploying caches in their data centers to speed up back-end response times (e.g. Memcached, Squid, RamDisk, etc…). However, few engineers recognize that a breed of caches outside of your data center can deliver equivalent performance gains. These are CDNs. Here’s a concise summary of how CDNs help you and a quick how-to on using Amazon’s CDN for the masses, dubbed CloudFront.&lt;/p&gt;
&lt;p&gt;&lt;a&gt;&lt;a href="http://www.labnol.org/internet/setup-content-delivery-network-with-amazon-s3-cloudfront/5446/"&gt;http://www.labnol.org/internet/setup-content-delivery-network-with-amazon-s3-cloudfront/5446/&lt;/a&gt;&lt;/a&gt;&lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/308687443</link><guid>http://practicalcloudcomputing.com/post/308687443</guid><pubDate>Wed, 30 Dec 2009 15:57:00 -0800</pubDate><category>Amazon</category><category>CDNs</category><category>Content Delivery Networks</category><category>CloudFront</category></item><item><title>Denial of Service (DoS) : Some Thoughts</title><description>&lt;p&gt;About a year ago, I had the opportunity to solve a class of Denial-of-Service attacks that were compromising our availability and scalability. During that investigation, I happened upon a revelation. That revelation led to a solution. I’ve since seen that learning applied to other systems, including Amazon’s SimpleDB, so I wanted to share it here.&lt;/p&gt;
&lt;p&gt;Consider the following scenario (also depicted below):&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A web client issues an HTTP request to a web site&lt;/li&gt;
&lt;li&gt;The web site, upon receiving the request, attempts to determine if the current request is part of a larger DOS attack&lt;ol&gt;
&lt;li&gt;If so, a defense is executed&lt;/li&gt;
&lt;li&gt;If not, the web request follows a normal execution of business logic&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;The web server returns a response to the web client&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_kvcln4Qkw01qa94o2.png"/&gt;&lt;/p&gt;
&lt;!-- more --&gt;
&lt;p&gt;It is relatively simple and inexpensive for a client to issue an HTTP request. The server on the other hand needs to do a lot more work, which may include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Reading data from data bases&lt;/li&gt;
&lt;li&gt;Running algorithms and computations on that data&lt;/li&gt;
&lt;li&gt;Filling caches&lt;/li&gt;
&lt;li&gt;Creating an HTTP response&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These are more expensive than simply issuing an HTTP request - i.e. what the client does. While a server is fulfilling a client’s request, any of the following could be considered as reserved for that request:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A front-end server connection&lt;/li&gt;
&lt;li&gt;A fraction of a back-end (e.g. DB) connection if connections are pooled&lt;/li&gt;
&lt;li&gt;A fraction of a middle-tier (e.g. an internal web service) connection if connections are pooled&lt;/li&gt;
&lt;li&gt;CPU and memory resources on the front-end, middle-tier, and back-end servers&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All of the resources above are in limited supply, so at any point of time, a web site can support a finite number of client requests.&lt;/p&gt;
&lt;p&gt;To DoS a site, all a client needs to do is to overrun that limit.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;How to DoS a Site?&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Find an operation on a website that typically takes a lot of time and bang on it with Curl or wget. For example, search or login might be slow operations as they require a lot of work on the back-end. (Disclaimer: I am not suggesting that anyone DoS any site.)&lt;/p&gt;
&lt;p&gt;The principle behind DoS is very simple:&lt;/p&gt;
&lt;p&gt;Have the client do much less work than the server. Hence, a laptop in Starbucks can easily issue 4k HTTP requests per second. If those requests are expensive for a website to serve, 4K of those requests may overwhelm it.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;DoS Defense&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;How do you defend against DoS? Follow these rules:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Ensure that your DoS detection logic is run as early on as possible in your server’s request processing stack&lt;/li&gt;
&lt;li&gt;Ensure that all resources (except for front-end server connections) are O(1) with respect to request traffic under a DoS attack. &lt;ol&gt;
&lt;li&gt;This means that under attack conditions, web site resource consumption does not increase — not memory, not CPU, not back-end server connections, nothing. Your inbound client connections a.k.a. your front-end server connections can increase — that’s OK.&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;If you detect a DoS attack, return the cheapest response possible as soon as possible: e.g. HTTP 503 - Server Unavailable. &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The basic principle behind fending of a DoS attack is to make responding to attacking requests as cheap as possible so that the server or server farm can keep up with the client.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;My DoS Defense&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;To solve this class of DoS attacks, I ran a simple algorithm to detect the DoS attack. I made sure that the detection algorithm did not create connections or eat up memory or CPU — in effect, these resources were O(1) or constant. This meant relying on local data.&lt;/p&gt;
&lt;p&gt;If I detected a DoS attack, I cached some client information in a local cache. This caused subsequent requests to be locked out — a fast failure response was sent to the client.&lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/303981107</link><guid>http://practicalcloudcomputing.com/post/303981107</guid><pubDate>Sun, 27 Dec 2009 22:43:00 -0800</pubDate><category>DoS</category><category>Denial-of-Service</category><category>Amazon SimpleDB</category></item><item><title>SimpleDB Recommended Reading List (12/23/09)</title><description>&lt;p&gt;Below is a list of recommended reading to understand SimpleDB and other cloud-related topics. The reading list starts with distributed computing basics and ends with in-depth SimpleDB best-practices.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a title="The CAP Theorem Distilled" target="_blank" href="http://bit.ly/5721Zc"&gt;The CAP Theorem Distilled&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a title="CNA Principle" target="_blank" href="http://bit.ly/4osfrv"&gt;The “Consistency, Not Accuracy” Principle&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a title="Eventual Consistency Non-techies" target="_blank" href="http://bit.ly/5WWlNG"&gt;Eventual Consistency Explained for Non-techies&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a title="Eventual Consistency Techies" target="_blank" href="http://bit.ly/4yZe9A"&gt;Eventual Consistency Explained for Techies&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a title="RDBMvsSDB Overview" target="_blank" href="http://bit.ly/7Kg5gH"&gt;RDBMS vs. SimpleDB Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a title="Forklift 1 Billion Rows into SDB" target="_blank" href="http://bit.ly/7ymQuz"&gt;Cloud Tips : How to Efficiently Forklift 1 Billion Rows into SimpleDB&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a title="Oracle-SDB Hybrid P0" target="_blank" href="http://bit.ly/8kHGAb"&gt;Introducing the Oracle-SimpleDB Hybrid&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a title="Oracle-SDB P1" target="_blank" href="http://bit.ly/4vv5s6"&gt;The Oracle-SimpleDB Hybrid Part 1 : Pulling Data out of Oracle Efficiently&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a title="Oracle-SDB P2" target="_blank" href="http://bit.ly/7IZRSA"&gt;The Oracle-SimpleDB Hybrid Part 2 : Solving Eventual Consistency Problem&lt;/a&gt; &lt;/li&gt;
&lt;li&gt;&lt;a title="Oracle-SimpleDB P3" target="_blank" href="http://bit.ly/8Vc2JJ"&gt;The Oracle-SimpleDB Hybrid Part 3 : Defining the Oracle-SimpleDB Translation&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description><link>http://practicalcloudcomputing.com/post/298081186</link><guid>http://practicalcloudcomputing.com/post/298081186</guid><pubDate>Wed, 23 Dec 2009 22:34:00 -0800</pubDate><category>simpledb</category><category>Oracle</category><category>RDBMS</category><category>cloud computing</category><category>nosql</category></item><item><title>The Oracle-SimpleDB Hybrid Part 3 : Defining the SimpleDB-Oracle translation</title><description>&lt;p&gt;Preamble : See &lt;a target="_blank" href="http://practicalcloudcomputing.com/post/297236978/oraclesdbhybridp2"&gt;Part 2 : Solving the Eventual-Consistency Problem&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;When building a SimpleDB-Oracle (i.e. any key_value_store-RDBMS) hybrid system, translating between two very different data models presents a challenge. The challenge expands beyond the obvious ACID vs. BASE differences.&lt;/p&gt;
&lt;p&gt;Most RDBMSs support the following features:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Triggers&lt;/li&gt;
&lt;li&gt;Stored Procedures&lt;/li&gt;
&lt;li&gt;Constraints (e.g. integrity, foreign key, unique, etc…)&lt;/li&gt;
&lt;li&gt;Sequences &lt;/li&gt;
&lt;li&gt;Sequences used as Primary Keys&lt;/li&gt;
&lt;li&gt;Locks&lt;/li&gt;
&lt;li&gt;Tables without Primary Keys or Unique Keys or both&lt;/li&gt;
&lt;li&gt;Relationships between tables&lt;/li&gt;
&lt;/ul&gt;
&lt;!-- more --&gt;&lt;p&gt;These do not currently exist in SimpleDB. As we transition our applications to the cloud, the absence of some of these features are more problematic than others.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Unique Constraints and Primary Keys&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Sometimes, a relational table might not have a Primary Key, but might have a Unique Key — the Unique Key might be a composite of several columns. In such a model, when translating the RDBMS data to SimpleDB, specify the Item Key to be the RDBMS Unique Key.&lt;/p&gt;
&lt;p&gt;If an RDBMS table depends on a sequence, especially if the Unique Key or Primary Key contains a sequence, that can be a real problem in the cloud. A separate distributed sequence generator (e.g. Paxos) will be needed in the cloud as there is no way to generate sequences in SimpleDB. Paxos can potentially be a bottle-neck however.&lt;/p&gt;
&lt;p&gt;Another option is to replace the Primary Key sequence (i.e. a number) with a GUID (i.e. a varchar2) in the RDBMS. We can generate GUIDs in the cloud easily as there is no need to gather consensus. This will then make it easy to insert new records in the cloud (i.e. SimpleDB) and data center (i.e. Oracle). It will also make the translation between the two by IR processes trivial.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Relationships between Tables&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;SimpleDB does not recognize relationships between domains. You have 2 options:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Option 1: Keep separate tables as separate domains and do joins in the cloud application&lt;/li&gt;
&lt;li&gt;Option 2: Collapse multiple-table relationships into a single domain&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The second option is called denormalizing. SimpleDB, like most key-value stores, does not support join semantics, so one of these 2 options will be needed to overcome this deficit.&lt;/p&gt;
&lt;p&gt;The option right for a particular set of data really winds down to replication. If data in the 2 RDBMS tables is altered as part of the same DB transaction, then denormalize (i.e. Option 1). If not, then keep the separate RDBMS tables in separate domains and do an application-level join.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Model All Deletes as Soft-deletes&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;This has more to do with multi-master replication than the translation between Oracle and SimpleDB. However, just keep it in mind. Hard-deletes can cause some confusion in the following quick succession of a chain of events:  create followed by a delete followed by a recreate. A soft-delete is one way to clear up the ambiguity this chain of events can cause when events are received out of order.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Discussion&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;These were the main pain points I encountered. I could work around many of the others. If you have any questions, feel free to note them here or DM me at @r39132 (Twitter).&lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/298018874</link><guid>http://practicalcloudcomputing.com/post/298018874</guid><pubDate>Wed, 23 Dec 2009 21:51:00 -0800</pubDate><category>SQL</category><category>NoSQL</category><category>SimpleDB</category><category>RDBMS</category><category>Oracle</category><category>cloud computing</category></item><item><title>The Oracle-SimpleDB Hybrid Part 2 : Solving the Eventual Consistency Problem</title><description>&lt;p&gt;Preamble: Read &lt;a target="_blank" href="http://practicalcloudcomputing.com/post/296371796/oraclesdbhybridp1"&gt;Part 1: Pulling Data out of Oracle Efficiently&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Creating the Oracle-SimpleDB Hybrid system is a challenge. For one, it is a multihomed system, accepting writes in both the cloud (i.e. SimpleDB) and in our data center (i.e. Oracle). Secondly, the data center and the AWS reqion (i.e. US-east) are on opposite coasts of the US with network latency ~50-100ms. Thirdly, clocks are not synchronized via network protocols like NTP across WANs. NTP across WANs introduce tens of milliseconds of inaccuracy, which may not be good enough to resolve all forms of conflict.&lt;/p&gt;
&lt;p&gt;To build a multi-homed system, we needed to keep our Oracle DB in our data center in sync with our SimpleDB domains in the East Coast region.&lt;/p&gt;
&lt;p&gt;This is a tough problem to solve. How consistent do we want the data? What if we shoot for strong-consistency?&lt;/p&gt;
&lt;p&gt;In order to build a strongly-consistent link between Oracle and SimpleDB, we could use dual-writes via a 2 Phase Commit (a.k.a 2PC) protocol. However, 2PC over a 50-100ms link would be an availability bottleneck, and hence 2PC is not a viable option. Any consensus protocol would suffer the same short-coming.&lt;/p&gt;
&lt;p&gt;Since we cannot achieve Strong Consistency between Oracle and SimpleDB, can we achieve Eventual Consistency?&lt;/p&gt;
&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_kv4dx4m6iT1qa94o2.png"/&gt;&lt;/p&gt;
&lt;!-- more --&gt;
&lt;p&gt;In an Eventually-consistent system, writes that occur in one place (i.e. SimpleDB or Oracle) are replicated asynchronously by IR processes. This is unlike the synchronous writes that occur in a 2PC system.&lt;/p&gt;
&lt;p&gt;The biggest problem then becomes resolving data conflicts. In other words, when parallel writes are occurring to the same row in both Oracle and SimpleDB, how do you settle on a winner?&lt;/p&gt;
&lt;p&gt;Our business rules allowed us to adopt the &lt;a target="_blank" href="http://practicalcloudcomputing.com/post/279565694/consistency-not-accuracy-p1"&gt;Consistency, Not Accuracy Principle&lt;/a&gt;. For most data (i.e. non-financial data and other business critical data), we don’t need to be accurate in terms of a global clock or vector clocks. In effect, we don’t need to ensure that the most recent write wins! Clocks are not synchronized, hence we cannot rely on local times to determine when the most recent change occurred.&lt;/p&gt;
&lt;p&gt;All we need to do is to pick the same winner in both Oracle and SimpleDB. We need the 2 data sources to be in sync.&lt;/p&gt;
&lt;p&gt;If there is a network partition such that Oracle and SimpleDB cannot communicate with one another, then the 2 data sources will diverge as long as the partition exists. This is unavoidable. However, as soon as the communicate link is reestablished, the system s expected to heal and the two data sources are expected to reach agreement.&lt;/p&gt;
&lt;p&gt;One night a few weeks ago, I came up with a solution to this problem in the form of an invariant — a condition that, if maintained, would ensure that SimpleDB and Oracle remained in sync.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Introducing the Consistency Invariant&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Consistency Invariant: candidate_row.version &gt; stored_row.version&lt;/p&gt;
&lt;p&gt;Any candidate_row being written to either SimpleDB or an RDBMS must have a version number greater than that of the existing stored record. If the candidate row does not have a higher version number, the writer must take exactly one of the following actions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Abandon the write&lt;/li&gt;
&lt;li&gt;Pick a higher version and retry the write&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As long as this invariant is maintained in the system, the Oracle and SimpleDB components of this multi-homed system will not diverge. One special condition is imposed on replication : replication must abandon writes if the data it wants to replicate (i.e. candidate_row) has a lower version than the targeted stored_row in the replication destination.&lt;/p&gt;
&lt;p&gt;Example:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Start with a record in both Oracle and SimpleDB: ID=1, Some_field=foo, version=01/01/09 12:00:00:000&lt;/li&gt;
&lt;li&gt;Imagine a write to Oracle that results in: ID=1, Some_field=bar, version=01/01/09 12:10:00:000&lt;/li&gt;
&lt;li&gt;Imagine a write to SimpleDB that results in: ID=1, Some_field=kar, version=01/01/09 12:10:00:001&lt;/li&gt;
&lt;li&gt;Oracle to SimpleDB replication is abandoned as the candidate_row.version is smaller than the stored_row.version (i.e. 01/01/09 12:10:00:000 &lt; 01/01/09 12:10:00:001 )&lt;/li&gt;
&lt;li&gt;SimpleDB to Oracle replication succeeds so that Oracle now has: ID=1, Some_field=kar, version=01/01/09 12:10:00:001&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;At the end of step 5, both SimpleDB and Oracle have the same data based on local clock values. Due to clock-skew, the winner may not have been the most recent in terms of an omniscient observer’s eyes.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Implementation&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;In order to implement this system, we need some functionality in SimpleDB. Amazon’s SimpleDB team is working on making this a reality. Once it comes out, we will be able to build our Eventually Consistent system.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Discussion&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Without the anticipated Amazon API, we cannot build an eventually-consistent Hybrid system optimized for AP (i.e. from CAP theorem). We would have had to rely on dual-writes, defeating our goal to be highly-available. This pattern and Invariant will likely be the standard solution in time to come as more companies try to move existing RDBMS-based applications to the cloud.&lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/297236978</link><guid>http://practicalcloudcomputing.com/post/297236978</guid><pubDate>Wed, 23 Dec 2009 12:15:00 -0800</pubDate><category>AWS</category><category>Amazon</category><category>Cloud Computing</category><category>Oracle</category><category>RDBMS</category><category>SimpleDB</category></item><item><title>The Oracle-SimpleDB Hybrid Part 1 : Pulling data out of Oracle Efficiently</title><description>&lt;p&gt;Preamble : See my previous post titled &lt;a target="_blank" href="http://rooksfury.tumblr.com/post/285434697/multimasterreplicationbetweenoracleandsdb"&gt;“Introducing the Oracle-MySQL Hybrid”&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;In my previous post, I provided an overview of an Oracle-SimpleDB Hybrid system that I am building. It supports writes to multiple masters, replicates data between masters in single-digit seconds (i.e. in the absence of long-term network partitions), is eventually-consistent, and is designed for optimal AP — it survives Network Partitions and is highly available.&lt;/p&gt;
&lt;p&gt;My company already relies on Oracle databases. In order to transition to SimpleDB, we will need to move one application at a time into the cloud while keeping our service running. As this cannot happen overnight, we need to keep both SimpleDB and Oracle in sync.&lt;/p&gt;
&lt;p&gt;In &lt;b&gt;Part 1 : Pulling data out of Oracle Efficiently&lt;/b&gt;, I’m going to discuss one of 3 methods we have devised to replicate data out of an RDBMS. This method is called Trigger-oriented Incremental Replication (a.k.a TIR) and is depicted below in the bottom gray-box.&lt;/p&gt;
&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_kv44l4EiUL1qa94o2.png"/&gt;&lt;/p&gt;
&lt;p&gt;Before the Oracle-SimpleDB Hybrid system could go live, we needed to copy a lot of data from Oracle to SimpleDB. There were 2 distinct goals:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Copy historical data from Oracle to SimpleDB - i.e. a one-time data fork-lift&lt;/li&gt;
&lt;li&gt;Replicate incremental changes as they occur in the live system &lt;/li&gt;
&lt;/ol&gt;
&lt;!-- more --&gt;&lt;p&gt;&lt;b&gt;One-time Fork-lift : Phase 1&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;When you run a production OLTP Oracle table that has millions or billions of rows, a one-time fork-lift could mean capturing a large snapshot of the data and then using SimpleDB’s BatchPutAttributes API to load that large snapshot into SimpleDB. The upload could take several weeks. Also, you might need to involve DBAs and SAs to set the snapshot up on a separate host machine so as to not impact your OLTP machine.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Incremental Replication : Phase 2&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Once the fork-lift phase of the data replication is complete, you can start streaming new changes to SImpleDB as they happen in Oracle.&lt;/p&gt;
&lt;p&gt;&lt;b&gt; &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Introducing Trickle-lifting, a novel architecture&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Our team used a single architecture to solve both of these goals concurrently. In other words, rather than have 2 distinct phases, our architecture met the fork-lift and incremental replication requirements in a single phase with a single architecture.&lt;/p&gt;
&lt;p&gt;Essentially, the gray box on the bottom supported incremental replication (i.e. our second requirement). When we piggy-backed the Trickle-lift Architecture (i.e. the gray box on the top) on top of the TIR architecture, we were able to meet our first requirement as well. The Trickle-lift Architecture is a novel architectural alternative to traditional fork-lifting architectures.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Pros and Cons of Entire Architecture&lt;/b&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;b&gt;Pros&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;No need for DBAs and SAs to set up a snapshot on a separate host&lt;/li&gt;
&lt;li&gt;Less development work is needed&lt;/li&gt;
&lt;li&gt;Ability to easily tune the data uploading rate&lt;/li&gt;
&lt;li&gt;No need to switch off between separate fork-lift and incremental replication phases&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;Cons&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;Use of trigger for all CRUD operations means that writes to the main table will now take longer and be more expensive in terms of system resources like CPU&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Trickle-lifting was used in the billion record upload task (see &lt;a target="_blank" href="http://rooksfury.tumblr.com/post/284222088/forklift-1b-records"&gt;earlier post&lt;/a&gt;)&lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/296371796</link><guid>http://practicalcloudcomputing.com/post/296371796</guid><pubDate>Tue, 22 Dec 2009 22:31:00 -0800</pubDate><category>Amazon</category><category>Fork-lift</category><category>Oracle</category><category>SimpleDB</category><category>cloud computing</category></item><item><title>Introducing the Oracle-SimpleDB Hybrid</title><description>&lt;p&gt;My company would like to migrate its systems to the cloud. As this will take several months, the engineering team needs to support data access in both the cloud and its data center in the interim. Also, the RDBMS system might be maintained until some functionality (e.g. Backup-Restore) is created in SimpleDB.&lt;/p&gt;
&lt;p&gt;To this aim, for the past 9 months, I have been building an eventually-consistent, multi-master data store. This system is comprised of an Oracle replica and several SimpleDB replicas. As I near completion of this system, I’d like to share its design.&lt;/p&gt;
&lt;p&gt;Here’s the system:&lt;/p&gt;
&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_kv45cd1aJL1qa94o2.png"/&gt;&lt;/p&gt;
&lt;p&gt;We plan on accepting reads and writes in our data center (Oracle) and in our AWS region (SimpleDB). There are 2 Incremental Replicators (IRs) that transmit the changes between Oracle and SimpleDB. One replicates data from Oracle to SimpleDB, the other replicates data back from SimpleDB to Oracle.&lt;/p&gt;
&lt;!-- more --&gt;
&lt;p&gt;There were many tough challenges in building this system. Not only are we trying to translate data between a key-value store and an RDBMS (2 totally disjoint concepts), we are also trying to keep data in sync in the absence of a global or vector clock. Additionally, pulling large amounts of data out of your work-horse OLTP system is no easy task.&lt;/p&gt;
&lt;p&gt;Each of these topics were deep and often led to lengthy debates.&lt;/p&gt;
&lt;p&gt;The challenges can be summarized in several parts:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Part 1 : Pulling data out of Oracle Efficiently&lt;/li&gt;
&lt;li&gt;Part 2 : Solving the Oracle-SimpleDB Eventual Consistency Problem&lt;/li&gt;
&lt;li&gt;Part 3 : Defining the SimpleDB-Oracle translation&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These challenges will be discussed in the near future. Feel free to follow me or retweet me (@r39132) on Twitter. Also DM me if you have any questions.&lt;/p&gt;
&lt;p&gt;Cheers!&lt;/p&gt;
&lt;p&gt;- s -&lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/285434697</link><guid>http://practicalcloudcomputing.com/post/285434697</guid><pubDate>Tue, 15 Dec 2009 17:57:00 -0800</pubDate><category>Amazon</category><category>Amazon SimpleDB</category><category>Oracle</category><category>cloud</category></item><item><title>Cloud Tips: How to Efficiently Forklift 1 Billion Rows into SimpleDB</title><description>&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_kuofzvpIy51qa94o2.jpg"/&gt;&lt;/p&gt;
&lt;p&gt;About 9 months ago, I was tasked with fork-lifting a massive amount of data into Amazon’s SimpleDB in a short amount of time. I achieved it. Here’s what you need to know.&lt;/p&gt;
&lt;p&gt;If you read-on, I’ll show you how to achieve data upload rates of around 10K items/second&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;i&gt;SimpleDB Basics&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;First of all, if you have 1 billion rows to upload, you will need more than 1 domain. This is because Amazon SDB imposes certain limits on how much data you can store in one domain : see &lt;a title="limits" target="_blank" href="http://docs.amazonwebservices.com/AmazonSimpleDB/latest/DeveloperGuide/index.html?SDBLimits.html"&gt;limits&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Without digressing too much, figure out your optimal domain sharding scheme for you data growth by keeping the following formula in mind:&lt;/p&gt;
&lt;p&gt;&lt;i&gt;Storage Usage = (ItemNamesSizeBytes + AttributeValuesSizeBytes + AttributeNameSizebytes)&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;This is how Amazon computes your Storage Usage vis-a-vis their 10GB limit.&lt;/p&gt;
&lt;p&gt;Note: You might need to ask them to raise your domains per account beyond 100 if you find 100 domains is too few for your data growth.&lt;/p&gt;
&lt;!-- more --&gt;
&lt;p&gt;&lt;b&gt;&lt;i&gt;SimpleDB Throttling&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Now that you have created the requisite number of domains (say 100) that you need, you might think that you can just pump all this data into SimpleDB in a few hours,&lt;/p&gt;
&lt;p&gt;Wrong!&lt;/p&gt;
&lt;p&gt;Amazon imposes a concept of fairness on all users of the system. If you try to execute too many writes, Amazon starts returning &lt;b&gt;503 - Service Unavailable&lt;/b&gt; fast-fail responses to you. SDB throttles you.&lt;/p&gt;
&lt;p&gt;How many put’s can you execute per second per domain?&lt;/p&gt;
&lt;p&gt;My experience has been: &lt;b&gt;70 singleton puts/domain/sec&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;If you had 100 domains and could achieve 70 singleton puts/domain/sec, how long would it take you to forklift 1 Billion records?  &lt;b&gt;It would take about 1.6 days.&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;In reality, the single put rate slows down the more full your domains become, so you don’t get this rate throughout.&lt;/p&gt;
&lt;p&gt;Earlier this year, Amazon unveiled &lt;a title="Batch Put" target="_blank" href="http://aws.amazon.com/about-aws/whats-new/2009/03/24/write-your-simpledb-data-faster-with-batch-put/"&gt;Batch Put&lt;/a&gt; . This gave me a 20x throughput improvement (I was inserting 20 items per batch put call) on nearly-empty domains. I did not trend how the batch put performed as my domains got more full, but my impression was that it was still much faster than the singleton put.&lt;/p&gt;
&lt;p&gt;Now, if you have used PutAttributes or BatchPutAttributes, you will know that you can specify “replace=true” or leave the default “replace=false” option on each item. Leaving the default is much, much more efficient. If you know that you are fork-lifting brand new data, then this is the way to go.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;i&gt;SimpleDB Forklift Experiment&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;I ran the following experiment:&lt;/p&gt;
&lt;p&gt;&lt;i&gt;&lt;b&gt;Data Shape and Domain Count&lt;/b&gt;&lt;/i&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;i&gt;~100 domains&lt;/i&gt;&lt;/li&gt;
&lt;li&gt;&lt;i&gt;~10 attributes per item&lt;/i&gt;&lt;/li&gt;
&lt;li&gt;&lt;i&gt;~10 bytes per attribute&lt;/i&gt;&lt;/li&gt;
&lt;li&gt;&lt;i&gt;~20 byte key (20 byte item id)&lt;/i&gt;&lt;/li&gt;
&lt;li&gt;&lt;i&gt;~35 Million records&lt;/i&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;i&gt;&lt;b&gt;Other Info&lt;/b&gt;&lt;/i&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;i&gt;Default Replace=false on all items&lt;/i&gt;&lt;/li&gt;
&lt;li&gt;&lt;i&gt;Domains were empty to start with&lt;/i&gt;&lt;/li&gt;
&lt;li&gt;&lt;i&gt;20 items per BatchPutAttribute call&lt;/i&gt;&lt;/li&gt;
&lt;li&gt;&lt;i&gt;Source data from Oracle instance on west coast, SDB target on east coast &lt;/i&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;b&gt;&lt;i&gt;Resulting Performance&lt;/i&gt;&lt;/b&gt;&lt;b&gt;&lt;i&gt; &lt;/i&gt;&lt;/b&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;b&gt;All data forklifted in 55 minutes (i.e. Average Rate = 10-11K items/sec)&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;b&gt;Conclusion: &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;You can achieve a pretty good rate of data upload into SimpleDB — 35M records in 55 minutes across the US (west coast to east coast).&lt;/p&gt;
&lt;p&gt;For my 1 Billion record task, I ran into a series of issues that kept me from sustaining the 10K items/sec upload rate. These include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Not being able to use replace=false (I had to use replace=true)&lt;/li&gt;
&lt;li&gt;Source Oracle DB was a bottle-neck&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you have any questions feel free to DM me at @r39132 (Twitter)&lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/284222088</link><guid>http://practicalcloudcomputing.com/post/284222088</guid><pubDate>Mon, 14 Dec 2009 21:04:00 -0800</pubDate><category>SimpleDB</category><category>Amazon SimpleDB</category><category>NoSQL</category><category>Forklift</category><category>Cloud</category><category>Tips</category><category>distributed computing</category></item><item><title>RDBMS vs SimpleDB Overview</title><description>&lt;p&gt;&lt;b&gt;&lt;i&gt;Enter the key-value store, exit the RDBMS&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Anyone who has worked directly or indirectly with a relational database will tell you that it would be foolish to build a business that didn’t use one to store your business’s data.&lt;/p&gt;
&lt;p&gt;One may argue whether MySQL or Oracle is the better choice, but would someone actually argue that an RDBMS (a.k.a. relation database) was not the best choice for storing your data?&lt;/p&gt;
&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_kv4be8I1MB1qa94o2.png"/&gt;&lt;/p&gt;
&lt;p&gt;Yes! There is a movement, the NoSQL movement, that is challenging the supremacy of RDBMSs for storing your data!&lt;/p&gt;
&lt;p&gt;Some are listed &lt;a title="here" target="_blank" href="http://en.wikipedia.org/wiki/NoSQL"&gt;here&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Now 10 years after Eric Brewer’s game-changing introduction of the CAP theorem, a mass exodus is starting towards AP (availability + partition tolerance) and away from CP (strong consistency + partition tolerance).&lt;/p&gt;
&lt;p&gt;&lt;i&gt;&lt;img src="http://media.tumblr.com/tumblr_kumk1ag1js1qa94o2.jpg"/&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;Amazon’s SimpleDB is one such alternative to an RDBMS. Simply put, it is a distributed, replicated, eventually-consistent, always-available, key-value store owned and operated (i.e. hosted) by Amazon’s Web Services division.&lt;/p&gt;
&lt;!-- more --&gt;
&lt;p&gt;In SimpleDB-speak, a domain is a table, attributes are columns, and items are records. So, you can speak about a domain named customer_addresses containing 1000 items (i.e. 1000 customer addresses), such that each item contains attributes like Street Name and Zip.&lt;/p&gt;
&lt;p&gt;SDB domains are sparse (i.e. schema-less) and SDB supports the following APIs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;b&gt;select &lt;/b&gt;— a subset of the SQL 92 standard offering order by, but no group by or joins, etc…&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;getAttributes&lt;/b&gt;(key, attributeNameList)&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;putAttributes&lt;/b&gt;(key, attribute_name1=attribute_value1, attribute_name2=attribute_value2, ..)&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;removeAttributes&lt;/b&gt;(key, attributeNameList)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With the exception of the putAttributes() call, omission of the attributeNameList is allowed.&lt;/p&gt;
&lt;p&gt;I’ve been transitioning several tables to SimpleDB from Oracle over the past year. Here are some issues I have with it.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;b&gt;Backup-recovery &lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;They don’t have a way for users to back up their domains. God forbid that you should corrupt your data, you don’t have the ability to rollback to a previous good checkpoint&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt; High-availability&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;When a write is committed, you should be able to read your data within seconds. WHEN A WRITE IS COMMITTED. This is the catch. If the node that you are writing to is being hammered (by you or someone else), you will receive ‘503 - Service Unavailable’ responses. In other words, you won’t be able to commit your writes.&lt;/li&gt;
&lt;li&gt;You may have to make several attempts to commit, during which your own site might become less-available.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;Inter-region replication&lt;/b&gt; 
&lt;ul&gt;
&lt;li&gt;Currently, SimpleDB is shared between all availability-zones within a region but is not shared across regions. I don’t have a major qualm with this missing functionality, but you might. Essentially, you choose regions in order to achieve fast access to local data. In other words, your east coast customers should access SDB-US-East, while your west coast customers should access SDB-US-West. You could partition your data according to where you customers live (registered) or you could keep your data in sync between East and West regions and route traffic based on IP to the nearest coast. Whatever you choose, you will need to solve this problem.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;b&gt;Benefits of SimpleDB over RDBMS &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Even with these deficits, I am quite pleased with SimpleDB. If you are starting a new company and need to decide on an OLTP system. Use SimpleDB. You won’t need to hire System Admins or DBAs. Your engineers won’t need to understand the need for connection pool management, SQL injection, prepared statements, query optimization, or other DB-related performance handicaps.&lt;/p&gt;
&lt;p&gt;Your developers will simply make web service calls. There are some best-practices to follow, but they are few and simple to understand.&lt;/p&gt;
&lt;p&gt;If you are transition from an RDBMS to a key-value store, your decision is more complex. I’ll write on that at a later time.&lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/282783123</link><guid>http://practicalcloudcomputing.com/post/282783123</guid><pubDate>Sun, 13 Dec 2009 21:27:00 -0800</pubDate><category>SimpleDB</category><category>RDBMS</category><category>Amazon</category><category>Cloud computing</category><category>distributed computing</category></item><item><title>Eventual Consistency Explained for Techies</title><description>&lt;p&gt;&lt;b&gt;Preamble:&lt;/b&gt; Have a look at my previous article titled “Eventually Consistency Explained for non-Techies”&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Eventual and Weak Consistency &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;SimpleDB is eventually consistent. Eventual consistency is a version of weak consistency — you may not see the latest writes committed to the system.&lt;/p&gt;
&lt;p&gt;Imagine that you have a system of N nodes. Of these, W nodes are involved in any write sent to the system and R nodes are contacted on any read from the system. Strong consistency can be achieved if R+W &gt; N. In other words, if the read sets and write sets overlap, the read can discover the most recent write to the system.&lt;/p&gt;
&lt;!-- more --&gt;
&lt;p&gt;Hence, in order to achieve consistent reads, there are 2 sides to this equation. You can either increase W or increase R to achieve the overlap. There are hence 2 extreme cases:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;b&gt;Fastest Reads:&lt;/b&gt; R=1, W=N   
&lt;ul&gt;
&lt;li&gt;To read the latest write, only 1 node is contacted. However, writes need to be confirmed at all nodes before they are ACKed back to the client. Write performance suffers.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;Fastest Writes:&lt;/b&gt; R=N, W=1   
&lt;ul&gt;
&lt;li&gt;Only one node is ever involved in a write. All nodes are contacted for the read and a quorum needs to be reached among all nodes. Read performance suffers.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In an eventually consistent system, R+W &lt; N. Read sets and Write sets are not guaranteed to overlap. Reads will not see the latest write in these cases. However, a gossip-style, lazy data propagation mechanism replicates writes to all nodes. Eventually all nodes will be consistent. In cases where there are conflicting versions for a datum, the system will choose one.&lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/282603471</link><guid>http://practicalcloudcomputing.com/post/282603471</guid><pubDate>Sun, 13 Dec 2009 19:12:00 -0800</pubDate><category>SimpleDB</category><category>Eventual Consistency</category><category>Amazon</category><category>Tech</category><category>distributed computing</category><category>Cloud Computing</category></item><item><title>Eventual Consistency Explained for Non-techies</title><description>&lt;p&gt;If you work in the Computer industry, especially the Internet industry, chances are good that you have encountered an eventually-consistent system.&lt;/p&gt;
&lt;p&gt;For example, when managing an internet or IT business, you might have considered one of all of the following DB architectures:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;b&gt;Use a single DB host&lt;/b&gt; &lt;ol&gt;
&lt;li&gt;e.g. MyHost&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;Use a single DB host for your writes, but several for your reads &lt;/b&gt;&lt;ol&gt;
&lt;li&gt;e.g. MyWriteHost &amp; MyReadHost1, MyReadHost2, MyReadHost 3, etc …&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;Use multiple DB hosts &lt;/b&gt;&lt;ol&gt;
&lt;li&gt;e.g. MyHost1, MyHost2, etc….&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;b&gt;Use multiple DB hosts for your writes and your reads &lt;/b&gt;&lt;ol&gt;
&lt;li&gt;e.g. MyWriteHost1, MyWriteHost2, etc… &amp; MyReadHost1, MyReadHost2, etc, ….&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;These 4 choices represent an increasing degree of data traffic partitioning, with 1 having no partitioning and 2 having the most partitioning.&lt;/p&gt;
&lt;!-- more --&gt;
&lt;p&gt;In most eCommerce sites, the DB read traffic dwarfs the DB write traffic, typically 10-to-1. In such cases, architecture 2 makes a lot of sense. For example, a user might change his phone number from 555-5555 to 666-6666. 666-6666 will be written to MyWriteHost. An asynchronous back-ground task will copy data from MyWriteHost to MyReadHost1 (for example) after a brief delay, also known as the Inconsistency Window. If the website tries to read the latest phone number from MyReadHost1, it will still see 555-5555 until the Inconsistency (time) Window expires. This is an eventually-consistent system.&lt;/p&gt;
&lt;p&gt;This is known as a read-write split architecture - companies like eBay heavily leverage such an architecture. Read-write split architectures are eventually-consistent.&lt;/p&gt;
&lt;p&gt;Another use-case that exhibits eventual-consistency is seen in Amazon’s SimpleDB. Such a system is actually an Active-Active HA storage system. Each node can fail-over to any other node and nodes are kept in sync by an asynchronous background thread.&lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/282570055</link><guid>http://practicalcloudcomputing.com/post/282570055</guid><pubDate>Sun, 13 Dec 2009 18:46:00 -0800</pubDate><category>Eventual Consistency</category><category>eBay</category><category>Amazon SimpleDB</category></item><item><title>The "Consistency, Not Accuracy" Principle</title><description>&lt;p&gt;Preamble: Read my post “The CAP Theorem distilled”&lt;/p&gt;
&lt;p&gt;In my previous post, I started talking about the “Consistency, Not Accuracy” Principle (a.k.a. The CNA Principle)&lt;/p&gt;
&lt;p&gt;Essentially, in order to scale your web site and to keep running amidst unpredictable network and system outages, you need to have a replicated, fault-tolerant data store that accepts reads and writes in multiple locations. One replica might be in California and another might be in Virginia. If California were to fall into the great Pacific, your web site should still work and your users should be none the wiser.&lt;/p&gt;
&lt;!-- more --&gt;
&lt;p&gt;However, this distributed data store design now presents the following predicament: How do you keep your replicas in Virginia and California in sync? Also, what does “in sync” actually mean?&lt;img/&gt;&lt;/p&gt;
&lt;p&gt;As described in my earlier post, throughput and latency don’t scale for 2-phase commit (a.k.a. 2PC) or other forms of pessimistic synchronous replication (e.g. fault tolerant causal broadcasts). In order to have a highly-available system, some portion of writes need to be executed asynchronously.  Hence some number of data inconsistencies will arise after writes commit.&lt;/p&gt;
&lt;p&gt;Resolving these inconsistencies, a complex task, can be made simpler if we accept the CNA principle. Essentially, it is more important for the data to be consistent between these two data store replicas (i.e. Virginia and California) than for the data to reflect the most recent update to the system (i.e. be accurate).&lt;/p&gt;
&lt;p&gt;For example, say you changed a blog entry at noon and then again at 12:01 pm. Assume that the modifications were different. According to this principle, it is possible for one of your changes to be lost. Also, the one that is lost might be the most recent edit.&lt;/p&gt;
&lt;p&gt;A system in which accuracy is required, such as a financial application, is vastly more complex and can require causal ordering (e.g. vector clocks), read repair, quorums, etc….&lt;/p&gt;
&lt;p&gt;Building a system that follows CNA is vastly simpler to understand and implement. If your site’s business model supports it (which it probably will), then I recommend it.&lt;/p&gt;
&lt;p&gt;In a future post, I’ll go into my specific experience with the CNA Principle.&lt;/p&gt;</description><link>http://practicalcloudcomputing.com/post/279565694</link><guid>http://practicalcloudcomputing.com/post/279565694</guid><pubDate>Fri, 11 Dec 2009 16:56:00 -0800</pubDate><category>consistency</category><category>SimpleDB</category><category>Amazon</category><category>Multi-homing</category><category>CAP theorem</category></item></channel></rss>
