Talks, Articles, White Papers, Videos, Patents

I thought that I would put up a list of previous and upcoming talks that I will be giving. These talks will tend to focus on my work in Distributed “Cloud” Computing and NoSQL.

Recent & Upcoming Speaking Appearances

White papers

Magazine Articles

Patents

 Videos (in reverse-chronological order)

Big Data @ LinkedIn Interview (QCON London 2012) (video + slides)

Data Infrastructure @ LinkedIn (QCon London 2012) (video + slides)

Keeping Movies Running Amid Thunderstorms (video + slides)

NoSQL @ Netflix (QCON London 2011) (video + slides)

Netflix Cloud Data Architecture (QCON London 2011) (video + slides)

NoSQL @ Netflix talk at Facebook (Feb 2011)

Structure 2011 Guru Panel (June 2011)

NoSQL @ Netflix : Part 1 

Hi Folks!

I had a great time this past Thursday (Feb 17, 2011) speaking at the Silicon Valley Cloud Computing Group Meetup at Facebook. The talk covered Netflix’s move from RDBMS to NoSQL, specifically SimpleDB. Subsequent parts will provide our experiences with Cassandra, HBase, and other technologies.

Video is now available. The first 10 minutes are from sponsors, VMWare, RackSpace, and Scalr!

-s

Silicon Valley Meetup & QCon London

Hi Folks!

I’ll be giving a 2-part lecture on NoSQL @ Netflix at the Silicon Valley Cloud Computing Meetup in Mountain View on Feb 17. The lectures will be a month apart.

I will detail the challenges involved in going from an RDBMS in our Data Center to AWS’s SimpleDB and S3 in the Cloud. I was intimately involved in this transition. Now, my team at Netflix is investing in Cassandra (and to a lesser extent in HBase). How are we using these different storage options? Come to the meetup to find out!

As a separate note, I am excited to be heading to London in early March to meet engineers in London and to deliver 2 talks

I will also be giving 2 QCon London talks in early March in both Floyd Marinescu’s Architectures You’ve Always Wondered About and Alex Popescu’s NoSQL : Where and How tracks. In the first lecture, I’ll describe Netflix’s Cloud-based data infrastructure. In the second (i.e. the NoSQL Track), I will dive into our NoSQL.

If you happen to be in town, please drop by!

-s

How Process Priority Inversion Can Burn CPU via “Waiting” Processes

For the past few weeks, we have been wrestling with an interesting bug in Oracle 11g at Netflix.

We are seeing high CPU attributed with a high number of wait events for the following:

  • cursor: mutex S
  • latch: shared pool

I was perplexed as to why waits would result in high CPU so I hopped on to Google. I didn’t find an answer for 11g, but I did find something reported in 10.2 and reportedly fixed in 11g.

In Oracle 10.2, on Operating Systems (e.g. AIX, HPUX, etc..) that support process priority decay (e.g. fair round robin, default on AIX 5), priority inversion can cause “blocked threads” to burn CPU. Ignore the fact that this mentions “cursor pin s” waits instead of “cursor mutex s” waits, the pattern is the same.

http://blog.tanelpoder.com/2010/04/21/cursor-pin-s-waits-sporadic-cpu-spikes-and-systematic-troubleshooting/

In Oracle 10.2, latches cause waits, while mutexes cause CPU spins.

Read More

Cassandra : Row Cache & Memtable Q&A

Below, you will find 2 sets of questions and answers regarding Cassandra’s (now v0.7) Row Cache and Memtable. These were answered by Matthew Dennis at Riptano, a company that is actively developing both Cassandra and a management suite known as RipCord.


Questions


      Hi!

      I have a super column family. Writes either modify a column within a super column or add a super column to the super column family. I wonder how the row cache works.

      A. How do writes interact with the Row Cache entries?
      1. Do you update the row cache entry in place in order to keep it consistent with the SSTable?
      2. Do you update it when the memtable is updated or when the memtable is flushed to the SSTable?

      B. Also, is the Memtable appended to an SSTable when it is flushed or does it become its own SSTable?

Read More

Netflix’s Transition to High-Availability Storage Systems (QCon SF 2010) Slides

I presented at QCon SF (2010) yesterday on Netlflix’s transition to high-availability storage. The slides are on slideshare.

I expect that the presentation will be available on the QCon SF site in a few days. It’s (loosely) based on my previous white paper of the same title - see this post.  

Netflix’s Transition to High-Availability Storage Systems

I just published a white paper titled Netflix’s Transition to High-Availability Storage Systems

Feel free to email me your thoughts at siddharthanand@yahoo.com

To download this paper as PDF, click on this .

SimpleDB Essentials for High Performance Users : Part 3 

This is Part 3 of SimpleDB Essentials for High Performance Users. Check out Part 2

  1. Work around Attribute Value Length Limits
    1. If you need to store data that is vastly larger than 1024 bytes in a SimpleDB attribute, consider storing that data in S3 and putting a pointer (i.e. bucket name + object key) to the data in the simpleDB attribute. However, the drawback from this approach is that you will require 2 round-trips (i.e. one to SimpleDB and one to S3) to compose one logical row. Beyond the obvious performance hit, this approach is not transactionally sound.
    2. A better approach is to split that data over several SimpleDB attributes. You will need to control the splitting and joining logic of these SimpleDB attributes, but you will only need one roundtrip and you can leverage conditional puts for concurrency control. This approach is ideal if your data can fit in 10 or fewer attributes.
      1. Just remember that subsequent updates to these split attributes might be of different length
  2. Getting tripped up by the Default Select Query pagination limit of 100
    1. You must be aware that the SDB Select query supports the “limit N” expression. This allows the developer to specify N up to a max of 2500. If the developer chooses N=200 for example and 1000 items match the WHERE clause conditions, then the results would be returned in chunks of 200 at a time. 5 subsequent round trips would be required to fetch the 1000 items.  For customer facing functionality, you are risking end-user timeouts. To avoid this, always specify “limit 2500”. Note: if you don’t specify it, the default value of 100 is assumed by SimpleDB
    2. Avoid any client code that auto-follows tokens returned by SimpleDB. SimpleDB Query timeouts could result in an unpredictably long-cycle of next-pointers. Auto-following these can not only result in an infinite loop on your servers, but customer-browser timeouts as well. Instead, follow these next pointers judiciously.
  3. Avoid carrying multi-table relationships into the cloud in the form of multi-domain relationships. Try to denormalize these relationships into single items. Doing joins in the application tier might require multiple round-trips to SDB and open customer-facing functionality to time-outs
  4. Remember that there are no sequences, locks, constraints (except for the uniqueness constraint on the item name), triggers, etc.. in SimpleDB. Don’t expect them

SimpleDB Essentials for High Performance Users : Part 2

This Part 2 of SimpleDB Essentials for High Performance Users. Check out Part 1

Read More

About Me

A blog describing my work in building websites that millions of people visit. I'm a senior member of LinkedIn's Distributed Data Systems team. I previously held technical and leadership roles at Netflix, Etsy, eBay & Siebel Systems.
Tumblelogs I follow: