Wednesday, October 30, 2013

Building distributed systems out of crap

Pat Helland gave the opening keynote at Basho's conference Ricon West, yesterday. The general topic was building distributed systems with enterprise guarantees and web scalability on crap. His argument is that enterprise-grade SLAs with lots of nines can be supported on cheap hardware using a strategy of expect failure and recover quickly.

Helland, formerly having done time on the Bing team and at Amazon, is building a distributed data storage system for Salesforce.com. It's design involves a catalog stored in a relational DB and files stored on clusters of storage servers, a technique Helland calls blobs-by-reference.

The files are stored in fragments distributed across a cluster. There was another concept called an “extent”. I wasn't sure if that meant an aggregation of related fragments or just a bucket to dump them in.

SSDs are used as a new layer of the memory hierarchy. Helland argues for using the cheapest and crappiest available. This entails a couple engineering tweaks. Because SSDs degrade with every operation, the software has to manage read write cycles. To detect data corruption, each fragment is packaged with a CRC error-detecting code.

“By surrounding the data with aggressive error checking, we can be extremely confident of detecting an error and fetching the desired data from one of the other places it has been stored.”

Helland emphasized the importance of immutable data, which goes a long way towards mitigating the inconsistency and race conditions that come with distributed computing. In the proposed storage system, fragments are immutable, which greatly reduces opportunity for the storage nodes to get out of sync with the catalog.

Aside from this talk, Ricon is loaded with good content including a talk by Jeff Dean coming up this afternoon. Send me next year!

More