SCADS: Scale-Independent Storage for Social Computing Applications

Michael Armbrust, Armando Fox, David A. Patterson, Nick Lanham, Beth Trushkowsky, Jesse Trutna, Haruki Oh, “SCADS: Scale-Independent Storage for Social Computing Applications,” CIDR, 2009. [PDF]

Summary

SCADS (Scalable Consistency Adjustable Data Storage) is a proposal for a collection of components leveraging database, control theory, and machine learning techniques to achieve data scale independence for rapidly growing (or shrinking) Web 2.0 services. It has three key components:

  1. A performance-insightful query language (PIQL) that provides strict scalability guarantees and predictable performance;
  2. A declarative way for developers to explicitly define there performance-consistency tradeoff requirements; and
  3. Machine learning models to add and remove capacity to meet SLA requirements.

Critique

I believe that restricting queries to have bounded performance and allowing developers to explicitly dictate/specify their requirements/deadlines are the key contributions of this proposal. There are several other interesting concepts embedded in different parts of the proposal; it is not clear, however, how influential the whole architecture will be.

The key tradeoff in SCADS is whatever-is-required for predictable performance. The authors are willing to restrict queries, use CPU/disks to build additional indices, consider developer inputs and give feedback to them, and do several other things to ensure predictability.

Since this is a position paper, the authors only provide high-level ideas without any concrete solution. Many of the components proposed in this paper have so far been developed, and there is a good chance that the overall architecture will see the light of day at some point, in some form.

Leave a Reply

Your email address will not be published. Required fields are marked *