Reviews
Memory Management in the Cloud
Stanford, “The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM,” SIGOPS Operating Systems Review, Vol. 43, No. 4, December 2009, pp. 92-105. [PDF] AMP Lab, “PACMan: Coordinated Memory Caching for Parallel Jobs,” Secret Draft. Update: PACMan has been accepted at NSDI’2012. Secret draft won’t remain secret anymore :) Summary Cloud applications require storage systems ...
Continue reading →
Confidentiality and Security in the Cloud
Raluca Ada Popa, Catherine M. S. Redfield, Nickolai Zeldovich, Hari Balakrishnan, “CryptDB: Protecting Confidentiality with Encrypted Query Processing,” SOSP, 2011. [PDF] Thomas Ristenpart, Eran Tromer, Hovav Shacham, Stefan Savage, “Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds,” CCS, 2009. [PDF] Summary With the increase in popularity of cloud computing ...
Continue reading →
Graph-parallel frameworks
Google, “Pregel: A System for Large-Scale Graph Processing,” SIGMOD, 2010. [PDF] Carnegie Mellon, “GraphLab: A New Framework for Parallel Machine Learning,” arXiv:1006.4990, 2010. [PDF] Summary Data-parallel frameworks such as MapReduce and Dryad are good at performing embarrassingly parallel jobs. These frameworks are not ideal for iterative jobs and for jobs where data-dependencies across stages are ...
Continue reading →
Datacenter transport layer protocols
Stanford and Microsoft, “DCTCP: Efficient Packet Transport for the Commoditized Data Center,” SIGCOMM, 2010. [PDF] Raiciu et al, “Improving Datacenter Performance and Robustness with Multipath TCP,” SIGCOMM, 2011. [PDF] MSR Asia, ICTCP: Incast Congestion Control for TCP in Data Center Networks,” CoNEXT, 2010. [PDF] Summary Datacenters pose a different set of challenges than the Internet, ...
Continue reading →
Multi-framework resource managers for datacenters
AMPLab, “Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center,” NSDI, 2011. [PDF] Apache Software Foundation, “Hadoop NextGen”, 2011. [LINK] Summary Traditional cluster resource schedulers fall into two broad categories: some do fine-grained management of resources for individual frameworks (e.g., in Hadoop), but this requires multiple frameworks to run on multiple isolated clusters. ...
Continue reading →
Distributed in-memory datasets
AMPLab, “Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing,” UCB/EECS-2011-82, 2011. [PDF] Russell Power, Jinyang Li, “Piccolo: Building Fast, Distributed Programs with Partitioned Tables,” OSDI, 2010. [PDF] Summary MapReduce and similar frameworks, while widely applicable, are limited to directed acyclic data flow models, do not expose global states, and generally slow due to ...
Continue reading →
Cloud databases
MIT, “Relational Cloud: A Database-as-a-Service for the Cloud,” CIDR, 2011. [PDF] Divyakant Agrawal, Amr El Abbadi, Sudipto Das, Aaron J. Elmore, ”Database Scalability, Elasticity, and Autonomy in the Cloud,” DASFAA, 2011. [PDF] Relational Cloud The key idea of the Relational Cloud project is to define the concept of transactional Database-as-a-Service (DBaaS), identify the key challenges toward materializing ...
Continue reading →
Declarative and finite state machine approaches to Cloud programming
Perter Alvaro, Tyson Condie, Neil Conway, Khaled Elmeleegy, Joseph M. Hellerstein, Russell Sears, “BOOM Analytics: Exploring Data-Centric, Declarative Programming for the Cloud,” EuroSys, 2010. [PDF] Joe Armstrong, “Erlang: A Survey of the Language and Its Industrial Applications,” Ninth Exhibition and Symposium on Industrial Applications of Prolog, 1996. [PDF] BOOM BOOM or Berkeley Orders-Of-Magnitude adopts a data-centric ...
Continue reading →
Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS
Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, and David G. Andersen, ”Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS,” SOSP, 2011. [PDF] Summary This paper introduces a new consistency model, causal+, that extends the causal consistency model and lies between sequential and causal consistency models. The authors claim that causal+ is the ...
Continue reading →
PNUTS: Yahoo!’s Hosted Data Serving Platform
Yahoo! Research, “PNUTS: Yahoo!’s Hosted Data Serving Platform,” PVLDB, 2008. [PDF] Summary PNUTS is a scalable, highly available, and geographically distributed (but low latency) data store used by most Yahoo! online properties. To achieve both availability and partition tolerance, it uses a novel notion of consistency called per-record timeline consistency; under this model, all replicas of ...
Continue reading →