<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Mosharaf Chowdhury &#187; Reviews</title>
	<atom:link href="http://www.mosharaf.com/blog/category/reviews/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mosharaf.com</link>
	<description>UC Berkeley</description>
	<lastBuildDate>Fri, 03 Feb 2012 04:49:11 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Memory Management in the Cloud</title>
		<link>http://www.mosharaf.com/blog/2011/12/05/memory-management-in-the-cloud/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=memory-management-in-the-cloud</link>
		<comments>http://www.mosharaf.com/blog/2011/12/05/memory-management-in-the-cloud/#comments</comments>
		<pubDate>Mon, 05 Dec 2011 18:13:50 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[PACMan]]></category>
		<category><![CDATA[RAMCloud]]></category>
		<category><![CDATA[UCB Cloud Computing Course F11]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1985</guid>
		<description><![CDATA[Stanford, &#8220;The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM,&#8221; SIGOPS Operating Systems Review, Vol. 43, No. 4, December 2009, pp. 92-105. [PDF] AMP Lab, &#8220;PACMan: Coordinated Memory Caching for Parallel Jobs,&#8221; Secret Draft. Update: PACMan has been accepted at NSDI&#8217;2012. Secret draft won&#8217;t remain secret anymore :) Summary Cloud applications require storage systems [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">Stanford, &#8220;The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM,&#8221; <em>SIGOPS Operating Systems Review</em>, Vol. 43, No. 4, December 2009, pp. 92-105. [<a href="http://www.stanford.edu/~ouster/cgi-bin/papers/ramcloud.pdf">PDF</a>]</p>
<p class="alert">AMP Lab, &#8220;PACMan: Coordinated Memory Caching for Parallel Jobs,&#8221; <em>Secret Draft</em>.</p>
<p class="alert"><strong>Update:</strong> PACMan has been accepted at NSDI&#8217;2012. Secret draft won&#8217;t remain secret anymore :)</p>
<h2>Summary</h2>
<p>Cloud applications require storage systems that provide low latency and high throughput for large amounts of data.  While traditional disks cannot meet such requirements, given the trend in DRAM price and capacity, it is possible to envision a future where most of the storage needs can be fulfilled by DRAM; RAMCloud is such a system. PACMan, on the other hand, suggests that even today, most of the workloads can be kept into DRAM using better caching mechanisms.</p>
<h3>RAMCloud</h3>
<p>The core idea in RAMCloud is to keep everything in DRAM with disks used only as backups. The biggest challenge is to make sure that the storage system can be recovered quickly upon failure. RAMCloud uses buffered logging. The authors claim that replication is not necessary to achieve high performance, rather replicas are used only for parallel recovery. In steady state, there is a single copy of the data present in DRAM. Recovery is performed using a massively parallel read of data from disks.</p>
<h3>PACMan</h3>
<p>PACMan is a caching mechanism and corresponding system for HDFS and similar distributed file systems. The key idea is that current clusters have a large amount of unused memory that can be used to cache frequently-used data blocks, and traditional caching strategies like LRU or LFU do not work well on cluster jobs. The authors propose the concept of all-or-nothing property, i.e., when caching all data blocks for a given job across the cluster should be cached or nothing at all.</p>
<h2>Comments</h2>
<p>RAMCloud is a more general system than PACMan, but clearly, it is more expensive as well. RAMCloud trades off price for speed, but it is likely to be used in many future systems if prices of DRAM and high-speed network equipments keep going down. PACMan, from the high level, may seem to be a more short-term fix for the existing clusters. However, the insight of all-or-nothing is important and will be useful even in the future. Also, PACMan can have a quicker impact because it does not ask for any investment to reap the possible gains.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2011/12/05/memory-management-in-the-cloud/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Confidentiality and Security in the Cloud</title>
		<link>http://www.mosharaf.com/blog/2011/11/27/confidentiality-and-security-in-the-cloud/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=confidentiality-and-security-in-the-cloud</link>
		<comments>http://www.mosharaf.com/blog/2011/11/27/confidentiality-and-security-in-the-cloud/#comments</comments>
		<pubDate>Mon, 28 Nov 2011 01:09:33 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[Balakrishnan]]></category>
		<category><![CDATA[CryptDB]]></category>
		<category><![CDATA[Popa]]></category>
		<category><![CDATA[Redfield]]></category>
		<category><![CDATA[Ristenpart]]></category>
		<category><![CDATA[Savage]]></category>
		<category><![CDATA[Shacham]]></category>
		<category><![CDATA[Tromer]]></category>
		<category><![CDATA[UCB Cloud Computing Course F11]]></category>
		<category><![CDATA[Zeldovich]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1975</guid>
		<description><![CDATA[Raluca Ada Popa, Catherine M. S. Redﬁeld, Nickolai Zeldovich, Hari Balakrishnan, &#8220;CryptDB: Protecting Conﬁdentiality with Encrypted Query Processing,&#8221; SOSP, 2011. [PDF] Thomas Ristenpart, Eran Tromer, Hovav Shacham, Stefan Savage, &#8220;Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds,&#8221; CCS, 2009. [PDF] Summary With the increase in popularity of cloud computing [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">Raluca Ada Popa, Catherine M. S. Redﬁeld, Nickolai Zeldovich, Hari Balakrishnan, &#8220;CryptDB: Protecting Conﬁdentiality with Encrypted Query Processing,&#8221; <em>SOSP</em>, 2011. [<a href="http://people.csail.mit.edu/nickolai/papers/raluca-cryptdb.pdf">PDF</a>]</p>
<p class="alert">Thomas Ristenpart, Eran Tromer, Hovav Shacham, Stefan Savage, &#8220;Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds,&#8221; <em>CCS</em>, 2009. [<a href="http://cseweb.ucsd.edu/~hovav/dist/cloudsec.pdf">PDF</a>]</p>
<h2>Summary</h2>
<p>With the increase in popularity of cloud computing as a scalable, elastic, and cost-effective infrastructure solution, concerns about the security, privacy, and confidentiality of user data hosted on public clouds are also increasing. Curious administrators might breach trust, malicious entities can try to restrict/deny services, and adversaries might gain access to confidential data.</p>
<h3>CryptDB</h3>
<p>CryptDB stores user data in an SQL-aware encrypted form with multi-layered encryption onions. Each layer provides different levels of security and restricts execution of  SQL queries to limited sets. Depending on user queries, layers are dynamically ripped off one after another. Eventually, the database reaches a steady-state that strikes a balance between confidentiality of data and usability of the database. Encryption keys are chained together with user passwords to survive security breaches of both database and application servers.</p>
<h3>Hey You, Get Off of My Cloud!</h3>
<p>This paper discusses the risks of shared public clouds by demonstrating how an attacker can find the network topology of a cloud provider (e.g., Amazon EC2) to get a VM that co-resides with a victim VM and extract information from the victim. The goal is more to show that these risks existed in 2009 (it is questionable how big of a risk they are, and how hard it is avoid them), than how to address them.</p>
<h2>Comments</h2>
<p>CryptDB is undoubtedly the more practical of the two papers with a usable solution to a real problem. However, it has its weaknesses: CryptDB should require N times more space for N layers of the onion, creation/update of new onions with the change of user passwords and corresponding encryption key chains will be expensive, and for databases with mostly long-running and persistent connections, information of most users will be exposed when database and application servers are compromised.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2011/11/27/confidentiality-and-security-in-the-cloud/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Graph-parallel frameworks</title>
		<link>http://www.mosharaf.com/blog/2011/11/18/graph-parallel-frameworks/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=graph-parallel-frameworks</link>
		<comments>http://www.mosharaf.com/blog/2011/11/18/graph-parallel-frameworks/#comments</comments>
		<pubDate>Sat, 19 Nov 2011 03:59:29 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[GraphLab]]></category>
		<category><![CDATA[Pregel]]></category>
		<category><![CDATA[UCB Cloud Computing Course F11]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1966</guid>
		<description><![CDATA[Google, &#8220;Pregel: A System for Large-Scale Graph Processing,&#8221; SIGMOD, 2010. [PDF] Carnegie Mellon, &#8220;GraphLab: A New Framework for Parallel Machine Learning,&#8221; arXiv:1006.4990, 2010. [PDF] Summary Data-parallel frameworks such as MapReduce and Dryad are good at performing embarrassingly parallel jobs. These frameworks are not ideal for iterative jobs and for jobs where data-dependencies across stages are [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">Google, &#8220;Pregel: A System for Large-Scale Graph Processing,&#8221; <em>SIGMOD</em>, 2010. [<a href="http://www-bd.lip6.fr/ens/grbd2011/extra/SIGMOD10_pregel.pdf" class="broken_link">PDF</a>]</p>
<p class="alert">Carnegie Mellon, &#8220;GraphLab: A New Framework for Parallel Machine Learning,&#8221; <em>arXiv:1006.4990</em>, 2010. [<a href="http://arxiv.org/pdf/1006.4990v1">PDF</a>]</p>
<h2>Summary</h2>
<p>Data-parallel frameworks such as MapReduce and Dryad are good at performing embarrassingly parallel jobs. These frameworks are not ideal for iterative jobs and for jobs where data-dependencies across stages are sparse (e.g., in MapReduce, each reducer is likely to depend on each mapper). However, there are many problems, specially in machine learning, that can be intuitively expressed using graphs with sparse computational dependencies, require multiple iterations to converge, and have variable convergence rate for different parameters. Pregel and GraphLab are two frameworks optimized for this type of graph-based problems.</p>
<p>A typical graph-parallel problem is expressed using graphs with vertices and edges, where each vertex and edge have associated data with them. In every iteration, vertex and edge data are updated and a bunch messages are exchanged between neighboring entities. This update function is typically the same for every vertex, and it is written by the user. There may or may not be a synchronization step at the end of every iteration. In a distributed setting, the graph is cut and divided across multiple nodes and updates from a collection of vertices in one node is communicated to another using message passing.</p>
<h3>Pregel vs GraphLab</h3>
<p>The key difference between Pregel and GraphLab is that Pregel has a barrier at the end of every iteration, whereas GraphLab is completely asynchronous. Asynchrony in GraphLab allows it to prioritize more complex vertices over others, but it also calls for consistency models to maintain sanity of results. GraphLab proposes three consistency models: full, edge, and vertex consistency, to allow different levels of parallelism. Another difference is that Pregel allows dynamic modifications to the graph structure, whereas GraphLab does not.</p>
<h2>Comments</h2>
<p>Pregel and GraphLab sit at two ends of the &#8220;power of framework&#8221; vs &#8220;ease of use&#8221; tradeoff space. Allowing asynchrony makes GraphLab more general and powerful than Pregel, but it is more complex and requires users to understand which consistency model is suitable for them. Pregel is simpler (common for most frameworks in Google&#8217;s arsenal), but still capable of handling a wide variety of problems. Given its origin at Google, open-source clones like Giraph, Pregel&#8217;s model is more likely to succeed in near future.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2011/11/18/graph-parallel-frameworks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Datacenter transport layer protocols</title>
		<link>http://www.mosharaf.com/blog/2011/11/14/datacenter-transport-layer-protocols/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=datacenter-transport-layer-protocols</link>
		<comments>http://www.mosharaf.com/blog/2011/11/14/datacenter-transport-layer-protocols/#comments</comments>
		<pubDate>Tue, 15 Nov 2011 04:03:15 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[DCTCP]]></category>
		<category><![CDATA[ICTCP]]></category>
		<category><![CDATA[incast]]></category>
		<category><![CDATA[MPTCP]]></category>
		<category><![CDATA[TCP]]></category>
		<category><![CDATA[UCB Cloud Computing Course F11]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1952</guid>
		<description><![CDATA[Stanford and Microsoft, &#8220;DCTCP: Efﬁcient Packet Transport for the Commoditized Data Center,&#8221; SIGCOMM, 2010. [PDF] Raiciu et al, &#8220;Improving Datacenter Performance and Robustness with Multipath TCP,&#8221; SIGCOMM, 2011. [PDF] MSR Asia, ICTCP: Incast Congestion Control for TCP in Data Center Networks,&#8221; CoNEXT, 2010. [PDF] Summary Datacenters pose a different set of challenges than the Internet, [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">Stanford and Microsoft, &#8220;DCTCP: Efﬁcient Packet Transport for the Commoditized Data Center,&#8221; <em>SIGCOMM</em>, 2010. [<a href="http://research.microsoft.com/pubs/121386/dctcp-public.pdf">PDF</a>]</p>
<p class="alert">Raiciu et al, &#8220;Improving Datacenter Performance and Robustness with Multipath TCP,&#8221; <em>SIGCOMM</em>, 2011. [<a href="http://conferences.sigcomm.org/sigcomm/2011/papers/sigcomm/p266.pdf">PDF</a>]</p>
<p class="alert">MSR Asia, ICTCP: Incast Congestion Control for TCP in Data Center Networks,&#8221; <em>CoNEXT</em>, 2010. [<a href="http://conferences.sigcomm.org/co-next/2010/CoNEXT_papers/13-Wu.pdf">PDF</a>]</p>
<h2>Summary</h2>
<p>Datacenters pose a different set of challenges than the Internet, such as microsecond RTTs, synchronized workloads that cause <a href="http://www.mosharaf.com/blog/2009/09/28/safe-and-effective-fine-grained-tcp-retransmissions-for-datacenter-communication/">incast</a>, and decreased level of multiplexing. TCP, as we know it with milliseconds feedback loops and dependence on packet drops for congestion, works mostly alright but leaves one wondering whether we could design a better transport protocol. DCTCP, MPTCP, and ICTCP are three recent proposals that try to address this question. The proliferation of such proposals stems from the unique opportunities that only a datacenter network can provide, e.g., complete knowledge of the topology and workloads, single administrative domain that allows enforcing changes to the network elements, and uniform network behavior almost all over the network. Each of the three protocols summarized below exploits one or more datacenter-specific network characteristics.</p>
<h3>DCTCP</h3>
<p>DCTCP aims for smaller occupancy in switch buffers through explicit rate throttling at end hosts in order to ensure low latency for short flows and high throughput for long flows. Switches set ECN bits to signal the senders to cut back their window sizes, while the senders estimate the level of congestion and reduce their window sizes proportionally (as opposed to multiplicative decrease in TCP).</p>
<h3>ICTCP</h3>
<p>ICTCP is a specialized TCP variation to solve the incast problem in the last hop. The key idea is to adjust the receiving window of each connection by estimating the available bandwidth.</p>
<h3>MPTCP</h3>
<p>MPTCP is kind of orthogonal to DCTCP and ICTCP in that it tries to address the problem of underutilization of bisection bandwidth and relevant unfairness when flows follow only a single path. Their solution is, unsurprisingly, using multiple paths. Transparent to the applications, MPTCP divides each source-destination flow into several sub-flows and employs a congestion control mechanism that pushes toward using up as much available bandwidth as possible.</p>
<h2>Comments</h2>
<p>Out of the three, I found DCTCP and MPTCP more interesting because of the breadth of problems they try to solve. ICTCP is geared toward solving the incast problem only; however, one thing I found interesting about it is that it takes a flow control approach to the problem instead of the more common congestion control approach. In general, all three suffer from the this-may-not-be-not-real syndrome: DCTCP and ICTCP are possibly too biased by Microsoft workloads, while MCTCP has no evaluation on real workloads. It will be nice to see more general evaluation of all three. Also, my personal opinion on the order of long-run impacts of these papers is MPTCP&gt;DCTCP&gt;&gt;ICTCP.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2011/11/14/datacenter-transport-layer-protocols/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Multi-framework resource managers for datacenters</title>
		<link>http://www.mosharaf.com/blog/2011/11/01/multi-framework-resource-managers-for-datacenters/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=multi-framework-resource-managers-for-datacenters</link>
		<comments>http://www.mosharaf.com/blog/2011/11/01/multi-framework-resource-managers-for-datacenters/#comments</comments>
		<pubDate>Wed, 02 Nov 2011 02:49:41 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hadoop NextGen]]></category>
		<category><![CDATA[Mesos]]></category>
		<category><![CDATA[UCB Cloud Computing Course F11]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1933</guid>
		<description><![CDATA[AMPLab, &#8220;Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center,&#8221; NSDI, 2011. [PDF] Apache Software Foundation, &#8220;Hadoop NextGen&#8221;, 2011. [LINK] Summary Traditional cluster resource schedulers fall into two broad categories: some do fine-grained management of resources for individual frameworks (e.g., in Hadoop), but this requires multiple frameworks to run on multiple isolated clusters. [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">AMPLab, &#8220;Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center,&#8221; <em>NSDI</em>, 2011. [<a href="http://www.mesosproject.org/papers/nsdi_mesos.pdf">PDF</a>]</p>
<p class="alert">Apache Software Foundation, &#8220;Hadoop NextGen&#8221;, 2011. [<a href="http://developer.yahoo.com/blogs/hadoop/posts/2011/02/mapreduce-nextgen/">LINK</a>]</p>
<h2>Summary</h2>
<p>Traditional cluster resource schedulers fall into two broad categories: some do fine-grained management of resources for individual frameworks (e.g., in Hadoop), but this requires multiple frameworks to run on multiple isolated clusters. Some others perform course-grained resource management across multiple frameworks at the cost of underutilization (e.g., MPI schedulers). However, fine-grained sharing of cluster resources across multiple, possibly diverse, data- and compute-intensive frameworks is important for several reasons: better utilization and multiplexing of resources, ease of cluster management, and faster innovation without worrying about underlying physical resources. Mesos and Hadoop NextGen aim to achieve just that.</p>
<p>Without subscribing to either approach&#8217;s terminology, a typical resource manager has a central coordinator that keeps track of all the resources in the cluster by periodically communicating with its daemons in individual machines. Instead of interfacing to actual physical resources, frameworks now use a library provided by the resource manager to interact with the coordinator. Once a framework expresses its requirements and later accepts some, it&#8217;s on its own to schedule those resources among its workers.</p>
<h2>Mesos vs Hadoop NextGen</h2>
<p>The primary and possibly the only major difference between Mesos (that came earlier) and Hadoop NextGen (that spun out from the basic Hadoop framework) is the way the coordinator and frameworks interact while expressing and accepting (or rejecting) resources. Mesos provides resource offers to individual frameworks that can then accept or reject them. Consequently, resource allocation becomes a distributed problem, and Mesos itself remains minimal. Hadoop NextGen, on the contrary, requires each framework to explicitly express their requirements and then runs a centralized algorithm to allocate resources.</p>
<h2>Comments</h2>
<p>Both resource managers are pretty much the same. May be I am biased as an AMPLab member, but it seems that Hadoop NextGen design was highly influenced by Mesos. In either case, the central coordinator can become the bottleneck. But with increasing cluster size, Mesos&#8217; approach is likely to scale more than that of Hadoop NextGen due to Mesos&#8217; distributed approach. Given Hadoop&#8217;s popularity, however, Hadoop NextGen is likely to become more widespread than Mesos.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2011/11/01/multi-framework-resource-managers-for-datacenters/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Distributed in-memory datasets</title>
		<link>http://www.mosharaf.com/blog/2011/10/30/distributed-in-memory-datasets/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=distributed-in-memory-datasets</link>
		<comments>http://www.mosharaf.com/blog/2011/10/30/distributed-in-memory-datasets/#comments</comments>
		<pubDate>Mon, 31 Oct 2011 03:36:54 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[Reviews]]></category>
		<category><![CDATA[Li]]></category>
		<category><![CDATA[Piccolo]]></category>
		<category><![CDATA[Power]]></category>
		<category><![CDATA[RDD]]></category>
		<category><![CDATA[Spark]]></category>
		<category><![CDATA[UCB Cloud Computing Course F11]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1915</guid>
		<description><![CDATA[AMPLab, &#8220;Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing,&#8221; UCB/EECS-2011-82, 2011. [PDF] Russell Power, Jinyang Li, &#8220;Piccolo: Building Fast, Distributed Programs with Partitioned Tables,&#8221; OSDI, 2010. [PDF] Summary MapReduce and similar frameworks, while widely applicable, are limited to directed acyclic data flow models, do not expose global states, and generally slow due to [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">AMPLab, &#8220;Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing,&#8221; UCB/EECS-2011-82, 2011. [<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2011/EECS-2011-82.pdf">PDF</a>]</p>
<p class="alert">Russell Power, Jinyang Li, &#8220;Piccolo: Building Fast, Distributed Programs with Partitioned Tables,&#8221; OSDI, 2010. [<a href="http://www.news.cs.nyu.edu/~jinyang/pub/power-piccolo.pdf">PDF</a>]</p>
<h2>Summary</h2>
<p>MapReduce and similar frameworks, while widely applicable, are limited to directed acyclic data flow models, do not expose global states, and generally slow due to the lack of support for in-memory computations. MPI, while extremely powerful, is hard to use for non-experts. An ideal solution would be a compromise between the two approaches. Spark and Piccolo try to approximate that ideal within the MapReduce-to-MPI spectrum using in-memory data abstractions.</p>
<h3>Piccolo</h3>
<p>Piccolo provides a distributed key-value store-like abstraction, where applications/tasks can read from and write to a shared storage. Users write partition functions to divide the data across multiple machines, control functions to decide the workflow, kernel functions for performing distributed operations on mutable states, and conflict resolution functions to resolve write-write conflicts. Piccolo uses Chandi-Lamport snapshot algorithm for periodic checkpointing and rolls back all tasks of a failed job from the last checkpoint when required.</p>
<h3>Spark</h3>
<p>Spark is a distributed programming model based on a distributed in-memory data abstraction called <em>Resilient Distributed Datasets (RDDs)</em>. RDDs are immutable, support coarse-grained transformations, and keep track of which transformations have been applied to them so far using lineages that can be used for RDD reconstruction. As a result, checkpointing requirements/overheads are low in Spark.</p>
<h2>Spark vs Piccolo</h2>
<p>There are two key differences between Spark and Piccolo.</p>
<ol>
<li>RDDs only support coarse-grained writes (transformations) as opposed to finer-grained writes supported by distributed tables used by Piccolo. This allows efficient storage of lineage information, which reduces checkpointing overhead and fast fault recovery. However, this makes Spark unsuitable for applications that depend on fine-grained updates.</li>
<li>RDDs are immutable, which enables straggler mitigation by speculative execution in Spark.</li>
</ol>
<h2>Comments</h2>
<p>Piccolo is closer to MPI, while Spark is closer to MapReduce on the MapReduce-to-MPI spectrum. The key tradeoff in both cases, however, is between framework usability vs its applicability/power (framework complexity follows power). Both frameworks are much faster than Hadoop (but remember that Hadoop is not the best implementation of MapReduce), a large fraction of which comes from the use of memory. May be I am biased as a member of the Spark project, but Spark should be good enough for most applications unless they absolutely require fine-grained updates.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2011/10/30/distributed-in-memory-datasets/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cloud databases</title>
		<link>http://www.mosharaf.com/blog/2011/10/25/cloud-databases/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=cloud-databases</link>
		<comments>http://www.mosharaf.com/blog/2011/10/25/cloud-databases/#comments</comments>
		<pubDate>Wed, 26 Oct 2011 04:44:41 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[Abbadi]]></category>
		<category><![CDATA[Agrawal]]></category>
		<category><![CDATA[Das]]></category>
		<category><![CDATA[DBaaS]]></category>
		<category><![CDATA[Elmore]]></category>
		<category><![CDATA[Relational Cloud]]></category>
		<category><![CDATA[UCB Cloud Computing Course F11]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1902</guid>
		<description><![CDATA[MIT, &#8220;Relational Cloud: A Database-as-a-Service for the Cloud,&#8221; CIDR, 2011. [PDF] Divyakant Agrawal, Amr El Abbadi, Sudipto Das, Aaron J. Elmore, &#8221;Database Scalability, Elasticity, and Autonomy in the Cloud,&#8221; DASFAA, 2011. [PDF] Relational Cloud The key idea of the Relational Cloud project is to define the concept of transactional Database-as-a-Service (DBaaS), identify the key challenges toward materializing [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">MIT, &#8220;Relational Cloud: A Database-as-a-Service for the Cloud,&#8221; CIDR, 2011. [<a href="http://people.csail.mit.edu/nickolai/papers/curino-relcloud-cidr.pdf">PDF</a>]</p>
<p class="alert">Divyakant Agrawal, Amr El Abbadi, Sudipto Das, Aaron J. Elmore, &#8221;Database Scalability, Elasticity, and Autonomy in the Cloud,&#8221; DASFAA, 2011. [<a href="http://www.cs.ucsb.edu/~sudipto/papers/dasfaa.pdf">PDF</a>]</p>
<h2>Relational Cloud</h2>
<p>The key idea of the Relational Cloud project is to define the concept of transactional <em>Database-as-a-Service (DBaaS)</em>, identify the key challenges toward materializing it, and finally to address each one individually (in separate papers). The authors identify workload awareness as the key ingredient in addressing these challenges. Since this is only an overview paper, they do not go into the details, but they do identify three high-level goals:</p>
<ol>
<li><em>Efficient Multi-tenancy</em>: This deals with packing databases from different tenants in a single machine while maintaining individual SLAs. The paper suggests that just creating VMs for each database is not the ideal solution due to the lack of proper isolation. Instead they <a href="http://db.csail.mit.edu/pubs/sigmod457-curino.pdf">propose</a> using accurate resource models for colocation.</li>
<li><em>Elastic Scalability</em>: The goal here is to dynamically scale up/down databases based on current load from one machine to more and vice versa. The challenge is partitioning the data while avoiding cross-machine dependency as much as possible.</li>
<li><em>Database Privacy</em>: The last key challenge, according to the authors, is adjustable security. <a href="http://people.csail.mit.edu/nickolai/papers/raluca-cryptdb.pdf">This</a> will allow querying encrypted data without decryption on the cloud. Btw, this IS cool!</li>
</ol>
<h2>Database Scalability, Elasticity, and Autonomy in the Cloud</h2>
<p>Instead of rooting for one particular solution, this paper more or less surveys the design space of DBaaS in terms of scalability, elasticity, and autonomy. The authors believe that there can be two different approaches to simultaneously achieve scalability and atomicity in databases: one is to add some level of atomicity to existing key-value stores (<em>Data Fusion</em>) and the other is to scale traditional transactional databases by intelligent partitioning (<em>Data Fission</em>). For elasticity, they argue for live migration of instances during runtime. However, it is not sure how this technique will affect SLAs. The paper also self-cites some of the systems the authors have built at different points of the design space.</p>
<h2>Comments</h2>
<p>I like the proposed techniques presented in both papers, specially the Relational Cloud ones, from a technical perspective. However, I&#8217;m not sure whether DBaaS will be a successful business model. Yes, Amazon and Microsoft have DBaaS or similar products, but who are using them? If they are big enough to care about performance and security isolation, they might not be comfortable in sharing DBs with random entities. If they are not, may be they can do without all the complicated solutions anyway.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2011/10/25/cloud-databases/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Declarative and finite state machine approaches to Cloud programming</title>
		<link>http://www.mosharaf.com/blog/2011/10/22/declarative-and-finite-state-machine-approaches-to-cloud-programming/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=declarative-and-finite-state-machine-approaches-to-cloud-programming</link>
		<comments>http://www.mosharaf.com/blog/2011/10/22/declarative-and-finite-state-machine-approaches-to-cloud-programming/#comments</comments>
		<pubDate>Sat, 22 Oct 2011 07:39:15 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[Alvaro]]></category>
		<category><![CDATA[Armstrong]]></category>
		<category><![CDATA[BOOM]]></category>
		<category><![CDATA[Condie]]></category>
		<category><![CDATA[Conway]]></category>
		<category><![CDATA[Elmeleegy]]></category>
		<category><![CDATA[Erlang]]></category>
		<category><![CDATA[Hellerstein]]></category>
		<category><![CDATA[Overlog]]></category>
		<category><![CDATA[Sears]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1887</guid>
		<description><![CDATA[Perter Alvaro, Tyson Condie, Neil Conway, Khaled Elmeleegy, Joseph M. Hellerstein, Russell Sears, &#8220;BOOM Analytics: Exploring Data-Centric, Declarative Programming for the Cloud,&#8221; EuroSys, 2010. [PDF] Joe Armstrong, &#8220;Erlang: A Survey of the Language and Its Industrial Applications,&#8221; Ninth Exhibition and Symposium on Industrial Applications of Prolog, 1996. [PDF] BOOM BOOM or Berkeley Orders-Of-Magnitude adopts a data-centric [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">Perter Alvaro, Tyson Condie, Neil Conway, Khaled Elmeleegy, Joseph M. Hellerstein, Russell Sears, &#8220;BOOM Analytics: Exploring Data-Centric, Declarative Programming for the Cloud,&#8221; <em>EuroSys</em>, 2010. [<a href="http://db.cs.berkeley.edu/papers/eurosys10-boom.pdf">PDF</a>]</p>
<p class="alert">Joe Armstrong, &#8220;Erlang: A Survey of the Language and Its Industrial Applications,&#8221; <em>Ninth Exhibition and Symposium on Industrial Applications of Prolog</em>, 1996. [<a href="http://www.erlang.se/publications/inap96.pdf">PDF</a>]</p>
<h2>BOOM</h2>
<p>BOOM or Berkeley Orders-Of-Magnitude adopts a data-centric approach to system design and ties it up with a declarative programming language (Overlog) to allow designing compact distributed systems that are easy to build and debug. This paper focuses on reimplementing Hadoop MapReduce engine and HDFS using BOOM to show that Overlog programs can be really tiny. How BOOM can help in making distributed systems verifiable or easier to debug is not that clear though.</p>
<h2>Erlang</h2>
<p>Erlang is a declarative functional programming language specially designed for concurrent, distributed, real-time systems. It takes a finite state machine (FSM) approach in the sense that each node runs individual Erlang processes (using the Actor model), and they communicate between themselves using message passing without any shared state. This makes Erlang suitable for implementing complex distributed algorithms and protocols. The sequential code in Erlang borrows a lot from Prolog due to its logic programming roots. Erlang has many OS-like features, e.g., concurrent processes, scheduling, garbage collection etc.</p>
<h2>Comments</h2>
<p>In both BOOM and Erlang, the high-level goal is to simplify the way verifiable distributed systems can be developed without large SLOCs. The tiny programs, however, are very complex to understand, IMO. Instead of trying to impose one specific paradigm, I think it is better to allow programmers use whichever way they want to code. My personal choice would be Scala.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2011/10/22/declarative-and-finite-state-machine-approaches-to-cloud-programming/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Don&#8217;t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS</title>
		<link>http://www.mosharaf.com/blog/2011/10/17/dont-settle-for-eventual-scalable-causal-consistency-for-wide-area-storage-with-cops/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=dont-settle-for-eventual-scalable-causal-consistency-for-wide-area-storage-with-cops</link>
		<comments>http://www.mosharaf.com/blog/2011/10/17/dont-settle-for-eventual-scalable-causal-consistency-for-wide-area-storage-with-cops/#comments</comments>
		<pubDate>Tue, 18 Oct 2011 04:16:57 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[Andersen]]></category>
		<category><![CDATA[causal consistency]]></category>
		<category><![CDATA[COPS]]></category>
		<category><![CDATA[FAWN]]></category>
		<category><![CDATA[Freedman]]></category>
		<category><![CDATA[Kaminsky]]></category>
		<category><![CDATA[key-value]]></category>
		<category><![CDATA[Lloyd]]></category>
		<category><![CDATA[UCB Cloud Computing Course F11]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1872</guid>
		<description><![CDATA[Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, and David G. Andersen, &#8221;Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS,&#8221; SOSP, 2011. [PDF] Summary This paper introduces a new consistency model, causal+, that extends the causal consistency model and lies between sequential and causal consistency models. The authors claim that causal+ is the [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, and David G. Andersen, &#8221;Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS,&#8221; <em>SOSP</em>, 2011. [<a href="http://www.cs.princeton.edu/~mfreed/docs/cops-sosp11.pdf">PDF</a>]</p>
<h2>Summary</h2>
<p>This paper introduces a new consistency model, <em>causal+</em>, that extends the causal consistency model and lies between sequential and causal consistency models. The authors claim that causal+ is the <em>strongest</em> consistency model achievable for ALPS systems (i.e., systems that require availability, low latency, partition-tolerance, and high scalability), but they do not prove why something stronger cannot be achieved.</p>
<p>The claimed novel part of the proposal that made the authors create a new consistency model instead of using causal itself is <em>convergent conflict handling</em>. Essentially, it requires all conflicting puts to be handled in the same manner at all replicas using the same handler function, <strong>h</strong>, which must be associative and commutative. The authors propose COPS (Clusters of Order-Preserving Servers) that executes all puts/gets in the underlying key-value store in a linearizable fashion within the same datacenter and replicates across datacenters in causal+ consistent order. In order to allow <em>get transactions</em> without locking in a non-blocking manner that will enable consistency <em>across multiple keys</em>, they also propose COPS-GT. Understandably, COPS-GT is more expensive; however, it is not more expensive than comparable systems.</p>
<p><span class="Apple-style-span" style="font-size: 20px; font-weight: bold;">Comments</span></p>
<p>The convergent conflict handling mechanism is related to the way <a href="http://www.mosharaf.com/blog/2011/10/07/dynamo-amazons-highly-available-key-value-store/">Dynamo</a> allows handling conflicts using user-defined functions. The cross-datacenter replication mechanism is similar to that in <a href="http://www.mosharaf.com/blog/2011/10/17/pnuts-yahoos-hosted-data-serving-platform-2/">PNUTS</a>. While COPS-GT provides transaction guarantees across multiple keys without much application logic, it is quite complicated. May be, it&#8217;s better to leave this to application developers as PNUTS argued from real-world experience (i.e., online services do not require this functionality often enough to make it a part of the system).</p>
<p>Btw, I liked the summary of consistency models in Section 3.2, where the authors put causal+ in a partial order of different consistency models.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2011/10/17/dont-settle-for-eventual-scalable-causal-consistency-for-wide-area-storage-with-cops/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PNUTS: Yahoo!&#8217;s Hosted Data Serving Platform</title>
		<link>http://www.mosharaf.com/blog/2011/10/17/pnuts-yahoos-hosted-data-serving-platform-2/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=pnuts-yahoos-hosted-data-serving-platform-2</link>
		<comments>http://www.mosharaf.com/blog/2011/10/17/pnuts-yahoos-hosted-data-serving-platform-2/#comments</comments>
		<pubDate>Tue, 18 Oct 2011 04:09:17 +0000</pubDate>
		<dc:creator>Mosharaf</dc:creator>
				<category><![CDATA[Reviews]]></category>
		<category><![CDATA[per-record timeline consistency]]></category>
		<category><![CDATA[PNUTS]]></category>
		<category><![CDATA[UCB Cloud Computing Course F11]]></category>

		<guid isPermaLink="false">http://www.mosharaf.com/?p=1879</guid>
		<description><![CDATA[Yahoo! Research, &#8220;PNUTS: Yahoo!’s Hosted Data Serving Platform,&#8221; PVLDB, 2008. [PDF] Summary PNUTS is a scalable, highly available, and geographically distributed (but low latency) data store used by most Yahoo! online properties. To achieve both availability and partition tolerance, it uses a novel notion of consistency called per-record timeline consistency; under this model, all replicas of [...]]]></description>
			<content:encoded><![CDATA[<p class="alert">Yahoo! Research, &#8220;PNUTS: Yahoo!’s Hosted Data Serving Platform,&#8221; <em>PVLDB</em>, 2008. [<a href="http://research.yahoo.com/files/pnuts.pdf">PDF</a>]</p>
<h2>Summary</h2>
<p>PNUTS is a scalable, highly available, and geographically distributed (but low latency) data store used by most Yahoo! online properties. To achieve both availability and partition tolerance, it uses a novel notion of consistency called <em>per-record timeline consistency;</em> under this model, all replicas of a given record apply all updates to the record in the same order. However, it is applicable to only a single record, and hence, it is not suitable for transactions involving multiple records without considerable application logic. PNUTS also allows its users to switch to eventual consistency, which is easier to maintain and often acceptable for many online services.</p>
<p>The PNUTS system is divided into regions, where each region contains a full complement of system components and a complete copy of each table. Data tables can be (horizontally) ordered or hash partitioned into groups of records called tablets. Each tablet is stored in a single server within a region. To read or write an update, the router first determines which tablet contains the record, and which server hosts that tablet. Routers contain only a cached copy of the mapping, which is owned by the tablet controller. Routers periodically poll the tablet controller to get any changes to the mapping.</p>
<p>Underneath it all, Yahoo! Message Broker or YMB is a pub/sub system that simultaneously acts as a write-ahead log to allow committing of updates and asynchronously replicates committed data to subscribers in other regions. A message is not purged from the YMB log until PNUTS has verified that the update is applied to all replicas of the database. YMB provides partial ordering of published messages, meaning that messages published to a particular YMB cluster will be delivered to all subscribers in the order they were published; however, messages published to different YMB clusters may be delivered in any order. In order to provide record-level timeline consistency, PNUTS implements <em>record-level mastering</em>: each record has a master copy in some cluster from where message propagation must start. Different records of the same table can have masters in different regions.</p>
<h2>Comments</h2>
<p>PNUTS tradeoffs power to achieve simplicity in design by offloading more complex tasks to application designers. Providing support for ordered tables is one of the key contributions of PNUTS, which allows it to implement efficient range operations. The notion of per-record timeline consistency is interesting, but it cannot support real transactions (the authors argue, however, that transactions involving multiple records are not that common). PNUTS also does not support complex queries.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mosharaf.com/blog/2011/10/17/pnuts-yahoos-hosted-data-serving-platform-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

