Tag Archives: Big Data Systems

Spark wins the Best Paper Award at NSDI’2012

April 25, 2012 Mosharaf 1 Comment

Spark (Resilient Distributed Datasets/RDDs) has won the Best Paper award at NSDI 2012. Woohoo! We were also nominated for the inaugural Community Award for open-sourcing the project.

Recent News

Spark has been accepted at NSDI’2012

December 13, 2011 Mosharaf Leave a comment

Our paper "Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing" has been accepted at NSDI'2012. This is Matei's brainchild and a joint work of a lot of people including, but not limited to, TD, Ankur, Justin, Murphy, and professors Ion Stoica, Scott Shenker, and Michael Franklin. Unlike many other systems papers, Spark is … Continue Reading ››

Recent News

Presented Orchestra at Yahoo! Research

September 23, 2011 Mosharaf Leave a comment

Today I presented Orchestra at Yahoo! Research in Santa Clara. Ganesh hooked me up with Sriram Rao at Yahoo! Research, who kindly invited me to give a talk in front of their Cloud Platforms team. I received some very interesting feedback that will help us in further improving Orchestra design and extending its reach.

Recent News

Presented Orchestra at SIGCOMM’2011

August 16, 2011 Mosharaf Leave a comment

I'm attending my second SIGCOMM and had the privilege of giving my first talk at the flagship networking conference. I presented Orchestra, which happened to be very well attended even though it was the last talk of the day at 6PM. I'd like to thank everyone for showing up and also for the lively … Continue Reading ››

Recent News

Technical report on Spark is available Online

July 26, 2011 Mosharaf Leave a comment

A technical report describing the key concepts behind Spark is available online. The abstract goes below:

We present Resilient Distributed Datasets (RDDs), a distributed memory abstraction that allows programmers to perform in-memory computations on large clusters while retaining the fault tolerance of data flow models like MapReduce. RDDs are motivated by two types of applications … Continue Reading ››

Recent News

Spark’s in the wild

May 26, 2011 Mosharaf 1 Comment

We have been working on the Spark cluster computing framework for last couple of years. It has always been open source under the BSD license in github. But yesterday Matei declared official launch of the spark website (spark-project.org) and mailing lists along with its 0.2 release to everyone during the AMPLab summer retreat … Continue Reading ››

Recent News

Presented Orchestra at LBNL

May 10, 2011 Mosharaf Leave a comment

Today I presented Orchestra for the first time in front of a crowd outside our lab. Taghrid Samak kindly invited me at LBNL's Computing Sciences Seminar after we caught up over lunch last week, after a year. She is currently a post-doc fellow with the Advance Computing for Science group. Overall, the talk went very … Continue Reading ››

Recent News

Orchestra has been accepted at SIGCOMM’2011

April 29, 2011 Mosharaf 6 Comments

Update: Camera-ready version of the paper ~~should be~~ can be found in the publications page ~~very soon~~!

Our paper "Managing Data Transfers in Computer Clusters with Orchestra" has been accepted at SIGCOMM'2011. This is a joint work with Matei, Justin, and professors Mike Jordan and Ion Stoica. The project started as part of Continue Reading ››

Recent News

Spark short paper has been accepted at HotCloud’10

May 8, 2010 Mosharaf 1 Comment

An initial overview of our ongoing work on Spark, an iterative and interactive framework for cluster computing, has been accepted at HotCloud'10. I've been joined the project last February, while Matei has been working on it since last Fall. I will have uploaded the paper in the publications page. once … Continue Reading ››