Update 2: Varys Developer Alpha released!
Update 1: Latest version
will be is available here soon now!
We introduced the coflow abstraction back in 2012 at HotNets-XI, and now we have something solid to show what coflows are good for. Our paper on efficient and deadline-sensitive coflow scheduling has been accepted to appear at this year’s SIGCOMM. By exploiting a little bit of application-level information, coflows enable significantly better performance for distributed data-intensive jobs. In the process, we have discovered, characterized, and proposed heuristics for a novel scheduling problem: “Concurrent Open Shop Scheduling with Coupled Resources.” Yeah, it’s a mouthful; but hey, how often does one stumble upon a new scheduling problem?
Communication in data-parallel applications often involves a collection of parallel flows. Traditional techniques to optimize flow-level metrics do not perform well in optimizing such collections, because the network is largely agnostic to application-level requirements. The recently proposed coflow abstraction bridges this gap and creates new opportunities for network scheduling. In this paper, we address inter-coflow scheduling for two different objectives: decreasing communication time of data-intensive jobs and guaranteeing predictable communication time. We introduce the concurrent open shop scheduling with coupled resources problem, analyze its complexity, and propose effective heuristics to optimize either objective. We present Varys, a system that enables data-intensive frameworks to use coflows and the proposed algorithms while maintaining high network utilization and guaranteeing starvation freedom. EC2 deployments and trace-driven simulations show that communication stages complete up to 3.16X faster on average and up to 2X more coflows meet their deadlines using Varys in comparison to per-flow mechanisms. Moreover, Varys outperforms non-preemptive coflow schedulers by more than 5X.
This is a joint work with Yuan Zhong and Ion Stoica. But as always, it took a village! I’d like to thank Mohammad Alizadeh, Justine Sherry, Peter Bailis, Rachit Agarwal, Ganesh Ananthanarayanan, Tathagata Das, Ali Ghodsi, Gautam Kumar, David Zats, Matei Zaharia, the NSDI and SIGCOMM reviewers, and everyone else who patiently read many drafts or listened to my spiel over the years. It’s been in the oven for many years, and I’m glad that it was so well received.
I believe coflows are the future, and they are coming for the good of the
realm network. Varys is only the tip of the iceberg, and there are way too many interesting problems to solve (both for network and theory folks!). We are in the process of open sourcing Varys in coming weeks and look forward to seeing it used with major data-intensive frameworks. Personally, I’m happy how my dissertation research is shaping up :)
This year the SIGCOMM PC accepted 45 papers out of 237 submissions with a 18.99% acceptance rate. This is the highest acceptance rate since 1995 and the highest number of papers accepted ever (tying with SIGCOMM’86)! After years of everyone complaining about the acceptance rate, this year’s PC have taken some action; probably a healthy step for the community.