Oort Accepted to Appear at OSDI’2021

Oort’s working title was Kuiper.

With the wide deployment of AI/ML in our daily lives, the need for data privacy is receiving more attention in recent years. Federated Learning (FL) is an emerging sub-field of machine learning that focuses on in-situ processing of data wherever it is generated. This is only going to become more important as regulations around data movement (e.g., GDPR, CCPA) become even more restrictive. Although there has already been a large number of FL algorithms from the ML community and some FL deployments from large companies, systems support for FL is somewhat non-existent. Oort is our effort in building the first open-source FL system that allows FL developers to select participants for their training in an informed manner instead of selecting them at random. In the process, we have also collected the largest public dataset for FL that we plan to open source in near future.

Federated Learning (FL) is an emerging direction in distributed machine learning (ML) that enables in-situ model training and testing on edge data. Despite having the same end goals as traditional ML, FL executions differ significantly in scale, spanning thousands to millions of participating devices. As a result, data characteristics and device capabilities vary widely across clients. Yet, existing efforts randomly select FL participants, which leads to poor model and system efficiency.

In this paper, we propose Oort to improve the performance of federated training and testing with guided participant selection. With an aim to improve time-to-accuracy performance in model training, Oort prioritizes the use of those clients who have both data that offers the greatest utility in improving model accuracy and the capability to run training quickly. To enable FL developers to interpret their results in model testing, Oort enforces their requirements on the distribution of participant data while improving the duration of federated testing by cherry-picking clients. Our evaluation shows that, compared to existing participant selection mechanisms, Oort improves time-to-accuracy performance by 1.2X-14.1X and final model accuracy by 1.3%-9.8%, while efficiently enforcing developer-specified model testing criteria at the scale of millions of clients.

Fan and Allen had been working on Oort since summer of 2019, and it’s been a great learning experience for me. As always, it’s been a pleasure to collaborate with Harsha, and I look forward to many more in the future. Over the past two years, many others have joined Fan in our efforts toward providing systems support for federated learning and analytics, with many exciting results in different stages in the pipeline and focusing on cloud/edge/WAN challenges. It’s only going to become more exciting!

This is the first OSDI in an odd year as OSDI moves to a yearly cadence. Although the number of submissions is lower than the past, it’s likely only due to the late announcement; being in my first OSDI PC, I think the quality of the submitted and accepted papers remains as high as ever. Overall, the OSDI PC accepted 31 out of 165 submissions.