Tag Archives: Federated Learning

FedScale Accepted to Appear at ICML’2022

Although theoretical federated learning (FL) research is growing exponentially, we are far from putting those theories into practice. Over the course of last few years, SymbioticLab has made significant progress in building deployable FL systems, with Oort being the most prominent example. As I discussed in the past, while evaluating Oort, we observed the weaknesses of the existing FL workloads/benchmarks: they are too small and sometimes too homogeneous to highlight the uncertainties that FL deployments would face in the real world. FedScale was borne out of the necessity to evaluate Oort. As we worked on it, we added more and more datasets to create a diverse benchmark that not only contains workloads to evaluate FL but also traces to emulate real-world end device characteristics. Eventually, we started building a runtime as well that one can use to implement any FL algorithm within FedScale. For example, Oort can be implemented with a few lines in FedScale, or a more recent work PyramidFL in MobiCom’22, which is based on Oort. This ICML paper gives an overview of the benchmarking aspects of FedScale for the ML/FL researchers, while providing a quick intro to the systems runtime that we are continuously working on and plan to publish later this year.

We present FedScale, a diverse set of challenging and realistic benchmark datasets to facilitate scalable, comprehensive, and reproducible federated learning (FL) research. FedScale datasets are large-scale, encompassing a wide range of important FL tasks, such as image classification, object detection, word prediction, speech recognition, and sequence prediction in video streaming. For each dataset, we provide a unified evaluation protocol using realistic data splits and evaluation metrics. To meet the pressing need for reproducing realistic FL at scale, we build an efficient evaluation platform to simplify and standardize the process of FL experimental setup and model evaluation. Our evaluation platform provides flexible APIs to implement new FL algorithms, and includes new execution backends (e.g., mobile backends) with minimal developer efforts. Finally, we perform systematic benchmark experiments on these datasets. Our experiments suggest fruitful opportunities in heterogeneity-aware co-optimizations of the system and statistical efficiency under realistic FL characteristics. FedScale will be open-source and actively maintained, and we welcome feedback and contributions from the community.

Fan and Yinwei had been working on FedScale for more than two years with some help from Xiangfeng toward the end of Oort. During this time, Jiachen and Sanjay joined first as users of FedScale and later as its contributors. Of course, Harsha is with us like all other past FL projects. Including this summer, close to 20 undergrads and master’s students have worked on/with/around it. At this point, FedScale has become the largest project in the SymbioticLab with interests from academic and industry users within and outside Michigan, and there is an active slack channel as well where users from many different institutions collaborate. We are also organizing the first FedScale Summer School this year. Overall, FedScale reminds me of another small project called Spark I was part of many years ago!

This is my/our first paper in ICML or any ML conference for that matter, even though it’s not necessarily a core ML paper. This year, ICML received 5630 submissions. Among these, 1117 were accepted for short and 118 for long presentations with a 21.94% acceptance rate; FedScale is one of the former. These numbers are mind boggling for me as someone from the systems community!

Join us in making FedScale even bigger, better, and more useful, as a member of SymbioticLab or as a FedScale user/contributor. Now that we have the research vehicle, possibilities are limitless. We are exploring maybe less than 10 such ideas, but 100s are waiting for you.

Visit http://fedscale.ai/ to learn more.

Presented a Keynote Talk on FL Systems at DistributedML’21

This week, I presented a keynote talk at the DistributedML’21 workshop on our recent works on building software systems support for practical federated computation (SolOort, and FedScale). This is a longer version of my talk at Google FL workshop last month, and the updated slides go into more details of our research on cross-silo and cross-device federated learning and analytics.

Presented a Keynote at 2021 Workshop on Federated Learning and Analytics

Recently I presented a keynote talk at the 2021 federated learning and analytics workshop organized Google on our recent works on building software systems support for practical federated computation (Sol, Oort, and FedScale).

I based my talk around the similarities and differences between software stacks for cloud systems and federated systems for learning and analytics. While we still want to perform similar computation (to some extent), the underlying network appears as one of the biggest challenges for the latter. Because the wide-area network (WAN) has significantly lower bandwidth and higher latency than that of a datacenter, federated software stacks have to be rethought with those constraints in mind. Small tweaks here and there are not enough.

Federated learning and analytics systems also come in two broad flavors: cross-silo and cross-device. In case of the former, few computationally powerful and reliable facilities are connected by the WAN, each with several/many powerful computation devices. For the latter, massive number of weak and unreliable devices (e.g., smartphones) take part in the computation. Naturally, cross-device solutions have to deal with additional challenges beyond dealing with the network. The devices have resource and battery constraints, and their owners may not always be connected or have unique behavioral/charging patterns. How do we reason about learning and analytics under such uncertainty?

While the former two topics focused on systems research, the third piece of my talk was about providing a service to my machine learning (ML) colleagues so that they can easily implement and evaluate large-scale federated systems. I believe that systems researchers will fail their ML counterparts if an ML person have to spend considerable time in building and tinkering with systems instead of spending that time on developing new ideas and algorithms. To this end, I talked about the challenges in building such a benchmarking dataset and experimental harness.

I want to thank Peter Kairouz and Marco Gruteser from Google for inviting me to the workshop. My slides are available here and have more details.

Oort Wins the Distinguished Artifact Award at OSDI’2021. Congrats Fan and Xiangfeng!

Oort, our federated learning system for scalable machine learning over millions of edge devices has received the distinguished artifact award at this year’s USENIX OSDI conference!

This is a testament to a lot of hard work put in by Fan and Xiangfeng over the course of last couple years. Oort is our first foray into federated learning, but it certainly is not the last.

Oort and it’s workloads (FedScale) are both open-source at https://github.com/symbioticlab.

FedScale Released on GitHub

Anyone working on federated learning (FL) has faced this problem at least once: you are reading two papers and they either use very different datasets for performance evaluation or unclear about their experimental assumptions about the runtime environment, or both. They often deal with very small datasets as well. There have been attempts at solutions too, creating many FL benchmarks. In the process of working on Oort, we faced the same problem(s). Unfortunately, none of the existing benchmarks fit our requirements. We had to create one on our own.

We present FedScale, a diverse set of challenging and realistic benchmark datasets to facilitate scalable, comprehensive, and reproducible federated learning (FL) research. FedScale datasets are large-scale, encompassing a diverse range of important FL tasks, such as image classification, object detection, language modeling, speech recognition, and reinforcement learning. For each dataset, we provide a unified evaluation protocol using realistic data splits and evaluation metrics. To meet the pressing need for reproducing realistic FL at scale, we have also built an efficient evaluation platform to simplify and standardize the process of FL experimental setup and model evaluation. Our evaluation platform provides flexible APIs to implement new FL algorithms and include new execution backends with minimal developer efforts. Finally, we perform in-depth benchmark experiments on these datasets. Our experiments suggest that FedScale presents significant challenges of heterogeneity-aware co-optimizations of the system and statistical efficiency under realistic FL characteristics, indicating fruitful opportunities for future research. FedScale is open-source with permissive licenses and actively maintained, and we welcome feedback and contributions from the community.

You can read up on the details on our paper and check it out on Github. Do check it out and contribute so that we can together build a large-scale benchmark that considers both data and system heterogeneity across a variety of application domains.

Fan, Yinwei, and Xiangfeng have put in tremendous amount of work over almost two years to get to this point, and I’m super excited about its future.

Oort Accepted to Appear at OSDI’2021

Oort’s working title was Kuiper.

With the wide deployment of AI/ML in our daily lives, the need for data privacy is receiving more attention in recent years. Federated Learning (FL) is an emerging sub-field of machine learning that focuses on in-situ processing of data wherever it is generated. This is only going to become more important as regulations around data movement (e.g., GDPR, CCPA) become even more restrictive. Although there has already been a large number of FL algorithms from the ML community and some FL deployments from large companies, systems support for FL is somewhat non-existent. Oort is our effort in building the first open-source FL system that allows FL developers to select participants for their training in an informed manner instead of selecting them at random. In the process, we have also collected the largest public dataset for FL that we plan to open source in near future.

Federated Learning (FL) is an emerging direction in distributed machine learning (ML) that enables in-situ model training and testing on edge data. Despite having the same end goals as traditional ML, FL executions differ significantly in scale, spanning thousands to millions of participating devices. As a result, data characteristics and device capabilities vary widely across clients. Yet, existing efforts randomly select FL participants, which leads to poor model and system efficiency.

In this paper, we propose Oort to improve the performance of federated training and testing with guided participant selection. With an aim to improve time-to-accuracy performance in model training, Oort prioritizes the use of those clients who have both data that offers the greatest utility in improving model accuracy and the capability to run training quickly. To enable FL developers to interpret their results in model testing, Oort enforces their requirements on the distribution of participant data while improving the duration of federated testing by cherry-picking clients. Our evaluation shows that, compared to existing participant selection mechanisms, Oort improves time-to-accuracy performance by 1.2X-14.1X and final model accuracy by 1.3%-9.8%, while efficiently enforcing developer-specified model testing criteria at the scale of millions of clients.

Fan and Allen had been working on Oort since summer of 2019, and it’s been a great learning experience for me. As always, it’s been a pleasure to collaborate with Harsha, and I look forward to many more in the future. Over the past two years, many others have joined Fan in our efforts toward providing systems support for federated learning and analytics, with many exciting results in different stages in the pipeline and focusing on cloud/edge/WAN challenges. It’s only going to become more exciting!

This is the first OSDI in an odd year as OSDI moves to a yearly cadence. Although the number of submissions is lower than the past, it’s likely only due to the late announcement; being in my first OSDI PC, I think the quality of the submitted and accepted papers remains as high as ever. Overall, the OSDI PC accepted 31 out of 165 submissions.