Tag Archives: Locking

NetLock Accepted to Appear at SIGCOMM’2020

High-throughput, low-latency lock managers are useful for building a variety of distributed applications. Traditionally, a key tradeoff in this context had been expressed in terms of the amount of knowledge available to the lock manager. On the one hand, a decentralized lock manager can increase throughput by parallelization, but it can starve certain categories of applications. On the other hand, a centralized lock manager can avoid starvation and impose resource sharing policies, but it can be limited in throughput. In SIGMOD’18, we presented DSLR that attempted to mitigate this tradeoff in clusters with fast RDMA networks by adapting Lamport’s bakery algorithm in the context of RDMA’s fetch-and-add (FA) operations to design a decentralized solution. The downside is that we couldn’t implement complex policies that need centralized information.

What if we could have a high-speed centralized point that all remote traffic must go through anyway? NetLock is our attempt at doing just that by implementing a centralized lock manager in a programmable switch by working at tandem with the servers. The co-design is important to go around the resource limitations of the switch. By carefully caching hot locks and moving warm and cold ones to the servers, we can meet both the performance and policy goals of a lock manager without significant compromise in either.

Lock managers are widely used by distributed systems. Traditional centralized lock managers can easily support policies between multiple users using global knowledge, but they suffer from low performance. In contrast, emerging decentralized approaches are faster but cannot provide flexible policy support. Furthermore, performance in both cases is limited by the server capability.

We present NetLock, a new centralized lock manager that co-designs servers and network switches to achieve high performance without sacrificing flexibility in policy support. The key idea of NetLock is to exploit the capability of emerging programmable switches to directly process lock requests in the switch data plane. Due to the limited switch memory, we design a memory management mechanism to seamlessly integrate the switch and server memory. To realize the locking functionality in the switch, we design a custom data plane module that efficiently pools multiple register arrays together to maximize memory utilization We have implemented a NetLock prototype with a Barefoot Tofino switch and a cluster of commodity servers. Evaluation results show that NetLock improves the throughput by 14.0–18.4×, and reduces the average and 99% latency by 4.7–20.3× and 10.4–18.7× over DSLR, a state-of-the-art RDMA-based solution, while providing flexible policy support.

Xin and I came up with the idea of this project over a couple meals in San Diego at OSDI’18, and later Zhuolong and Yiwen expanded and successfully executed our ideas that lead to NetLock. Similar to DSLR, NetLock explores a different design point in our larger memory disaggregation vision.

This year’s SIGCOMM probably has the highest acceptance rate in 25 years, if not more. After a long successful run at SIGCOMM and a small break doing many other exciting things, it’s great to be back to some networking research! But going forward, I’m hoping for much more along these lines both inside the network and at the edge.

DSLR Accepted to Appear at SIGMOD’2018

High-throughput, low-latency lock managers are useful for building a variety of distributed applications. A key tradeoff in this context can be expressed in terms of the amount of knowledge available to the lock manager. On the one hand, a decentralized lock manager can increase throughput by parallelization, but it can starve certain categories of applications. On the other hand, a centralized lock manager can avoid starvation and impose resource sharing policies, but it can be limited in throughput. DSLR is our attempt at mitigating this tradeoff in clusters with fast RDMA networks. Specifically, we adapt Lamport’s bakery algorithm in the context of RDMA’s fetch-and-add (FA) operations, which provides higher throughput, lower latency, and avoids starvation in comparison to the state-of-the-art.

Lock managers are a crucial component of modern distributed systems. However, with the increasing popularity of fast RDMA-enabled networks, traditional lock managers can no longer keep up with the latency and throughput requirements of modern systems. Centralized lock managers can ensure fairness and prevent starvation using global knowledge of the system, but are themselves single points of contention and failure. Consequently, they fall short in leveraging the full potential of RDMA networks. On the other hand, decentralized (RDMA-based) lock managers either completely sacrifice global knowledge to achieve higher throughput at the risk of starvation and higher tail latencies, or they resort to costly communications to maintain global knowledge, which can result in significantly lower throughput.

In this paper, we show that it is possible for a lock manager to be fully decentralized and yet exchange the partial knowledge necessary for preventing starvation and thereby reducing tail latencies. Our main observation is that we can design a lock manager using RDMA’s fetch-and-add (FA) operation, which always succeeds, rather than compare-and-swap (CAS), which only succeeds if a given condition is satisfied. While this requires us to rethink the locking mechanism from the ground up, it enables us to sidestep the performance drawbacks of the previous CAS-based proposals that relied solely on blind retries upon lock conflicts.

Specifically, we present DSLR (Decentralized and Starvation-free Lock management with RDMA), a decentralized lock manager that targets distributed systems with RDMA-enabled networks. We demonstrate that, despite being fully decentralized, DSLR prevents starvation and blind retries by providing first-come-first-serve (FCFS) scheduling without maintaining explicit queues. We adapt Lamport’s bakery algorithm [34] to an RDMA-enabled environment with multiple bakers, utilizing only one-sided READ and atomic FA operations. Our experiments show that DSLR delivers up to 2.8X higher throughput than all existing RDMA-based lock managers, while reducing their average and 99.9% latencies by up to 2.5X and 47X, respectively.

Barzan and I started this project with Dong Young in 2016 right after I joined Michigan, as our interests matched in terms of new and interesting applications of RDMA primitives. It’s exciting to see our work turn into my first SIGMOD paper. As we work on rack-scale/resource disaggregation over RDMA, we are seeing more exciting use cases of RDMA, going beyond key-value stores and designing new RDMA networking protocols. Stay tuned!