A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, B. Maggs, “Cutting the Electric Bill for Internet-Scale Systems,” ACM SIGCOMM Conference, (August 2009). [PDF]
Large organizations like Google, Microsoft, and Yahoo! annually consume tens of millions of dollars worth of electricity. The traditional approach toward reducing energy costs by reducing the amount of energy consumption has not been that successful. The authors of this paper propose a new method based on two key observations: first, electricity prices exhibit both temporal and geographical variation; and second, large distributed systems incorporate request routing and replication. Based on these observations, they posit that cost-aware routing of computations to low price electricity zones can save a significant amount of energy expenses per annum without increasing other costs (e.g., bandwidth). It should be noted that the paper deals with reducing energy cost and not energy consumption.
Electricity is generally produced somewhere else and then transmitted to consumer locations. The whole process is regulated by Regional Transmission Organizations or RTOs (there are eight reliability regions in the US). RTOs use auctioning mechanisms to match buyers with sellers in multiple parallel wholesale markets such as day-ahead, hour-ahead, and real-time markets. This paper is concerned only with the real-time market and depends on its variation over time and locality. Through empirical study of average prices from January 2006 to April 2009, the authors have found significant uncorrelated variation of prices across time, geographic locations, and different wholesale markets. These variations are due to the source of electricity and time difference in regions (resulting in peak hours in different absolute times).
Routing computations to different, possibly further, geographical places can result in increased latency to client experience and an increase in bandwidth cost. Based on their study on Akamai data, the authors found that for Akamai-like large corporations such problems are manageable, i.e., they do have some impact but the total operating cost still decreases. In the end, they provide a simple strategy to reduce energy costs by moving computations to minimum priced places within a certain radius from the original location.
The results presented in this paper are highly dependent on the energy-proportionality of data centers/clusters, which refers to the fact that energy consumption should be proportional to the load. Unfortunately, that is not the case in today’s hardware, but the authors hope for something like that in the future. Anyhow, they did find significant temporal and geographic variation in prices to exploit and through trace-based simulation they showed that 40% savings can be achieved in ideal energy-proportional scenario. However, on real hardware total savings is brought down to only ~5-6%.
The authors have presented an excellent overview of the contemporary electricity economy, and they have obviously done a great job in shedding new light on a (sort of) well-established problem.
While the paper is excellent in its observations and measurements, the solution is straightforward with several simplifying assumptions. One can imagine a multi-variable optimization problem here that will consider energy-, bandwidth-, and other costs together with multiple constraints on latencies, bandwidth etc. On the other hand, such optimization problems are really hard to solve in real-time, and it might happen so that people will end up using the straightforward approach in the end.
The choice of 1500km as the radius seemed pretty arbitrary. The authors did try to justify the number by some significant jumps in cost and distance at that value, but I did not find it very convincing.
Also, one might ask if such measures do save some cost, why no one is using it. The possible reason is that the overhead of practically implementing something like this outweighs its savings. But this is only the first step in this direction, there might be ways to find some practical solution in the future.
Also, it seemed to me that the paper does not have much networking or communication content per se. Anyhow, it was an interesting read.