Tag Archives: Floyd

Scalable Reliable Multicast

S. Floyd, V. Jacobson, S. McCanne, C-G Liu, L. Zhang, “A Reliable Multicast Framework for Light-weight Sessions and Application Level Framing,” ACM SIGCOMM Conference, (August 1995). [PDF]


Unlike unicast transmissions where requirements for different reliable, sequential data delivery are fairly general, different multicast applications have different requirements for reliability. The authors argue that designing a one-size-fits-all multicast protocol – that can simultaneously meet the functionality, scalability and efficiency requirements of all applications – is not possible. In this paper, they propose Scalable Reliable Multicast (SRM), a framework for supporting diverse multicast requirements in different applications by integrating application semantics (similar to Application Level Framing (ALF)) and session semantics (similar to Light-Weight Sessions (LWS)).

In order to ensure reliability, some party must take responsibility for loss detection and recovery. In TCP, the sender is that responsible party. However, in a multicast scenario, the sender will face ACK explosion from all the receivers in the multicast domain. In addition, the sender will have to keep track of the changing set of active receivers, corresponding windows. RTT estimation algorithms as well as concepts like congestion window also require rethinking. The authors argue that receiver-based reliability, where responsibilities are distributed among many receivers instead of being piled up on the single sender , is a better solution in the multicast domain. Another design choice is to promote the use of naming in application data units (ADUs) instead of generic protocol-specific naming; ADUs ensure that application semantics are intact and provides a level of indirection that can be exploited for anonymity purposes.

SRM is discussed using a distributed whiteboard (wb) application. The whiteboard separates drawing into pages. Each member can create a page and any member can manipulate any page. Each member has a globally unique identifier, so does each page. Names and identifiers are persistent and refers to the same entity over time. The authors also assume the presence of IP multicast datagram delivery and do not separate senders from receivers. Also, wb has no requirement for ordered data delivery because the operations are idempotent and they can be performed in any order.

Whenever new data is generated by wb, it is multicast to the group. Each member of the group individually keep track of their losses (detected by gaps in sequence numbers) and retransmissions. Each member sends low-rate, periodic, session messages to the group announcing the highest sequence number they received. The same messages also carry timestamps that are used to estimate distances between members. When receiver(s) detects missing data, they wait for a random amount of time (using a request timer) before sending a repair request. The wait time is a function of their distance from the source. Repair requests and retransmissions are multicast to the whole group, which suppress repair request implosion (from other receivers who also didn’t receive the same data). If a host receives a request for the same data before its request timer goes off, it resets the request timer after doing an exponential backoff. Any host with the requested data can reply back, but it also waits a random amount of time (using a repair timer) based on its distance from the requester. Similar to the previous case, this prevents response implosion. If a host receives the repair before its repair timer goes off, it cancels the timer.

The authors have analyzed the framework using three example topologies: chains, which explain deterministic suppression; stars, which explore probabilistic suppression; and bounded degree trees, which combine both the suppression mechanisms. Based on the analysis, they have come up with some parameters and observations that are best described in the paper. In short, by tuning those parameters the framework can be tuned to different application requirements.

Finally, SRM is extended to support adaptive adjustment of random timer algorithms based on the past behavior of request/repair algorithms. Simulations show that the adaptive algorithms quickly reduce the average number of repairs with little penalty in additional delay, and they are effective in controlling the number of duplicates over a range of scenarios.


The ideas presented in the paper are simple but intuitively effective. The best part would be the analysis of the framework using three different example topologies. The biggest complain about this paper, as well as about some others around the same time period, would be the absence of any axis or legends whatsoever in the graphs, which is very very annoying for the readers. Another observation is the extensive use of simulations.

There seems to be a lot of similarities between SRM and opportunistic routing protocols in the wireless networks. I wonder whether and which aspects of the opportunistic protocols can be used in a multicast scenario similar to SRM or vice versa.

Modeling Wireless Links for Transport Protocols

Andrei Gurtov, Sally Floyd, “Modeling Wireless Links for Transport Protocols,” ACM SIGCOMM Computer Communications Review, Volume 34, Number 2, (April 2004). [PDF]


Intrinsic characteristics of wireless links (e.g., variable bandwidth, bit corruption, channel allocation delays, and asymmetry) significantly affect performance of transport protocols. In addition, different types of wireless links show noticeable variance in different characteristics. This paper summarizes the authors’ experiences in creating realistic simulation models for three different types of wireless links (cellular, WLAN, and satellite links) to evaluate the effect of link level mechanisms on end-to-end transport protocols.

The authors point out that most of the existing models are either not realistic enough to capture the characteristics of different types of wireless links; or realistic but fails to cover large parameter space required to better understand different characteristics; or overly realistic and ignore the underlying dynamics; or lack reproducibility. Their objective in this paper is to find a balance between reality, generality, and detail.

This paper discusses six different link characteristics to consider for simulating  wireless links and provides plausible models for each of them:

Link Characteristics How to model?
Error losses and corruption Drop packets according to a per-packet, per-bit, or time-based probability
Delay variation Suspend data transmission on a simulated link
Packet reordering Either swap two packets in a queue, or delay one packet and let others pass
On-demand resource allocation Include additional delay when a packet arrives to a queue that has been empty longer than the channel hold time
Bandwidth variation Change bandwidth of a link in simulation
Asymmetry in bandwidth and latency Configure up- and down-link with different values

In addition, the paper also discusses modeling of queue management and effects of mobility. For cellular and current WLAN links, queue management can be modeled by using a Drop-Tail buffer with configurable maximum size in packets; and for satellite and future cellular and WLAN links, using RED is advised to be more effective. Mobility modeling require adequate definitions of the underlying mechanisms and depend on specific scenarios (intersystem vs intrasystem).


This paper presents a good survey of possible issues to take care of while modeling wireless links (of different types) and proposes several plausible solutions for each of the issues. The authors take no side on whether to change link layer or to change transport layer; rather, they propose a combination of the two based on suitability.

Random Early Detection Gateways for Congestion Avoidance

S. Floyd, V. Jacobson, “Random Early Detection Gateways for Congestion Avoidance,” IEEE/ACM Transactions on Networking, (August 1993). [PDF]


Random Early Detection or RED detects impending congestion based on average queue size and notifies connections of congestion through binary feedback by dropping packets or by marking bits in headers. To this end, RED uses two thresholds: once the average queue size exceeds the minimum threshold, RED starts marking/dropping packets with a probability that is a function of the average queue size; when it exceeds the maximum threshold RED marks all the incoming packets. The authors argue against packet dropping as long as there is responsiveness to notifications from the connections, and they resort to dropping as an extreme measure. They also promote use of FCFS/FIFO scheduling over FQ citing the lower overhead for not having to save per-flow states.

The RED gateway consists of two separate algorithms: one for determining the average queue size that determines the maximum allowable burstiness, and the other calculates the probability of marking packets when the average queue size is between the minimum and maximum threshold. The authors leave the decision of determining the thresholds to the system designers’ choice of desired average queue size; however, as rules of thumb they advise to set the minimum threshold sufficiently high to maximize network power (the ratio of throughput to delay) and to set the maximum threshold twice as high as the minimum to avoid global synchronization and oscillation. Through observation and probabilistic calculation, it is shown that packet dropping probability should be calculated as a uniformly distributed random variable. The reason behind this selection is that in order to have a constant average queue size, packets should be marked in uniform intervals not in clusters.

Interesting features of RED include TCP-compatibility, avoidance of global synchronization through randomization, and ability to maintain an upper-bound on average queue size (which also allows RED to better handle bursty traffic). Moreover, RED does not discriminate between different classes of connections, and there is no reward for being within the fair share. RED does not provide explicit action mechanism against misbehaving users; instead the authors argue that misbehaving connections face higher probability for their packets being dropped/marked since they have more packets in the queue than the rest.

The authors back up their claims through simulation and provide mathematical bounds on the average queue size as well as the two thresholds. They also provide a so-called fast version of their control algorithm, which is supposed to useful for in-router implementation due to reduced operations.


The main contribution of this paper is the use of randomization/probabilistic analysis in a clever way (i.e., in conjunction with two different thresholds) that by itself obviates some of the nagging problems of other solutions (e.g., bias against bursty traffic, global synchronization etc.). The paper itself could be shorter though.

This paper promotes FCFS saying that it shares delay among connections and reduces delay for particular connection, which is completely different from the FQ thought process. However, the method described for identifying misconducting connections end-up being stateful, even though they shunned FQ due to its use of states (which they use to find out offenders and to ensure min-max fairness). The definition of fairness is vague, and it is not clear whether they strive for min-max fairness (no explicit mentioning; but since they are using FCFS underneath, its unlikely).

RED leaves too many parameters to the system administrators, and it is quite sensitive to those parameters. Tuning these parameters not only have made their simulation results not-so-reliable (you can check only that many variations!), but also they can potentially create problems in its realistic deployment (too many options to chose from; on the other hand, multiple knobs allow fine-tuning when in good hands).