ACMCrossroads / Xrds7-5 / Active Queue Management

Este artículo tambíen está en Español.


Active Queue Management

by Kostas Pentikousis

The previous Connector column ( Principles of Network System Design) discussed end-to-end protocols and the end-to-end argument. This month's focus will be queuing methods used by routers in the network layer. Routers are network layer devices used to interconnect different networks [9]. Their primary role is to switch packets from input links to output links. In order to do so a router must be able to determine the path that every incoming packet needs to follow, and decide which outgoing link it should be switched to [6].

In an earlier Connector column (Can TCP be the transport protocol of the 21st century?) it was noted that packet losses are often assumed to result from congestion, rather than from packet corruption. But what exactly is congestion and why does it result in packet loss? Congestion is the phenomena that occurs at a router when incoming packets arrive at a faster rate than the router can switch (or forward) them to an outgoing link. However, it is important to distinguish contention and congestion. According to Peterson, "Contention occurs when multiple packets have to be queued at a switch (or a router) because they are competing for the same output link, whereas congestion means that the switch has so many packets queued that it runs out of buffer space and has to start dropping packets". [9] Hence, packet drops are a form of implicit notification of congestion.

Consider the case of a university campus gateway, a router connecting the campus network to the rest of the Internet (Figure 1). All traffic leaving the campus network must pass through a single router outgoing port. In general, the campus network provides larger bandwidth than the link connecting the campus with the rest of the Internet. The gateway router can easily become congested.


fig1.gifFigure 1
A simplified diagram of a university campus network. Congestion can build up when the outgoing aggregate traffic exceeds the capacity of the link connecting the campus with the rest of the Internet.


Any protocol that runs on top of IP (Internet Protocol), such as TCP (Transmission Control Protocol) can detect packet drops and in a wired IP world it can, more or less, safely interpret them as indications of congestion along the path from the sender to the receiver. In particular, a TCP sender will translate these packet drops into a slower sending rate. Slowing down the sending rate will decrease the incoming packet rate at the router and the queue size will decrease after a while.

Queues are used to smooth spikes in incoming packet rates and to allow the router sufficient time for packet transmission. When the incoming packet rate is higher than the a router's outgoing packet rate, the queue size will increase, eventually exceeding available buffer space. When the buffer is full, some packets will have to be dropped -- but which ones? A straightforward solution is to drop the packets that are just arriving at the input port; that is, if a packet arrives and finds the queue full it will be dropped. This policy is known as drop tail or tail-drop (DT). Other solutions include dropping the first packet in the queue ("drop front on full") [7] or dropping a random packet already stored in the queue. Each of these solutions has certain advantages and disadvantages (which we will not elaborate on due to space limitations), but DT is the most widespread dropping policy.

Braden et al. distinguish between two classes of router algorithms that are related to congestion control, i.e., queue management and scheduling [1]: "To a rough approximation queue management algorithms manage the length of packet queues by dropping packets when necessary or appropriate, while scheduling algorithms determine which packet to send next and are used primarily to manage the allocation of bandwidth among flows". The most common scheduling algorithm is FIFO (First-In First-Out), which means that the packet that first enters the buffer will be the one to leave the buffer first.

Although DT is simple to implement, has been tested and used for many years, it was shown to lead to poor performance under congestion. For example, studies have shown that DT can lead to global synchronization, lockouts, and full queues. Global synchronization occurs when a router drops many consecutive packets in a short period of time forcing a number of TCP senders to slow down at the same time and leads to an underutilization of the bottleneck link, especially if TCP Tahoe [9] is used. Recall that TCP Tahoe always resorts to Slow Start when dropped packets are detected. The Slow Start phase ends when the TCP sender sends more segments than the network can handle. With many senders observing losses during the same period of time, a vicious cycle starts with periods of relatively low network utilization followed by heavy congestion. At the same time, lockouts can happen where a small fraction of flows receive a large proportion of the bottleneck bandwidth, hindering fair allocation and sharing of network resources.

Floyd and Jacobson [3] proposed Random Early Detection (RED) gateways back in 1993 in order to solve the above-mentioned problems with DT. RED attempts to solve both the lockout effect and the full queue problems in an efficient manner, without requiring any changes at the end hosts. Active queue management (AQM) is the pro-active approach of dropping packets before a buffer overflow happens (cf. DT) so that senders are informed early on. RED is an AQM mechanism that uses randomization to solve these problems. The following section provides an overview of how RED works.

Random Early Detection

RED is not designed to operate with any specific protocol, however, it performs better with protocols like TCP that perceive drops as indications of congestion. RED gateways require the user to specify five parameters: the maximum buffer size or queue limit (QL), the minimum (minth) and maximum (maxth) thresholds of the "RED region", the maximum dropping probability (maxp), and the weight factor used to calculate the average queue size (wq). QL can be defined in terms of packets or bytes. Note that when DT is implemented at a router, QL is the only parameter that needs to be set. In order to avoid buffer overflows a RED router starts dropping packets early on so that it can control the congestion level. Early packet dropping starts when the average queue size exceeds minth. RED was specifically designed to use the average queue size (avg), instead of the current queue size, as a measure of incipient congestion, because the latter proves to be rather intolerant of packet bursts [3]. If the average queue size does not exceed minth a RED router will not drop any packets. avg is calculated as an exponentially weighted moving average using the following formula:

avg i  =  (1 - wq) × avgi-1  + wq × q

where the weight wq is commonly set to 0.02 [4], and q is the instantaneous queue size. This weighted moving average captures the notion of long-lived congestion better than the instantaneous queue size [9]. Had the instantaneous queue size been used as the metric to determine whether the router is congested, short-lived traffic spikes would lead to early packet drops. So a rather underutilized router that receives a burst of packets can be deemed "congested" if one uses the instantaneous queue size. The average queue size, on the other hand, acts as a low pass filter that allows spikes to go through the router without any packets being dropped (unless, of course, the burst is larger than the queue limit). The user can configure wq and minth RED routers do not allow short-lived congestion to continue uninterrupted for a specific amount of time. This functionality allows RED to keep the per-packet delay low and throughput high.


For every packet arrival {
  Calculate avg
  if (avg ≥" maxth) { 
      Drop the packet
  }
  else if (avg > minth) {
      Calculate the dropping probability pa
      Drop the packet with probability pa, otherwise forward it
 }
 else {
      Forward the packet
 }
}

Listing 1 Pseudocode of the RED algorithm.


It is interesting that the RED algorithm (Listing 1) separates the decision processes of when to drop a packet from which packet to drop. As noted earlier, RED uses randomization to decide which packet to drop and consequently which connection will be notified to slow down. This is accomplished by making Bernoulli trials (e.g. coin flips) using a probability pa, which is calculated according to the following formulae:

pb = maxp  × (avg - minth) /  (maxth - minth)

and

pa = pb /  (1 - count × pb)

where maxp is a user-defined parameter, usually set to 2% or 10% [4], and count is the number of packets since the last packet drop (but see also [2]). Count is used so that consecutive drops are spaced out over time. Notice that pb varies linearly between 0 and maxp, while pa, i.e., the actual packet dropping probability increases with count. Originally, maxth was defined as the upper threshold; when the average queue size exceeds this limit, all packets have to be dropped (Figure 2(a)). Later on, however, Floyd proposed a modification to the dropping algorithm (the "gentle option"), under which packets are dropped with a linearly increasing probability until avg exceeds 2×maxth; after that all packets are dropped (Figure 2(b)). Although maxth can be set to any value, a rule of thumb is to set it to three times minth, and less than QL [4].


Figure2(a) RED Figure2(b) RED with the gentle option
(a) RED (b) RED with the "gentle option"

Figure 2 The packet dropping probability (pb) in RED as a function of the average queue size (maxp =10%).


By dropping packets before the buffer overflows, RED attempts to notify some connections of incipient congestion. The responsive ones will limit their sending rates and eventually the network load will decrease. The unresponsive connections will not slow down, but will continue at the same pace or even increase their sending rates. In this case, the unresponsive flows will have more packets reaching the router, effectively providing more candidates for dropping than responsive ones, because the number of packets RED drops from a single flow is roughly proportional to its share of the bandwidth [3].

The fairness feature of RED is important in the current Internet incarnation where all connections are treated equally. One would like to notify those connections that presently use the most resources. In other words, the router should drop packets that belong to flows currently using most of the bandwidth (and buffer space). RED is not absolutely accurate in dropping packets according to the exact proportion of the resources that a connection is using -- after all RED does not keep per-connection state, i.e., exact statistics of resource utilization (see also [8]). Nevertheless, its inventors show that RED can be effective in notifying the connections with the higher percentage of bandwidth utilization [3].

To be continued...

RED can easily be deployed in the Internet because it is supported by router vendors and is implemented in a number of products [5]. However, RED is not currently widely used, despite the eight years that have passed since its introduction, the subsequent studies of its efficiency in controlling congestion, and the IETF (Internet Engineering Task Force) recommendation for the wide deployment of RED some years ago [1]. The next Connector column will discuss the advantages and pitfalls of RED, some alternatives, and the potential for better Internet services that RED encompasses.

References

1
Braden, B., D. Clark, J. Crowcroft, B. Davie, S. Deering, D. Estrin, S. Floyd, V. Jacobson, G. Minshall, C. Partridge, L. Peterson, K. Ramakrishnan, S. Shenker, J. Wroclawski, and L. Zhang, Recommendations on Queue Management and Congestion Avoidance, RFC 2309, April 1998.
2
De Cnodder, S., Elloumi, O., and Pauwels, K, "RED Behavior with Different Packet Sizes", Proceedings of the Fifth IEEE Symposium on Computers and Communications, Antibes-Juan les Pins, France, July 2000.
3
Floyd, S., and Jacobson, V., "Random Early Detection Gateways for Congestion Avoidance". In ACM/IEEE Transactions on Networkingg, 3(1), August 1993.
4
Floyd, S., RED: Discussions of Setting Parameters, November 1997, http://www.aciri.org/floyd/REDparameters.txt (June 2001).
5
Floyd, S., RED Implementations, http://www.aciri.org/floyd/red.html#implementations, (June 2001).
6
Kurose, J., and Ross, K., Computer Networking: A Top-down Approach Featuring the Internet, Addison Wesley, 2001.
7
Lakshman, T. V., Neidhardt, A., and Ott, T., "The Drop From Front Strategy in TCP Over ATM and its Interworking with Other Control Features", Proceedings of IEEE INFOCOM 1996, San Francisco, California, April 1996.
8
Lin, D. and Morris, R., "Dynamics of Random Early Detection", Proceedings of ACM SIGCOMM '97, pages 127-137, Cannes, France, October 1997
9
Peterson, L. L., and Davie, B. S., Computer Networks, 2nd edition, Morgan-Kaufmann, 2000.

      

Copyright 2009, The Association for Computing Machinery, Inc.