Documente Academic
Documente Profesional
Documente Cultură
Parameters Configuration
EXECUTIVE SUMMARY.
Random Early Detection (RED) algorithm was first proposed by Sally Floyd and Van Jacobson in 1)
for Active Queue Management (AQM) and then standardized as a recommendation from IETF in 2).
It is claimed that RED is able to avoid global synchronization of TCP flows, maintain high throughput
as well as a low delay and achieve fairness over multiple TCP connections, etc. The introduction of
RED has stirred considerable research interest in understanding its fundamental mechanisms,
analyzing its performance and configuring its parameters to fit in various working environments. This
report first describes RED algorithm in section I and then explains several analytical models in section
II and IV, respectively. Specifically, section II discusses analytic evaluation of RED performance,
which is based upon the paper 3). Section IV examines a feedback control model for RED, which was
first introduced in paper 4). In section V, the parameter tuning for RED is discussed. The report is
ended with a further discussion of selected topics and possible future work. Note that this report only
focuses on the original RED algorithm, although numerous variants of RED have been proposed,.
When the average queue size is between the minimum and maximum thresholds, each arriving packet
is marked with probability pa, where pa is a function of the average queue size avg. Each time a packet
is marked, the probability that a packet is marked from a particular connection is roughly proportional
to that connection’s share of the bandwidth at the router. The detailed algorithm for RED is given in
Figure 1
Essentially, RED algorithm has two separate parts. One is for computing the average queue size,
which determines the degree of burstiness that will be allowed in the router queue. It takes into
account the period when the queue is empty (the idle period) by estimating the number m of small
packets that could have been transmitted by the router during the idle period. After the idle period, the
router computes the average queue size as if m packets had arrived to an empty queue during that
period.
The other is used to calculate the packet-marking probability and then determine how frequently the
router marks packets, given the current level of congestion. The goal is for the router to mark packets
at fairly evenly spaced intervals, in order to avoid biases and avoid global synchronization, and to mark
packets sufficiently frequently to control the average queue size.
1
Initialization
avg <- 0
count <- -1
For each packet arrival
if the queue is non-empty
avg←(1-ωq )×avg+ωq×q
else
m← f (time-q_time)
avg←(1-ωq )m×avg
If minth ≤ avg < maxth
Increment count
avg − minth
pb ← maxp
max th −minth
pb
pa ←
1-count × pb
with probability pa :
mark the arriving packet
count <- 0
Else if maxth < avg
mark the arriving packet
count <- 0
Else count <- -1
When queue become empty
q_time <- time
Notations:
[1] Saved Variables:
avg: average queue size
q_time: start of the queue idle time
count: packets since last marked packet
[2] Fixed Parameters:
ωq : queue weight
minth: minimum threshold for queue
maxth: maximum threshold for queue
maxp: maximum value for pb
[3] Other:
pa: current packet-marking probability
q: current queue size
Time: current time
2
As avg varies from minth to maxth , the packet-marking probability pb varies linearly from 0 to
maxp :
avg − minth
pb ← maxp Equation 1
max th −minth
The final packet-marking probability pa increases slowly as the count increases since the last marked
packet:
pb Equation 2
pa ←
1-count × pb
As discussed in Section V, this ensures that the gateway does not wait too long before marking a
packet. The gateway marks each packet that arrives at the gateway when the average queue size avg
exceeds maxth.
Thomas Bonald et al.3) use classic queueing theory to evaluate RED performance and quantify the
benefits (or lack thereof) brought about by RED. Basically, three major aspects of RED scheme,
namely the bias against bursty traffic, synchronization of TCP flows, and queuing delays, are studied
in details and compared with those of Tail Drop scheme to evaluate the performance of RED.
We consider a router with buffer size of K packets. A typical drop functions for RED scheme and Tail
Drop scheme are listed below. The corresponding curves are shown in Figure 2: Drop function of
RED and Tail Drop scheme.
avg −minth
max = p
p b
if minth<avg< maxth
maxth −minth
3
Drop function for Tail Drop scheme:
0 if q < maximum buffer size
d(q) = Equation 4
1 if q > maximum buffer size
P(drop) P(drop)
100% 100%
maxP
Queue length
We first derive a model of a RED router with a single input stream of bursty traffic. Assume that the
arrival process is a batched Poisson process with rate λ and bursts (or batches) of B packets. Let the
service time be exponentially distributed with mean µ-1. The offered load is defined to be ρ = Bλ/µ.
Hence, the number of packets buffered in the queue forms a Markov chain with stationary distribution
π. This model is depicted in Figure 3: Model of RED router with bursty input traffic.
It is worthwhile to note that this model does not really match empirically derived models of TCP and
other bursty traffic patterns. However, it is analytically tractable; furthermore, our purpose here is to
compare the relative impact of RED on bursty and less bursty traffic. We can imagine that the
difference between a smooth input traffic and a batch Poisson process (as examined here) would be a
lower bound to that observed between a smooth input and an input process with long range
dependence.
B
router
Poisson drop
4
Approximation 1: The RED router uses the same drop probability d(q) on al packets in the same burst,
where q is the instantaneous queue size at the time the first packet in the burst arrives at the router.
Furthermore, we choose maxth = K. Using PASTA property, we get the drop probability seen by a
newly arrival:
Tail Drop router:
B −1 1
PTD =π ( K ) + π ( K −1)( )+ + π ( K − B +1)( ) Equation 5
B B
RED router:
PRED =π ( K ) + π ( K −1)d ( K −1) + + π (1)d (1) Equation 6
Note: The stationary distribution p for RED router is different from that for Tail Drop router.
Figure 4: Drop probability vs. offered load for different values of the burst size.
Figure 4 shows the drop probability of an incoming packet as a function of offered load for different
burst sizes, obtained by previous analysis (with Approximation 1) and by simulation (without
Approximation 1). The figure clearly shows that the approximation is very accurate, even for large
values of the burst size.
In addition, for large offered load, the drop probability is very close to that suffered by a smooth
Poisson traffic in a Tail Drop router, which is given by the loss probability for the M/M/1/k queue.
1− ρ k
PM / M / 1/ K =1− Equation 7
1− ρ k +1
Noting that the drop probability is always higher with RED than with Tail Drop, we conclude that
whatever the burst size,
1 1
PRED ≈ PTD =1− + o ( )
ρ ρ Equation 8
for ρ >>1.
5
B. A RED router with bursty and smooth input traffic
Now we consider a router with two input flows, one bursty with batch Poisson arrivals as discussed in
A., the other a smoother (non batch) Poisson stream. We denote by ρ(b) and ρ(s) the load of the bursty
and the smooth traffic, and by ρ = ρ(b) + ρ(s) the total offered load. The model is depicted in Figure 5.
Poisson router
drop
Poisson
Figure 5: Model of RED router with a mix of bursty and smooth traffic
Again, the total number of packets buffered in the queue defines a Markov chain with stationary
distribution of π. Using PASTA property, we obtain the drop probability of a packet for the bursty
flow and the smooth flow in a Tail Drop router:
B −1 1
PTD ( b ) =π ( K ) + π ( K −1)( ) + + π ( K − B +1)( ) Equation 9
B B
and
PTD ( s ) =π ( K )
Equation 10
Since PTD(b) > PTD(s), it means that there is a bias against bursty traffic with Tail Drop router.
Note that for Tail Drop scheme, the drop probability is:
ρ (b ) ρ (s ) Equation 12
PTD = P ( b )+ P (s )
ρ TD ρ TD
6
C. Including queue size averaging in the model.
So far, we have assumed that the drop probability for RED scheme only depends on the instantaneous
queue size. Once the queue size averaging is taken into consideration, the complexity of the model is
increased phenomenally. However, note that when the weight ωq of the moving average scheme is
small, as being recommended in 1), the estimated average queue size avg varies slowly so that
consecutive packets belonging to the same burst are likely to experience the same drop probability
d(avg). Hence, the Approximation 1 used in previous analysis is still valid in this case.
As an example, consider a buffer of size K = 40 and RED parameters minth = 10, maxth = 30, maxp
= 0.1 and ωq = 0.002. Figure 6 shows the drop probability as a function of the fraction of bursty
traffic in the input traffic, obtained using the analytic expressions above (continuous line for RED,
dashed for Tail Drop), and using simulations (done with queue size averaging and without
Approximation 1).
Several key observations can be made from Figure 6. First, simulation result supports the conclusion
that Tail Drop scheme has bias against the bursty traffic. For RED scheme, however, the drop
probability for bursty traffic and smooth traffic is the same. Moreover, the average drop probability of
a mix of bursty traffic and smooth traffic for Tail Drop scheme remains a constant and equals to the
drop probability of RED scheme with the same traffic mix. This can be expressed as
ρ (b) ρ (s )
P (b) = P (s ) ≈ P (b) + P (s ) Equation 13
RED RED ρ TD ρ TD
When the fraction of bursty traffic is large, Figure 6 indicates that the RED scheme avoids bias against
bursty traffic by increasing the drop probability of smooth traffic, without improving the drop
probability of bursty traffic.
In all cases, the drop rate of a flow going through a RED router does not depend on the burstiness of
this flow, but only on the load it generates (refer to Equation 13 and Figure 6).
Figure 6: Drop probability vs. fraction of bursty traffic for an offered load of ρ=2.
7
D. An important observation about PASTA.
It is important to note that the analysis above heavily relies on the PASTA property of Poisson
processes. In general, it is not true that the stationary distribution of the number of packets k buffered
in the queue immediately before the arrival of a busrt of packets coincides with π, the continuous-time
stationary distribution of k.
For instance, Figure 7 shows the drop probabilities obtained in a RED router with both a bursty input
traffic with Pareto inter-arrival times between bursts and a Poisson input traffic. The Pareto coefficient
in the figure is 1.4 and the RED parameters are those of Figure 6. Unlike what we saw earlier in the
case of the batch Poisson arrival process, the drop probability for the Pareto traffic is different from the
drop probability for smooth traffic even for the RED router. A further discussion about how the traffic
model impacts on the RED performance is provided in section VI.
As shown in the previous section, TCP mechanism that uses Tail Drop scheme has bias against bursty
traffic. If these packets belong to different TCP connections, these connections then experience losses
at about the same time, decrease their rates/windows synchronously, and then tend to stay in
synchronization. This phenomenon is referred to as the synchronization of multiple TCP connections.
It is claimed that RED algorithm is able to avoid TCP synchronization problem. We investigate this
claim in this section.
Figure 7: Drop probability for RED and Tail Drop vs. offered load for bursty (batch arrivals and Pareto distributed
inter-arrivals) and smooth (Poisson) traffic, and a high fraction of bursty traffic (90%).
A. Tail Drop
Assume that a drop occurs at time t = 0 in a Tail Drop router. Due to the memory-less property of
exponential distribution, the next incoming packet is dropped if and only if its arrival time is smaller
8
than the service time of a packet. Thus when a packet is dropped, the next packet is dropped with
probability p, where
−λ x λ ρ
P = ∫0∞ µ e − µ x (1− e )dx = =
λ + µ ρ +1 Equation 14
Hence
E ( NTD ) = p +1
Equation 16
Var ( NTD ) = p( p +1) Equation 17
is the stationary distribution of the number of packets in the queue, conditionally to the fact
π ( i drop )
that a drop occurred,
K −1 d ( k )2
∑ π (k )
1− d ( k )
E ( NRED ) =1+ k = 0
K −1 Equation 20
∑ π ( k )d ( k )
k =0
K −1 d (k ) 2
∑ π ( k )[ ]
k = 0 1− d (k ) Equation 21
Var ( NRED ) =
K −1
∑ π ( k )d ( k )
k =0
Figure 8 compares the analytic result with simulation for an offered load of ρ=2 and RED parameters
as in Figure 4.
9
Figure 8: Distribution of the number of consecutive drops for an offered load of ρ=2.
Since the lower bound of P(NRED > n) in this case does not depend on n, the number of consecutive
drops becomes ∞ with positive probability! The phenomena is illustrated by the simulation results of
Figure 9 and can be explained as follows
With high load, avg slowly oscillates around maxth. This results in long (∞ when w ->0) periods of
consecutive drops (when q > maxth) and long (∞ when w ->0) periods of random drops (when q <
maxth).
The result shows that RED significantly increases the mean and variance of the number of consecutive
drops, especially when is close to its recommended value of 0.002 (1). This suggests that deploying
RED may in fact contribute to the synchronization of TCP flows.
10
Note that the conclusion drawn above is based upon a different definition of drop probability for RED
algorithm than the one originally proposed in (1). More discussion on this topic is provided at the end
of this section.
QUEUEING DELAY
To compare the delay through a router with both the RED and Tail Drop management schemes, same
model described in previous discussion is used, where the input traffic is a Poisson process of intensity
λ.
A. Tail Drop
Using the M/M/1/K model, the stationary distribution of the queue size in a Tail Drop router is given
by:
ρ k (1− ρ ) ∀k = 0, ,K ,
πTD ( k ) = Equation 23
1− ρ k +1
11
Figure 10 shows that RED reduces the mean delay, but increases the delay variance.
This section is organized as follows. First, we introduce a model of average queue size when TCP
flows pass through a queue system with fixed drop probability. Then this model is combined with
RED’s control element and the steady state behavior of the resulting feedback control system is
derived. Finally, the stability of the RED control system is analyzed.
Consider a system of n TCP flows traversing the same link l with capacity c, as shown in Figure 11.
Unidirectional TCP (Reno) flow fi (1 ≤ i ≤ n) is established from, Ai to Di. B – C is the only
bottleneck link for any flow fi. Also, the number of flows remains constant for long time.
The throughput of each TCP flow can be expressed in a closed form, based upon the steady state
model derived by D. Towsley et al. in (6). Only a brief qualitative explanation is offered here, because
the exact form of this equation is not used in our discussion. Basically, the throughput of a particular
12
Figure 11: An n-flow feedback control system.
TCP connection (T) depends on the packet drop probability (p) and average round trip time (Ri), the
average packet size M (in bits), the average number of packets acknowledged by an ACK b (usually 2),
the maximum congestion window size advertised by the flow’s TCP receiver Wmax (in packets) and the
duration of a basic (non-backed-off) TCP timeout To, (which is typically 5R). For simplicity, we
express this fairly complex relationship by using the following equation, where rt,i () is the sending rate
of flow i.
rt,i (p, Ri ) = T (p, Ri ) Equation 25
The purpose of the controlling element is to bring and keep the cumulative throughput (of all flows)
below (or equal to) the link’s capacity c:
n
∑ r ≤c Equation 26
j =1 t , j
In the following, we further simply the model, based upon the assumptions made below. Basically, we
assume that the TCP flows are homogenous. That is, all TCP (Reno) flow fi (1 ≤ i ≤ n) have
same average round trip time (RTT), Ri = R,
same average packet size Mi = M
So that rt,i (p, R) = rt,j (p, R) for 1≤ i, j ≤ n, and the objective function becomes
13
To determine the steady state of this feedback system (i.e., the average value of rt, q and p when the
system is in equilibrium), we need to determine the queue function (or queue “law”) q = G(p) and the
control function p=H( q ). The control function H is given by the architecture of the drop module,
which is RED in our case.
Where B is the max buffer size. The authors have conducted extensive simulation and the result
support the equation 31 developed above. The relationship between average queue length and the drop
probability is illustrated in Figure 14.
14
Now, let us return to the feedback control system in Figure 12. An expression for the long-term
(steady-state) average queue size as a function of packet drop probability, denoted by, is just developed
as in q ( p ) = G ( p ) Equation 32 and validated via simulation. However, Equation 32 is developed
under the open loop scenario. If we assume that the drop module has a feedback control function
denoted by p = H (qe ) , where qe is an estimate of the long-term average of the queue size. If the
following system of equations
q = G( p) Equation 33
p = H (q )
has unique solution (ps, q s ), then the feedback system in Figure 12 has an equilibrium state (ps, q s).
Moreover the system operates on average at (ps, q s ). If we use RED for queue management, the H
function becomes:
q −minth
= max th −minth
maxp if minth< q <maxth
p = H (q ) Equation 34
=0 if q< minth
= 1 if q >maxth
For different combinations of H function and G function, the whole system may work in a stable
equilibrium state or in unstable state, which depends on how the two curves intersect with each other.
In one case, for example, the G function and H function are as two curves illustrated in Figure 15. It
can be seen easily that the system approaches the equilibrium point
(ps, q s) . That is, the equilibrium point is an attractor for all states around it and once the system
reaches the equilibrium state, it will stay there with only small statistical fluctuations, given that the
number of flows n does not change.
Figure 15: RED operating points converges. Figure 16: RED operating point oscillates.
In the other scenario, the G function and H function are as in Figure 16. In this case, the equilibrium
point is situated beyond pmax, where the drop rate has a jump from 0.1 to 1, as given by Equation 34.
Careful analysis shows that the system, although attracted by this point, cannot stay there, since the
value of p is not defined. So, RED operating point oscillates in this case.
15
V. PARAMETER TUNING FOR RED.
In this section, we will explain how the parameters impact the performance of RED algorithm and to
configure these parameters. Such topics as the definition of packet drop probability, average queue
length, threshold values, etc., will be covered in the following discussion. We will also discuss how to
use the feedback control model to tweak the parameters.
X is a R.V., which represents the number of packets that arrive, after a marked packet, until the next
packet is marked. Both assume the average queue size is constant.
then
n-1
Prob [ X = n ] = ( 1 - pb) pb Equation 36
We intend to mark (drop) packets at fairly regular intervals. It is undesirable to have too many marked
(dropped) packets close together, and it is also undesirable to have too long an interval between
marked packets. Both of these events can occur when X is a geometric random variable, which can
result in global synchronization, with several connections reducing their windows at the same time.
pb / (1 –count × pb)
Equation 37
where count is the number of unmarked packets that have arrived since the last marked packet, then
Pb n−2 p
Pr ob[ X = n ] = ∏ (1− b ) = pb for 1 ≤ n ≤ ( 1 / pb )
1− ( n − 1) pb i = 0 1− ipb
Equation 38
for n > ( 1 / pb )
Pr ob[ X = n ]= 0
Method 2 is has obvious advantage over method 1, because we tend to spread out the packet drop as
evenly as possible. Neither clustering nor large inter-dropping interval are desirable.
16
AVERAGE QUEUE LENGTH ANDωq .
Assume that the queue is initially empty, with an average queue size of 0, and then the queue increases
from 0 to L packets over L packet arrivals. After the Lth packet arrives at the router, avgL is:
L L 1 i Equation 40
avgL = ∑ i ωq (1− ωq )L − i == ωq (1− ωq )L ∑ i ( )
i =1 i =1 1− ωq
L L+1
i x + (Lx − L − 1) x
Use the identity ∑ ix = , we obtain the upper bound for ωq as:
i =1 (1− x )2
(1− ωq )L + 1 − 1
avgL = L + 1+ Equation 41
ωq
Given a minimum threshold minth and a acceptable bursts of L packets arriving at the router, then ωq
should be chosen to satisfy the following equation for avgL < minth to accommodate burstiness.
(1− ωq )L + 1− 1
Equation 42
avgL = L + 1+ < minth
ωq
THRESHOLD VALUES.
The parameter minth must be correspondingly large to allow the link utilization to be maintained at an
acceptably high level, if the typical traffic is bursty. The maxth partly depends on the maximum
average delay that can be allowed by the router. The rule of thumb is:
Equation 43
maxth = 3 minth
In section IV., a feedback control model for RED scheme has been developed. One of the main
applications of this model is to configure the parameters for RED.
17
As shown in Figure 17, different H function and G function can result in different operating point and
hence the characteristics (e.g. stability, etc) of the system. For example, a H function with a high slope
results in a state with low drop rate but large average queue size. Conversely, a H function with a small
slope give rise to a lower average queue size, but larger drop rate.
Usually, TCP connections/flows can be modeled as bursty traffic, while UDP-based application can be
considered as smooth traffic. Since TCP has congestion control mechanism implemented at the end
host, TCP connection should respond to the packet dropping after a round trip time (RTT).
Meanwhile, UDP host neglects the packet loss and keeps pumping data into network and let the upper
layer application take care of congestion and perhaps further retransmission. However, this does not
necessarily mean that RED algorithm has no impact on UDP application. In fact, since RED algorithm
is implemented in routers instead of end hosts, it has impact on all kinds of Internet traffic, including
both the TCP and UDP connections.
So, it makes sense to compare how different the influence RED algorithm has on TCP flows from that
on UDP-based applications. The key observations are listed below.
First, the overall loss rate suffered by TCP connections when going from Tail Drop to RED will not
change much, but that the loss rate suffered by UDP/IP telephony applications (whether they are rate
adaptive or not) will increase significantly.
Second, average delay suffered by the UDP packets would be much lower than with Tail Drop, which
is a key benefit in telephony applications. However, the delay variance is such that the end-to-end
delay, including the playout delay at the destination, does not reflect the gain RED brought to the mean
delay. We can expect the audio quality perceived at the destination to be mediocre at best.
18
PARETO DISTRIBUTION AND RED PERFORMANCE
As discussed in section II, RED scheme has a worse performance if the inter-arrival time for input
traffic model has Pareto distribution instead of exponential distribution, which is shown in Figure 7.
The simulation indicates that under the same traffic load, RED has higher drop probability for bursty
traffic (whose inter-arrival time follows Pareto distribution) than the smooth traffic (whose inter-arrival
time is exponentially distributed).
The fact that inter-arrival time has Pareto distribution means that the probability that the inter-arrival
time approaches infinity is bigger than 0. Apparently, an inter-arrival time with Pareto distribution is
more likely to become infinity than that with exponential distribution. That is, the traffic that has inter-
arrival time with Pareto distribution is more clustered than that with exponentially distributed inter-
arrival time and becomes more possible to make the buffer full and packets dropped.
G(p) is derived from a model 6) that characterizes the behavior of end-to-end TCP connections with
multiple routers in between. When drop probability at routers decreases, packet loss decreases and
hence sending rate at end host increases. Higher sending rate, if high enough, results in higher buffer
occupancy and larger average queue size at router. If drop probability increases, more packets are to be
lost and the sending rate is to be slowed down. Then the buffer occupancy will be lowered
accordingly.
Meanwhile, H(q) describes the relationship between the average queue size and drop probability with
regard to RED algorithm, which runs at the intermediate routers. In this case, the feedback controller
tends to increase the drop probability as buffer occupancy increases.
Apparently, G(p) and H(q) are not inverse function to each other. In fact, since both of them describe
how the system behaves, the point the two corresponding curves intersect should be the place where
the system enters equilibrium steady state.
FUTURE WORK
Although much research effort has been focused on understanding and utilizing RED algorithm to
leverage the current network, some interesting research topics are yet to be investigated in more detail
in future.
For example, since it is widely accepted that Poisson model is not sufficient to characterize the traffic
in current Internet, it is important to understand how RED and similar Active Queue Management
(AQM) algorithm act when self-similar network traffic is applied. Further studies may produce more
meaningful characterization of RED performance in the real-world network.
Also note that T. Bonald et al. conclude in 3) that RED algorithm does not avoid TCP synchronization,
using Equation 35 instead of
19
Equation 37 as the definition for drop probability. However, S. Floyd et al. have already shown in the
1) that Equation 37 yields better performance than
Equation 35, in terms of avoiding TCP synchronization. Hence, the argument
made by T. Bonald et al. in 3) may not be valid. One of the main reasons that Equation 37 is
considered undesirable is that it brings more difficulty in performing the mathematical analysis.
Hence, simulation approach may be appropriate for conducting further examination on this problem.
VIII. REFERENCES
1) S. Floyd, V. Jacobson. Random early detection gateways for congestion avoidance. IEEE/ACM
Transactions on Networking (TON) August 19932309
3) T. Bonald, M. May, and J. C. Bolot. Analytic evaluation of RED performance. IEEE INFOCOM
2000
4) V. Firoiu, M. Borden. A Study of Active Queue Management for Congestion Control. IEEE
INFOCOM 2000
5) J. Padhye, V. Firoiu, D. Towsley, and J. Kurose. Modeling TCP Throughput: A Simple Model and
its Empirical Validation. ACM SIGCOMM '98.
6) J. Padhye, V. Firoiu, D. Towsley, and J. Kurose. A Stchastic Model of TCP Reno Congestion
Avoidance and Control. Technical Report CMPSCI TR 99-02, Univ. of Massachusetts, Amherst,
1999.
20