Sunteți pe pagina 1din 11

366

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 9, NO. 2, FEBRUARY 2007

Joint Design of Source Rate Control and


QoS-Aware Congestion Control for Video
Streaming Over the Internet
Peng Zhu, Wenjun Zeng, Senior Member, IEEE, and Chunwen Li

AbstractMultimedia streaming over the Internet has been a


very challenging issue due to the dynamic uncertain nature of the
channels. This paper proposes an algorithm for the joint design
of source rate control and congestion control for video streaming
over the Internet. With the incorporation of a virtual network
buffer management mechanism (VB), the quality of service
(QoS) requirements of the application can be translated into the
constraints of the source rate and the sending rate. Then at the
application layer, the source rate control is implemented based on
the derived constraints, and at the transport layer, a QoS-aware
congestion control mechanism is proposed that strives to meet
the send rate constraint derived from VB, by allowing temporary
violation of transport control protocol (TCP)-friendliness when
necessary. Long-term TCP-friendliness, nevertheless, is preserved
by introducing a rate-compensation algorithm. Simulation results show that compared with traditional source rate/congestion
control algorithms, this cross-layer design approach can better
support the QoS requirements of the application, and significantly
improve the playback quality by reducing the overflow and underflow of the decoder buffer, and improving quality smoothness,
while maintaining good long-term TCP-friendliness.
Index TermsCongestion control, cross-layer design, Internet,
QoS, video streaming.

I. INTRODUCTION
ULTIMEDIA streaming over the Internet has been a
very challenging issue due to the dynamic uncertain
nature (e.g., variable available bandwidth and random packet
loss) of the channels. To address this problem, many solutions
have been proposed based on the layered design principle of
the Internet, all following the architecture of congestion control
for streaming multimedia at the transport layer and source
rate control at the application layer. At the transport layer,
congestion control for streaming multimedia has been proposed
to make the users fairly share the network resources. Because
many commercial products of streaming media adopt the user

Manuscript received August 27, 2005; revised June 15, 2006. This work was
supported in part by National Science Foundation under Grant CNS-0423386
and in part by a grant from the University of Missouri System Research Board.
The associate editor coordinating the review of this manuscript and approving
it for publication was Dr. Deepak S. Turaga.
P. Zhu is with Hitachi (China) R&D Corporation, Beijing, China (e-mail:
pzhu@hitachi.cn).
C. Li is with Department of Automation, Tsinghua University, Beijing, China
(e-mail: lcw@mail.tsinghua.edu.cn).
W. Zeng is with the Department of Computer Science, University of Missouri-Columbia, Columbia MO 65211-2060 USA (e-mail: zengw@missouri.
edu).
A color version of Fig. 8 is available online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TMM.2006.886284

datagram protocol (UDP), which has no congestion control


mechanism, as their transport protocol, several congestion control mechanisms built upon UDP have been proposed. Source
rate control is typically adopted at the application layer to optimize the playback quality, subject to the bandwidth constraint
imposed by the congestion control mechanism and the quality
of service (QoS) requirements of the multimedia application
(e.g., the end-to-end delay constraint)1. However, with the
layered design principle, source rate control and congestion
control are traditionally designed separately without sufficient
communication with each other, which imposes a limitation
on the overall system performance. For example, traditional
congestion control mechanisms for streaming multimedia
usually need to smooth their send rate variation to help the
application achieve smooth playback quality. But this does not
work all the time, because the coding complexity of the video
frames may change abruptly. Moreover, unlike traditional data
applications, multimedia applications cannot send data at any
rate, and usually have a minimum bandwidth requirement,
which is ignored by most congestion control mechanisms. The
end-to-end delay constraint of multimedia applications also
imposes constraints on the sending rate, because source rate
control alone cannot guarantee the end-to-end delay constraint
due to the minimum bandwidth requirement and the quality
smoothness constraint of the video source.
The cross-layer design approach, which allows layers to have
more interaction with each other, on the other hand, can achieve
better overall system performance [1]. We therefore propose a
cross-layer design algorithm for video streaming over the Internet. Our main contributions are as follows:
1) Based on transport control protocol (TCP) friendly
rate control (TFRC) [2], we first propose a QoS-aware
congestion control mechanism, called TCP-friendly rate
control with compensation (TFRCC), to better support
QoS requirements of multimedia applications.
2) We then combine the strength of TFRCC and the virtual network buffer management algorithm described
in [3], and propose a joint design algorithm of source
rate control and QoS-aware congestion control for video
streaming over the Internet.
Compared to traditional solutions, our approach is unique in
that it provides a more flexible framework to allow a joint decision of the source rate and sending rate to meet the QoS requirements of multimedia applications and the TCP-friendliness
1Note that the source rate and the channel rate do not necessarily have to be
a perfect match in the presence of sending buffer and receiving buffer.

1520-9210/$25.00 2007 IEEE

ZHU et al.: JOINT DESIGN OF SOURCE RATE CONTROL

constraint. Simulation results show that our algorithm can significantly improve the playback quality while maintaining good
long-term TCP-friendliness.
The remainder of this paper is organized as follows. Section II
discusses the related work. The architecture of our joint design
algorithm is presented in Section III. In Section IV, we introduce
the QoS-aware congestion control algorithm. The virtual buffer
management mechanism is explained briefly in Section V. Section VI describes the proposed joint design algorithm in detail.
Simulation results are presented in Section VII. In Section VIII,
we summarize this paper and point out some future research directions.
II. RELATED WORK
A. TCP-Based Streaming
Strictly speaking, it is not suitable to use TCP as the transport protocol for streaming multimedia because of its lack of
control on the delay (due to reliable transmission) and its frequent deep fluctuation of the sending rate [4], especially when
the required end-to-end delay of the multimedia application is
small. It is also difficult for TCP-based streaming to work in a
multicast scenario. In addition, throughput efficiency is of a concern for TCP-based streaming over wireless networks. Nevertheless, when the multimedia application has a large end-to-end
delay (e.g., several seconds) and the available bandwidth is also
sufficiently large (e.g., twice as much as the video bitrate), it
has been proved that TCP-based unicast streaming can achieve
satisfactory performance[5]. As a result, many schemes based
on TCP-streaming have also been proposed[6], [7]. However in
this paper, we mainly focus on UDP-based streaming, which can
work under a wide range of end-to-end delay requirements and
different network scenarios.
B. UDP-Based Streaming
For UDP-based streaming, congestion control at the transport layer and source rate control at the application layer are
two important components. Congestion control for streaming
multimedia has to take care of not only the fairness and responsiveness of the protocol, but also the rate smoothness to
achieve better playback quality of the multimedia applications
[4]. A number of TCP-friendly congestion control schemes
for streaming media have been proposed to provide smoother
sending rate. These include the window-based schemes [8][12]
and the rate-based schemes which can be further classified into
the probe-based [12][14] and equation-based schemes [2],
[15]. The equation-based congestion control mechanisms can
achieve good TCP-friendliness by adapting the sending rate according to the throughput equation of the TCP flows under the
same condition (packet size, packet loss ratio, and round-trip
time (RTT), etc.). A well-known equation-based mechanism,
named TFRC, is proposed for unicast flows with constant
packet sizes in [2]. To support multimedia flows with variable
packet sizes, some variants of TFRC have been proposed
[16][18].
Source-rate control at the application layer is to make the
source rate match the channel condition to achieve better video
quality. At the sender side, adaptive adjustment of the source

367

rate is proposed based on the channel condition and the QoS


requirements of the application. A proportional plus derivative
(PD) controller is used to determine the source rate according to
the encoder buffer state in [19]. It, however, does not take into
account the end-to-end delay constraint of the multimedia applications. Both the encoder buffer state and the end-to-end delay
constraint are considered in [3] using a virtual network buffer
management algorithm for bitstream switching applications. In
[18], a global rate control model is adopted to take into account
the encoder buffer state as well as the end-to-end delay constraint of the application. Adaptive media playout mechanism
is proposed in [20] to make sure the end-to-end delay constraint
is met by adaptively varying the playout speed at the receiver.
Unlike traditional solutions, our work focuses on the joint design of source rate control and congestion control, and strive to
improve the overall system performance while ensuring TCPfriendliness. Some of our idea is similar, in some sense, to the
work of joint selection of source and channel rate for video over
QoS guaranteed ATM networks[21], but we address a different
and more challenging scenario, i.e., streaming over the best effort Internet, in which we have to take into account packet loss,
potentially unpredictable bandwidth, and potentially long delay.
Note that our work does not directly fight against packet losses
by error resilience or related schemes, but rather control the
source and sending rate to avoid packet losses caused by the
decoder buffer underflow and overflow.
C. Smoothing Techniques
A VBR-encoded video sequence typically shows very strong
burstiness in bitrate, so its bandwidth requirement is highly variable, which will make it difficult for the application to achieve
smooth playback quality. To address this problem, bandwidth
smoothing techniques have been proposed to reduce the overall
bandwidth burstiness. The main idea is to prefetch the video
data to the decoder buffer ahead of bursts of video data. Bandwidth smoothing techniques are first proposed for QoS-guaranteed networks, and can be classified into offline smoothing for
stored video [22] and online smoothing for live video [23]. Later
on, similar smoothing techniques have also been adopted for
layered video streaming over the Internet [24], [25], which first
derive the optimal transmission policy under the assumption of
complete knowledge of bandwidth evolution, then develop the
real-time heuristic based on the ideal optimal policy.
Note that smoothing techniques are beyond the scope of this
paper. However they can be incorporated into our architecture
to enhance the overall performance of our solution.
III. THE SYSTEM ARCHITECTURE
The proposed system architecture is illustrated in Fig. 1. At
the transport layer, TFRCC is proposed as the congestion control mechanism of the transport protocol. At the application
layer, a virtual network buffer management mechanism (VB) is
used to derive the constraint of the source rate and sending rate
according to the QoS requirements of the application. There is
a middleware component located between the application layer
and the transport layer. At the receiver, the middleware will collect information from the application (e.g., the amount of received video data), then feed it back to the sender together with

368

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 9, NO. 2, FEBRUARY 2007

Fig. 2. Virtual network buffer model.

Fig. 1. System architecture.

the feedback of TFRCC. At the sender, the joint decision of the


source rate and the sending rate is done within the middleware
by considering the constraints of the source rate and sending rate
(provided by VB) and the TCP-friendliness constraint provided
by TFRCC.
IV. THE QOS-AWARE CONGESTION CONTROL MECHANISM
The design goal of TFRCC is to better support the QoS
requirements of multimedia applications without violating
the network fairness constraint. It can take into account the
minimum bandwidth requirement of the video source, and
help the application to meet the end-to-end delay constraint.
It is built upon TFRC, which means that TFRCC calculates
the TCP-friendly sending rate in the same way as TFRC.
However, unlike TFRC, if the calculated TCP-friendly rate is in
conflict with the sending rate constraints imposed by the QoS
requirements of the application, TFRCC will adjust the actual
sending rate (thus temporarily violating TCP-friendliness) to
support the urgent QoS requirements of the application. Then
a rate-compensation algorithm is proposed to maintain good
long-term TCP-friendliness. We describe the main idea of
TFRCC briefly in the following. Details of the algorithm will
be presented in Section VI.
At the end of one feedback interval, the receiver uses the same
algorithm as TFRC to calculate the current packet loss event rate
according to the received packets, and derive the TCP-friendly
sending rate using the following TCP throughput equation:
(1)

where is the sending rate in bytes/s, is the packet size,


is the RTT, and
is the retransmit timeout value. Then the
receiver feeds this TCP-friendly rate back to the sender. To support the urgent QoS requirements of the application, the actual sending rate can be temporarily deviated from the TCPfriendly value if necessary, and the corresponding deviation will
be recorded and compensated later using a rate-compensation
algorithm to maintain good long-term TCP-friendliness.
Considering that multimedia protocols (e.g., the real-time
transport protocol (RTP) [26]) may have special requirements
for the feedback intervals, unlike TFRC, the feedback interval

of TFRCC is flexible and depends on the application. However,


it becomes difficult to have an accurate estimation of RTT
when the feedback intervals are large. To address this problem,
we adopt the RTT measurement algorithm of TFMCC (TCP
Friendly Multicast Congestion Control), which includes the
basic RTT measurement using the feedback, and RTT adjustments within the feedback interval using one-way delay [27].
To support multimedia flows with variable packet size, the
packet size in TFRCC is not assumed to be a constant. We adopt
the suggestion given in [2] to calculate the mean packet size of
multimedia flows, and use the mean packet size to calculate the
TCP-friendly rate according to (1). However, it should be noted
that any work that extends TFRC to support flows with variable
packet size (e.g., [16][18]) can be incorporated into TFRCC as
a building block.
V. THE VIRTUAL NETWORK BUFFER
MANAGEMENT ALGORITHM
We use the virtual network buffer management algorithm proposed in [3] to translate the QoS requirements of the application into the constraints of the source rate and the sending
rate. From the application layer perspective, let us assume a virtual network buffer located between the sender and the receiver
that abstracts the potentially complex network topology, and accounts for the delay and loss of packets introduced in the network (see Fig. 2). It can be conceptually treated as an aggregated effect of the behaviors of the buffers located at interme,
diate network nodes and the transmission links. Denote
, and
, respectively, as the encoder buffer, the decoder buffer, and the virtual network buffer occupancies at time
(when frame is placed into the encoder buffer). Let
,
, and
, respectively, be the th video frame size, the
amount of the data sent by the sender and the amount of data
actually received by the receiver at time . Without loss of generality, let us first assume there is no packet loss in the network.
It should be noted that any acknowledged/detected lost packets
can be easily accounted/subtracted in the calculation of the virtual network buffer occupancy [3], as illustrated in (11) later.
Then we have

(2)
where
is the end-to-end startup delay (in terms of frame
number).

ZHU et al.: JOINT DESIGN OF SOURCE RATE CONTROL

369

Then it can be easily derived that

, during the th feedback interval), we can


denoted as
according to (6):
get the following two bounds for
(3)

(4)
and
, respectively, as the encoder and decoder
Denote
buffer sizes. To avoid underflow and overflow of the decoder
buffer, from (3), we have

(5)
Combining (5) and the encoder buffer constraint, we have

(6)
So if we can maintain the encoder buffer to meet the constraints of (6) by selecting appropriate source rate and sending
rate, the overflow and underflow of both the encoder and decoder buffers can be avoided.
There exists a maximum admissible sending rate constraint,
which is imposed by the buffer sizes of the encoder and decoder.
From (3), it can be easily seen that the admissible sending rate
should make sure the following constraint is met:
(7)
Too large a sending rate will make the decoder buffer overflow.
So if the available bandwidth is so large that the constraint of
(7) is to be violated, we have to limit the sending rate.
Note that there exist some other works (e.g., [21], [28]) that
discuss similar buffer constraints, but they are used for different
scenarios such as constant bit rate channel or other non- best
effort networks with deterministic channel rates and constant
transmission delay. In the best effort network scenario, the challenge is that the available bandwidth is quite dynamic and packet
loss and variable delay will be introduced in the network. We
will discuss in the next section how to estimate these parameters, detect the potential estimation error, and compensate for it
soon enough to assure good performance.
VI. THE PROPOSED JOINT DESIGN ALGORITHM
Let us count the feedback intervals of TFRCC as . At time
, by using the sending rate of the current feedback interval
(bytes/frame) to estimate the receive rates of the future
frame period in (6) (as a result,
remains a constant,

(8)
and
are two nonnegative safety margins that are
where
used to guard against the potential estimation errors, and both
are set to 1 in the simulations. Then we first try to maintain
the encoder buffer fullness within the bounds by adjusting the
source rate, subject to the video quality smoothness constraint,
while maintaining the sending rate TCP-friendly.
Suppose at time , the sender receives a new feedback from
.
the receiver, then the sending rate is updated as
, the estimation of the
Consequently, at times
future receive rates using
might not have been accurate
and the constraints of (6) might not actually be met. This may
lead to the overflow or underflow of the decoder buffer from
to
. To address this issue, we will revisit
time
(6) to take advantage of knowledge of the updated TCP-friendly
, and if necessary, the readjustment of the
sending rate
(if still available in
size of the encoded frame
the encoder buffer), subject to the quality smoothness constraint,
is used to ensure that the decoder buffer will not underflow and
overflow. If this still cannot prevent the decoder buffer from
underflow or overflow, we will have to adjust the sending rate to
pull back the decoder buffer fullness to within the safety region.
Note that this may temporarily violate TCP-friendliness, but a
rate-compensation algorithm is introduced to assure long-term
TCP-friendliness.
In general, the proposed joint design algorithm is composed
of the algorithm done in TFRCC, the virtual network buffer
management mechanism at the application layer, and the algorithm for joint decision of source and sending rate which is done
in the middle-ware. Next, we introduce the details of these components.
A. TFRCC
1) The Receiver: At the end of the
th feedback
is determined, which
interval, next interval length
can be fixed or randomly selected with a constant average.
Then the receiver feeds the calculated TCP-friendly rate
,
, and
back to the sender, where
is the actual amount of received data (including
the amount of data of the detected lost packets) since the
beginning of the transmission.
2) The Sender: When the receivers feedback is received,
is decided by using a rate-compensation algorithm to
maintain long-term TCP-friendliness.
We denote the accumulated difference between the amount
of actually sent data and that of the ideal TCP-friendly sent data
. So a large deviation from 0 for
means long-term
as
un-TCP-friendliness, which should be avoided. A positive (negmeans that the amount of actually sent data is more
ative)

370

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 9, NO. 2, FEBRUARY 2007

(less) than the TCP-friendly value, and we need to make the future sending rate smaller (larger) than the TCP-friendly rate to
.
reduce the magnitude of
First, if we would like to pull back
to zero at the end of
to be
the th feedback interval, we need to set

where is one frame period. However we hope that the comcan help to reduce the sending rate variation
pensation of
to achieve smooth video quality. So we only do the compensation under the following two situations. For other situations, we
to
.
directly set
and
, which means that
1)
the available bandwidth increases, but the sender needs
to set the sending rate to be a value smaller than
to do the rate compensation. In this case, the compensation can reduce the sending rate variation. Considering the responsiveness of the transport protocol, we set
a lower bound
(9)
where

is a response factor. If
, we directly set
to be
, and
to be zero. Otherwise we can only set
to be
, and
is updated as follows:
(10)

2)

and
: Similar to the
as in (9).
previous case, we set a upper bound
, we set
to be , and
So if
to be zero. Otherwise, we set
to be
,
is updated as in (10).
and
Obviously with a larger , better responsiveness can be
achieved, while a longer time may be needed to achieve
has as good
long-term TCP-friendliness. The case of
responsiveness as TFRC, but long-term TCP-friendliness is
has slow
not taken into account. While the case of
responsiveness, and it will not change the sending rate unless
the long-term TCP-friendliness has been met. We will investigate the influence of on the system performance through
simulations in Section VII.
B. The Virtual Network Buffer Management Algorithm
When the feedback from the receiver is received, the virtual
network buffer fullness
is updated as follows:

(11)
where
is the amount of data sent since the beginis the uplink delay, which
ning of the transmission, and
is assumed to be half of the RTT in this paper. Note that any
algorithm for accurate estimation of the uplink delay can be

used. Then the bounds are updated according to (8). However, if


is too large so that
,
, and the violawe can find that it will lead to
tion of the maximum admissible sending rate constraint in (7).
In this case, we need to decrease the sending rate to make sure
). Note that
the constraint of (7) is met (i.e., making
equal to
, which will lead to
we cannot directly make
, because in that case, the source rate has to precisely
match the sending rate, and consequently the video quality varito be the value
ation may be very large. So we decrease
which results in

where
is the maximum rate adaptation range. Then the
. Note
bounds are updated according to (8) and the new
that there is some tradeoff in selecting . With a larger , there
and , which may
is a larger adaptation range between
will lead
lead to smoother video quality. However a large
to a low sending rate, which may result in low video quality.
Simulation results show that it is a good choice to set to be
40 kB when the video source is in QCIF format.
will result in temporal un-TCPThis adjustment of
is updated as follows:
friendliness, so
(12)

C. Joint Decision of Source Rate and Sending Rate in the


Middle-ware
1) Decision of the Source Rate and The Sending Rate: The
sending rate
is usually set to
for good TCP-friendliness. For the source rate, to maintain the encoder buffer within
and , we first set
the bounds of
(13)
where is a constant. Note that a large value of will typically
result in large PSNR varation, but will maintain the buffer oc).
cupancy at around the middle of the buffer (i.e.,
Simulation results show that it is a good choice to set to be
0.04.
Let us denote the peak signal-to-noise ratio (PSNR) of the
, and denote
and
, respectively,
previous frame as
as the frame size that makes the PSNR of the th frame equal
, and
, where
is a constant that
to
is not between
and
controls the PSNR variation. If
, we set
to be the corresponding boundary frame size
in order to achieve smooth video quality.
Note that the above two steps do not necessarily result in
being within the bounds. If
is not between
and , we have to set
to be the value which can pull back
just within the bounds. For example, if
is larger
, we will set
according
than
to (2). We assume there exists a minimum acceptable and maximum necessary PSNR for the video source. If we find that the
to be the size
PSNR of frame is out of the range, we set

ZHU et al.: JOINT DESIGN OF SOURCE RATE CONTROL

371

that makes the PSNR equal to the corresponding boundary value


according to the rate-distortion relationship of frame .
2) Adaptation at the Beginning of a New Feedback Interval: Suppose at time , a new feedback interval begins.
So we have new feedback information and have to check if
will underflow or overflow based
2
and
on the updated
The underflow case: Suppose the oldest frame in the encoder
. We will first
buffer is frame
(we assume this is possible,
try reducing
e.g., given a scalable bitstream), with the constraint of maxmum
dB between two adjacent
allowable PSNR variation of
.
frames, to avoid the underflow of
If this cannot prevent the decoder buffer underflow, we have to
consider adjusting the sending rate. First, we need to consider
the minimum bandwidth requirement of the video source. If we
(a constant which is set to 12 in the
find that the previous
simulations) frames all have the minimum quality, it suggests
might be smaller than the minimum bandwidth rethat
quirement of the video source. Here we use the average size of
frames in the minimum quality to indicate the
the previous
minimum bandwidth requirement of the video source3. So we
as
update

Then
is updated according to (12).
is larger than the minimum bandwidth requirement
If
or the decoder buffer will still underflow even with the above
, we have to adjust
,
updated
which are set to
, the minimum value which can prevent the
is given by
decoder buffer underflow.

(see the proof in the Appendix). Then

(14)
is updated as follows:
(15)

The overflow case: We need to reduce the sending rate


to prevent
overflow. So we set
to
, the maximum value which
is given by
can prevent the decoder buffer overflow.

(16)
(see the proof given in the Appendix). Then
cording to (15).

0 1) can be calculated using (4), then


0 1) can be estimated using (2).

2Note that Bd(k


N

is updated acBd(k ); .

. . ; Bd(k +

3To have an effective estimation, there should be at least one I frame in the
previous M frames, i.e., M should be larger than the GOP size.

Fig. 3. Simulation topology.

VII. SIMULATION RESULTS


We use the
[29] network simulator to investigate
the performance of our proposed algorithm. The simulation
topology is depicted in Fig. 3, where there are three links
(R1-R2, R2-R3 and R3-R4). Each link has a transmission delay
, and a variable capacity which depends on the simulaof
tion scenario. To simulate the real Internet environment, various
cross-traffic (FTP flows and WWW flows) combinations have
been used. The FTP flows are modeled as greedy sources which
can always send data at the available bandwidth. The WWW
flows are modeled as on-off processes.
Different video sequences are used as the video source, and
encoded using an MPEG-4 fine granularity scalable (FGS)
coder [30]. The encoder uses interframe coding (with the GOP
size of 10 and the frame type of I and P) and quantization
stepsize of 31 to generate the base layer, which provides the
minimum video quality. Then the FGS coder generates the
embedded enhancement layer bitstream, which can be cut off at
any bit to adapt the source rate with fine granularity. The frame
, i.e., the allowable
rate is set to 25 frames per second (fps).
maximum PSNR variation between two adjacent frames, is set
to 0.5 dB, and the maximum necessary PSNR is set to 40 dB.
We packetize the base layer and enhancement layer separately,
and the maximum segment size (MSS) is set to 1000 bytes. In
this paper, we use a simple error resilience algorithm. If the
base layer of some frame is lost or late, the base layer of the
previous frame will be used in decoding. If there is a packet
loss in the enhancement layer, all less important packets in that
frame will be discarded as they all depend on the lost more
important packet. Note that in MPEG-4 FGS, loss of enhancement layer packets is not going to cause distortion propagation
to subsequent frames.
We compare the performance of three source rate/congestion control algorithms. One is the direct buffer-state feedback
control algorithm in [19] with TFRC as the congestion control mechanism, denoted as DB-TFRC. One uses the global rate
control model in [18] and TFRC, denoted as GM-TFRC. The
third one is our proposed joint design algorithm, denoted as
VB-TFRCC. For fair comparisons, TFRCC and TFRC have the
in the simulations). Note that
same feedback interval (set to
both GM-TFRC and DB-TFRC belong to traditional separate
design approaches.
We use the average PSNR and PSNR deviation to evaluate
the video quality, where the average PSNR deviation of one
video sequence is calculated by averaging the PSNR difference
between every two adjacent frames. To evaluate the long-term
TCP-friendliness and internal fairness (i.e., the fairness among

372

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 9, NO. 2, FEBRUARY 2007

Fig. 4. Encoder and decoder buffer occupancies of one DB-TFRC flow, one GM-TFRC flow, and one VB-TFRCC flow in Scenario 1. (a) Encoder buffer. (b)
Decoder buffer. Top: DB-TFRC. Middle: GM-TFRC. Bottom: VB-TFRCC.

the flows using the same congestion control mechanism) of the


protocol, we adopt the metrics defined in [31, ch. 4], where a
value close to 1 indicates a good TCP-friendliness or internal
fairness. However, note that in the current form of TFRC and
TFRCC implemented in this paper, we use the mean packet size
to calculate the sending rate in order to support multimedia flows
with variable packet size. So if the sending rate in bits per second
is used in the fairness evaluation, the fairness of one flow will be
related to the flows mean packet size [10]. To remove the bias
of packet size, we use the sending rate in packets per second to
calculate the fairness of TFRC and TFRCC [17]. To make sure
that the probability that one packet is dropped is independent of
its size, the DropTail queue measured in packets and the random
early detection (RED) queue in packet mode are used in the simulations [16].
Next, we will present the simulation results for different scenarios. In each scenario, there are six TFRC flows (including
three DB-TFRC and three GM-TFRC flows), four TFRCC (i.e.,
VB-TFRCC) flows, and four FTP flows running throughout the
entire simulation. We add different background flows to simulate different network conditions.

A. Scenario 1: Low Bandwidth Situation


In this scenario, we would like to simulate the network condition with low available bandwidth. All the three links have
a capacity of 10 Mbps, and a RED queue with the maximum
threshold of 200 packets. The simulation lasts 600 s. There are
70 FTP flows joining at 30 s, and departing at 300 s. Both the
encoder buffer size and decoder buffer size are set to 80 kB. The
start up delay is set to 15 frames. The standard video sequence
of foreman (300 frames) in QCIF format is circularly used as

the video source. The response factor of TFRCC is first set to


0.15.
Fig. 4 shows the encoder and decoder buffer occupancies for
different flows. Within the first 30 s and the last 300 s, the available bandwidth is relatively low. DB-TFRC only considers the
encoder buffer state, and maintains the encoder buffer occupancy around half of the buffer size. So the data in the encoder
buffer cannot be sent out timely because of the TFRC sending
rate constraint, and consequently the decoder buffer underflow
occurs. GM-TFRC and VB-TFRCC, on the other hand, take into
account the end-to-end delay constraint, and lower the encoder
buffer occupancy to avoid decoder buffer underflow. The available bandwidth is lowered further between 30 s and 300 s with
the joining of 70 FTP flows, which makes the available bandwidth smaller than the minimum bandwidth requirement of the
video source occasionally. In this case, source rate control alone
cannot ensure the end-to-end delay constraint being met due
to the minimum bandwidth requirement and quality smoothness constraint of the application, and the decoder buffers of
DB-TFRC and GM-TFRC underflow. However, TFRCC can
meet the end-to-end delay constraint by temporarily making the
sending rate larger than the TCP-friendly value when necessary
(see Fig. 5). As a result, VB-TFRCC can almost avoid the decoder buffer underflow. From Fig. 5, we find that VB-TFRCC
maintains a little lower sending rate than TFRC after 300 s to
achieve good long-term TCP-friendliness.
To show the effect of different on the system performance,
we set to 0 and repeat the simulation. From Fig. 5, we can find
that at this time TFRCC has worse responsiveness. It holds its
sending rate, even though the available bandwidth has increased,
is pulled back to zero. But it takes less time for TFRCC
until
to recover from the compensation. In the following simulations,
is set to 0.15.

ZHU et al.: JOINT DESIGN OF SOURCE RATE CONTROL

373

Fig. 5. Sending rate of one GM-TFRC flow and one VB-TFRCC flow in Scenario 1 when is, respectively, set to 0.15 and 0. Top: is set to 0.15. Bottom: is
set to 0.

Fig. 6. Sending rate of one GM-TFRC flow and one VB-TFRCC flow in Scenario 2.

B. Scenario 2: High Bandwidth Situation


In this scenario, we simulate the network condition where the
available bandwidth is very high. All the three links have a capacity of 17 Mbps, and a RED queue with the same configuration as Scenario 1. We use the same video source as in the previous scenario. The simulation lasts 700 s. As the background
flows, 20 WWW flows join at 300 s. The start up delay is set to
25 frames. Both the encoder and decoder buffer sizes are set to
75 kB.
Within the first 300 s, the available bandwidth is high,
and may sometimes be higher than the maximum admissible
sending rate constrained by the buffer sizes. Consequently the
decoder buffers of both DB-TFRC and GM-TFRC overflow
[see Fig. 7(a)]. On the contrary, VB-TFRCC takes into account
the sending rate constraint imposed by the buffer sizes, and can
ensure the sending rate not to exceed the maximum admissible
value (see Fig. 6). So VB-TFRCC avoids the decoder buffer
overflow successfully. After 300 s, it has to maintain a little
higher sending rate than TFRC to achieve good long-term TCP
friendliness.
C. Scenario 3: Large End-to-End Delay Situation
In the previous simulations, the end-to-end delay is set to
a relatively small value, and only the RED queue is used. In
this scenario we would like to compare the performance of the
three algorithms with a large start up delay, which is set to 125
frames (5 s), and with the Drop Tail queue used in the simulation
topology. The simulation lasts 700 s. As the background flows,
76 FTP flows join at 200 s, and depart at 400 s, then there are
15 WWW flows joining at 400 s. All the links have a capacity
of 22 Mbps, and a Drop Tail queue with a size of 300 packets.

Both the encoder buffer size and decoder buffer size are set to
400 kB. In this scenario, the standard video sequence of coastguard (300 frames) in QCIF format is circularly used as the
video source.
Setting the end-to-end delay sufficiently large is usually considered an effective way of protecting the continuous playback
of multimedia applications. However, from Fig. 7(b), we can
find that even with very large end-to-end delay, DB-TFRC
and GM-TFRC still cannot achieve satisfactory performance.
Within the first 200 s, the available bandwidth is larger than
the maximum admissible value constrained by the buffer sizes.
So their decoder buffers overflow. Between 200 and 400 s,
the available bandwidth may be sometimes smaller than the
minimum bandwidth requirement of the video source. So the
data in the encoder buffers of DB-TFRC and GM-TFRC cannot
be sent out timely, which causes the underflow of their decoder buffers. VB-TFRCC, on the other hand, can successfully
avoid the underflow and overflow of the decoder buffer by
temporarily adapting its sending rate to support the urgent QoS
requirements of the application.
D. Summary of the Simulation Results
In all the scenarios, VB-TFRCC can better support the QoS
requirements of the application, and avoid/reduce the underflow or overflow of the decoder buffer. As a result, VB-TFRCC
can significantly reduce the video quality degradation due
to lost/late packets between the sender and the receiver, and
achieve higher average PSNR and smoother video quality than
DB-TFRC and GM-TFRC (see Tables IIII, where all the
entries are the average value of all the flows using the same
source rate/congestion control algorithm). This can also be
easily seen from Fig. 8. Note that very low PSNR values (e.g.,
less than 25 dB) in the figure typically indicate an effective loss
of base layer packet for a frame, which introduce significant
quality loss for that frame and the subsequent frames.
Although TFRCC may sometimes make its sending rate temporarily violate TCP-friendliness, from Table IV, we can find
that it can still achieve good long-term TCP-friendliness and internal fairness by using the rate-compensation algorithm. One
may concern about how the overall network performance is affected when short-term TCP-friendliness is broken. To address

374

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 9, NO. 2, FEBRUARY 2007

Fig. 7. Decoder buffer occupancies of one DB-TFRC flow, one GM-TFRC flow, and one VB-TFRCC flow in Scenario 2 and 3. (a) Scenario 2. (b) Scenario 3.
Top: DB-TFRC. Middle: GM-TFRC. Bottom: VB-TFRCC.

TABLE I
PSNR OF DB-TFRC, GM-TFRC, AND VB-TFRCC IN SCENARIO 1 ( = 0:15)

TABLE II
PSNR OF DB-TFRC, GM-TFRC, AND VB-TFRCC IN SCENARIO 2

Fig. 8. PSNR curve of the sequences decoded by the decoder of one


GM-TFRC flow and one VB-TFRCC flow in Scenario 3.

TABLE IV
TCP-FRIENDLINESS(TF) AND INTERNAL FAIRNESS (IF) OF TFRC AND TFRCC
TABLE III
PSNR OF DB-TFRC, GM-TFRC, AND VB-TFRCC IN SCENARIO 3

this question, we replace VB-TFRCC flows with GM-TFRC


flows (which means that there are only TCP and TFRC flows
in the simulations) and repeat the simulations of the above three
scenarios. Then we compare the overall network performance
(including the overall packet loss ratio introduced in the network, and the utilization ratio of the bottleneck bandwidth), and
find that there is almost no difference in the overall network performance between using TFRC and TFRCC (see Table V). So

our proposed algorithm will not deteriorate the overall network


performance although it sometimes exhibits temporal un-TCPfriendly behaviors.
VIII. CONCLUSION AND FUTURE WORK
This paper proposes an algorithm for joint design of source
rate control and QoS-aware congestion control for video
streaming over the Internet. Simulation results show that
compared with traditional separate design algorithms, this

ZHU et al.: JOINT DESIGN OF SOURCE RATE CONTROL

375

TABLE V
THE OVERALL NETWORK PERFORMANCE (INCLUDING THE OVERALL
PACKET LOSS RATIO IN THE NETWORK AND THE UTILIZATION RATIO OF THE
BOTTLENECK BANDWIDTH) WHEN USING TFRC/TFRCC

To avoid the decoder buffer overflow, we have


(20)
Combining (18) and (20), we have

cross-layer design approach can provide better QoS support


for multimedia applications, and significantly improve the
playback quality, while still meeting the fairness constraint of
the network.
In this paper we only address the problem of streaming video
over the wired Internet. Our ongoing work is to extend this work
to the wireless Internet, and provide a feasible solution to better
support the deployment of multimedia applications over wireless channels. Some initial results are presented in [32].
APPENDIX
PROOF OF (14) AND (16)
Given

, the past
frame sizes
, and the sending rate
from time
, according to (2), we have

So the maximum bandwidth which can prevent the decoder


buffer overflow from time to
is

ACKNOWLEDGMENT

to

The authors would like to thank G.-M. Su and M. Wu from


the University of Maryland for providing the MPEG4 FGS software.
REFERENCES

(17)
Then we use the sending rate to estimate the receive rates
, and have

(18)
To avoid the decoder buffer underflow, it is needed that
(19)
Combining (18) and (19), it can be easily derived that

So the minimum bandwidth which can prevent the decoder


buffer underflow from time to
is

[1] S. Shakkottai, T. S. Rappaport, and P. C. Karlsson, Cross-layer design for wireless networks, IEEE Commun. Mag., vol. 41, no. 10, pp.
7480, Oct. 2003.
[2] M. Handley, S. Floyd, J. Padhye, and J. Widmer, TCP friendly rate
control (TFRC): Protocol specification, in IETF RFC 3448, Jan. 2003.
[3] B. Xie and W. Zeng, Rate distortion optimized dynamic bitstream
switching for scalable video streaming, in Proc. IEEE Int. Conf. Multimedia and Expo, Taipei, Taiwan, Jun. 2004.
[4] J. Widmer, R. Denda, and M. Mauve, A survey on TCP-friendly congestion control, IEEE Network, vol. 15, no. 3, pp. 2837, May 2001.
[5] B. Wang, J. F. Kurose, P. J. Shenoy, and D. F. Towsley, Multimedia
streaming via TCP: an analytic performance study, Proc. ACM Multimedia04, pp. 908915, Oct. 2004.
[6] A. Sehgal, O. Verscheure, and P. Frossard, Distortion-buffer optimized tcp video streaming, in Proc. IEEE Int. Conf. Image Processing
(ICIP), Oct. 2004, pp. 20832086.
[7] P. de Cuetos, P. Guillotel, K. W. Ross, and D. Thoreau, Implementation of adaptive streaming of stored mpeg-4 FGS video over TCP, in
Proc. IEEE Int. Conf. Multimedia and Expo (ICME), Aug. 2002, pp.
405408.
[8] D. Bansal and H. Balakrishnan, Binomial congestion control algorithms, in Proc. IEEE Infocom, Apr. 2001, pp. 631640.
[9] Y. R. Yang and S. S. Lam, General AIMD congestion control, in
Proc. 8th IEEE Int. Conf. Network Protocols, Nov. 2000, pp. 187198.
[10] L. Cai, X. Shen, J. Pan, and J. W. Mark, Performance analysis of TCPfriendly AIMD algorithms for multimedia applications, IEEE Trans.
Multimedia, vol. 7, no. 2, pp. 339355, Apr. 2005.
[11] N. R. Sastry and S. S. Lam, CYRF: A theory of window-based unicast congestion control, IEEE/ACM Trans. Netw., vol. 13, no. 2, pp.
330342, Apr. 2005.
[12] R. Rejaie, M. Handley, and D. Estrin, RAP: An end-to-end rate-based
congestion control mechanism for real-time streams in the internet, in
Proc. IEEE INFOCOM99, Mar. 1999, pp. 13371345.
[13] D. Sisalem, TCP-Friendly Congestion Control for Multimedia Communication in the Internet, Ph.D. dissertation, Tech. Univ. Berlin,
Berlin, Germany, 2000.
[14] T. Kim, S. Lu, and V. Bharghavan, Improving congestion control performance through loss differentiation, in Proc. IEEE Int. Conf. Computers and Communication Networks, Oct. 1999, pp. 412418.

376

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 9, NO. 2, FEBRUARY 2007

[15] Y.-G. Kim, J. Kim, and C.-C Jay Kuo, TCP-friendly Internet video
with smooth and fast rate adaptation and network-aware error control,
IEEE Trans. Circuit Syst. Video Technol., vol. 14, no. 2, pp. 256268,
Feb. 2004.
[16] J. Widmer, C. Boutremans, and J. -Y. Le Boudec, End-to-end congestion control for TCP-friendly flows with variable packet size, ACM
SIGCOMM Comput. Commun. Rev., vol. 34, no. 2, pp. 137151, Apr.
2004.
[17] S. Floyd and E. Kohler, TCP friendly rate control (TFRC) for voice:
VoIP variant and faster restart, Internet Draft, IETF, Feb. 2005.
[18] J. Vieron and C. Guillemot, Real-time constrained TCP-compatible
rate control for video over the Internet, IEEE Trans. Multimedia, vol.
6, no. 3, pp. 634646, Aug. 2004.
[19] S. Jacobs and A. Eleftheriadis, Streaming video using TCP flow control and dynamic rate shaping, J. Vis. Commun. Image Repres., vol. 9,
no. 21, pp. 1222, 1998.
[20] M. Kalman, E. Steinbach, and B. Girod, Adaptive media playout for
low-delay video streaming over error-prone channels, IEEE Trans.
Circuits Syst. Video Technol., vol. 14, no. 6, pp. 841851, Jun. 2004.
[21] C. -Y. Hsu, A. Ortega, and A. R. Reibman, Joint selection of source
and channel rate for VBR video transmission under ATM policing constraints, IEEE J. Sel. Areas Commun., vol. 15, no. 6, pp. 10161028,
Aug. 1997.
[22] W. -C. Feng and J. Rexford, Performance evaluation of smoothing
algorithms for transmitting prerecorded variable-bit-rate video, IEEE
Trans. Multimedia, vol. 1, no. 3, pp. 302313, Sep. 1999.
[23] S. Sen, J. L. Rexford, J. K. Dey, J. F. Kurose, and D. F. Towsley, Online smoothing of variable-bit-rate streaming video, IEEE Trans. Multimedia, vol. 2, no. 1, pp. 3748, Mar. 2000.
[24] P. de Cuetos and K. W. Ross, Adaptive rate control for streaming
stored fine grained scalable video, in Proc. NOSSDAV, May 2002, pp.
312.
[25] T. Kim and M. H. Ammar, Optimal quality adaptation for scalable encoded video, IEEE J. Sel. Areas Commun, vol. 23, no. 2, pp. 344356,
Feb. 2005.
[26] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, RTP: A
transport protocol for real-time applications, FETE RFC 3550, Jul.
2003.
[27] J. Widmer and M. Handley, TCP-friendly multicast congestion control (TFMCC): Protocol specification, Internet Draft, IETF, Oct. 2004.
[28] J. Ribas-Corbera, P. A. Chou, and S. L. Regunathan, A generalized hypothetical reference decoder for h.264/avc, IEEE Trans. Circuit Syst.
Video Technol, vol. 13, no. 7, pp. 674687, Jul. 2003.
[29] S. Floyd and S. McCanne, Network Simulator LBNL public domain
software [Online]. Available: ftp.ee.lbl.gov., http://www.isi.edu/
nsnamlns/.
[30] H. M. Radha, M. van der Schaar, and Y. Chen, The MPEG-4 finegrained scalable video coding method for multimedia streaming over
IP, IEEE Trans. Multimedia, vol. 3, no. 1, pp. 5368, Mar. 2001.
[31] J. Padhye, Towards a Comprehensive Congestion Control Framework
JBR Continuous Media Flows in Best Effort Network, Ph.D. dissertation, Univ. Massachusetts, Amherst, 2000.
[32] P. Zhu, W. Zeng, and C. Li, Cross-layer design of source rate control and qos-aware congestion control for wireless video streaming, in
Proc. IEEE Int. Conf. Multimedia and Expo (ICME), Jul. 2006.

Peng Zhu received the B.S. degree in electrical engineering, the M.S. degree, and the Ph.D. degree in automation from Tsinghua University, Beijing, China,
in 2000, 2003, and 2006, respectively.
Currently, he is an Associate Senior Researcher
with Hitachi (China) R&D Corp. His research interests include multimedia networking, optical access
network, IPTV, control theory and applications.

Wenjun Zeng (S94M97SM03) received the


B.E., M.S., and Ph.D. degrees from Tsinghua University, Beijing, China, in 1990, the University of
Notre Dame, Notre Dame, IN, in 1993, and Princeton
University, Princeton, NJ, in 1997, respectively, all
in electrical engineering.
He has been an Associate Professor with the
Computer Science Department, University of Missouri, Columbia, since August 2003. Prior to that,
he was with PacketVideo Corporation, Sharp Labs
of America, Bell Laboratories, and Matsushita
Information Technology Lab, Panasonic Technologies Inc. From 1998 to 2002,
he was an active contributor to the MPEG 4 Intellectual Property Management
and Protection (IPMP) standard and the JPEG 2000 image-coding standard,
where four of his proposals were adopted. He has been awarded 12 patents. His
current research interests include multimedia communications and networking,
content and network security, wireless multimedia, and distributed source and
channel coding.
Dr. Zeng has served as an Organizing Committee Member and Technical
Program Committee (TPC) Chair/Member for a number of IEEE international conferences. He is an Associate Editor of the IEEE TRANSACTIONS ON
MULTIMEDIA and is on the Editorial Board of IEEE Multimedia Magazine. He
is currently serving as the TPC Chair for the 2007 IEEE Consumer Communications and Networking Conference (CCNC), and is Guest Editing the Special
Issue on Recent Advances in Distributed Multimedia Communications for the
PROCEEDINGS OF THE IEEE. He has previously served as the TPC Vice-Chair
for CCNC 2006, the TPC Co-Chair for the Multimedia Communications
and Home Networking Symposium, 2005 IEEE International Conference on
Communication. He was the Lead Guest Editor of the IEEE TRANSACTIONS
ON MULTIMEDIA Special Issue on Streaming Media published in April 2004.
He is a member of the IEEE Signal Processing Societys Multimedia Signal
Processing Technical Committee and the IEEE ComSocs Multimedia Communications Technical Committee.

Chunwen Li received the B.E. degree, the M.E. degree in control theory, and the Ph.D. degree in control engineering from Tsinghua University, Beijing,
China, in 1982, 1985, and 1989, respectively.
He joined the Teaching Staff at the Department of
Automation, Tsinghua University, in 1989, where he
is currently a Professor of Information Science and
Technology. His current research interests center
around nonlinear control, network control, and
power system control.

S-ar putea să vă placă și