Distortion-Aware Scalable Video Streaming To Multinetwork Clients

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO.
2, APRIL 2013
469
Distortion-Aware Scalable Video Streaming

to Multinetwork Clients
Nikolaos M. Freris, Member, IEEE, Cheng-Hsin Hsu, Member, IEEE, ACM, Jatinder Pal Singh, Member, IEEE,
and Xiaoqing Zhu, Member, IEEE
AbstractWe consider the problem of scalable video streaming

from a server to multinetwork clients over heterogeneous access
networks, with the goal of minimizing the distortion of the received
videos. This problem has numerous applications including: 1) mobile devices connecting to multiple licensed and ISM bands, and
2) cognitive multiradio devices employing spectrum bonding. In
this paper, we ascertain how to optimally determine which video
packets to transmit over each access network. We present models
to capture the network conditions and video characteristics and
develop an integer program for deterministic packet scheduling.
Solving the integer program exactly is typically not computationally tractable, so we develop heuristic algorithms for deterministic
packet scheduling, as well as convex optimization problems for
randomized packet scheduling. We carry out a thorough study of
the tradeoff between performance and computational complexity
and propose a convex programming-based algorithm that yields
good performance while being suitable for real-time applications.
We conduct extensive trace-driven simulations to evaluate the
proposed algorithms using real network conditions and scalable
video streams. The simulation results show that the proposed
convex programming-based algorithm: 1) outperforms the rate
control algorithms defined in the Datagram Congestion Control
Protocol (DCCP) by about 1015 dB higher video quality; 2) reduces average delivery delay by over 90% compared to DCCP;
3) results in higher average video quality of 4.47 and 1.92 dB than
the two developed heuristics; 4) runs efficiently, up to six times
faster than the best-performing heuristic; and 5) does indeed
provide service differentiation among users.
Index TermsQuality optimization, rate control, stream adaptation, video streaming.
I. INTRODUCTION
ARKET research indicates that mobile data traffic will

increase 39 times over a span of five years, and 66% of
the increase will be due to mobile videos [4]. In fact, cellular
service providers are having a hard time coping with the huge
Manuscript received September 05, 2011; revised February 22, 2012

and April 10, 2012; accepted May 19, 2012; approved by IEEE/ACM
TRANSACTIONS ON NETWORKING Editor M. Reisslein. Date of publication
June 20, 2012; date of current version April 12, 2013. The work of C.-H. Hsu
was supported in part by the National Science Council (NSC) of Taiwan under
Grant #100-2218-E-007-015-MY2.
N. Freris is with IBM Research, 8803 Rschlikon, Switzerland (e-mail:
nif@zurich.ibm.com).
C.-H. Hsu is with National Tsing Hua University, Hsin Chu, Taiwan (e-mail:
chsu@cs.nthu.edu.tw).
J. P. Singh is with the Department of Electrical Engineering, Stanford University, Stanford, CA 94305 USA (e-mail: jatinder@stanford.edu).
X. Zhu is with Cisco Systems, Inc., San Jose, CA 95134 USA (e-mail:
xiaoqzhu@cisco.com).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TNET.2012.2203608
increase in mobile data traffic [1], [3] and will have to carefully engineer their systems to support high-quality real-time
video streaming. In wireless networks, one way to achieve the
best possible streaming quality is to leverage all available wireless spectra by connecting the streaming server to each client
via multiple access networks. We refer to the clients capable
of connecting to multiple access networks as multinetwork or
multihomed clients. Potential application scenarios of multinetwork clients include streaming videos to: 1) multiradio wireless
devices connected to different Industrial, Scientific, and Medical (ISM) bands [37]; 2) cognitive multiradio clients employing
spectrum bonding [34]; and 3) multiradio clients connected to
both licensed band (such as 3G cellular network) and ISM band
(such as IEEE 802.11 networks) [14]. A streaming server may
transmit a video concurrently over multiple access networks
to a multinetwork client, thus aggregating the various wireless
spectra to achieve better streaming quality. We call this setup
multinetwork (multihomed) video streaming, which is particularly challenging because access networks are diverse and dynamic. We note that concurrently activating multiple network
interfaces may lead to higher energy consumption on mobile
devices. While energy conservation is out of the scope of this
paper, several prior studies [16], [19], [33] propose mechanisms
to achieve burst traffic delivery to conserve energy, which can
be used in multinetwork video systems. Lastly, multihoming can
also be viewed as an alternative to multipath video streaming.
Multipath streaming, although studied in the literature, e.g.,[10],
is not widely deployed. This is partially due to the additional
requirements on designated network equipment. In contrast to
multipath video streaming, multihomed video streaming works
on the current Internet infrastructure: For example, cellular service providers can adopt multihomed video streaming to maximize the overall streaming quality without overloading the networks. Multihomed video streaming however is challenging because of: 1) the heterogeneity and dynamics of access networks,
and 2) complicated interdependency among video packets.
An approach of arbitrarily splitting a video stream into multiple substreams and sending each substream over an access network may lead to degraded video quality and playout glitches.
This is because transmitting a substream at a low rate may underutilize the network resources, while transmitting at a rate
close to the available bit rate may lead to network congestion,
which in turn causes packet drops and late packet delivery. To
this end, rate control based on measurements of available bit
rate (ABR) and round-trip time (RTT) needs to be performed
to achieve a good tradeoff between throughput and delay. In
nonscalable video streaming, once the bit rate of each substream
1063-6692/$31.00 2012 IEEE
470
is determined, the video stream must be adapted into the right

format so that it can be delivered to the client in a timely fashion.
We refer to this conversion as stream adaptation, which is typically implemented by means of a computationally demanding
operation called transcoding [29], [36]. In contrast, scalable
video coding, such as the H.264/SVC standard [29] supports efficient stream adaptation and allows service providers to save
expenses on deploying streaming servers and transcoders. Despite a small cost on coding inefficiency, modern H.264/SVC
coders are reported to significantly outperform several scalable
coding schemes and even outperform some nonscalable coders
such as MPEG-4 Advanced Simple Profile (ASP) [35]. Scalable video streams feature complex interdependencies among
video packets, which stream adaptation must account for. The
rate control and stream adaptation problems must be simultaneously considered for optimal video streaming quality.
In this paper, we present a mathematical formulation of the
joint rate control and scalable stream adaptation problem for
multiple clients1 concurrently competing for the same access
networks (cf. Fig. 12). We abstract the problem of streaming
videos to multinetwork clients and formulate an optimization
problem to determine, for each client: 1) the streaming rate over
each access network; 2) the video packets to be transmitted; and
3) the access network over which each transmitted video packet
is sent. Due to the discrete nature of the considered optimization
problem, and its NP-completeness, we formulate it as an integer
program in order to derive the global-optimal solutions. Our
contributions can be summarized as follows.
We formulate the joint rate control and packet scheduling
problem as an integer program where the objective is to
minimize a cost function of the expected video distortion.
We suggest different cost functions in order to provide service differentiation and address fairness among users.
We propose heuristic algorithms for packet scheduling,
analyze their complexity, and study their performance
through trace-driven simulations.
We consider randomized packet scheduling by relaxing the
integer program into a real-valued optimization problem.
We derive convex programming approximations to this
problem.
We analyze, both analytically and experimentally, the performance versus computational complexity tradeoff of the
proposed optimization programs and recommend one that
yields good performance while being suitable for real-time
applications.
Simulation results show that the proposed algorithm:
1) outperforms the rate control algorithms defined in the
Datagram Congestion Control Protocol (DCCP) standard [22] by about 1015 dB in terms of video quality;
2) achieves better balance between performance and
runtime; 3) reduces average delivery delay by over 90%
compared to DCCP; 4) results in better performance than
1Throughout the paper, we use the terms client and user interchangeably.
2This figure shows a sample architecture, in which our algorithm runs on the
streaming server. Before a streaming session starts, each client sends a CONNECT UDP message to the server using different IP addresses associated with
each client network interface. The different IP addresses allow the streaming
server to direct the video packets over the chosen access networks to the client.
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO. 2, APRIL 2013
Fig. 1. Sample system architecture of a scalable video streaming system with

clients and
access networks.
the proposed heuristics, under diverse background traffic

load; and 5) indeed provides service differentiation among
users.
The rest of this paper is organized as follows. We present related work in Section II. In Section III, we expose the problem
formulation. In Section IV, we develop deterministic and randomized packet scheduling algorithms. We present trace-driven
simulations to evaluate the proposed algorithms in Section V.
In Section VI, we discuss some limitations of our approach and
propose future work, while Section VII concludes the paper.
II. RELATED WORK
Rate control for nonscalable video streams has been extensively investigated [6], [10], [11], [21], [30], [32], [39], [41].
Chakareski and Girod [10] propose an algorithm to enable a
streaming server to decide which packets to transmit over which
network paths so as to meet the predefined bandwidth constraints. Szwabe et al. [32] propose an architecture to monitor
network conditions and control the streaming rate over a single
access network. Jurca and Frossard [21] study the problem
of rate control for video streaming over a multihop network,
assuming known packet loss rates and available bandwidths for
each network link. Chou and Miao [11] propose a video-aware
framework to schedule video packets based on their importance
so as to maximize the video quality under given rate constraints.
Zhu et al. [41] propose joint routing and rate control algorithms
for ad-hoc wireless networks, while rate control for clients with
multiple interfaces has been studied in [6], [30], and [39]. There
has been a wide range of methodologies summoned to address
the resource allocation problem of video streaming: For example, Singh et al. [30] propose a solution based on stochastic
control of Markov decision processes, Alpcan et al. [6] propose
a solution based on
-optimal control, and Zhu et al. [39]
propose a solution based on convex optimization.
Efficient stream adaptation for scalable streams has been
studied in [7], [12], [17], [25], and [31]. Hefeeda and Hsu [17]
consider the stream adaptation problem for fine-grained scalable (FGS) video streaming from multiple senders to a single
client; they employ a rate-distortion (R-D) function designed
for FGS streams and consider stream adaptation to maximize
the overall video quality. Amonou et al. [7] study the problem
of prioritizing video packets of H.264/SVC streams; they empirically calculate the distortion impact of dropping each video
packet and give higher priorities to video packets with higher
impact values. Sun et al. [31] propose an R-D model for FGS
FRERIS et al.: DISTORTION-AWARE SCALABLE VIDEO STREAMING TO MULTINETWORK CLIENTS
TABLE I
LIST OF SYMBOLS USED IN THE PAPER
471
Access networks are heterogeneous and time-varying, so

periodic measurements of the ABR, , as well as the RTT,
, need to be carried out for each access network. In our
implementation, we have opted to use a light-weight tool called
[5], although our algorithms are clearly independent
of such a measuring tool. This measurement tool runs on
both server and client sides and monitors end-to-end network
conditions. We develop an algorithm to determine, on the
server side, the streaming rates of individual access networks
along with the video packets to be included in each substream,
given information about the network conditions and video
characteristics.
B. Network Model
streams coded by H.264/SVC, based on a generalized Gaussian

distribution source model that captures the drifting error caused
by truncating video packets. Mansour et al. [25] study stream
adaptation between one base station and multiple clients in a
single-hop wireless network; the clients share a given network
capacity for receiving FGS streams from the base station.
In [12], the authors propose a streaming platform to support
multihoming, which was tested to reduce video interruptions
and achieve higher and more stable received video quality.
This paper builds upon the preliminary results reported
in [13] and [18] and provides more detailed analysis and
elaborate evaluation results. To the best of our knowledge, our
work is the first that simultaneously considers the end-to-end
rate control and scalable stream adaptation for multinetwork
clients. Previous works either consider nonscalable video
streaming [6], [30], [39] or concentrate on scalable stream
adaptation without accounting for heterogeneous and dynamic
network conditions [7], [17], [25], [31].
III. PROBLEM FORMULATION
A. System Architecture
Table I summarizes the symbols used in the paper. A
multinetwork scalable streaming system consists of a scalable
streaming server containing a database of scalable videos, and
multinetwork clients, each one having access to
heterogeneous networks (cf. Fig. 1). When requested by a client, a
video stream is divided into
substreams (each transmitted
over a distinct network) by a video splitter that further controls
the rate of each substream to ensure timely delivery of video
packets. For each client, the server sets up a connection over
each access network and transmits substream
over access network . Each client has a dejitter buffer and a
video assembler, which combine the received substreams into
a single scalable video stream. The video stream is then fed to
a video decoder.
For a given user

, we let
be the subbe
stream rate over access network and
the total streaming rate for network . For access network ,
we use
to denote the packet loss probability, which accounts
for losses due to packets missing their playout deadlines. We
assume that access networks are statistically independent and
write
, where
is increasing in
and decreasing in . While our analysis can accommodate var, we adopt
ious queueing models [15] in defining
the M/M/1 model that was shown to yield a good approximation in typical streaming applications per previous measurement
studies [39], [40]. We denote the playout deadline3 by
and
define the average one-way delay by
. The one-way
delay can be related to the residual bandwidth,
, by
, where
is a parameter estimated from past observations of
via linear regression [39]. Finally, we periodically measure and values and compute the streaming
rate in each time window, which in turns allow us to estimate
via
(1)
C. Video Model
We consider H.264/SVC [29] video streams coded with
medium-grained quality scalability (MGS). Each stream ,
, is divided into multiple Network Abstraction Layer
Units (NALUs). For user , each NALU
is identified
, and quality layer ,
by frame number ,
; NALU
corresponds to the base layer
of frame , while
denote quality enhancement
layers. The H.264/SVC standard imposes decoding dependencies among NALUs:
depends on all
, while
depends on its parent frames as
determined by the hierarchical prediction structure (cf. Fig. 2).
We let4
be the parent frames of frame and use
to represent the size of NALU
.
Let
be a boolean decision variable that is equal to 1
if
is sent over access network , and is 0 else. We allow
for a packet to be sent over at most one access network; this
3For simplicity, we assume in the sequel that the playout deadline is the same
for all users and video packets, whence is a system parameter determined by
the service provider. The general case can be handled by considering playout
deadlines to depend on users and video packets; in such a case, the loss probability is defined separately for each user and packet via (1).
4In
this paper, we use bold symbols to represent vectors.
472
Fig. 2. Dependency among NALUs of H.264/SVC streams. Each square represents an NALU belonging to an MGS layer, and each rounded rectangle represents a video frame.
is because efficient link-layer error control mechanisms, such

as Forward Error Correction (FEC) and Automatic Repeat Request (ARQ), are widely applied in wireless networks to reduce packet losses [38], hence sending an NALU over multiple
access networks does not lead to significant improvements on
video quality, while it increases the network load.
We measure video distortion using mean square error (MSE).
We let
be the total distortion of frame ,
denotes the truncation distortion and
denotes
where
the drifting distortion. Truncation distortion refers to the quality
degradation of frame due to dropping NALUs of that frame.
be the full-quality distortion of frame , achieved
Let
when all NALUs are received, and
be
the additional distortion introduced by dropping NALU
.
, all NALUs
must have
In order to decode
been decoded, thus we have
(2)
Drifting distortion refers to the distortion caused by imperfect
reconstruction of parent frames
used for interframe prediction. In principle,
may include all parent frames of
frame . Doing so, however, may result in excessive overhead with questionable performance gain. Therefore, in practice, we can either constrain interframe predictions within individual group-of-pictures (GoPs) or heuristically choose a bound
on
. Following the discussion in [18] and [24], we propose an affine model
(3)
are parameters estimated from real data,
where
and each
is constrained to be nonnegative,
Although our video model is designed for H.264/SVC
streams coded with MGS layers, the model is general and can
work with other types of scalable or nonscalable streams. For
example, H.264/AVC coders can generate temporal scalable
streams using hierarchical B-frames. The proposed video model
abstracts such H.264/AVC streams as videos with a base layer
and no enhancement layer, i.e.,
, and captures the error
propagation due to error concealment as the drifting distortion.
Such flexibility is due to the fact that nonscalable streams are
essentially scalable streams with one quality layer, and thus
they can be captured by our video model.
In order to optimize the overall streaming quality, we need
to specify the parameters for the model introduced above.
We have implemented least-squares parameter estimation in

MATLAB, which runs offline, as a preprocessing step. We do
not assume the parameter estimation is performed online, and
its computational complexity is not reported throughout this
paper. Estimating the parameters offline limits the application
scope mostly to video-on-demand services, in which the parameters can be computed offline and stored as metadata. Such
services are fairly popular nowadays, e.g., YouTube, Hulu, and
Netflix streaming.
D. Optimization Problem
We denote, by some abuse of notation, the expected
distortion, after accounting for random packet losses, of
the th frame of user
by
and define the vectors
. We formulate
the multinetwork scalable video streaming problem as one of
finding the
values to minimize a convex cost function
, which is nondecreasing in each
argument. One special case of interest is
,
where each
is convex and nondecreasing in each
argument. We can provide service differentiation among
users and frames by considering different cost functions, e.g.,
. We can also address
fairness among users, e.g., weighted max-min fairness by
.
setting
be the frame rate measured in frames per
For user , let
second (fps). Then, the average transport stream rate for network is given by
(4)
Using the network model (1), the delivery probability of
NALU
denoted by
is given by
(5)
The expected truncation distortion is still given by (2), and the
expected drifting distortion by (3) if we further assume that
packet losses are statistically independent.
Since NALUs have different sizes, some NALUs
may
comprise multiple, say
, packets. Typically,
is a
; for example, for a path with maxfunction of NALU size
imum payload length , the streaming server may divide NALU
into
packets. We can handle this case by
be 1, if the th packet of NALU
is sent
letting
over access network , and 0 else. Then, we may replace (5) and
(2) with
(6)
(7)
respectively. In the sequel, we assume
for
notational simplicity; extending the optimization program
and the proposed algorithms to handle this general case is

straightforward.
The joint rate control and stream adaptation problem, featuring optimization over
frames for client , is given by
the integer program
(8a)
s.t.
(8b)
(8c)
(8d)
(8e)
(8f)
(8g)
(8h)
(8i)
We consider a recurring scheduling window of duration

,
which implies that we have to solve the above optimization
problem once every
seconds. In our implementation, we set
as constant and consider variable numbers of packets to be
scheduled within each window, i.e., pick
. Using
such an approach, the system can adapt to dynamic changes such
as variable network conditions, or new users arriving/leaving,
by solving different instances of the optimization problem for
each scheduling window. Finally, note that rate control is performed through (8c); this is a form of proactive congestion control, in the sense that it seeks to avoid causing network congestion, as opposed to the responsive nature of the widely used
Transmission Control Protocol (TCP).
We also consider relaxing
to take real values in the
interval
. This is a soft decision problem, which can be implemented via randomized packet scheduling. Let us define a
family of independent Bernoulli random variables
,
for
, where
takes the value 1 with probability
,
and the value 0 with probability
. Given a realization of the Bernoulli random variables, NALU
is sent
; this event has probaover network only when
bility
and is independent from the scheduling of other
packets. The expected truncation distortion is given by (2) if
we assume that packet losses of access networks are statistically independent from the decision variables
. This assumption is a gross approximation that can be made fairly accurate by considering a two-timescale separation approach: Suppose that the optimization window size is large enough for the
stochastic process (such as a Markov chain) characterizing the
network losses to converge to the stationary distribution. Then,
the approximation error in (8c) is negligible in both theory and
practice [20].
473
E. Properties of the Optimization Problem

The actual decision variables of the optimization problem (8)
are
; these are only constrained to be either binary (hard decisionsdeterministic scheduling) or lie in
(soft decisionsrandomized scheduling). The distortion
is a function of the decision variables , however its analytical expression is too complicated to write in
closed form. Instead, we have introduced auxiliary variables
and imposed the equations that
are related with as constraints in the optimization problem (8).
This is a technique in optimization usually referred to as uplifting [8], in which the decision space is increased to yield a
simpler objective function accompanied by a set of constraints.
The objective function of (8) is increasing in
, while
it is decreasing in
for each
. It is increasing in
and
for each
. Based on these properties, we
can replace the equality constraints in (8c)(8e) with , ,
inequality constraints, respectively. This yields an equivalent
formulation with no nonlinear equality constraints. The above
monotonicity properties guarantee that an optimal integer solution for satisfies the property that
is sent over some
are sent over some network
network, only if all
as well. The randomized optimization problem, in which (8i) is
replaced by
, is not convex due to multinomial
terms in (8d) and (8e). The problem can neither be converted
into an equivalent convex program by means of exponential
transformations of the form
, nor can it be rendered in
the format of geometric programming [9]. In Section IV-C, we
present convex approximations to the randomized packet scheduling problem.
IV. OPTIMIZATION ALGORITHMS
In this section, we propose several deterministic and randomized packet scheduling algorithms. In the sequel, we let
,
, for all
, just for the sake
of notational simplicity but without any loss of generality.
A. Exhaustive Search
The integer program (8) can be solved by means of exhaustive search; the complexity of a naive exhaustive search is
, which can be reduced to
in
the light of the fact that each packet is sent over at most one
network [c.f (8h)]. If we further exploit the optimality property
of Section III-E, that
is sent over some network, only
are sent over some network, then this
if all
means that for fixed
there are
possible
values for
at optimality. Therefore, the complexity of
,
an exhaustive search can be further reduced to
for
, or
when
.
B. Heuristic Algorithms
We present heuristic algorithms for deterministic packet
scheduling. The algorithms do not explicitly address service
differentiation, i.e., we consider
.
Simple Rate-Distortion Optimization: The Simple Rate-Distortion Optimization (SRDO) algorithm takes a maximal
allowed packet loss probability
as the input and sorts
NALUs in descending order of
. It sequentially assigns
474
Fig. 3. Simple Rate Distortion Optimization algorithm (SRDO).
is to approximate the nonconvex constraint set of (8) by

a convex superset, by means of convex approximations to
the multilinear product terms in (8d) and (8e). Even though
a solution to an approximate problem might be infeasible
for the original when considering the augmented space
, this is not an issue because
we are only interested in obtaining values for the actual decision variables, i.e., the transmission probabilities
;
. We also note here that
these are only constrained to lie in
our approach can be plainly used to handle the case that some
NALUs
comprise multiple packets, i.e.,
,
since (6) and (7) also feature multilinear product terms.
In the next lemma, we present a convex programming formulation that approximates multilinear functions in (8d) and (8e)
in a term-by-term fashion.
Lemma 1 [Term-by-Term Convex Approximation (TTC)]:
The optimization problem
(9a)
s.t.
(9b)
(9c)
(9d)
(9e)
Fig. 4. Progressive Rate-Distortion Optimization algorithm (PRDO).
NALUs to the access network with the smallest

until all
access networks are fully loaded, i.e., right before the smallest
exceeds
. SRDO has a worst-case complexity of
. The pseudocode is presented in Fig. 3.
Progressive Rate-Distortion Optimization: The Progressive
Rate-Distortion Optimization (PRDO) algorithm considers
the net distortion gain of assigning NALU
over access
network , namely
, based on the distortion model (cf.
Section III-C). Following the video prediction structure, PRDO
sequentially schedules the immediately decodable NALU
with the largest nonnegative
value to the corresponding access network. The algorithm stops when all packets
have been scheduled or when all unscheduled NALUs have
nonpositive net distortion gains. PRDO is a greedy algorithm
that relies on evaluating function
; a single function evaluation has complexity
. In the worst case, PRDO
makes
such evaluations, so its complexity is
. The pseudocode is presented in Fig. 4.
Hybrid Rate-Distortion Optimization: The Hybrid Rate-Distortion Optimization (HRDO) algorithm uses SRDO to bootstrap a solution, which is consequently used as the initial value
for PRDO. It has been shown to yield a good tradeoff of performance versus runtime in simulations [18].
C. Convex Approximations
In this section, we derive approximate convex programs
for the randomized packet scheduling problem. The goal
(9f)
(9g)
(9h)
(9i)
is a convex program whose optimal value is an underestimate
of the optimal value of (8). It consists of
decision variables and
constraints. It
can be written as an equivalent smooth convex program by substituting the
in (9d) and (9e) with inequality constraints. If
we assume that
is continuous on
, then the convex
program admits an optimal solution and has the strong duality
property.5
Proof: The concave envelope of
on
is given by
[28]. Applying this to each multinomial
term in (8d) and (8e), we get, by exploiting the monotonicity
properties of Section III-E and the fact that the minimum of
affine functions is a concave function, convex program (9).
The program has a nonempty and compact set of optimal
solutions since the domain of the decision variables
is
the compact unit hypercube
and since all
inequality constraints along with the objective function involve
continuous functions. The convex program (9) has the strong
5Strong
duality is important for the performance of numerical methods [9].
duality property as well as a nonempty and bounded set of dual

optimal solutions because it satisfies Slater condition [9]; there
exists a feasible solution for which all inequality constraints
are strictly satisfied. For example, let
sufficiently
small and consider
,
.
Remark 1 (Practical Considerations): The approximation
error in TTC may be nonnegligible. The approximation in
(9d) does not sufficiently capture the impact of packet losses,
which is especially true when the loss probability is small (say
5%10%), which further implies that
in
. In addition, the
most cases, hence
gap in approximating the probability that
is not received
are
with the minimum of the probabilities that
not received [c.f (9e)] might not be negligible either.
We present another method of approximating the nonconvex
multilinear inequalities (8d) and (8e) by means of their convex
envelopes. This yields the optimal convex approximation of the
nonconvex constraint set of (8), but comes at the cost of high
computational complexity.
Lemma 2 [Multilinear Convex Approximation (MC)]: The
optimization problem in (10), shown at the bottom of the page,
is a convex program whose optimal value is an underestimate of
475
the optimal value of (8). If we assume that

is continuous
on
, then the convex program admits an optimal solution
and has the strong duality property.
Proof: Consider a multilinear function
on
, i.e., a function that is linear in each
is given by [28]
argument alone. The convex envelope of
,
, where
is defined as:
, and
. We use this to calculate the concave envelope of the multilinear functions on the right of (8d) and (8e)
to obtain convex inequality constraints. The rest of the proof
follows along the same lines as in Lemma 1.
Remark 2 [Hybrid Convex Approximation (HC)]: We can replace (9d) in TTC with (10d) for a balance between performance
and computational complexity; we call the resulting problem
HC. Note that(10d) is always a better approximation of (8d), as
it is the tightest convex approximation of the multilinear function. Therefore, HC is expected to outperform TTC, and we
observed in our experiments that the improvement is significant in all cases, not only for low-loaded networks, as is immediate from Remark 1. However, there is no substantial increase
(10a)
s.t.
(10b)
(10c)
(10d)
(10e)
(10f)
(10g)
(10h)
(10i)
476
in runtime for the case when the number of available networks

is low, e.g.,
. For this reason, we report results using HC
in the remainder of this paper. On the contrary, the performance
gain of MC over HC was measured to be negligible, particularly
in the face of a runtime that was prohibitive for real-time applications, even for low values of , e.g.,
.
Remark 3 (Computational Complexity): The TTC optimization program contains a polynomial number of constraints in
, , , . MC requires computing the convex envelope in
(10d) and (10e). The convex envelopes in (10d) can be computed offline with
tests. This is because the calculation
of the convex envelope of
on
does not depend on the problem parameters;
does not depend on
in (10d). However,
that is why
the convex envelopes in (10e) depend on the problem parameters; the computation takes
tests, which might take
a prohibitively long time. HC contains an exponential number
of constraints in . For fixed , and given
, the concave envelope in (10d) is given by the minimum of a fixed
number, say
of affine functions of
(for example
for
,
), while the total number of constraints is
polynomial in
. Therefore, for small values of , we
propose using HC for a good tradeoff between performance and
runtime.
Remark 4: We can improve the convex approximation in (9e) by replacing parameter
with
,
where
are chosen to satisfy
for a near-optimal solution computed
by some heuristic algorithm, like the ones in Section IV-B.
Remark 5 (Multiple-Server Extension): We can generalize
the optimization programs to handle the scenario where each
video sequence may be streamed from multiple servers, say
, with rate constraints, to the clients
. Let
be 1 if the NALU
is streamed from server
to client over network , and 0 otherwise. We can consider
(8)(10) with the additional constraints
(11)
(12)
denotes the streaming capacity of server .
where
In this case, server-side scheduling is, by nature, centralized;
different servers need to exchange information to avoid sending
multiple copies of the same packet.
We have implemented the packet scheduling algorithms
based on the aforementioned convex approximations in
MATLAB using CVX [2], which is a numerical solver for
convex optimization. However, as we discuss in Remark 2, the
HC algorithm gives us a good tradeoff between complexity and
performance. Thus, we only report the performance of HC in
Section V.
V. EVALUATION
A. Setup
We use Abing [5] to periodically measure ABR and RTT
values between Deutsche Telekom Laboratories (Berlin,
Fig. 5. Rate increase for different numbers of MGS layers.
Fig. 6. R-D curves of the scalable video streams.
Germany) and Stanford University (Stanford, CA). We collect

the network traces on weekdays with dozens of hosts on each
network generating background traffic. At Deutsche Telekom
Laboratories, Abing was run over three access networks: Ethernet, 802.11b, and 802.11g. Parts of the network traces were
used in [30] and [39], and further details can be found therein.
We consider four 10-s 4CIF (704x576) video sequences:City,
Soccer, Crew, and Harbour, encoded as H.264/SVC scalable
streams using H.264/SVC baseline profile of JSVM Reference
Software. We tested different numbers of MGS layers
and
found that
does not affect coding efficiency substantially.
Fig. 5 illustrates that
only results in 5%7.5% higher
bit rate than
. This shows that additional MGS layers do
not lead to severe coding inefficiency. Therefore, each video sequence is encoded into a scalable stream with eight MGS layers
for higher flexibility. To illustrate the video characteristics of
individual videos, we plot the R-D curves in Fig. 6.
We estimate the video model parameters by extracting and
decoding 32 random substreams from each scalable stream and
measuring the video quality. Knowing which video packets
were successfully delivered as well as truncation distortion and
drifting distortion, we estimate the model parameters of the
video model using standard least-squares fitting in MATLAB.
To evaluate the accuracy of the video model, we randomly
extracted another 32 substreams from each video stream, computed the empirical per-frame video quality, and compared it to
the video quality estimated by the video model. Fig. 7 shows
the actual and estimated video quality of Soccer and Crew: The
proposed video model is quite accurate, and the average absolute errors for City, Soccer, Crew, and Harbour were measured
to be 2.82%, 1.38%, 0.74%, and 1.65%, respectively.
For comparison, we also encode the same video sequences
into nonscalable streams using the H.264/AVC baseline profile of JM Reference Software. We configure the H.264/AVC
477
Fig. 8. Simulation setup for multiple user scenarios.

Fig. 7. Proposed video model closely follows measured quality. Sample results
from (a) Soccer and (b) Crew.
encoder as close to the H.264/SVC configuration as possible,

e.g., we use the prediction structure of IPPPPPPP in both cases.
Schwarz et al. [29] claim that, compared to nonscalable streams,
10%50% bit rate increases of scalable streams are acceptable.
To be conservative, we use rate control mechanism to generate
H.264/AVC streams with 20% lower bit rates than H.264/SVC
streams; we found that the H.264/AVC streams still achieve
0.11 dB better quality than H.264/SVC streams.
We implemented a multinetwork streaming server in
NS-2 [27], which supports the SRDO, PRDO, and HC algorithms, implemented as MATLAB subroutines. The HC
algorithm uses CVX [2] to numerically solve the convex
program given in Remark 2. We report runtime values corresponding to a 2.8-GHz PC with MATLAB R2010a. For
comparison, we also implemented a multinetwork DCCP [22]
streaming server based on an open-source DCCP implementation [26] that supports two standard rate control algorithms:
TCP-like and TCP-Friendly Rate Control (TFRC). The DCCP
streaming server sets up a connection over each access network and assigns NALUs to each connection from lower- to
higher-quality layers until reaching the rate limit computed by
the rate control algorithms. The DCCP streaming servers with
TCP-like and TFRC rate control algorithms are referred to as
DCCP-TCP and DCCP-TFRC, respectively.
We simulate multinetwork video streaming sessions using
the four videos with random start times in the network traces.
We repeat the 10-s video clips throughout the simulations. We
inject background traffic over each network at a rate between
20%90% of its available bit rate. We chose
,
,
s,
s, and
. The maximum UDP
packet size is set to 1000 B. If not otherwise specified, we report results with 40% background traffic, using average distortion as the cost function. We conduct simulations with a single
user
and compare the performance of the proposed algorithms and the rate control algorithms defined in DCCP standard. We also run the HC algorithm for three users
of
different videos, which is illustrated in Fig. 8. For each setup,
we test the algorithms 300 times and consider five performance
metrics: video quality in PSNR, streaming rate, packet delivery
delay, delivery ratio, and runtime.
B. Simulation Results
Nonscalable Versus Scalable Streams: Streaming nonscalable videos over a bandwidth-limited channel may lead to undecodable frames [23], while scalable video streaming systems
have the choice to at least stream the base layer and attain basic
Fig. 9. Delivery ratios with nonscalable and scalable streams: (a) DCCP-TCP
and (b) DCCP-TFRC. Sample results from City sequence.
Fig. 10. Video quality achieved by different numbers of access networks:

(a) DCCP-TCP and (b) DCCP-TFRC. Sample results from City sequence.
quality. To quantify the benefits of streaming scalable videos,

we use delivery ratio as the performance metric, which is defined as the fraction of timely delivered frames for nonscalable
videos and the fraction of frames with base layers delivered on
time for scalable videos. We configure a DCCP streaming server
to transmit City (both nonscalable and scalable) over one, two,
and three access networks. We plot the resulting delivery ratio
with 95% confidence intervals in Fig. 9. This figure illustrates
that streaming scalable videos results in higher delivery ratio,
therefore fewer rebuffering instances and overall a better user
experience. Hence, we only consider scalable streams in the rest
of this section.
Benefits of Multihoming: We instruct a DCCP streaming
server to transmit City over one, two, and three access networks
and compute the video quality achieved under 40% background traffic. We plot sample results for a 60-s period using
DCCP-TCP and DCCP-TFRC in Fig. 10; notice that multihoming can significantly increase video quality and reduce the
number of quality fluctuations.
Video Quality: We compare the video quality achieved by the
proposed algorithms against the DCCP rate control algorithms
under 40% background traffic. In Fig. 11(a), we plot the video
quality achieved using each algorithm for a 60-s sample period.
We observe that both DCCP-TCP and DCCP-TFRC suffer
from sudden quality drops, unlike the proposed algorithms that
additionally achieve higher video streaming quality. We report
478
Fig. 11. Video quality achieved by different algorithms: (a) 60-s sample period
from Crew and (b) overall results.
Fig. 12. Streaming rate achieved by the different algorithms: (a) 60-s sample
results from City and (b) overall results.
the aggregate video quality for different video sequences in

Fig. 11(b). The proposed algorithms outperform the DCCP rate
control algorithms by at least 10 dB in terms of video quality.
Streaming Rate: While the proposed algorithms achieve
better video quality, we need to make sure that they do not saturate the network resources. We plot the streaming rates achieved
by different algorithms in Fig. 12. Fig. 12(a) shows a sample
time period, which reveals that the DCCP rate control algorithms (see upper subfigure) result in higher rate fluctuations
while the proposed algorithms lead to smoother streaming rates
(see lower subfigure). This can be attributed to the proactive
rate control employed by the proposed algorithms, compared to
the responsive rate control used by DCCP. Fig. 12(b) plots the
average streaming rates for all videos. This figure indicates that
the proposed algorithms are conservative in terms of network
resources; they lead to streaming rates comparable to (if not
lower than) the DCCP rate control algorithms.
Packet Delivery Delay: We calculate the average packet delivery delay for the different algorithms. Fig. 13 reveals that, for
all videos, DCCP-TCP and DCCP-TFRC lead to average delay
of 1.7 and 2.5 s, respectively, while the proposed algorithms result in less than 0.2 s delay, over 90% reduction on average.
This shows that schedules produced by the proposed algorithms
deliver more packets on time, which in turn justifies the better
video quality compared to DCCP.
Adaptation to Network Dynamics: The proposed algorithms
employ a short scheduling window (in the order of seconds) to
adjust to network dynamics. To show the effectiveness of this
approach, we conduct simulations in which the client gradually gains access to more access networks. More specifically,
the video streaming session starts with a single access network,
and two additional access networks are activated after 15 and
30 s of simulation time, respectively. We plot sample streaming
Fig. 13. Packet delivery delay incurred by different algorithms.
Fig. 14. Streaming rate of individual networks: Network 1 is available for the
entire simulation run, while Networks 2 and 3 become available only after 15
and 20 s, respectively. Sample results from (a) SRDO and (b) HC.
rates over individual networks in Fig. 14. This figure shows

that our algorithms can quickly adapt to network dynamics by
capitalizing the new access networks shortly after they become
available. We note that short scheduling windows also help to
adopt to the access network outages by rescheduling the more
important video packets over more reliable access networks.
Furthermore, the scheduling window size is a control knob of
the tradeoff between responsiveness and flexibility, in the sense
that shorter scheduling windows result in faster recovery from
network outages but limit the room for redistributing network
resources among frames in the same scheduling window.
Performance Versus Computational Complexity: We compare the performance of the proposed algorithms under different
background traffic load, from 20% to 90%. Fig. 15 presents
the achieved video quality from Harbour and Crew. This figure
shows that the HC algorithm outperforms the PRDO algorithm,
which in turn outperforms the SRDO algorithm. We also plot
the quality improvement resulted by HC over SRDO and PRDO
in Fig. 16; the HC algorithm almost always leads to quality
improvement, which is more transparent in highly loaded networks. Specifically, among all videos, the maximum, mean, and
minimum quality improvements over SRDO are 7.36, 4.33, and
1.19 dB. The maximum, mean, and minimum quality improvements over PRDO are 4.71, 1.84, and 0.33 dB.
Fig. 17 presents the runtime of the proposed algorithms for
Harbour and Crew; the HC algorithm has an up to 10-fold lower
runtime as compared to PRDO. SRDO runs fast, less than 200
ms on average, but it results in lower video quality as illustrated
in Figs. 15 and 16. Therefore, we propose using the HC algorithm for good performance as well as reasonable runtime. Note
that the runtime of HC is constant independent of background
traffic since it is a convex program that takes the same time to
Fig. 15. Video quality under different background traffic loads. Sample results
from (a) Harbour and (b) Crew.
Fig. 18. Sample video quality with cost function
479
TABLE II
AVERAGE VIDEO QUALITY AND STREAMING RATE WITH DIFFERENT COST
FUNCTIONS
Fig. 16. Video quality improvement achieved by HC over (a) SRDO and
(b) PRDO under different background traffic loads.
Fig. 17. Runtime under different background traffic loads. Sample results from
(a) Harbour and (b) Crew.
solve numerically irrespective of background traffic. The runtime of PRDO decreases substantially with background traffic
since there are much fewer packets that can be sent before capacity is reached. The same is true for SRDO, but it is not as
apparent because SRDO does not perform the time-costly function evaluation (which PRDO does).
Lastly, although our proposed proactive algorithms outperform responsive DCCP-TCP and DCCP-TFRC, we need to
point out that DCCP algorithms still have several advantages
over the proposed algorithms. First, DCCP algorithms are
simple and easy to deploy. Second, DCCP algorithms have
very low computational complexity. Third, DCCP works in the
considered system architecture (Fig. 1) as well as others, while
our proposed algorithms only run on streaming servers. We
will discuss the last limitation more in Section VI.
Multiple Clients and Service Differentiation: We use the
HC algorithm to stream different videos to three clients
under 40% background traffic load. Three cost functions
,
, and
are considered, where
6. We plot the video

quality of individual clients with
in Fig. 18, which
shows that the HC algorithm achieves service differentiation:
Client 3 (Harbour) has the lowest video quality among all
clients.
Table II presents the overall video quality under different
weights: Service differentiation can be achieved by a proper selection of the cost functions. For example, with
, the
video quality of client 3 (Harbour) is 1015 dB lower than that
of client 1 (Crew), while the gap is reduced to 3 dB with
.
Table II gives the average streaming rate under different cost
functions. For
, client 1 (Crew) achieves higher video
quality than other clients (cf. Table II), despite receiving lower
rate (cf. Table II); this is because Crew has a steeper R-D curve
(cf. Fig. 6).
VI. LIMITATIONS AND FUTURE WORK

This paper considers multihomed, multiple-client video
streaming from a servers perspective (Fig. 1). This results in
server-driven adaptation solutions, which may incur too much
overhead on the server for many clients. Therefore, each server
might only be able to serve a small number of clients. While
the streaming service providers may deploy multiple streaming
servers in a server farm, as well as exploit increased computational power via grid or cloud computing, the centralized nature
of our solution could still render the proposed algorithms less
efficient in such deployments. For example, probing traffic
from multiple servers to infer available bit rate and round-trip
6The cost functions are application-dependent. A meticulous design of the
cost function to meet some desirable specifications is out of the scope of this
paper.
480
time could interfere with each other. To tackle this, we can

control the number of decision variables by simplifying the
scalable stream structures, hence trading streaming optimality
for shorter running time for the case of many users. Transforming our current architecture toward client-driven solutions
is one of our future tasks. Techniques such as Lagrangian decomposition could be used to develop distributed algorithms for
more scalable solutions. The resulting distributed algorithms
will be more suitable to client-driven HTTP streaming, such as
3GPP/MPEG DASH, which is getting more and more popular
nowadays.
VII. CONCLUSION
In this paper, we have addressed various usage scenarios of
video streaming from a server to multinetwork clients over heterogeneous access networks. More precisely, we have formally
abstracted the problem of joint rate control and stream adaptation as an optimization problem of minimizing the expected distortion of the received videos subject to constraints based on network conditions. We have formulated this problem as an integer
program for joint rate control and stream adaptation in order to
determine, for each client: 1) the streaming rates over individual
access networks; 2) the video packets to be transmitted; and 3)
the access network each transmitted video packet is sent over, so
as to minimize a cost function of the expected distortion at the
receiver side. We have proposed using different cost functions
to account for service differentiation and fairness among users.
We have proposed two heuristic algorithms for packet scheduling, namely SRDO and PRDO. In addition, we have derived
convex programming approximations to the randomized packet
scheduling problem and have studied the tradeoff between performance and runtime; one of our randomized algorithms (TTC)
has a better runtime at the cost of lower performance, while the
other one (MC) has better performance at the cost of exponential complexity. We have proposed a hybrid algorithm (HC) that
yields good performance for a low number of access networks
while being suitable for real-time applications.
We have conducted extensive simulations to compare the performance of HC against SRDO, PRDO, and the rate control
algorithms defined in the DCCP standard. The simulation results have shown that the HC algorithm: 1) outperforms the rate
control algorithms in the DCCP standard by about 1015 dB in
video quality; 2) reduces average delivery delay by over 90%
compared to DCCP; 3) results in an average quality improvement of 4.33 dB versus SRDO, and 1.84 dB versus PRDO, under
different background traffic loads; 4) runs efficiently, up to six
times faster than PRDO; and 5) indeed provides service differentiation among users.
REFERENCES
[1] M. Megna, AT&T faces 5,000 percent surge in traffic, Oct.
2009 [Online]. Available: http://www.internetnews.com/mobility/article.php/3843001
[2] CVX: Matlab software for disciplined convex programming, 2009
[Online]. Available: http://www.stanford.edu/~boyd/cvx/
[3] K. Fitchard, T-Mobiles growth focusing on 3G, 2009 [Online].
Available:
http://connectedplanetonline.com/wireless/news/t-mobile-3g-growth-0130
[4] Cisco Systems, San Jose, CA, Cisco Visual Networking Index forecast
Web site, 2010 [Online]. Available: http://www.cisco.com/go/vni
[5] Stanford University, Stanford, CA, Abing project page, 2004 [Online]. Available: http://www-iepm.slac.stanford.edu/tools/abing/
[6] T. Alpcan, J. Singh, and T. Basar, Robust rate control for heterogeneous network access in multihomed environments, IEEE Trans. Mobile Comput., vol. 8, no. 1, pp. 4151, Jan. 2009.
[7] I. Amonou, N. Cammas, S. Kervadec, and S. Pateux, Optimized ratedistortion extraction with quality layers in the scalable extension of
H.264/AVC, IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 9,
pp. 11861193, Sep. 2007.
[8] D. Bertsekas, Convex Optimization Theory. Belmont, MA: Athena
Scientific, 2009.
[9] S. Boyd and L. Vandenberghe, Convex Optimization, 1st ed. Cambridge, U.K.: Cambridge Univ. Press, 2004.
[10] J. Chakareski and B. Girod, Rate-distortion optimized packet scheduling and routing for media streaming with path diversity, in Proc.
DCC, Snowbird, UT, Mar. 2003, pp. 203212.
[11] P. Chou and Z. Miao, Rate-distortion optimized streaming of packetized media, IEEE Trans.n Multimedia, vol. 8, no. 2, pp. 390404,
Apr. 2006.
[12] K. Evensen, T. Kupka, D. Kaspar, P. Halvorsen, and C. Griwodz,
Quality-adaptive scheduling for live streaming over multiple access
networks, in Proc. ACM NOSSDAV, Amsterdam, The Netherlands,
Jun. 2010, pp. 2126.
[13] N. Freris, C. Hsu, X. Zhu, and J. Singh, Resource allocation for multihomed scalable video streaming to multiple clients, in Proc. IEEE
ISM, Taichung, Taiwan, Dec. 2010, pp. 916.
[14] P. Fuxjager, H. Fischer, I. Gojmerac, and P. Reichl, Radio resource
allocation in urban femto-WiFi convergence scenarios, in Proc.
Euro-NF NGI, Paris, France, Jun. 2010, pp. 18.
[15] D. Gross, J. Shortle, J. Thompson, and C. Harris, Fundamentals of
Queueing Theory, 4th ed. Hoboken, NJ: Wiley-Interscience, 2008.
[16] Y. He and R. Yuan, A novel scheduled power saving mechanism for
802.11 wireless lans, IEEE Trans. Mobile Comput., vol. 8, no. 10, pp.
13681383, Oct. 2009.
[17] M. Hefeeda and C. Hsu, Rate-distortion optimized streaming of
fine-grained scalable video sequences, Trans. Multimedia Comput.,
Commun.s, Appl., vol. 4, no. 1, pp. 2:12:28, Jan. 2008.
[18] C. Hsu, N. Freris, J. Singh, and X. Zhu, Rate control and stream adaptation for scalable video streaming over multiple access networks, in
Proc. IEEE PV, Hong Kong, Dec. 2010, pp. 3340.
[19] C. Hsu and M. Hefeeda, Broadcasting video streams encoded
with arbitrary bit rates in energy-constrained mobile TV networks,
IEEE/ACM Trans. Netw., vol. 18, no. 3, pp. 681694, Jun. 2010.
[20] L. Jiang and J. Walrand, A distributed CSMA algorithm for
throughput and utility maximization in wireless networks, IEEE/ACM
Trans. Netw., vol. 18, no. 3, pp. 960972, Jun. 2010.
[21] D. Jurca and P. Frossard, Media-specific rate allocation in heterogeneous wireless networks, in Proc. IEEE PV, Hangzhou, China, May
2006, pp. 713726.
[22] E. Kohler, M. Handley, and S. Floyd, Datagram congestion control
protocol (DCCP), RFC 4340, 2006.
[23] W. Li, Overview of fine granularity scalability in MPEG-4 video standard, IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 3, pp.
301317, Mar. 2001.
[24] Y. Liang, J. Apostolopoulos, and B. Girod, Analysis of packet loss for
compressed video: Effect of burst losses and correlation between error
frames, IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 7, pp.
861874, Jul. 2008.
[25] H. Mansour, V. Krishnamurthy, and P. Nasiopoulos, Channel aware
multiuser scalable video streaming over lossy under-provisioned channels: Modeling and analysis, IEEE Trans. Multimedia, vol. 10, no. 7,
pp. 13661381, Nov. 2008.
[26] N. Mattsson, A DCCP module for NS-2, Masters thesis, Dept.
Comput. Sci. Elect. Eng., Lulea Tekniska University, Lulea, Sweden,
2004.
[27] The network simulator, 2012 [Online]. Available: http://www.isi.
edu/nsnam/ns/
[28] A. D. Rikun, A convex envelope formula for multilinear functions,
J. Global Optimiz., vol. 10, no. 4, pp. 425437, 1997.
[29] H. Schwarz, D. Marpe, and T. Wiegand, Overview of the scalable
video coding extension of the H.264/AVC standard, IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 9, pp. 11031120, Sep. 2007.
[30] J. Singh, T. Alpcan, P. Agrawal, and V. Sharma, An optimal flow assignment framework for heterogeneous network access, in Proc. IEEE
WoWMoM, Helsinki, Finland, Jun. 2007, pp. 112.
[31] J. Sun, W. Gao, D. Zhao, and W. Li, On rate-distortion modeling and

extraction of H.264/SVC fine-granular scalable video, IEEE Trans.
Circuits Syst. Video Technol., vol. 19, no. 3, pp. 323336, Mar. 2009.
[32] A. Szwabe, A. Schorr, F. Hauck, and A. Kassler, Dynamic multimedia stream adaptation and rate control for heterogeneous networks,
in Proc. IEEE PV, Hangzhou, China, May 2006, pp. 6369.
[33] E. Tan, L. Guo, S. Chen, and X. Zhang, PSM-throttling: Minimizing
energy consumption for bulk data communications in WLANs, in
Proc. IEEE ICNP, Beijing, China, Oct. 2007, pp. 123132.
[34] J. Wang, M. Ghosh, and K. Challapali, Emerging cognitive radio applications: A survey, IEEE Commun. Mag., vol. 49, no. 3, pp. 7481,
Mar. 2011.
[35] M. Wien, H. Schwarz, and T. Oelbaum, Performance analysis of
SVC, IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 9, pp.
11941203, Sep. 2007.
[36] J. Xin, C. Lin, and M. Sun, Digital video transcoding, Proc. IEEE,
vol. 93, no. 1, pp. 8497, Jan. 2005.
[37] X. Xing, S. Mishra, and X. Liu, ARBOR: Hang together rather than
hang separately in 802.11 WiFi networks, in Proc. IEEE INFOCOM,
San Diego, CA, Mar. 2010, pp. 13521360.
[38] Q. Zhang, W. Zhu, and Y. Zhang, End-to-end QoS for video delivery
over wireless Internet, Proc. IEEE, vol. 93, no. 1, pp. 123134, Jan.
2005.
[39] X. Zhu, P. Agrawal, J. Singh, T. Alpcan, and B. Girod, Distributed rate
allocation policies for multihomed video streaming over heterogeneous
access networks, IEEE Trans. Multimedia, vol. 11, no. 4, pp. 752764,
Jun. 2009.
[40] X. Zhu, E. Setton, and B. Girod, Congestion-distortion optimized
video transmission over ad hoc networks, Signal Process., Image
Commun., vol. 20, no. 8, pp. 773783, Sep. 2005.
[41] X. Zhu, J. Singh, and B. Girod, Joint routing and rate allocation for
multiple video streams in ad-hoc wireless networks, in Proc. IEEE
PV, Hangzhou, China, May 2006, pp. 727736.
Nikolaos M. Freris (M05) received the Diploma

in electrical and computer engineering from the
National Technical University of Athens, Athens,
Greece, in 2005, and the M.S. degree in electrical and computer engineering, M.S. degree in
mathematics, and Ph.D. degree in electrical and
computer engineering from the University of Illinois
at UrbanaChampaign in 2007, 2008, and 2010,
respectively.
Since 2010, he has been working as a Researcher
with IBM ResearchZrich, Zurich, Switzerland.
His research interests lie in wireless and sensor networks as well as data mining
with provable guarantees.
Dr. Freris is a member of SIAM and the Technical Chamber of Greece.
481
Cheng-Hsin Hsu (S09M10) received the B.Sc.

degree in mathematics and M.Sc. degree in computer
science and information engineering from National
Chung-Cheng University, Taiwan, in 1996 and
2000, respectively, the M.Eng. degree in electrical
and computer engineering from the University of
Maryland, College Park, in 2003 and the Ph.D.
degree in computing science from Simon Fraser
University, Burnaby, BC, Canada, in 2009.
He is an Assistant Professor with National Tsing
Hua University, Hsin Chu, Taiwan. His research interests are in the area of multimedia networking and distributed systems.
Dr. Hsu is a member of the Association for Computing Machinery (ACM).
Jatinder Pal Singh (M05) received the B.S. degree

from the Indian Institute of Technology, Delhi, India,
in 2000, and the M.S. and Ph.D. degrees from Stanford University, Stanford, CA, in 2002 and 2005, respectively, all in electrical engineering.
He is the Director of Mobile Innovation Strategy
with the Palo Alto Research Center and Consulting Associate Professor with the Department of
Electrical Engineering, Stanford University. He was
previously Vice President of Research with Deutsche
Telekom, Los Altos, CA, one of the worlds largest
ISPs and parent company of T-Mobile.
Dr. Singh graduated at the top of his class with the Institute Silver Medal at
the Indian Institute of Technology, Delhi, and was awarded a Stanford Graduate
Fellowship and Deutsche Telekom Fellowship.
Xiaoqing Zhu (M09) received the B.Eng. degree

in electronics engineering from Tsinghua University,
Beijing, China, in 2001, and the M.S. and Ph.D. degrees in electrical engineering from Stanford University, Stanford, CA, in 2002 and 2009, respectively.
She is currently a member of the Advanced Architecture and Research Group, Cisco Systems, Inc.,
San Jose, CA. She interned with the IBM Almaden
Research Center, San Jose, CA, in 2003, and was
at Sharp Labs of America, Camas, WA, during the
summer of 2006. Her research interests lie at the
intersection of multimedia signal processing, wireless communications, and
networking.
Dr. Zhu has served as a reviewer for many journals and magazines,
including the IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS,
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, IEEE TRANSACTIONS
ON MULTIMEDIA, IEEE Communications Magazine, and IEEE Network. She
has also helped organize various conferences and workshops, such as IEEE
GLOBECOM, IEEE International Conference on Computing, Networking
and Communication (ICNC), and SPIE Visual Communications and Image
Processing (VCIP). She served as Guest Editor for the IEEE Technical Committee on Multimedia Communications (MMTC) E-Letter, IEEE JOURNAL
ON SELECTED AREAS IN COMMUNICATIONS, and IEEE TRANSACTIONS ON
MULTIMEDIA. She was awarded the Stanford Graduate Fellowship from 2001
to 2005. She was the recipient of the Best Student Paper Award in ACM
Multimedia 2007.

Distortion-Aware Scalable Video Streaming To Multinetwork Clients

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Distortion-Aware Scalable Video Streaming To Multinetwork Clients

Încărcat de

Drepturi de autor:

Formate disponibile

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO.

Distortion-Aware Scalable Video Streaming

AbstractWe consider the problem of scalable video streaming

ARKET research indicates that mobile data traffic will

Manuscript received September 05, 2011; revised February 22, 2012

1063-6692/$31.00 2012 IEEE

is determined, the video stream must be adapted into the right

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO. 2, APRIL 2013

Fig. 1. Sample system architecture of a scalable video streaming system with

the proposed heuristics, under diverse background traffic

FRERIS et al.: DISTORTION-AWARE SCALABLE VIDEO STREAMING TO MULTINETWORK CLIENTS

Access networks are heterogeneous and time-varying, so

streams coded by H.264/SVC, based on a generalized Gaussian

For a given user

this paper, we use bold symbols to represent vectors.

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO. 2, APRIL 2013

is because efficient link-layer error control mechanisms, such

We have implemented least-squares parameter estimation in

FRERIS et al.: DISTORTION-AWARE SCALABLE VIDEO STREAMING TO MULTINETWORK CLIENTS

and the proposed algorithms to handle this general case is

We consider a recurring scheduling window of duration

E. Properties of the Optimization Problem

Fig. 3. Simple Rate Distortion Optimization algorithm (SRDO).

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO. 2, APRIL 2013

is to approximate the nonconvex constraint set of (8) by

Fig. 4. Progressive Rate-Distortion Optimization algorithm (PRDO).

NALUs to the access network with the smallest

duality is important for the performance of numerical methods [9].

FRERIS et al.: DISTORTION-AWARE SCALABLE VIDEO STREAMING TO MULTINETWORK CLIENTS

duality property as well as a nonempty and bounded set of dual

the optimal value of (8). If we assume that

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO. 2, APRIL 2013

in runtime for the case when the number of available networks

Fig. 5. Rate increase for different numbers of MGS layers.

Fig. 6. R-D curves of the scalable video streams.

Germany) and Stanford University (Stanford, CA). We collect

FRERIS et al.: DISTORTION-AWARE SCALABLE VIDEO STREAMING TO MULTINETWORK CLIENTS

Fig. 8. Simulation setup for multiple user scenarios.

encoder as close to the H.264/SVC configuration as possible,

Fig. 10. Video quality achieved by different numbers of access networks:

quality. To quantify the benefits of streaming scalable videos,

the aggregate video quality for different video sequences in

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO. 2, APRIL 2013

Fig. 13. Packet delivery delay incurred by different algorithms.

rates over individual networks in Fig. 14. This figure shows

FRERIS et al.: DISTORTION-AWARE SCALABLE VIDEO STREAMING TO MULTINETWORK CLIENTS

Fig. 18. Sample video quality with cost function

6. We plot the video

VI. LIMITATIONS AND FUTURE WORK

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO. 2, APRIL 2013

time could interfere with each other. To tackle this, we can

FRERIS et al.: DISTORTION-AWARE SCALABLE VIDEO STREAMING TO MULTINETWORK CLIENTS

[31] J. Sun, W. Gao, D. Zhao, and W. Li, On rate-distortion modeling and

Nikolaos M. Freris (M05) received the Diploma

Cheng-Hsin Hsu (S09M10) received the B.Sc.

Jatinder Pal Singh (M05) received the B.S. degree

Xiaoqing Zhu (M09) received the B.Eng. degree

S-ar putea să vă placă și