Sunteți pe pagina 1din 3

1.

Video coding standards

a) Describe two features incorporated in the H.263 video coding standard that helped to
improve the coding efficiency over the earlier H.261 video coding standard. (2.5pt for each
feature)

i) Using half-pel accuracy motion estimation instead of integer-pel. ii) Variable block size
for motion compensation. (allow a 16x16 block to be divided into 4 8x8 blocks, and
estimating motion vector for each 8x8 block separately. This is helpful when the 16x16
block includes two objects moving differently).

If a student list another legitimate difference, it is acceptable too.

b) Why do MPEG-1 and MPEG-2 use the GOP structure with periodic I-frames? (2 pt) For
video conferencing or video phone applications, can the encoder insert I-frames
periodically? What may be the problem?

The GOP structure enables random access, which is important for video broadcasting,
video streaming, and DVD playback applications, which are the targeted applications of
MPEG1/2. Inserting I-frames periodically generally cause the bitstream to have spikes at
I-frames. When the bit stream is sent through a constant rate channel, the I-frame data will
take longer time to send, this will cause variable delay at the receiver. In order to display
the video at constant frame rate, a large smoothing buffer is needed at the receiver. This
will significantly increase the delay time between when a frame is sent at the sender and
when it is decoded and displayed. The delay may exceed several seconds. For video
distribution applications targeted by MPEG1/2, this delay is typically acceptable.
However, for video conferencing/telephony applications, the acceptable delay is
between 150 ms and 400 ms. Therefore, inserting I-frames periodically is not advisable for
video conferencing/telephony applications.

c) What is scalable coding? (2 pt) Why is it beneficial for video streaming applications?
Scalable coding generates, for each group of video frames, a bit stream that can be truncated
either at any point or at several defined points. When a user receives a truncated bit stream,
he/she will see a correspondingly lower video quality (either in spatial resolution, temporal
frame rate, or color accuracy, or a combination of these). In video streaming applications,
the same video is often requested by users with different access bandwidth or
decoding/display capability. Without scalable coding, multiple versions of this video has
to be encoded at different bit rate, with different spatial/temporal resolutions. With scalable
coding, only a single scalable bit stream needs to be generated. Based on the users available
bandwidth and decoding/display capability, only a partial set of the bit stream needs to be
delivered. In a broadcast/ multicast application, the complete bit stream will be sent to a
multicast tree, but different nodes of the tree may choose to deliver only parts of the bit
stream depending on the available bandwidth below that node. (Note that discussion on
multicast/broadcast is not required)
d) What is object-based coding? What are the three types of information specified for each
object? Which standard uses object-based coding?

Object-based coding refers to coding different moving objects that may exist in a video
separately, so that the decoder can access the bits for different objects separately. The coder
can choose to decode certain objects but not others, or display decoded objects with
different view angles. The three types of information specified for each object are: shape,
motion, texture (color variation). MPEG-4 uses object-based coding.

2. Digital TV systems
a) Describe the major components in the US ATSC system and the method used for each
component.

The US ATSC system includes audio coding, video coding, data multiplexing, channel
coding and modulation. Audio coding uses Dolby AC3 standard. Video coding
follows MPEG2-video standard, using either mp@hl or mp@ml. Multiplexing is done
following the MPEG2-system standard. Channel coding is realized by concatenating an
outer Reed-Solomn code with a trellis code, with a data interleaver in between. Modulation
is accomplished using 8-VSB, which uses 8-ASK for mapping from digital to analog
waveforms and use VSB to reduce the bandwidth to 6 MHz total.

b) Repeat the same for the Europe DVB system.


The Europe DVB system also consists of 5 components. Video coding and multiplexing
follow the MPEG2-video and MPEG2-system standards, as with the US ATSC standard.
For audio, stereo sound is the standard format, coded using MPEG2-audio format (but only
requires MPEG2 BC mode, which is equivalent to MPEG1 layer 2). Channel coding is
quite similar, but the inner code is punctured convolution code. DVB uses a very different
modulation technique. It combines QAM with OFDM.

3. Consider the transmission of digital signals over a channel with 5 MHz


bandwidth.

a) If we use 8 ASK for modulating the digital bits into an analog waveform, what is
the maximum bit rate the channel can support?
A 5 MHz channel can support at most 10 M symbols/s. With 8 ASK,
each symbol carries 3 bits, so the maximum bit rate is 30 Mbits/s.

b) Now, suppose we further use a channel code with rate 2/3 to protect the pay load
information, what is the maximum bit rate at which the information can be sent?
Maximum information bit rate is 30 * 2/3=20 Mbits/s.

4. Describe how to map a digital signal to an analog waveform using 4-QAM. For the
following sequence of bits, 01101100, sketch the resulting analog signal.
5. Why does the receiver of a video streaming session need a buffer? What is the
advantages of using a large buffer? What is the disadvantage? What is the
application-layer protocol developed for Internet video streaming? What does
the protocol govern? What protocols can be used for the data transport?
The packets of a video streaming session often arrives with different delays due to
variability in the network loading (also different packets may be delivered over different
paths). This delay variation is called delay jitter. For our discussion purpose, let us assume
each packet contains data for one video frame. If the decoder immediately decodes and
displays each received packet, the displayed video will have a variable frame rate due to
delay jitter. In order to display the decoded video at constant frame rate, a buffer is needed
to store all received packets as they arrive. The decoder will take out packets from the
buffer at a constant rate, decode and display them at a constant rate. It is possible that when
a packet may arrive later than its scheduled decoding time. That packet will be considered
lost even if it arrives later. (if a student answers “To overcome delay jitter of arrived
packets”, it is fine)

By using a large buffer, longer delays are allowed for all packets, and fewer packets will
be droped because they are late. The displayed video will have better quality. However,
this also means that a user has to wait longer after issuing a request to see a video.

RTSP is the application layer protocol for video streaming. It governs the interaction
between streaming server and clients, and enables connection set up, pause,
rewind, stop, etc. It allows either TCP, UDP, or RTP/UDP at the transport layer.

6. What is the acceptable delay for effective audio/video phone/conferencing


applications? What layer does the SIP protocol resides? What is the primary
functions of the SIP protocol? What transport layer protocols can it work
with?

150 ms is definitely acceptable. Up to 400 ms can be tolerated.


SIP sits at the application layer. It performs call setup, including finding the current IP
address of the callee. It also allows exchanging information regarding acceptable
audio/video codecs at each end. It can work with either TCP, UDP, or RTP/UDP at the
transport layer.

S-ar putea să vă placă și