Documente Academic
Documente Profesional
Documente Cultură
1. INTRODUCTION
1.1 PREAMBLE
High Efficiency Video Coding (HEVC) is the newest video coding standard of the
ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts
Group. The main goal of the HEVC standardization effort is to enable
significantly improved compression performance relative to existing standards—
in the range of 50% bit-rate reduction for equal perceptual video quality.
1.4 SUMMARY
The HEVC standard is designed to achieve multiple goals, including coding
efficiency, ease of transport system integration and data loss resilience, as well as
implementability using parallel processing architectures. HEVC standardization
effort is to enable significantly improved compression in the range of 50% bit-rate
reduction for equal perceptual video quality.
Video coding standards have evolved primarily through the development of the
well-known ITU-T and ISO/IEC standards. The ITU-T produced H.261 and H.263,
ISO/IEC produced MPEG-1 and MPEG-4 Visual, and the two organizations jointly
produced the H.262/MPEG-2 Video and H.264/MPEG-4 Advanced Video Coding
(AVC) standards. The two standards that were jointly produced have had a particularly
strong impact and have found their way into a wide variety of products that are
increasingly prevalent in our daily lives. Throughout this evolution, continued efforts
have been made to maximize compression capability and improve other characteristics
such as data loss robustness, while considering the computational resources that were
practical for use in products at the time of anticipated deployment of each standard. The
major video coding standard directly preceding the
HEVC project was H.264/MPEG-4 AVC, which was initially developed in the period
between 1999 and 2003, and then was extended in several important ways from 2003–
2009. H.264/MPEG-4 AVC has been an enabling technology for digital video in almost
every area that was not previously covered by H.262/MPEG-2 Video and has
substantially displaced the older standard within its existing application domains. It is
widely used for many applications, including broadcast of high definition (HD) TV
signals over satellite, cable, and terrestrial transmission systems, video content
acquisition and editing systems, camcorders, security applications, Internet and mobile
network video, Blu-ray Discs, and real-time conversational applications such as video
chat, video conferencing, and telepresence systems.
The video coding layer of HEVC employs the same hybrid approach (inter-
/intrapicture prediction and 2-D transform coding) used in all video compression
standards since H.261. Fig. 1 depicts the block diagram of a hybrid video encoder,
which could create a bitstream conforming to the HEVC standard.
between random access points, interpicture temporally predictive coding modes are
typically used for most blocks. The encoding process for interpicture prediction
consists of choosing motion data comprising the selected reference picture and
motion vector (MV) to be applied for predicting the samples of each block. The
encoder and decoder generate identical interpicture prediction signals by applying
motion compensation (MC) using the MV and mode decision data, which are
transmitted as side information. The residual signal of the intra- or interpicture
prediction, which is the difference between the original block and its prediction, is
transformed by a linear spatial transform. The transform coefficients are then scaled,
quantized, entropy coded, and transmitted together with the prediction information.
The encoder duplicates the decoder processing loop (see gray-shaded boxes in Fig.
1) such that both will generate identical predictions for subsequent data. Therefore,
the quantized transform coefficients are constructed by inverse scaling and are then
inverse transformed to duplicate the decoded approximation of the residual signal.
The residual is then added to the prediction, and the result of that addition may then
be fed into one or two loop filters to smooth out artifacts induced by block-wise
processing and quantization. The final picture representation (that is a duplicate of
the output of the decoder) is stored in a decoded picture buffer to be used for the
prediction of subsequent pictures. In general, the order of encoding or decoding
processing of pictures often differs from the order in which they arrive from the
source; necessitating a distinction between the decoding order (i.e., bitstream order)
and the output order (i.e., display order) for a decoder.
Given such a video signal, the HEVC syntax partitions the pictures further as
described follows
necessary to reduce the CB size to the point where the entire CB will fit into the
picture.
The luma and chroma PBs, together with the associated prediction syntax, form
the PU.
the effects of in-loop filtering near the edges of the slice) without the use of any data
from other slices in the same picture. This means that prediction within the picture
(e.g., intrapicture spatial signal prediction or prediction of motion vectors) is not
performed across slice boundaries.
Fig. 4. Subdivision of a picture into (a) slices and (b) tiles. (c) Illustration
Of wavefront parallel processing
may share header information by being contained in the same slice. Alternatively, a
single tile may contain multiple slices. A tile consists of a rectangular arranged group
of CTUs (typically, but not necessarily, with all of them containing about the same
number of CTUs), as shown in Fig. 4(b). To assist with the granularity of data
Packetization, dependent slices are additionally defined. Finally, with WPP, a slice is
divided into rows of CTUs. The decoding of each row can be begun as soon a few
decisions that are needed for prediction and adaptation of the entropy coder have
been made in the preceding row. This supports parallel processing of rows of CTUs
by using several processing threads in the encoder or decoder (or both). An example
is shown in Fig. 4(c).
The high-level syntax of HEVC contains numerous elements that have been inherited
from the NAL of H.264/MPEG-4 AVC. The NAL provides the ability to map the video
coding layer (VCL) data that represent the content of the pictures onto various transport
layers, including RTP/IP, ISO MP4, and H.222.0/MPEG-2 Systems, and provides a
framework for packet loss resilience.
As in H.264/MPEG-4 AVC, there are two lists that are constructed as lists of
pictures in the DPB, and these are called reference picture list 0 and list 1. An index
called a reference picture index is used to identify a particular picture in one of these
lists. For uniprediction, a picture can be selected from either of these lists. For
biprediction, two pictures are selected—one from each list. When a list contains only
one picture, the reference picture index implicitly has the value 0 and does not need
to be transmitted in the bitstream.
The high-level syntax for identifying the RPS and establishing the reference
picture lists for interpicture prediction is more robust to data losses than in the prior
H.264/MPEG-4 AVC design, and is more amenable to such operations as random
access and trick mode operation (e.g., fast-forward, smooth rewind, seeking, and
adaptive bitstream switching).
A key aspect of this improvement is that the syntax is more explicit, rather than
depending on inferences from the stored internal state of the decoding process as it
decodes the bitstream picture by picture. Moreover, the associated syntax for these
aspects of the design is actually simpler than it had been for H.264/MPEG-4 AVC.
7. Conclusion
The emerging HEVC standard has been developed and standardized
collaboratively by both the ITU-T VCEG and ISO/IEC MPEG organizations. HEVC
represents a number of advances in video coding technology. Its video coding layer
design is based on conventional block-based motion compensated hybrid video
coding concepts, but with some important differences relative to prior standards.
When used well together, the features of the new design provide approximately a
50% bit-rate savings for equivalent perceptual quality relative to the performance of
prior standards (especially for a high-resolution video).
REFERENCES
[1] Gary J. Sullivan, Fellow, IEEE, Jens-Rainer Ohm, Member, IEEE, Woo-Jin Han, Member,
IEEE, and Thomas Wiegand, Fellow, IEEE,Overview of the High Efficiency Video Coding
(HEVC) Standard,2013.
[2] Video Codec for Audiovisual Services at px64 kbit/s, ITU-T Rec. H.261, version 1: Nov.
1990, version 2: Mar. 1993.
[3] Video Coding for Low Bit Rate Communication, ITU-T Rec. H.263, Nov. 1995 (and
subsequent editions).
[4] Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to About
1.5 Mbit/s—Part 2: Video, ISO/IEC 11172-2 (MPEG-1), ISO/IEC JTC 1, 1993.
[5] Coding of Audio-Visual Objects—Part 2: Visual, ISO/IEC 14496-2 (MPEG-4 Visual version
1), ISO/IEC JTC 1, Apr. 1999 (and subsequent editions).
[6] Generic Coding of Moving Pictures and Associated Audio Information— Part 2: Video,
ITU-T Rec. H.262 and ISO/IEC 13818-2 (MPEG 2 Video), ITU-T and ISO/IEC JTC 1, Nov.
1994.
[7] Advanced Video Coding for Generic Audio-Visual Services, ITU-T Rec.H.264 and ISO/IEC
14496-10 (AVC), ITU-T and ISO/IEC JTC 1, May 2003 (and subsequent editions).
[8] H. Samet, “The quadtree and related hierarchical data structures,” Comput. Survey, vol. 16,
no. 2, pp. 187–260, Jun. 1984.