Sunteți pe pagina 1din 20

Department of Computer Science & Engineering

CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

1. INTRODUCTION

1.1 PREAMBLE
High Efficiency Video Coding (HEVC) is the newest video coding standard of the
ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts
Group. The main goal of the HEVC standardization effort is to enable
significantly improved compression performance relative to existing standards—
in the range of 50% bit-rate reduction for equal perceptual video quality.

1.2 SCOPE AND OBJECTIVE


HEVC standard is the most recent joint video project of the ITU-T Video Coding
Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG)
standardization organizations, working together in a partnership known as the
Joint Collaborative Team on Video Coding (JCT-VC).
An increasing diversity of services, the growing popularity of HD video
and the emergence of beyondHD formats (e.g., 4k×2k or 8k×4k resolution) are
creating even stronger needs for coding efficiency superior to H.264/MPEG-4
AVC’s capabilities. The need is even stronger when higher resolution is
accompanied by stereo or multiview capture and display. Moreover, the traffic
caused by video applications targeting mobile devices and tablet PCs, as well as
the transmission needs for video-on-demand services, are imposing severe
challenges on today’s networks. An increased desire for higher quality and
resolutions is also arising in mobile applications.HEVC has been designed to
address essentially all existing applications of H.264/MPEG-4 AVC and to
particularly focus on two key issues: increased video resolution and increased use
of parallel processing architectures. The syntax of HEVC is generic and should
also be generally suited for other applications that are not specifically mentioned
above.

University Register No: 11012273 1 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

1.3 ORGANIZATION OF THE REPORT


The seminar is on the topic HEVC.The first chapter deals with the introduction of
High Efficiency Video Coding technology. The second chapter describes the
techniques used for HEVC such as intrapicture and intrapicture predictions. The
third chapter deals with various high level syntax of HEVC.

1.4 SUMMARY
The HEVC standard is designed to achieve multiple goals, including coding
efficiency, ease of transport system integration and data loss resilience, as well as
implementability using parallel processing architectures. HEVC standardization
effort is to enable significantly improved compression in the range of 50% bit-rate
reduction for equal perceptual video quality.

University Register No: 11012273 2 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

2. LITERATURE SURVEY AND CURRENT STATE OF ART

Video coding standards have evolved primarily through the development of the
well-known ITU-T and ISO/IEC standards. The ITU-T produced H.261 and H.263,
ISO/IEC produced MPEG-1 and MPEG-4 Visual, and the two organizations jointly
produced the H.262/MPEG-2 Video and H.264/MPEG-4 Advanced Video Coding
(AVC) standards. The two standards that were jointly produced have had a particularly
strong impact and have found their way into a wide variety of products that are
increasingly prevalent in our daily lives. Throughout this evolution, continued efforts
have been made to maximize compression capability and improve other characteristics
such as data loss robustness, while considering the computational resources that were
practical for use in products at the time of anticipated deployment of each standard. The
major video coding standard directly preceding the
HEVC project was H.264/MPEG-4 AVC, which was initially developed in the period
between 1999 and 2003, and then was extended in several important ways from 2003–
2009. H.264/MPEG-4 AVC has been an enabling technology for digital video in almost
every area that was not previously covered by H.262/MPEG-2 Video and has
substantially displaced the older standard within its existing application domains. It is
widely used for many applications, including broadcast of high definition (HD) TV
signals over satellite, cable, and terrestrial transmission systems, video content
acquisition and editing systems, camcorders, security applications, Internet and mobile
network video, Blu-ray Discs, and real-time conversational applications such as video
chat, video conferencing, and telepresence systems.

HEVC has been designed to address essentially all existing applications of


H.264/MPEG-4 AVC and to particularly focus on two key issues: increased video
resolution and increased use of parallel processing architectures.

University Register No: 11012273 3 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

3. High Efficiency Video Coding


High Efficiency Video Coding (HEVC) is the newest video coding standard of the
ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts
Group. As has been the case for all past ITU-T and ISO/IEC video coding standards,
in HEVC only the bitstream structure and syntax is standardized, as well as
constraints on the bitstream and its mapping for the generation of decoded pictures.
The mapping is given by defining the semantic meaning of syntax elements and a
decoding process such that every decoder conforming to the standard will produce
the same output when given a bitstream that conforms to the constraints of the
standard. This limitation of the scope of the standard permits maximal freedom to
optimize implementations in a manner appropriate to specific applications (balancing
compression quality, implementation cost, time to market, and other considerations).
However, it provides no guarantees of end-to-end reproduction quality, as it allows
even crude encoding techniques to be considered conforming

3.1 HEVC Coding Design

The video coding layer of HEVC employs the same hybrid approach (inter-
/intrapicture prediction and 2-D transform coding) used in all video compression
standards since H.261. Fig. 1 depicts the block diagram of a hybrid video encoder,
which could create a bitstream conforming to the HEVC standard.

An encoding algorithm producing an HEVC compliant bitstream would typically


proceed as follows. Each picture is split into block-shaped regions, with the exact
block partitioning being conveyed to the decoder. The first picture of a video
sequence (and the first picture at each clean random access point into a video
sequence) is coded using only intrapicture prediction (that uses some prediction of
data spatially from region-to-region within the same picture, but has no dependence
on other pictures). For all remaining pictures of a sequence or

University Register No: 11012273 4 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

Fig.1 Typical HEVC video encoder

between random access points, interpicture temporally predictive coding modes are
typically used for most blocks. The encoding process for interpicture prediction
consists of choosing motion data comprising the selected reference picture and
motion vector (MV) to be applied for predicting the samples of each block. The
encoder and decoder generate identical interpicture prediction signals by applying
motion compensation (MC) using the MV and mode decision data, which are
transmitted as side information. The residual signal of the intra- or interpicture
prediction, which is the difference between the original block and its prediction, is
transformed by a linear spatial transform. The transform coefficients are then scaled,
quantized, entropy coded, and transmitted together with the prediction information.
The encoder duplicates the decoder processing loop (see gray-shaded boxes in Fig.
1) such that both will generate identical predictions for subsequent data. Therefore,

University Register No: 11012273 5 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

the quantized transform coefficients are constructed by inverse scaling and are then
inverse transformed to duplicate the decoded approximation of the residual signal.
The residual is then added to the prediction, and the result of that addition may then
be fed into one or two loop filters to smooth out artifacts induced by block-wise
processing and quantization. The final picture representation (that is a duplicate of
the output of the decoder) is stored in a decoded picture buffer to be used for the
prediction of subsequent pictures. In general, the order of encoding or decoding
processing of pictures often differs from the order in which they arrive from the
source; necessitating a distinction between the decoding order (i.e., bitstream order)
and the output order (i.e., display order) for a decoder.

Video material to be encoded by HEVC is generally expected to be input as


progressive scan imagery (either due to the source video originating in that format or
resulting from deinterlacing prior to encoding). No explicit coding features are
present in the HEVC design to support the use of interlaced scanning, as interlaced
scanning is no longer used for displays and is becoming substantially less common
for distribution. However, a metadata syntax has been provided in HEVC to allow an
encoder to indicate that interlace-scanned video has been sent by coding each field
(i.e., the even or odd numbered lines of each video frame) of interlaced video as a
separate picture or that it has been sent by coding each interlaced frame as an HEVC
coded picture. This provides an efficient method of coding interlaced video without
burdening decoders with a need to support a special decoding process for it.

University Register No: 11012273 6 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

4. HEVC Video Coding Techniques


As in all prior ITU-T and ISO/IEC JTC 1 video coding standards since H.261, the
HEVC design follows the classic block-based hybrid video coding approach (as
depicted in Fig. 1). The basic source-coding algorithm is a hybrid of interpicture
prediction to exploit temporal statistical dependences, intrapicture prediction to exploit
spatial statistical dependences, and transform coding of the prediction residual signals
to further exploit spatial statistical dependences. There is no single coding element in
the HEVC design that provides the majority of its significant improvement in
compression efficiency in relation to prior video coding standards. It is, rather, a
plurality of smaller improvements that add up to the significant gain.

4.1 Sampled Representation of Pictures


For representing color video signals, HEVC typically uses a tristimulus YCbCr
color space with 4:2:0 sampling (although extension to other sampling formats is
straightforward, and is planned to be defined in a subsequent version). This separates
a color representation into three components called Y, Cb, and Cr. The Y component
is also called luma, and represents brightness. The two chroma components Cb and
Cr represent the extent to which the color deviates from gray toward blue and red,
respectively. Because the human visual system is more sensitive to luma than
chroma, the 4:2:0 sampling structure is typically used, in which each chroma
component has one fourth of the number of samples of the luma component (half the
number of samples in both the horizontal and vertical dimensions). Each sample for
each component is typically represented with 8 or 10 b of precision, and the 8-b case
is the more typical one.
The video pictures are typically progressively sampled with rectangular picture
sizes W×H, where W is the width and H is the height of the picture in terms of luma
samples. Each chroma component array, with 4:2:0 sampling, is then W/2×H/2.

University Register No: 11012273 7 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

Given such a video signal, the HEVC syntax partitions the pictures further as
described follows

4.2 Division of the Picture into Coding Tree Units


A picture is partitioned into coding tree units (CTUs), which each contain luma
CTBs and chroma CTBs. A luma CTB covers a rectangular picture area of L×L
samples of the luma component and the corresponding chroma CTBs cover each
L/2×L/2 samples of each of the two chroma components. The value of L may be
equal to 16, 32, or 64.
The support of larger CTBs than in previous standards is particularly beneficial
when encoding high-resolution video content. The luma CTB and the two chroma
CTBs together with the associated syntax form a CTU. The CTU is the basic
processing unit used in the standard to specify the decoding process.

4.3 Division of the CTB into CB’s


The blocks specified as luma and chroma CTBs can be directly used as CBs or
can be further partitioned into multiple CBs.Partitioning is achieved using tree
structures. The tree partitioning in HEVC is generally applied simultaneously to both
luma and chroma, although exceptions apply when certain minimum sizes are
reached for chroma. The CTU contains a quadtree syntax that allows for splitting the
CBs to a selected appropriate size based on the signal characteristics of the region
that is covered by the CTB. The quadtree splitting process can be iterated until the
size for a luma CB reaches a minimum allowed luma CB size that is selected by the
encoder using syntax in the SPS and is always 8×8 or larger (in units of luma
samples). The boundaries of the picture are defined in units of the minimum allowed
luma CB size. As a result, at the right and bottom edges of the picture, some CTUs
may cover regions that are partly outside the boundaries of the picture. This
condition is detected by the decoder, and the CTU quadtree is implicitly split as

University Register No: 11012273 8 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

necessary to reduce the CB size to the point where the entire CB will fit into the
picture.

4.4 Prediction Blocks(PB’s) and PU’s


The prediction mode for the CU is signaled as being intra or inter, according to
whether it uses intrapicture (spatial) prediction or interpicture (temporal) prediction.
When the prediction mode is signaled as intra, the PB size, which is the block size at
which the intrapicture prediction mode is established, is the same as the CB size for
all block sizes except for the smallest CB size that is allowed in the bitstream. For the
latter case, a flag is present that indicates whether the CB is split into four PB
quadrants that each has their own intrapicture prediction mode. The reason for
allowing this split is to enable distinct intrapicture prediction mode selections for
blocks as small as 4×4 in size. When the luma intrapicture prediction operates with
4×4 blocks, the chroma intrapicture prediction also uses 4×4 blocks (each covering
the same picture region as four 4×4 luma blocks).

Fig. 2. Modes for splitting a CB into PB’s.

The luma and chroma PBs, together with the associated prediction syntax, form
the PU.

University Register No: 11012273 9 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

4.5 Tree-Structured Partitioning Into Transform Blocks and Units


For residual coding, a CB can be recursively partitioned into transform blocks (TBs).
The partitioning is signaled by a residual quadtree. Only square CB and TB
partitioning is specified, where a block can be recursively split into quadrants, as
illustrated in Fig. 3. For a given luma CB of size M×M, a flag signals
whether it is split into four blocks of size M/2×M/2.

Fig. 3. Subdivision of a CTB into CB’s.


(a) CTB with its partitioning. (b) Corresponding quadtree

In contrast to previous standards, the HEVC design allows a TB to span across


multiple PBs for interpicture-predicted CUs to maximize the potential coding
efficiency benefits of the quadtree-structured TB partitioning.

4.6 Slices and Tiles


Slices are a sequence of CTUs that are processed in the order of a raster scan. A
picture may be split into one or several slices as shown in Fig. 4(a) so that a picture is
a collection of one or more slices. Slices are self-contained in the sense that, given
the availability of the active sequence and picture parameter sets, their syntax
elements can be parsed from the bitstream and the values of the samples in the area
of the picture that the slice represents can be correctly decoded (except with regard to

University Register No: 11012273 10 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

the effects of in-loop filtering near the edges of the slice) without the use of any data
from other slices in the same picture. This means that prediction within the picture
(e.g., intrapicture spatial signal prediction or prediction of motion vectors) is not
performed across slice boundaries.

Fig. 4. Subdivision of a picture into (a) slices and (b) tiles. (c) Illustration
Of wavefront parallel processing

The main purpose of slices is resynchronization after data losses. Furthermore,


slices are often restricted to use a maximum number of bits, e.g., for packetized
transmission. Therefore, slices may often contain a highly varying number of CTUs
per slice in a manner dependent on the activity in the video scene. In addition to
slices, HEVC also defines tiles, which are self-contained and independently
decodable rectangular regions of the picture. The main purpose of tiles is to enable
the use of parallel processing architectures for encoding and decoding. Multiple tiles

University Register No: 11012273 11 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

may share header information by being contained in the same slice. Alternatively, a
single tile may contain multiple slices. A tile consists of a rectangular arranged group
of CTUs (typically, but not necessarily, with all of them containing about the same
number of CTUs), as shown in Fig. 4(b). To assist with the granularity of data
Packetization, dependent slices are additionally defined. Finally, with WPP, a slice is
divided into rows of CTUs. The decoding of each row can be begun as soon a few
decisions that are needed for prediction and adaptation of the entropy coder have
been made in the preceding row. This supports parallel processing of rows of CTUs
by using several processing threads in the encoder or decoder (or both). An example
is shown in Fig. 4(c).

4.7 Intrapicture Prediction


Intrapicture prediction operates according to the TB size, and previously decoded
boundary samples from spatially neighboring TBs are used to form the prediction
signal. Directional prediction with 33 different directional orientations is defined for
(square) TB sizes from 4×4 up to 32×32. The possible prediction directions are
shown in Fig. 5.

Fig. 5. Modes and directional orientations for intrapicture prediction

University Register No: 11012273 12 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

Alternatively, planar prediction (assuming an amplitude surface with a horizontal


and vertical slope derived from the boundaries) and DC prediction (a flat surface
with a value matching the mean value of the boundary samples) can also be used.

4.8 Interpicture Prediction

Interpicture prediction predicts temporal changes in image sequence. Video is


encoded as a set of clean random access (CRA) pictures and their predictions. Clean
random access (CRA) picture syntax specifies the use of an independently coded picture
at the location of a random access point (RAP). RAP is a location in a bitstream at which
a decoder can begin successfully decoding pictures without needing to decode any
pictures that appeared earlier in the bitstream. The images in between RAP’s are encoded
as predictions of changes in the last RAP which was decoded.

4.9 Transform, Scaling, and Quantization

Two-dimensional transforms are computed on the data by applying 1-D transforms in


the horizontal and vertical directions. DCT are used for transforms and gives lossy
compression on the encoded data. Scaling is not required in HEVC as the values are
already close approximations due to application of scaled DCT. As in H.264/MPEG-4
AVC, uniform reconstruction quantization (URQ) is used in HEVC, with quantization
scaling matrices supported for the various transform block sizes.

4.10 Entropy coding


Context adaptive binary arithmetic coding (CABAC) is used for entropy coding.
This is similar to the CABAC scheme in H.264/MPEG-4 AVC, but has undergone
several improvements to improve its throughput speed (especially for parallel-processing
architectures) and its compression performance, and to reduce its context memory
requirements.

University Register No: 11012273 13 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

5. High-Level Syntax of HEVC

The high-level syntax of HEVC contains numerous elements that have been inherited
from the NAL of H.264/MPEG-4 AVC. The NAL provides the ability to map the video
coding layer (VCL) data that represent the content of the pictures onto various transport
layers, including RTP/IP, ISO MP4, and H.222.0/MPEG-2 Systems, and provides a
framework for packet loss resilience.

5.1 Random Access and Bitstream Splicing Features


The new design supports special features to enable random access and
bitstream splicing. In H.264/MPEG-4 AVC, a bitstream must always start with an
IDR access unit. An IDR access unit contains an independently coded picture i.e.,
a coded picture that can be decoded without decoding any previous pictures in the
NAL unit stream. The presence of an IDR access unit indicates that no subsequent
picture in the bitstream will require reference to pictures prior to the picture that it
contains in order to be decoded. The IDR picture is used within a coding structure
known as a closed GOP (in which GOP stands for group of pictures). The new
clean random access (CRA) picture syntax specifies the use of an independently
coded picture at the location of a random access point (RAP), i.e., a location in a
bitstream at which a decoder can begin successfully decoding pictures without
needing to decode any pictures that appeared earlier in the bitstream, which
supports an efficient temporal coding order known as open GOP operation. Good
support of random access is critical for enabling channel switching; seek
operations, and dynamic streaming services. Some pictures that follow a CRA
picture in decoding order and precede it in display order may contain interpicture
prediction references to pictures that are not available at the decoder. These
nondecodable pictures must therefore be discarded by a decoder that starts its
decoding process at a CRA point. For this purpose, such nondecodable pictures
are identified as random access skipped leading (RASL) pictures. The location of
splice points from different original coded bitstreams can be indicated by broken

University Register No: 11012273 14 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

link access (BLA) pictures. A bitstream splicing operation can be performed by


simply changing the NAL unit type of a CRA picture in one bitstream to the value
that indicates a BLA picture and concatenating the new bitstream at the position
of a RAP picture in the other bitstream. A RAP picture may be an IDR, CRA, or
BLA picture, and both CRA and BLA pictures may be followed by RASL
pictures in the bitstream (depending on the particular value of the NAL unit type
used for a BLA picture). Any RASL pictures associated with a BLA picture must
always be discarded by the decoder, as they may contain references to pictures
that are not actually present in the bitstream due to a splicing operation. The other
type of picture that can follow a RAP picture in decoding order and precede it in
output order is the random access decodable leading (RADL) picture, which
cannot contain references to any pictures that precede the RAP picture in
decoding order. RASL and RADL pictures are collectively referred to as leading
pictures (LPs). Pictures that follow a RAP picture in both decoding order and
output order, which are known as trailing pictures, cannot contain references to
LPs for interpicture prediction.

5.2 Temporal Sublayering Support


Similar to the temporal scalability feature in the H.264/ MPEG-4 AVC scalable
video coding (SVC) extension.HEVC specifies a temporal identifier in the NAL unit
header, which indicates a level in a hierarchical temporal prediction structure. This
was introduced to achieve temporal scalability without the need to parse parts of the
bitstream other than the NAL unit header.
Under certain circumstances, the number of decoded temporal sublayers can be
adjusted during the decoding process of one coded video sequence. The location of a
point in the bitstream at which sublayer switching is possible to begin decoding some
higher temporal layers can be indicated by the presence of temporal sublayer access
(TSA) pictures and stepwise TSA (STSA) pictures. At the location of a TSA picture,
it is possible to switch from decoding a lower temporal sublayer to decoding any

University Register No: 11012273 15 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

Fig. 6. Example of a temporal prediction structure and the POC values


decoding order, and RPS content for each picture

higher temporal sublayer, and at the location of an STSA picture, it is possible to


switch from decoding a lower temporal sublayer to decoding only one particular
higher temporal sublayer (but not the further layers above that, unless they also
contain STSA or TSA pictures).

5.3 Additional Parameter Sets.


The VPS has been added as metadata to describe the overall characteristics of
coded video sequences, including the dependences between temporal sublayers. The
primary purpose of this is to enable the compatible extensibility of the standard in
terms of signaling at the systems layer, e.g., when the base layer of a future extended
scalable or multiview bitstream would need to be decodable by a legacy decoder, but
for which additional information about the bitstream structure that is only relevant
for the advanced decoder would be ignored.

University Register No: 11012273 16 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

5.4 Reference Picture Sets and Reference Picture Lists

For multiple-reference picture management, a particular set of previously decoded


pictures needs to be present in the decoded picture buffer (DPB) for the decoding of
the remainder of the pictures in the bitstream. To identify these pictures, a list of
picture order count (POC) identifiers is transmitted in each slice header. The set of
retained reference pictures is called the reference picture set (RPS). Fig. 2 shows
POC values, decoding order, and RPSs for an example temporal prediction structure.

As in H.264/MPEG-4 AVC, there are two lists that are constructed as lists of
pictures in the DPB, and these are called reference picture list 0 and list 1. An index
called a reference picture index is used to identify a particular picture in one of these
lists. For uniprediction, a picture can be selected from either of these lists. For
biprediction, two pictures are selected—one from each list. When a list contains only
one picture, the reference picture index implicitly has the value 0 and does not need
to be transmitted in the bitstream.

The high-level syntax for identifying the RPS and establishing the reference
picture lists for interpicture prediction is more robust to data losses than in the prior
H.264/MPEG-4 AVC design, and is more amenable to such operations as random
access and trick mode operation (e.g., fast-forward, smooth rewind, seeking, and
adaptive bitstream switching).

A key aspect of this improvement is that the syntax is more explicit, rather than
depending on inferences from the stored internal state of the decoding process as it
decodes the bitstream picture by picture. Moreover, the associated syntax for these
aspects of the design is actually simpler than it had been for H.264/MPEG-4 AVC.

University Register No: 11012273 17 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

6. Questions And Clarifications

1. What is a SAO filter?


SAO is short for Standard Adaptive Offset Filter. It is used for
deblocking the boundaries in the images caused due to its processing as
blocks.

2. What is the compression rate of HEVC (H.265) compared to the previous


technique (i.e. H.264)?
When used well together, the features of the new design provide
approximately a 50% bit-rate savings for equivalent perceptual quality
relative to the performance of prior standards (especially for a high-
resolution video).
3. What is intrapicture prediction?
The intrapicture prediction of HEVC operates in the spatial domain. It
explains spatial relations between different blocks in a single image to
predict blocks.
4. What do you mean by temporal prediction?
Temporal prediction means predicting images over time. The relation
between consecutive images in a timeline is exploited to predict the
images. Thus the entire sequence of images does not have to be directly
saved in the video. This reduces its size of the video file or stream.
5. What is the input to the encoder and what is its output?
The input to the encoder is raw video file and the output is either a
bitstream or a compressed video file.

University Register No: 11012273 18 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

7. Conclusion
The emerging HEVC standard has been developed and standardized
collaboratively by both the ITU-T VCEG and ISO/IEC MPEG organizations. HEVC
represents a number of advances in video coding technology. Its video coding layer
design is based on conventional block-based motion compensated hybrid video
coding concepts, but with some important differences relative to prior standards.
When used well together, the features of the new design provide approximately a
50% bit-rate savings for equivalent perceptual quality relative to the performance of
prior standards (especially for a high-resolution video).

University Register No: 11012273 19 High Efficiency Video Coding


Department of Computer Science & Engineering
CS010-709 Seminar Report Rajagiri School of Engineering and Technology, Kochi – 39

REFERENCES

[1] Gary J. Sullivan, Fellow, IEEE, Jens-Rainer Ohm, Member, IEEE, Woo-Jin Han, Member,
IEEE, and Thomas Wiegand, Fellow, IEEE,Overview of the High Efficiency Video Coding
(HEVC) Standard,2013.

[2] Video Codec for Audiovisual Services at px64 kbit/s, ITU-T Rec. H.261, version 1: Nov.
1990, version 2: Mar. 1993.

[3] Video Coding for Low Bit Rate Communication, ITU-T Rec. H.263, Nov. 1995 (and
subsequent editions).

[4] Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to About
1.5 Mbit/s—Part 2: Video, ISO/IEC 11172-2 (MPEG-1), ISO/IEC JTC 1, 1993.

[5] Coding of Audio-Visual Objects—Part 2: Visual, ISO/IEC 14496-2 (MPEG-4 Visual version
1), ISO/IEC JTC 1, Apr. 1999 (and subsequent editions).

[6] Generic Coding of Moving Pictures and Associated Audio Information— Part 2: Video,
ITU-T Rec. H.262 and ISO/IEC 13818-2 (MPEG 2 Video), ITU-T and ISO/IEC JTC 1, Nov.
1994.

[7] Advanced Video Coding for Generic Audio-Visual Services, ITU-T Rec.H.264 and ISO/IEC
14496-10 (AVC), ITU-T and ISO/IEC JTC 1, May 2003 (and subsequent editions).

[8] H. Samet, “The quadtree and related hierarchical data structures,” Comput. Survey, vol. 16,
no. 2, pp. 187–260, Jun. 1984.

University Register No: 11012273 20 High Efficiency Video Coding

S-ar putea să vă placă și