Sunteți pe pagina 1din 92

HOW VIDEO

CODEC WORKS
I'm not a runway but I don't have
either twitter or FB

But you still can ask questions


using #qacodec on twitter

PS: I do have a github leandromoreira


#qacodec

MISSION: IMPOSSIBLE
OUR GOAL

To learn the whys, whats and hows about


digital video compression.
WHY DO WE NEED
COMPRESSION?
#qacodec

IN THE BEGINNING A SINGLE PIXEL


(PICTURE ELEMENT)
#qacodec

HOW CAN WE ENCODE THIS PIXEL?

1 bit

2 bits

1B = 8 bits
#qacodec

LET'S DEFINE AN IMAGE


(2D IS ON)

width

height

4 x 4 x 1 = 16B
#qacodec

LET'S COLORIZE IT
(RGB PRIMARY COLORS)
X
#qacodec

https://lumeniquessl.com/2012/03/01/12-in-12-for-2012-the-flicker-indicator-machine/
https://lightingstudio.wordpress.com/2012/03/27/week5-light-object-shadow-contrast/
#qacodec

NOW WE'RE DEALING WITH COLOR


(3D IS ON)

width

height

color
#qacodec

THE COLOR COST

4 x 4 x 3 = 48B
BEFORE WE MOVE ON...
#qacodec
#qacodec

MATRIX OF NUMBERS

15 0 0 0

15 15 15 15
0
20 10 0 0

20 10 10 10
0
16 13 13 13
25 14 14 14
#qacodec

BEHOLD, THE 4TH DIMENSION

time

4 x 4 x 3 x 30 = 1440B
A SINGLE TV SHOW
EPISODE
#qacodec

1080p 24 fps 30 min long

time in sec fps color resolution


30 x 60 x 24 x 3 x 1080 x 1920 = 250.28GB
#qacodec

fps color resolution bits


24 x 3 x 1080 x 1920 x 8 = 1.11Gbps
WHAT CAN
WE DO?
#qacodec

exploit our reduce reduce


vision repetitions in repetitions in
space time
#qacodec

EXPLOITING OUR VISION


#qacodec

OUR EYES: AN OVERSIMPLIFICATION

The eye contains about 120M rod cells and 6M cone cells.
#qacodec

WE'RE BETTER TO SEE LUMA THAN COLOR


#qacodec

AN ALTERNATIVE TO RGB
#qacodec

From RGB to YCbCr

Y = 0.299R + 0.587G + 0.114B


Cb = 0.564(B - Y) | Cr = 0.713(R - Y)

From YCbCr to RGB

R = Y + 1.402Cr | B = Y + 1.772Cb | G = Y - 0.344Cb - 0.714Cr

*ITU-R BT.601-7
#qacodec

COLOR MODEL YUV


(YCBCR, YPBPR)

Y (luma) U (chroma blue) V (chroma red)


#qacodec

CHROMA SUBSAMPLING

1280 320

180

720
#qacodec
24 bits per
pixel

12 bits per
pixel
#qacodec

time in sec fps color resolution


30 x 60 x 24 x 3 x 1080 x 1920 = 250.28GB
#qacodec

time in sec fps color resolution


30 x 60 x 24 x 3 x 1080 x 1920 = 250.28GB
time in sec fps color resolution
30 x 60 x 24 x 1.5 x 1080 x 1920 = 125.14GB
#qacodec

CORRELATIONS IN TIME
#qacodec

TEMPORAL REDUNDANCY

frame 0 frame 1 frame 2 frame 3


#qacodec

ORIGINAL FRAMES

|||||||||| (103Kb) || (4Kb)


#qacodec

FRAME DIFFERENCE
#qacodec

FRAME DIFFERENCE COST


#qacodec

WHAT IS THE COST?

diff + reference

frame 0 frame 1
CAN WE DO BETTER?
#qacodec

DESCRIBE THE MOTION


(ESTIMATION)

x=12, y=25 x=34, y=26

frame 1 frame 2
#qacodec

FRAME DIFFERENCE VS MOTION ESTIMATION


ffmpeg -flags2 +export_mvs -i in.mp4 -vf codecview=mv=pf+bf+bb out.mp4
#qacodec

SO, CAN WE JUST LINK ALL THE FRAMES?


#qacodec

TEMPORAL REDUNDANCY
(INTER PREDICTION)


FRAMES THAT CAN'T BE
EASILY EXPLOITED BY
TEMPORAL REDUNDANCY
#qacodec

CORRELATIONS IN SPACE
LOTS OF
SIMILARITIES
PATTERNS /
DIRECTIONS
SPATIAL REDUNDANCY
(INTRA PREDICTION)

100 100 100 200 100 100 100 200 100 100 100 200 100 100 100 200

100 ??? ??? ??? 100 100 100 200 100 100 100 200 100 0 0 0

100 ??? ??? ??? 100 100 100 200 100 100 100 200 100 0 0 0

100 ??? ??? ??? 100 100 100 200 100 100 120 210 100 0 20 10

unknown values direction of the real values difference


prediction highly compressible
WHAT IS THE COST OF INTRA PREDICTION?

100 100 100 200


100 0 0 0
100 0 0 0
100 0 20 10
direction of the residual
prediction values
SOME PREDICTION DIRECTIONS
PIED PIPER
OUR IMAGINARY VIDEO CODEC

* Pied Piper is a startup company focused on "multi-platform technology based on a proprietary universal compression algorithm" featured in the HBO series Silicon Valley
CODEC

A codec is a device or computer program


for encoding or decoding a digital data
stream or signal.
CODEC

Compressed
Video Source Display
Video

Compress Decompress
(enCOde) (DECode)

*Vcodex: Introduction to Video Coding


CODEC

Compressed
Video Source Display
Video

Compress Decompress
(enCOde) (DECode)

*Vcodex: Introduction to Video Coding


ENCODER BLOCKS

picture entropy
predictions transform quantization
partitioning coding

entropy lossless
technique redundancy removal
reduction compression

dct, dwt, intra-prediction, linear, huffman, lzw


implementation inter-prediction, motion logarithm ...
estimation / compensation
ENCODER

picture entropy
predictions transform quantization
partitioning coding

entropy lossless
technique redundancy removal
reduction compression

dct, dwt, intra-prediction, linear, huffman, lzw


implementation inter-prediction, motion logarithm ...
estimation / compensation
FRAME PARTITIONING
ENCODER

picture entropy
predictions transform quantization
partitioning coding

entropy lossless
technique redundancy removal
reduction compression

dct, dwt, intra-prediction, linear, huffman, lzw


implementation inter-prediction, motion logarithm ...
estimation / compensation
(INTRA|INTER)-PREDICTION
(SPACE AND TIME)

100 100 100 200

direction of the 100 100 100 200

prediction 100 100 100 200

100 100 100 200


ENCODER

picture entropy
predictions transform quantization
partitioning coding

entropy lossless
technique redundancy removal
reduction compression

dct, dwt, intra-prediction, linear, huffman, lzw


implementation inter-prediction, motion logarithm ...
estimation / compensation
TRANSFORM

Double [3] f(x): x + x => [ 6]

Plus10 [3] f(x): x + 10 => [ 13]

Divide2 [3] f(x): x / 2 => [1.5]


TRANSFORM
(DCT)
https://www.iem.thm.de/telekom-labor/zinke/mk/mpeg2beg/whatisit.htm
WALKTHROUGH
DCT APPLIED
ENCODER

picture entropy
predictions transform quantization
partitioning coding

entropy lossless
technique redundancy removal
reduction compression

dct, dwt, intra-prediction, linear, huffman, lzw


implementation inter-prediction, motion logarithm ...
estimation / compensation
ORIGINAL VS QUANTIZED (50%)
ORIGINAL VS QUANTIZED (85%)

https://www.mathworks.com/help/images/discrete-cosine-transform.html
ENCODER

picture entropy
predictions transform quantization
partitioning coding

entropy lossless
technique redundancy removal
reduction compression

dct, dwt, intra-prediction, linear, huffman, lzw


implementation inter-prediction, motion logarithm ...
estimation / compensation
ENTROPY CODING VLC

27 10 0 0

10 0 0 0

0 0 0 0

0 0 0 0
ZIG-ZAG SCAN
(FROM 2D TO 1D)

27 10 0 0

10 0 0 0
27,10,10,0,0,0,0,0,0,0,0,0,0,0,0,0
0 0 0 0

0 0 0 0
PROBABILITY OF EACH SYMBOL

symbols 27 10 0
27,10,10,0,0,0,0,0,0,0,0,0,0,0,0,0
probability 1/16 2/16 13/16
BITCODE

symbols 27 10 0

probability 1/16 2/16 13/16

binary code 110 10 0


ENCODING BIT CODES

binary code 0 10 110

27,10,10,0,0,0,0,0,0,0,0,0,0,0,0,0 16B

110 10 10 0 0 0 .... 3B + symbol table

*CAVLC example
FROM MANY BYTES TO FEW BITS

0001101100001010000010100000
0000000000000000000000000000
0000000000000000000000000000 010101101101 + code table
0000000000000000000000000000
0000000000000000
ENCODER

picture entropy
predictions transform quantization
partitioning coding

entropy lossless
technique redundancy removal
reduction compression

dct, dwt, intra-prediction, linear, huffman, lzw


implementation inter-prediction, motion logarithm ...
estimation / compensation
ENCODER

picture predictions
transform quantization entropy coding
partitioning (subtract)

DECODER

picture predictions inverse entropy


dequantization
reconstructing (add) transform decoding
#qacodec

time in sec fps color resolution


30 x 60 x 24 x 3 x 1080 x 1920 = 250.28GB
#qacodec

time in sec fps color resolution


30 x 60 x 24 x 3 x 1080 x 1920 = 250.28GB
time in sec fps color resolution
30 x 60 x 24 x 1.5 x 1080 x 1920 = 125.14GB
#qacodec

time in sec fps color resolution


30 x 60 x 24 x 3 x 1080 x 1920 = 250.28GB
time in sec fps color resolution
30 x 60 x 24 x 1.5 x 1080 x 1920 = 125.14GB
time in sec fps color resolution
30 x 60 x 24 x 0.127 x 1080 x 1920 = 1.32GB
#qacodec

fps color resolution bits


24 x 3 x 1080 x 1920 x 8 = 1.11Gbps

fps color resolution


24 x 0.127 x 1080 x 1920 ~= 6.02Mbps
FINALLY, HOW
NEWER CODECS
BEAT OLDER ONES?
AVC @ 4Mbps HEVC @ 2Mbps
AVC @ 400kbps HEVC @ 400kbps
References https://github.com/leandromoreira/digital_video_introduction
https://en.wikipedia.org/wiki/List_of_monochrome_and_RGB_p
alettes
https://globoplay.globo.com/os-dias-eram-assim/p/9955/
https://commons.wikimedia.org/wiki/File:Ressorts_de_compre
ssion_coniques.jpg
https://commons.wikimedia.org/wiki/File:Sony_DME7000_Digit
al_Video_Multi_Effects_Processor_(13577533863).jpg
https://commons.wikimedia.org/wiki/File:High-speed.jpg
https://en.wikipedia.org/wiki/List_of_monochrome_and_RGB_p
alettes
https://www.youtube.com/watch?v=o0DYP-u1rNM
http://silicon-valley.wikia.com/wiki/Pied_Piper_(company)
http://www.nintendo.com/
SEE MORE:

bit.ly/intro_codec

LEANDRO MOREIRA
leandromoreira.com.br

Disclaimer: The views and opinions expressed in this presentation are those of the authors and do not
necessarily reflect the policy or position of Globo.

S-ar putea să vă placă și