How Video Codec Work

HOW VIDEO
CODEC WORKS
I'm not a runway but I don't have
either twitter or FB
But you still can ask questions

using #qacodec on twitter
PS: I do have a github leandromoreira

#qacodec
MISSION: IMPOSSIBLE
OUR GOAL
To learn the whys, whats and hows about

digital video compression.
WHY DO WE NEED
COMPRESSION?
#qacodec
IN THE BEGINNING A SINGLE PIXEL

(PICTURE ELEMENT)
#qacodec
HOW CAN WE ENCODE THIS PIXEL?
1 bit
2 bits
1B = 8 bits
#qacodec
LET'S DEFINE AN IMAGE

(2D IS ON)
width
height
4 x 4 x 1 = 16B
#qacodec
LET'S COLORIZE IT
(RGB PRIMARY COLORS)
X
#qacodec
https://lumeniquessl.com/2012/03/01/12-in-12-for-2012-the-flicker-indicator-machine/
https://lightingstudio.wordpress.com/2012/03/27/week5-light-object-shadow-contrast/
#qacodec
NOW WE'RE DEALING WITH COLOR

(3D IS ON)
width
height
color
#qacodec
THE COLOR COST
4 x 4 x 3 = 48B
BEFORE WE MOVE ON...
#qacodec
#qacodec
MATRIX OF NUMBERS
15 0 0 0
15 15 15 15
0
20 10 0 0
20 10 10 10
0
16 13 13 13
25 14 14 14
#qacodec
BEHOLD, THE 4TH DIMENSION
time
4 x 4 x 3 x 30 = 1440B
A SINGLE TV SHOW
EPISODE
#qacodec
1080p 24 fps 30 min long
time in sec fps color resolution

30 x 60 x 24 x 3 x 1080 x 1920 = 250.28GB
#qacodec
fps color resolution bits

24 x 3 x 1080 x 1920 x 8 = 1.11Gbps
WHAT CAN
WE DO?
#qacodec
exploit our reduce reduce

vision repetitions in repetitions in
space time
#qacodec
EXPLOITING OUR VISION

#qacodec
OUR EYES: AN OVERSIMPLIFICATION
The eye contains about 120M rod cells and 6M cone cells.
#qacodec
WE'RE BETTER TO SEE LUMA THAN COLOR

#qacodec
AN ALTERNATIVE TO RGB
#qacodec
From RGB to YCbCr
Y = 0.299R + 0.587G + 0.114B

Cb = 0.564(B - Y) | Cr = 0.713(R - Y)
From YCbCr to RGB
R = Y + 1.402Cr | B = Y + 1.772Cb | G = Y - 0.344Cb - 0.714Cr
*ITU-R BT.601-7
#qacodec
COLOR MODEL YUV

(YCBCR, YPBPR)
Y (luma) U (chroma blue) V (chroma red)

#qacodec
CHROMA SUBSAMPLING
1280 320
180
720
#qacodec
24 bits per
pixel
12 bits per
pixel
#qacodec

30 x 60 x 24 x 3 x 1080 x 1920 = 250.28GB
#qacodec

30 x 60 x 24 x 3 x 1080 x 1920 = 250.28GB
30 x 60 x 24 x 1.5 x 1080 x 1920 = 125.14GB
#qacodec
CORRELATIONS IN TIME
#qacodec
TEMPORAL REDUNDANCY
frame 0 frame 1 frame 2 frame 3

#qacodec
ORIGINAL FRAMES
|||||||||| (103Kb) || (4Kb)

#qacodec
FRAME DIFFERENCE
#qacodec
FRAME DIFFERENCE COST

#qacodec
WHAT IS THE COST?
diff + reference
frame 0 frame 1
CAN WE DO BETTER?
#qacodec
DESCRIBE THE MOTION

(ESTIMATION)
x=12, y=25 x=34, y=26
frame 1 frame 2
#qacodec
FRAME DIFFERENCE VS MOTION ESTIMATION

ffmpeg -flags2 +export_mvs -i in.mp4 -vf codecview=mv=pf+bf+bb out.mp4
#qacodec
SO, CAN WE JUST LINK ALL THE FRAMES?

#qacodec
TEMPORAL REDUNDANCY
(INTER PREDICTION)

FRAMES THAT CAN'T BE
EASILY EXPLOITED BY
TEMPORAL REDUNDANCY
#qacodec
CORRELATIONS IN SPACE
LOTS OF
SIMILARITIES
PATTERNS /
DIRECTIONS
SPATIAL REDUNDANCY
(INTRA PREDICTION)
100 100 100 200 100 100 100 200 100 100 100 200 100 100 100 200
100 ??? ??? ??? 100 100 100 200 100 100 100 200 100 0 0 0
100 ??? ??? ??? 100 100 100 200 100 100 100 200 100 0 0 0
100 ??? ??? ??? 100 100 100 200 100 100 120 210 100 0 20 10
unknown values direction of the real values difference

prediction highly compressible
WHAT IS THE COST OF INTRA PREDICTION?
100 100 100 200

100 0 0 0
100 0 0 0
100 0 20 10
direction of the residual
prediction values
SOME PREDICTION DIRECTIONS
PIED PIPER
OUR IMAGINARY VIDEO CODEC
* Pied Piper is a startup company focused on "multi-platform technology based on a proprietary universal compression algorithm" featured in the HBO series Silicon Valley
CODEC
A codec is a device or computer program

for encoding or decoding a digital data
stream or signal.
CODEC
Compressed
Video Source Display
Video
Compress Decompress
(enCOde) (DECode)
*Vcodex: Introduction to Video Coding

CODEC
Compressed
Video Source Display
Video
Compress Decompress
(enCOde) (DECode)
*Vcodex: Introduction to Video Coding

ENCODER BLOCKS
picture entropy
predictions transform quantization
partitioning coding
entropy lossless
technique redundancy removal
reduction compression
dct, dwt, intra-prediction, linear, huffman, lzw

implementation inter-prediction, motion logarithm ...
estimation / compensation
ENCODER
picture entropy
partitioning coding
entropy lossless

FRAME PARTITIONING
ENCODER
picture entropy
partitioning coding
entropy lossless

(INTRA|INTER)-PREDICTION
(SPACE AND TIME)
100 100 100 200
direction of the 100 100 100 200
prediction 100 100 100 200
100 100 100 200

ENCODER
picture entropy
partitioning coding
entropy lossless

TRANSFORM
Double [3] f(x): x + x => [ 6]
Plus10 [3] f(x): x + 10 => [ 13]
Divide2 [3] f(x): x / 2 => [1.5]

TRANSFORM
(DCT)
https://www.iem.thm.de/telekom-labor/zinke/mk/mpeg2beg/whatisit.htm
WALKTHROUGH
DCT APPLIED
ENCODER
picture entropy
partitioning coding
entropy lossless

ORIGINAL VS QUANTIZED (50%)
ORIGINAL VS QUANTIZED (85%)
https://www.mathworks.com/help/images/discrete-cosine-transform.html
ENCODER
picture entropy
partitioning coding
entropy lossless

ENTROPY CODING VLC
27 10 0 0
10 0 0 0
0 0 0 0
0 0 0 0
ZIG-ZAG SCAN
(FROM 2D TO 1D)
27 10 0 0
10 0 0 0
27,10,10,0,0,0,0,0,0,0,0,0,0,0,0,0
0 0 0 0
0 0 0 0
PROBABILITY OF EACH SYMBOL
symbols 27 10 0
27,10,10,0,0,0,0,0,0,0,0,0,0,0,0,0
probability 1/16 2/16 13/16
BITCODE
symbols 27 10 0
probability 1/16 2/16 13/16
binary code 110 10 0

ENCODING BIT CODES
binary code 0 10 110
27,10,10,0,0,0,0,0,0,0,0,0,0,0,0,0 16B
110 10 10 0 0 0 .... 3B + symbol table
*CAVLC example
FROM MANY BYTES TO FEW BITS
0001101100001010000010100000
0000000000000000000000000000
0000000000000000000000000000 010101101101 + code table
0000000000000000000000000000
0000000000000000
ENCODER
picture entropy
partitioning coding
entropy lossless

ENCODER
picture predictions
transform quantization entropy coding
partitioning (subtract)
DECODER
picture predictions inverse entropy

dequantization
reconstructing (add) transform decoding
#qacodec

30 x 60 x 24 x 3 x 1080 x 1920 = 250.28GB
#qacodec

30 x 60 x 24 x 3 x 1080 x 1920 = 250.28GB
30 x 60 x 24 x 1.5 x 1080 x 1920 = 125.14GB
#qacodec

30 x 60 x 24 x 3 x 1080 x 1920 = 250.28GB
30 x 60 x 24 x 1.5 x 1080 x 1920 = 125.14GB
30 x 60 x 24 x 0.127 x 1080 x 1920 = 1.32GB
#qacodec
fps color resolution bits

24 x 3 x 1080 x 1920 x 8 = 1.11Gbps
fps color resolution

24 x 0.127 x 1080 x 1920 ~= 6.02Mbps
FINALLY, HOW
NEWER CODECS
BEAT OLDER ONES?
AVC @ 4Mbps HEVC @ 2Mbps
AVC @ 400kbps HEVC @ 400kbps
References https://github.com/leandromoreira/digital_video_introduction
https://en.wikipedia.org/wiki/List_of_monochrome_and_RGB_p
alettes
https://globoplay.globo.com/os-dias-eram-assim/p/9955/
https://commons.wikimedia.org/wiki/File:Ressorts_de_compre
ssion_coniques.jpg
https://commons.wikimedia.org/wiki/File:Sony_DME7000_Digit
al_Video_Multi_Effects_Processor_(13577533863).jpg
https://commons.wikimedia.org/wiki/File:High-speed.jpg
https://en.wikipedia.org/wiki/List_of_monochrome_and_RGB_p
alettes
https://www.youtube.com/watch?v=o0DYP-u1rNM
http://silicon-valley.wikia.com/wiki/Pied_Piper_(company)
http://www.nintendo.com/
SEE MORE:
bit.ly/intro_codec
LEANDRO MOREIRA
leandromoreira.com.br
Disclaimer: The views and opinions expressed in this presentation are those of the authors and do not
necessarily reflect the policy or position of Globo.

How Video Codec Work

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

How Video Codec Work

Încărcat de

Drepturi de autor:

Formate disponibile

HOW VIDEO

But you still can ask questions

PS: I do have a github leandromoreira

To learn the whys, whats and hows about

IN THE BEGINNING A SINGLE PIXEL

HOW CAN WE ENCODE THIS PIXEL?

LET'S DEFINE AN IMAGE

NOW WE'RE DEALING WITH COLOR

THE COLOR COST

BEHOLD, THE 4TH DIMENSION

1080p 24 fps 30 min long

time in sec fps color resolution

fps color resolution bits

exploit our reduce reduce

EXPLOITING OUR VISION

OUR EYES: AN OVERSIMPLIFICATION

WE'RE BETTER TO SEE LUMA THAN COLOR

From RGB to YCbCr

Y = 0.299R + 0.587G + 0.114B

From YCbCr to RGB

R = Y + 1.402Cr | B = Y + 1.772Cb | G = Y - 0.344Cb - 0.714Cr

COLOR MODEL YUV

Y (luma) U (chroma blue) V (chroma red)

time in sec fps color resolution

time in sec fps color resolution

frame 0 frame 1 frame 2 frame 3

|||||||||| (103Kb) || (4Kb)

FRAME DIFFERENCE COST

WHAT IS THE COST?

DESCRIBE THE MOTION

x=12, y=25 x=34, y=26

FRAME DIFFERENCE VS MOTION ESTIMATION

SO, CAN WE JUST LINK ALL THE FRAMES?

unknown values direction of the real values difference

100 100 100 200

A codec is a device or computer program

*Vcodex: Introduction to Video Coding

*Vcodex: Introduction to Video Coding

dct, dwt, intra-prediction, linear, huffman, lzw

dct, dwt, intra-prediction, linear, huffman, lzw

dct, dwt, intra-prediction, linear, huffman, lzw

100 100 100 200

direction of the 100 100 100 200

prediction 100 100 100 200

100 100 100 200

dct, dwt, intra-prediction, linear, huffman, lzw

Double [3] f(x): x + x => [ 6]

Plus10 [3] f(x): x + 10 => [ 13]

Divide2 [3] f(x): x / 2 => [1.5]

dct, dwt, intra-prediction, linear, huffman, lzw

dct, dwt, intra-prediction, linear, huffman, lzw

probability 1/16 2/16 13/16

binary code 110 10 0

binary code 0 10 110

110 10 10 0 0 0 .... 3B + symbol table

dct, dwt, intra-prediction, linear, huffman, lzw

picture predictions inverse entropy

time in sec fps color resolution

time in sec fps color resolution

time in sec fps color resolution

fps color resolution bits

fps color resolution

S-ar putea să vă placă și