Documente Academic
Documente Profesional
Documente Cultură
In 1992, the Joint Photographic Experts Group first published a method of digital compression and
coding of continuoustone still images in both grayscale and color, colloquially known as the JPEG
format. The method was included in International Telecommunication Union Recommendation T.81,
which includes three elements: 1) an encoding process for taking digital source images and creating
compressedimagedata;2)adecoderfortakingcompressedimagedataandgeneratingreconstructed
image data for display; and 3) an interchange format which described specifications allowing the
compressedimagedatatobeusedindifferentapplicationenvironments.Thepurposeofthispaperisto
discussthemethodofconvertingdigitalimagesinthered/green/blueformattoacompressedJPEGfile.
Background Information
Red, Green Blue Color Space and Human Vision
Thered,green,andblue(RGB)colorspacewasbasedontheYoungHelmholtztheorythathumancolor
vision is trichromatic, meaning that the eye contains photoreceptors which are sensitive to particular
ranges of visible of light: a long photoreceptor (L) sensitive to red light, a medium photoreceptor (M)
sensitivetogreenlight,andasmallphotoreceptor(S)sensitivetobluelight.Colorsareinterpretedby
the brainbasedontherelativestrengthsofthesignalsfromeachofthesephotoreceptors.The three
human photoreceptors will produce a signal response, as shown in Figure1, for any particular
wavelengthofvisiblelight.Forexample,alightwithawavelengthofapproximately500nm(lookslike
this:
)wouldoutputapproximately0.1ontheSphotoreceptor,0.3ontheMphotoreceptor,and0.2
ontheLphotoreceptor.
Figure1ConeResponseCurves(OutputSignalStrengthbetween380nmand700nm)
Thisalsomeansthatifthreewavelengthsoflight,say450nm,550nm,and650nmweresimultaneously
transmittedtotheeye,atrelativelyintensitiesthatproducedasimilaroutputresponse,thebrainwould
actuallythinkitislookingatlightwithawavelengthof500nm.Thispropertyofhumanvisionisutilized
oncomputersandcomputermonitorsinordertostoreimageinformationanddisplayittousers.
Figure2ShadesofRed,Green,andBlue
Modernliquidcrystaldisplay(LCD)computermonitorshaveabuiltindisplayresolution,whichindicates
thenumberofdiscrete,addressablepointsonthemonitorwhichcanhaveacolorassignedtoit.Eachof
these picture elements, or pixels, may have unique RGB color data associated with it. The actual
constructionofeachpixelisasshowninFigure3.Eachpixelisactuallycomposedofthreesubpixels,
oneforeachcolorinthecolorspace.Thedensityofthesubpixelsissuchthattheeyecannotdiscernthe
discretesubpixels.Eachpixelcanbethoughtofasavectorin ,witharangeasshowninFigure4.
Figure3LCDPixelGeometry
Contemporary display cables (HDMI and DisplayPort) include specifications for transmitting on multiple color
spaces
Figure4RGBColorCube
Whilecomputermemoryisconsideredtobeonegiantvector(i.e.,
1matrix),itishelpfultothinkof
thedisplaymemoryasagiant
matrix,where isthehorizontalresolutionofthedisplay,and is
theverticalresolutionofthedisplay.Forthepurposesofmanipulationofthematrices,itisalsohelpful
toconsidereachcolorasaseparatematrix,meaningone
matrixforred,one
matrixfor
green, and one
matrix for blue. During each display frame, the computer will send the
informationinthesematricestothemonitor,sothedisplayimagecanbeupdated.
Procedure
1. ConvertthesourceimagetotheYCBCRcolorspace.
2. Reduceresolutionofchromadata(CbandCR),astheeyeislesssensitivetothisinformation
3. Splitthesourceimagecolorspacematricesintoblocksofsize
. PerformDiscreteCosine
Transforms(DCT)oneachblockandstoretheresultingmatricesforeachcolorvector.
4. Quantizethecolorspacematrices,givingmoreweighttolowfrequencydatacomparedtohigh
frequencydata.
5. Thequantizedmatricesarefurthercompressedwithlosslesscompressionalgorithm(s),suchas
variantsofrunlengthencoding.
Theseprocedureswillbefurtherexpandeduponbelow.
0.587
0.587
0.587
0.114
0.886
0.114
ThistransformationendsuprotatingandshearingtheRGBcolorcube,asshowninFigure5.
Figure5YCbCrColorCube
Whilethetransformationshownaboveisalineartransformation,theactualtransformationperformed
,inorder
forthepurposesofJPEGencodingisnotalineartransformation,sinceitisoftheform
tonormalizeandclampthevaluesusedbythecomputerto8bitvalues(i.e.,0through255inclusive).
Therevisedformulaisasfollows:
1
min max 0, round
0
1
0
1.772
0
0
0
1
1.402
0.299
0.299
0.701
0.587
0.587
0.587
0.114
0.886
0.114
0
128
128
, 255
1
4
cos
1
16
cos
1
16
Where,
is the horizontal spatial frequency, from 0 to 7 inclusive
is the vertical spatial frequency, from 0 to 7 inclusive
1
,
0
2
1,
otherwise
, is the pixel value at coordinates x, y of the block
, is the DCT coefficient at coordinates u, v
Figure6FrequencyPatterns(DCTBasisFunctions)
This transformation, as noted above, changes the basis of the image to one based on the DCT basis
functions. Since it is a twodimensional transformation, the transformation is represented by the
following,where isthesourceimage,and istheDCTmatrix:
Where,
2
2
cos
16
2
cos
16
3
1 cos 16
2
4
cos
16
5
cos
16
6
cos
16
7
cos
16
2
2
3
cos
16
6
cos
16
9
cos
16
12
cos
16
15
cos
16
18
cos
16
21
cos
16
2
2
5
cos
16
10
cos
16
15
cos
16
20
cos
16
25
cos
16
30
cos
16
35
cos
16
2
2
7
cos
16
14
cos
16
21
cos
16
28
cos
16
35
cos
16
42
cos
16
49
cos
16
2
2
9
cos
16
18
cos
16
27
cos
16
36
cos
16
45
cos
16
54
cos
16
63
cos
16
2
2
11
cos
16
22
cos
16
33
cos
16
44
cos
16
55
cos
16
66
cos
16
77
cos
16
2
2
13
cos
16
26
cos
16
39
cos
16
52
cos
16
65
cos
16
78
cos
16
91
cos
16
2
2
15
cos
16
30
cos
16
45
cos
16
60
cos
16
75
cos
16
90
cos
16
105
cos
16
PriortoperformingtheDCTontheYCrCbsourceimage,itisusuallylevelshiftedsothatthe8bitvalues
arecenteredaround0,withanintervalfrom128to127,ratherthanbeingfromanintervalbetween0
and 255. This is accomplished by subtracting 128 from each value. If A was as follows, before level
shifting,Awouldbeasfollows:
154
153
154
160
164
165
173
184
161
155
149
150
157
162
166
171
188
185
176
168
167
168
162
155
197
199
196
190
188
186
181
176
200
199
198
200
201
197
194
197
181
191
194
196
201
195
191
198
134
145
161
173
181
188
191
191
111
108
112
122
130
142
160
176
Thisoperationwouldbeperformedforeach8x8blockinthesourceimage,foreachdimensioninthe
image(Y,Cr,andCb)
Quantize Resultant Matrices
Sincethehumaneyeislessabletodistinguishtheexactstrengthofhighfrequencyvariationsinasmall
area,thecoefficientsrepresentingthesehighfrequencycomponentscanbereduced,andinmostcases,
eliminated (i.e., set equal to 0). This is accomplished by dividing each coefficient by a value from a
quantizationmatrix.TheoriginalJPEGstandardutilizedthefollowingquantizationmatrix:
Aquantizedmatrix,B,iscreatedfromtheDCTcoefficientsaccordingtothefollowing:
,
round
,
,
IfAwerequantizedwithquantizationmatrixQ,thefollowingmatrix,B,wouldbecreated:
Lossless Compression
AsindicatedinthequantizedmatrixB,shownabove,mostofthecoefficientsfromtheDCThavebeen
reducedtozero.Utilizingarunlengthencodingscheme,whereeachvalueiscodedwiththenumberof
timesthatvalueappears(i.e.,fourzeroesinarowareencoded[0,4]),andutilizingazigzagorderingas
showninFigure7,matrixBcouldberepresentedasfollows:
[22,1],[1,1],[3,1],[0,1],[4,1],[14,1],[5,1],[5,1],[1,1],[0,3],[1,1],[1,1],[0,1],[1,1],[0,1],[1,1],
[0,49]
The entire matrix can be recreated based on the 34 values described above, which represents a
compressionof47%.
Figure7ZigZagOrderingof8x8Block