Sunteți pe pagina 1din 7

Introduction

In 1992, the Joint Photographic Experts Group first published a method of digital compression and
coding of continuoustone still images in both grayscale and color, colloquially known as the JPEG
format. The method was included in International Telecommunication Union Recommendation T.81,
which includes three elements: 1) an encoding process for taking digital source images and creating
compressedimagedata;2)adecoderfortakingcompressedimagedataandgeneratingreconstructed
image data for display; and 3) an interchange format which described specifications allowing the
compressedimagedatatobeusedindifferentapplicationenvironments.Thepurposeofthispaperisto
discussthemethodofconvertingdigitalimagesinthered/green/blueformattoacompressedJPEGfile.

Background Information
Red, Green Blue Color Space and Human Vision
Thered,green,andblue(RGB)colorspacewasbasedontheYoungHelmholtztheorythathumancolor
vision is trichromatic, meaning that the eye contains photoreceptors which are sensitive to particular
ranges of visible of light: a long photoreceptor (L) sensitive to red light, a medium photoreceptor (M)
sensitivetogreenlight,andasmallphotoreceptor(S)sensitivetobluelight.Colorsareinterpretedby
the brainbasedontherelativestrengthsofthesignalsfromeachofthesephotoreceptors.The three
human photoreceptors will produce a signal response, as shown in Figure1, for any particular
wavelengthofvisiblelight.Forexample,alightwithawavelengthofapproximately500nm(lookslike
this:
)wouldoutputapproximately0.1ontheSphotoreceptor,0.3ontheMphotoreceptor,and0.2
ontheLphotoreceptor.

Figure1ConeResponseCurves(OutputSignalStrengthbetween380nmand700nm)

Thisalsomeansthatifthreewavelengthsoflight,say450nm,550nm,and650nmweresimultaneously
transmittedtotheeye,atrelativelyintensitiesthatproducedasimilaroutputresponse,thebrainwould
actuallythinkitislookingatlightwithawavelengthof500nm.Thispropertyofhumanvisionisutilized
oncomputersandcomputermonitorsinordertostoreimageinformationanddisplayittousers.

Computer/Monitor Interface and Display of Images


Theconnectionbetweenthecomputerandthemonitor,ifusingaVGAorDVIcable1,hasoneormore
linksdedicatedtotransmittingred,green,andbluedata.Inordertodisplayonimageonacomputer
monitor,thedatafortheimagemustbestoredinaformatthatcaneasilybeunderstoodbythedisplay
renderer, so it can push this information to the computer monitor on a realtime basis. Since many
computermonitorsexpecttoseedatainaRGBformat,thisishowthedisplayinformationisstoredin
computermemory.
For the purposes of this discussion, RGB format will mean that 8bits are dedicated to each color
channel,meaning 2 ,or256,possibleshadesofred,greenandbluecanbedisplayedoneachrespective
colorchannel.Figure2showstheshadingavailableforthered,green,andbluecolorspace.

Figure2ShadesofRed,Green,andBlue

Modernliquidcrystaldisplay(LCD)computermonitorshaveabuiltindisplayresolution,whichindicates
thenumberofdiscrete,addressablepointsonthemonitorwhichcanhaveacolorassignedtoit.Eachof
these picture elements, or pixels, may have unique RGB color data associated with it. The actual
constructionofeachpixelisasshowninFigure3.Eachpixelisactuallycomposedofthreesubpixels,
oneforeachcolorinthecolorspace.Thedensityofthesubpixelsissuchthattheeyecannotdiscernthe
discretesubpixels.Eachpixelcanbethoughtofasavectorin ,witharangeasshowninFigure4.

Figure3LCDPixelGeometry

Contemporary display cables (HDMI and DisplayPort) include specifications for transmitting on multiple color
spaces


Figure4RGBColorCube

Whilecomputermemoryisconsideredtobeonegiantvector(i.e.,
1matrix),itishelpfultothinkof
thedisplaymemoryasagiant
matrix,where isthehorizontalresolutionofthedisplay,and is
theverticalresolutionofthedisplay.Forthepurposesofmanipulationofthematrices,itisalsohelpful
toconsidereachcolorasaseparatematrix,meaningone
matrixforred,one
matrixfor
green, and one
matrix for blue. During each display frame, the computer will send the
informationinthesematricestothemonitor,sothedisplayimagecanbeupdated.

Conversion to JPEG format


Thebackgroundinformationabovedescribedthedisplayofcolordatalocatedwithincomputermemory
onthecomputermonitorforthepurposesofdisplayingtotheentiremonitorresolution,howeveritis
important to note that individual images may have color data stored in matrices of a different size
(smaller or larger) than the display resolution. What follows is a procedure for converting a source
image in RGB format to JPEG format utilizing the encoding standard described in the JPEG File
InterchangeFormat(JFIF)standard(ITUTRecommendationT.871).

Procedure
1. ConvertthesourceimagetotheYCBCRcolorspace.
2. Reduceresolutionofchromadata(CbandCR),astheeyeislesssensitivetothisinformation
3. Splitthesourceimagecolorspacematricesintoblocksofsize
. PerformDiscreteCosine
Transforms(DCT)oneachblockandstoretheresultingmatricesforeachcolorvector.
4. Quantizethecolorspacematrices,givingmoreweighttolowfrequencydatacomparedtohigh
frequencydata.
5. Thequantizedmatricesarefurthercompressedwithlosslesscompressionalgorithm(s),suchas
variantsofrunlengthencoding.
Theseprocedureswillbefurtherexpandeduponbelow.

Convert source image color space


WhileimagesonacomputermaybestoredinmemoryinRGBformat,humanphysiologycausesthisto
be an inefficient way of storing the information that makes up the image. Because of the substantial
overlapintheMandLphotoreceptors,RandGinformationcanbestoredasonevaluewithlittlelossof
information. Thetwochrominancevaluesstorethecolorinformationandoneluminance valuestores
thelightintensity.PerITU601,theconversiontoYCbCrisachangeinbasisrepresentedbythefollowing
lineartransformation:
0.299
0.299
0.701

0.587
0.587
0.587

0.114
0.886
0.114

ThistransformationendsuprotatingandshearingtheRGBcolorcube,asshowninFigure5.

Figure5YCbCrColorCube

Whilethetransformationshownaboveisalineartransformation,theactualtransformationperformed
,inorder
forthepurposesofJPEGencodingisnotalineartransformation,sinceitisoftheform
tonormalizeandclampthevaluesusedbythecomputerto8bitvalues(i.e.,0through255inclusive).
Therevisedformulaisasfollows:
1
min max 0, round

0
1
0
1.772
0

0
0
1
1.402

0.299
0.299
0.701

0.587
0.587
0.587

0.114
0.886
0.114

0
128
128

, 255

Resolution Reduction of Chrominance Data


Since the eye is more sensitive to the information carried by luminance matrix compared to the two
chrominance matrices, the size of the chrominance matrices can be reduced. This is not required,
however,andinsomecasesmaynegativelyaffecttheDCTandquantizationsteps,sothisstepwillbe
skippedinthisdiscussion.

Discrete Cosine Transform


JPEG encoding utilizes a frequencydomain transformation in order to change the basis of the source
image.Thetypeoftransformationusediscalledadiscretecosinetransform(DCT).TheDCTtransforms
eachblockofthesourceimageintocoefficientsbasedontheDCTbasisfunctions.Thebasisfunctions
aredescribedbythefollowingequation,withapictorialrepresentationshowninFigure6:

1
4

cos

1
16

cos

1
16

Where,
is the horizontal spatial frequency, from 0 to 7 inclusive
is the vertical spatial frequency, from 0 to 7 inclusive
1
,
0

2
1,
otherwise
, is the pixel value at coordinates x, y of the block
, is the DCT coefficient at coordinates u, v

Figure6FrequencyPatterns(DCTBasisFunctions)

This transformation, as noted above, changes the basis of the image to one based on the DCT basis
functions. Since it is a twodimensional transformation, the transformation is represented by the
following,where isthesourceimage,and istheDCTmatrix:

Where,

2
2
cos

16
2
cos
16
3
1 cos 16
2
4
cos
16
5
cos
16
6
cos
16
7
cos
16

2
2
3
cos
16
6
cos
16
9
cos
16
12
cos
16
15
cos
16
18
cos
16
21
cos
16

2
2
5
cos
16
10
cos
16
15
cos
16
20
cos
16
25
cos
16
30
cos
16
35
cos
16

2
2
7
cos
16
14
cos
16
21
cos
16
28
cos
16
35
cos
16
42
cos
16
49
cos
16

2
2
9
cos
16
18
cos
16
27
cos
16
36
cos
16
45
cos
16
54
cos
16
63
cos
16

2
2
11
cos
16
22
cos
16
33
cos
16
44
cos
16
55
cos
16
66
cos
16
77
cos
16

2
2
13
cos
16
26
cos
16
39
cos
16
52
cos
16
65
cos
16
78
cos
16
91
cos
16

2
2
15
cos
16
30
cos
16
45
cos
16
60
cos
16
75
cos
16
90
cos
16
105
cos
16

PriortoperformingtheDCTontheYCrCbsourceimage,itisusuallylevelshiftedsothatthe8bitvalues
arecenteredaround0,withanintervalfrom128to127,ratherthanbeingfromanintervalbetween0
and 255. This is accomplished by subtracting 128 from each value. If A was as follows, before level
shifting,Awouldbeasfollows:
154
153
154
160
164
165
173
184

161
155
149
150
157
162
166
171

188
185
176
168
167
168
162
155

197
199
196
190
188
186
181
176

200
199
198
200
201
197
194
197

181
191
194
196
201
195
191
198

134
145
161
173
181
188
191
191

111
108
112
122

130
142
160
176

Thisoperationwouldbeperformedforeach8x8blockinthesourceimage,foreachdimensioninthe
image(Y,Cr,andCb)
Quantize Resultant Matrices
Sincethehumaneyeislessabletodistinguishtheexactstrengthofhighfrequencyvariationsinasmall
area,thecoefficientsrepresentingthesehighfrequencycomponentscanbereduced,andinmostcases,
eliminated (i.e., set equal to 0). This is accomplished by dividing each coefficient by a value from a
quantizationmatrix.TheoriginalJPEGstandardutilizedthefollowingquantizationmatrix:


Aquantizedmatrix,B,iscreatedfromtheDCTcoefficientsaccordingtothefollowing:
,

round

,
,

IfAwerequantizedwithquantizationmatrixQ,thefollowingmatrix,B,wouldbecreated:

Lossless Compression
AsindicatedinthequantizedmatrixB,shownabove,mostofthecoefficientsfromtheDCThavebeen
reducedtozero.Utilizingarunlengthencodingscheme,whereeachvalueiscodedwiththenumberof
timesthatvalueappears(i.e.,fourzeroesinarowareencoded[0,4]),andutilizingazigzagorderingas
showninFigure7,matrixBcouldberepresentedasfollows:
[22,1],[1,1],[3,1],[0,1],[4,1],[14,1],[5,1],[5,1],[1,1],[0,3],[1,1],[1,1],[0,1],[1,1],[0,1],[1,1],
[0,49]
The entire matrix can be recreated based on the 34 values described above, which represents a
compressionof47%.

Figure7ZigZagOrderingof8x8Block

S-ar putea să vă placă și