Documente Academic
Documente Profesional
Documente Cultură
6
METODE KOMPRESI
RUN LEGHT ENCODING (RLE)
& KUANTISASI
Chapter 3
Run-length algorithms
In this chapter, we consider a type of redundancy, as in Example 2.24, where
a consecutive sequence of symbols can be identified, and introduce a class of
simple but useful lossless compression algorithms called run-length algorithms
or run-length encoding (RLE for short).
We first introduce the ideas and approaches of the run-length compression
techniques. We then move on to show how the algorithm design techniques
learnt in Chapter 1 can be applied to solve the compression problem.
3.1 Run-length
The consecutive recurrent symbols are usually called runs in a sequence of sym-
bols. Hence the source data of interest is a sequence of symbols from an alphabet.
The goal of the run-length algorithm is to identify the runs and record the length
of each run and the symbol in the run.
1. K K K K K K K K K
2. ABCDEFG
3. ABABBBC
49
50 C H A P T E R 3. R U N - L E N G T H A L G O R I T H M S
uuuuuuuVVVV
For the non-run parts, non-repeating control characters n l, n 2 , . . . , n63 are
used which are followed by the length of the longest non-repeating characters
3.2. HARDWARE DATA COMPRESSION (HDC) 51
until the next run or the end of the entire file. For example, ABCDEFG will be
replaced by nrABCDEFG.
This simple version of the HDC algorithm essentially uses only ASCII codes
for the single symbols, or a total of 123 control characters including a run-
length count. Each ri, where i = 2,-.. , 63, is followed by either another control
character or a symbol. If the following symbol is another control character, ri
(alone) signifies i repeating space characters (i.e. spaces or blanks). Otherwise,
ri signifies that the symbol immediately after it repeats i times. Each hi, where
i = 1,.-. , 63, is followed by a sequence of i non-repeating symbols.
Applying the following 'rules', it is easy to understand the outline of the
encoding and decoding run-length algorithms below.
3.2.1 Encoding
E x a m p l e 3.3 GGGuuuuuuBCDEFGuu55GHJKuLM777777777777
can be compressed to r3Gr6n6BCDEFGr2ng55GHJKuLMr127.
Solution
5. The next nine non-repeating symbols are found and encoded by n955GHJKuLM.
6. The next twelve '7's are found and encoded by r127.
3.2.2 Decoding
The decoding process is similar to that for encoding and can be outlined as
follows:
Observation
It is not difficult to observe from a few examples that the performance of the
HDC algorithm (as far as the compression ratio concerns) is:
2It can be even better than entropy coding such as Huffman coding.
Metode Kompresi Run Length Encoding (RLE)