Sunteți pe pagina 1din 7


ROLL NO. E6802A21
REGD NO.-10801051


Abstract: In this term paper I would like to present an uncompressedThe trade-off between slightly reduced
overview of AUDIO COMPRESSION. There is no audio quality and transmission or storage size is
other thing that can compare to the versatility of audio outweighed by the latter for most practical audio
compression with its immediate playback. Whether we applications in which users may not perceive the loss
are in management or in technical position this term
in playback rendition qualityFor example, one
paper will help to understand the immense role of
audio compression.
Compact Disc holds approximately one hour of
uncompressed high fidelity music, less than 2 hours
of music compressed loss loss , or 7 hours of music
I. INTRODUCTION compressed in the MP3 format at medium bit rates.

Structured Audio means transmitting sound by

describing it rather than compressing it. That's the
whole idea, and it's a very simple one, but as you will
see if you keep reading, it leads to a wealth of new
directions for sound research and low bit-rate coding.
Audio compression is a form of data compression
designed to reduce the transmission bandwidth
requirement of digital audio streams and the storage
size of audio files. Audio compression algorithms are
implemented in computer software as audio codes.
Generic data compression algorithms perform poorly
with audio data, seldom reducing data size much
below 87% from the original, and are not designed
for use in real time applications. Consequently,
specifically optimized audio lossless and loss
algorithms have been created. Loss algorithms
provide greater compression rates and are used in Fig.1
mainstream consumer audio devices .In both loss and
lossless compression, information redundancy is II. HISTORY
reduced, using methods such as coding, pattern
recognition and linear prediction to reduce the
amount of information used to represent the
A literature compendium for a large variety of audio The primary application areas of lossless encoding
coding systems was published in the IEEE Journal on are:
Selected Areas in Communications (JSAC), February
1988. While there were some papers from before that Archives
time, this collection documented an entire variety of For archival purposes it is generally desired to
finished, working audio coders, nearly all of them preserve the source material exactly (i.e. at
using perceptual (i.e. masking) techniques and some 'best possible quality').
kind of frequency analysis and back-end noiseless Editing
coding. Several of these papers remarked on the Audio engineers use lossless compression for
difficulty of obtaining good, clean digital audio for audio editing to avoid digital generation loss.
research purposes. Most, if not all, of the authors in High fidelity playback
the JSAC edition were also active in the MPEG-1 Audiophiles prefer lossless compression
Audio committee. form-ats to avoid compression artefacts.
Mastering of casual-use audio media
High quality master copies of recordings are
used to produce lossily compressed versions
for digital audio players. As formats and en-
coders improve, updated lossily compres-sed
files may be generated from the lossless


Solidyne 922: The world's first commercial audio bit

compression card for PC, 1990

The world's first commercial broadcast automation

audio compression system was developed by Oscar
Bonello, an Engineering professor at the University
of Buenos Aires. In 1983, using the psychoacoustic
principle of the masking of critical bands first
published in 1967, he started developing a practical
application based on the recently developed IBM PC
computer, and the broadcast automation system was
launched in 1987 under the name Audio. 20 years
later, almost all the radio stations in the world were
using similar technology, manufactured by a number
of companies.

Compression Goals
 Reduced bandwidth
 Make decoded signal sound as close as
possible to original signal
 Lowest Implementation Complexity
 Robust
 Scalable



 Voc File Compression
 Linear Predictive Coding
 Mu-law compression
 Differential Pulse Code Modulation


Sending Ethernet data, audio, video through power A. MPEG

* Tx / Rx Up to 4 optional RJ-45 ports
* Video Compression: MPEG-2
* Enhanced video quality by DBM B. Moving Picture Experts Group
* Video input format auto detect C. Part of a multiple standard for
* MPEG-1 / Layer 2 audio compression a. Video compression
* Typical 250ms latency time b. Audio compression
* Interference immunity for microwave oven,
WLAN, bluetooth, cordless phone
c. Audio, Video and Data MPEG VEDIO
synchronization to an aggregate bit  Physically Lossy compression algorithm
rate of1.5 Mbit/sec.  Perceptually lossless, transparent algorithm
 Exploits perceptual properties of human ear
 Psychoacoustic modeling
MPEG Audio Standard ensures inter-operability,
defines coded bit stream syntax, defines decoding
process and guarantees decoder’s accuracy


The simplest compression techniques simply
removed any silence from the entire sample.
Creative Labs introduced this form of
compression with their introduction of the
Soundblaster line of sound cards. This
method analyzes the whole sample and then
codes the silence into the sample using byte
codes. It is very similar to run-length coding.
Linear Predictive Coding and Code Excited
Linear Predictor This was an early
development in audio compression that was
used primarily for speech. A Linear Predictive
Coding (LPC) encoder compares speech to an
analytical model of the vocal tract, then
throws away the speech and stores the
Fig.6 parameters of the best-fit model. The output
quality was poor and was often compared to
The Program Stream is similar to MPEG-1 computer speech and thus is not used much
Systems Multiplex today. A later development, Code Excited
Linear Predictor(CELP), increased the
complexity of the speech model further, while
allowing for greater compression due to faster
computers, and produced much better results.
Sound quality improved, while the
compression ratio increased. The algorithm
compares speech with an analytical model of
the vocal tract and computes the errors
between the original speech and the model. It
transmits both model parameters and a very
compressed representation of the errors.


Logarithmic compression is a good method because it by the quantizer step size and possibly adding an
matches the way the human ear works. It only loses offset of half a step size. Depending on the quantizer
information which the ear would not hear anyway, implementation, this offset may be necessary to
and gives good quality results for both speech and center the re-quantized value between the
music. Although the compression ratio is not very quantization thresholds.
high it requires very little processing power to The ADPCM coder can adapt to the characteristics of
achieve. It is the international standard telephony the audio signal by changing the step size of either
encoding format, also known as ITU (formerly the quantizer or the predictor, or by changing both.
CCITT) standard. It is commonly used in North The method of computing the predicted value and the
America and Japan for ISDN 8 kHz sampled, voice way the predictor and the quantizer adapt to the audio
grade, digital telephone service. signal vary among different ADPCM coding systems.
It packs each 16-bit sample into 8 bits by using a Some ADPCM systems require the encoder to
logarithmic table to encode a 13-bit dynamic range, provide side information with the differential PCM
dropping the least significant 3 bits of precision. The values. This side information can serve two purposes.
quantization levels are dispersed unevely instead of First, in some ADPCM schemes the decoder needs
linearly to mimic the way that the human ear the additional information to determine either the
perceives sound levels differently at different predictor or the quantizer step size, or both. Second,
frequencies. Unlike linear quantization, the the data can provide redundant contextual
logarithmic step spacings represent low-amplitude information to the decoder to enable recovery from
samples with greater accuracy than higher-amplitude errors in the bit stream or to allow random access
samples. This method is fast and compresses data entry into the coded bit stream.
into half the size of the original sample. This method The following section describes the ADPCM
is used quite widely due to the universal nature of its algorithm proposed by the Interactive Multimedia
adoption. Association (IMA). This algorithm offers a
compression factor of (number of bits per source
sample)/4 to 1. Other ADPCM audio compression
schemes include the CCITT Recommendation G.721
(32 kilobits per second compressed data rate) and
Recommendation G.723 (24 kilobits per second
D. Differential Pulse Code Modulation compressed data rate) standards and the compact disc
interactive audio compression algorithm.
The ADPCM coder takes advantage of the fact that The IMA ADPCM Algorithm. The IMA is a
neighboring audio samples are generally similar to consortium of computer hardware and software
each other. Instead of representing each audio sample vendors cooperating to develop a de facto standard
independently as in PCM, an ADPCM encoder for computer multimedia data. The IMA’s goal for its
computes the difference between each audio sample audio compression proposal was to select a public-
and its predicted value and outputs thePCM value of domain audio compression algorithm able to provide
the differential. Note that the ADPCM encoder uses good compressed audio quality with good data
most of the components of the ADPCM decoder compression performance. In addition, the algorithm
(Figure 2b) to compute the predicted values. had to be simple enough to enable software-only,
The quantizer output is generally only a (signed) real-time decompression of stereo, 44.1-kHz-
representation of the number of quantizer levels. The sampled, audio signals on a 20-megahertz (MHz)
re-quantizer reconstructs the value of the quantized 386-class computer. The selected ADPCM algorithm
sample by multiplying the number of quantizer levels not only meets these goals, but is also simple enough
to enable software-only, real-time encoding on the Usability of lossy audio codecs is determined by:
same computer. Perceived audio quality Compression factor Speed of
The simplicity of the IMA ADPCM proposal lies in compression and decompression Inherent latency of
the crudity of its predictor. The predicted value of the algorithm (critical for real-time streaming
audio sample is simply the decoded value of the applications; see below) Product support Lossy
immediately previous audio sample. Thus the formats are often used for the distribution of
predictor block in Figure 2 is merely a time-delay streaming audio, or interactive applications (such as
element whose output is the input delayed by one the coding of speech for digital transmission in cell
audio sample interval. Since this predictor is not phone networks). In such applications, the data must
adaptive, side information is not necessary for the be decompressed as the data flows, rather than after
reconstruction of the predictor. the entire data stream has been transmitted. Not all
Figure 3 shows a block diagram of the quantization audio codecs can be used for streaming applications,
process used by the IMA algorithm. The quantizer and for such applications a codec designed to stream
outputs four bits representing the signed magnitude data effectively will usually be chosen.
of the number of quantizer levels for each input Latency results from the methods used to encode and
sample. decode the data. Some codecs will analyze a longer
segment of the data to optimize efficiency, and then
code it in a manner that requires a larger segment of
data at one time in order to decode. (Often codecs
create segments called a "frame" to create discrete
data segments for encoding and decoding.) The
inherent latency of the coding algorithm can be
critical; for example, when there is two-way
transmission of data, such as with a telephone
conversation, significant delays may seriously
degrade the perceived quality.
In contrast to the speed of compression, which is
proportional to the number of operations required by
the algorithm, here latency refers to the number of
samples which must be analysed before a block of
audio is processed. In the minimum case, latency is 0
zero samples (e.g., if the coder/decoder simply
reduces the number of bits used to quantize the
signal). Time domain algorithms such as LPC also
often have low latencies, hence their popularity in
speech coding for telephony. In algorithms such as
MP3, however, a large number of samples have to be
analyzed in order to implement a psychoacoustic
. model in the frequency domain, and latency is on the
order of 23 ms (46 ms for two-way communication).

It has been shown that there is a wide range of
applications for audio compresion products.
Furthermore, ongoing research and development is
continually expanding the current range of
applications. one of the most important
characteristics of audio compresion is their clearity in
mpeg formet and to acquire data and making it small.