Chapter 1

Chapter 1
Problem 1.1
Describe media types handled by MIRSs.
Static Media : do not have a time dimension and do not depend on the
presentation time.
Dynamic Media: have time dimensions, and their meanings and
correctness depend on the rate at which they are presented
Problem 1.2
What are the main characteristics of multimedia data and applications?
Multimedia data is very data intensive. (10-minute video=1.5 GB).
Audio and video have a temporal dimension and played out at a fixed
rate.
Digital audio, image, and video are represented in a series individual

sample values and lack obvious semantic structure to recognize
their contents.
Many multimedia applications require simultaneous presentation of

multiple media types in a spatially and temporally coordinated way.
The meaning of multimedia data is sometimes fuzzy and subjective.
Multimedia data is information rich.
Problem 1.3
Why can DBMSs not handle multimedia data effectively?
Because it needs more capabilities to handle content based multimedia
retrieval like the following:
Tools, to automatically, extract contents and features in multimedia
data;
Multidimensional indexing structures, to handle multimedia feature
vectors;
Similarity metrics, for multimedia retrieval instead of exact match;
Storage subsystems, redesigned to cope with the requirements of large
size and high bandwidth and meet real time requirements;
The user interface, designed to allow flexible queries in different media
types and provide multimedia presentations.
Problem 1.4
What are IR systems? Are they suitable for handling multimedia data?
Why?
An Information Retrieval (IR) techniques are important in multimedia
information management systems for two main reasons.
First, there exist a large number of text documents. To use information
stored in these documents, an efficient and effective IR system is needed.
Second, text can be used to annotate other media such as audio, images,
and video. Conventional IR techniques can be used for multimedia
information retrieval.
However, the use of IR for handling multimedia data has the following
limitations:
The annotation is commonly a manual process and time consuming;
Text annotation is incomplete and subjective;
IR techniques cannot handle queries in forms other than text (e.g.
audio);
Some multimedia features such as image are difficult to describe using
text
Problem 1.5
Describe the basic operation of an MIRS.
Information items in the database are preprocessed to extract features
and semantic contents and are indexed based on them. During
information retrieval, a user's query is processed and its main features
are extracted. Then, they are compared with the features or index of each
information item in the database. Information items whose features are
most similar to those of the query are retrieved and presented to the user.
Problem 1.6
Describe the types of queries that are expected to be supported by MIRSs.
Metadata-Based Queries
Metadata refers to the formal attributes of database items such as author
names and creation date. An example query in a video on demand (VOD)
application can be "List movies directed by NAME in 1997." This type can
be handled by DBMS capabilities.
Annotation-Based Queries
Annotation refers to the text description of the contents of database
items. Queries are in keywords or free-text form and retrieval is based on
similarity between the query and annotation. An example query can be
"Show me the video segment in which ACTOR is riding a bicycle." This type
assumes that items are properly annotated and can be handled by IR
techniques.
Define audio and describe its main characteristics?
Audio is caused by a disturbance in air pressure that reaches the human eardrum.
When the frequency of the air disturbance is in the range of 20 to 20,000 Hz, the
human ear hears sound
Another parameter used to measure sound is amplitude, variations in which

cause sound to be soft or loud. The dynamic range of human hearing is very
large: the lower limit is the threshold of audibility, and the upper limit is the
threshold of pain.
The audibility threshold for a 1-kHz sinusoidal waveform is generally set at

0.000283 dyne per square centimeter.
The amplitude of the sinusoidal waveform can be increased from the

threshold of audibility by a factor of between 100,000 and 1,000,000 before
pain is reached. It is difficult to work with such a large amplitude range.
Thus the audio amplitude is often expressed in decibels (dB). Given two
waveforms with peak amplitudes X and Y, the decibel measure of the
difference between these two amplitudes is defined by
Explain the basic principle of Huffman coding?
It assigns fewer bits to symbols that appear more often and more bits to
symbols that appear less often.
It is efficient when the probabilities of symbol occurrences vary widely. It is

usually used in combination with other coding schemes.
Describe the principle of predictive compression techniques?
Predictive Coding
In general, the sample values of spatially neighboring picture

elements are correlated.
Correlation or linear statistical dependency indicates that a linear

prediction of the sample values based on sample values of
neighboring picture elements results in prediction errors that have a
smaller variance than the original sample values.
One-dimensional prediction algorithms use correlation of adjacent

picture elements within the scan line. Other more complex schemes
also exploit line to-line and frame-to-frame correlation and are
denoted as two-dimensional and three dimensional prediction,
respectively.
The smaller variance of the signal to be quantized, coded, and

transmitted, can, in a predictive coding system, diminish the
amplitude range of the quantizer, reduce the number of quantizing
levels, and lessen the required bits per pixel without decreasing the
signal- to-quantizing-noise ratio.

Chapter 1

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Chapter 1

Încărcat de

Drepturi de autor:

Formate disponibile

Chapter 1

Multimedia data is very data intensive. (10-minute video=1.5 GB).

Digital audio, image, and video are represented in a series individual

Many multimedia applications require simultaneous presentation of

The meaning of multimedia data is sometimes fuzzy and subjective.

Multimedia data is information rich.

Another parameter used to measure sound is amplitude, variations in which

The audibility threshold for a 1-kHz sinusoidal waveform is generally set at

The amplitude of the sinusoidal waveform can be increased from the

Explain the basic principle of Huffman coding?

It is efficient when the probabilities of symbol occurrences vary widely. It is

Describe the principle of predictive compression techniques?

In general, the sample values of spatially neighboring picture

Correlation or linear statistical dependency indicates that a linear

One-dimensional prediction algorithms use correlation of adjacent

The smaller variance of the signal to be quantized, coded, and

S-ar putea să vă placă și