Sunteți pe pagina 1din 5

Eastern Visayas State University

Graduate School

MIDTERM EXAM
MSIT 215-Multimedia

GRACELY J. CARSON

Instruction.
I. Provide an example of any multimedia application that uses a lossy and
lossless compression. From your given example, show the process on how it
is compressed.
>> The MP3 format is one that uses lossy compression. This means that it
loses some of the audio information found in the original to make the
compressed file much smaller. The information that lossy compression loses
is the information deemed least important to the file. In music, this tends to be
the very high and very low frequencies that are not considered to add as
much to the music as the range of frequencies in between.
>> Briefly, heres how MP3 (and most other compression schemes)
work. The process employs a combination of digital technology and the
science of aural perception (psychoacoustics) to remove data bits from the
original digital file that are considered to be essentially inaudible. These
bits can include frequencies beyond the normal threshold of human
hearing, sounds that are masked by other sounds, and various other
redundant sonic information.
The point of contention with this whole concept is just how much of that
data is truly inaudible. While some bits can be removed with little
consequence, much of what gets stripped away can subtly affect our
perception of how things sound. While moderately compressed files can
deliver near-CD quality sound, too much compression can remove elusive
qualities that can make a difference to how we perceive music on a
subconscious level.
With any compression, some audio quality loss is inevitable. Very high
frequencies are typically the first data to be eliminated, and while in theory
these sounds are inaudible, their loss can rob your music of its subtle
overtones, presence, dynamic range and depth of field.
The audio resolution and sonic quality of an MP3 is determined by the
bit rate at which its encoded. The higher the bitrate, the more data per
second of music. As youd expect, a higher bitrate creates better quality
audio, along with a larger file.
Generally speaking, 128 kbps (kilobits per second) is considered the
bit rate at which an MP3 begins to exhibit artifacts of data compression.
Not coincidently, its also the rate many websites use for downloads, since
it offers a smaller file size with relatively minimal loss. Rates below 128
kbps are usually not recommended for anything other than spoken word
recordings. Bitrates of 192 kbps, 256 kbps, or higher preserve most of the
original sonic information, making them a better bet for music you care
about.
Another alternative is to encode using a VBR, or variable bit rate. VBR
examines the data as its encoding, using a lower rate for simple passages
and a higher rate for more complex ones. While the resulting file size is
smaller than using a higher bitrate, sometimes VBR encoding can end up
compromising the audio fidelity of delicate material like a solo acoustic
guitar or vocal.
>> WAV files don't involve any compression at all and will be the size of files
that you have calculated already. There are lossless compressed file formats
out there such as FLAC which compress the WAV file into data generally 50%
the original size. To do this it uses run length encoding, which looks for
repeated patterns in the sound file, and instead of recording each pattern
separately, it stores information on how many times the pattern occurs in a
row. Let us take a hypothetical set of sample points:

00000000000000000000012345432100000000000000000123456787656789
876

As you can see the silent area takes up a large part of the file, instead of
recording these individually we can set data to state how many silent samples
there are in a row, massively reducing the file size:

(21-0)123454321(17-0)123456787656789876

Another technique used by FLAC files is Linear prediction.

II. How to encode and decode a speech?


>> The speech is encoded using different methods. One method is variable-
rate coded-excited linear prediction (VCELP). This approach partitions the
sampled speech into frames (of typically 20 ms) and each frame is encoded
at a rate of 8, 4, 2, or 1 kbps. The encoding rate to be used for each speech
frame is determined by an "activity measure" selected to monitor how much
energy occurs in the frame. This is useful in two-way communications, since
rarely are both talkers speaking simultaneously, so that while one talker is
speaking, the rate used to encode the sound from the other talker can be kept
to a minimum. This algorithm is used in some cellular telephone systems.
Another method is called improved multiband excitation (IMBE) speech
coding. In this approach the speech is also partitioned into frames, and each
frame is analyzed using the DFT to determine pitch and harmonic
frequencies. The magnitude of the amplitude spectrum is then (coarsely)
quantized and encoded and the phase is not encoded. The decoding involves
synthesizing sinusoids for the encoded frequencies and amplitudes, and
carefully maintaining continuity of phase between one frame and the next.
>> The speech decoding process can be performed based in either single-
pass or multi-pass approach. The lattice rescoring is a form of multi-pass
decoding, in which the lattice is generated in the rst pass using simple and
low order knowledge sources and the rescoring is performed in the second
pass using higher order knowledge sources.

III. Why speech synthesis is important? Aside from text-to-speech conversion,


are there any example of a speech synthesis?
>> By speech synthesis we can, in theory, mean any kind of synthetization of
speech. For example, it can be the process in which a speech decoder
generates the speech signal based on the parameters it has received through
the transmission line, or it can be a procedure performed by a computer to
estimate some kind of a presentation of the speech signal given a text input.
>> Yes, another example of a speech synthesis is a telephone inquiry system
where the information is frequently updated, can use TTS to deliver answers
to the customers. Speech synthesizers are also important to the visually
impaired and to those who have lost their ability to speak. Several other
examples can be found in everyday life, such as listening to the messages
and news instead of reading them, and using hands-free functions through a
voice interface in a car, and so on.

IV. Present about multimedia database management system. How does it


behave to any multimedia applications?
>>A police investigation of a large-scale drug operation. This investigation
may generate the following types of data
a. Video data captured by surveillance cameras that record the activities
taking place at various locations.
b. Audio data captured by legally authorized telephone wiretaps.
c. Image data consisting of still photographs taken by investigators.
d. Document data seized by the police when raiding one or more places.
e. Structured relational data containing background information, back
records, etc., of the suspects involved.
f. Geographic information system data remaining geographic data relevant
to the drug investigation being conducted.

Possible Queries

Image Query (by example):


Police officer Rocky has a photograph in front of him.
He wants to find the identity of the person in the picture.
Query: Retrieve all images from the image library in which the
person appearing in the (currently displayed) photograph appears

Image Query (by keywords):


Police officer Rocky wants to examine pictures of Big Spender.
Query: "Retrieve all images from the image library in which Big
Spender appears."

Video Query:
Police officer Rocky is examining a surveillance video of a
particular person being fatally assaulted by an assailant. However,
the assailant's face is occluded and image processing algorithms
return very poor matches. Rocky thinks the assault was by
someone known to the victim.
Query: Find all video segments in which the victim of the assault
appears.
By examining the answer of the above query, Rocky hopes to find
other people who have previously interacted with the victim.

Heterogeneous Multimedia Query:


Find all individuals who have been photographed with Big
Spender and who have been convicted of attempted murder in
South China and who have recently had electronic fund transfers
made into their bank accounts from ABC Corp.

V. Create an interactive product using multimedia. One example to site is an


interactive website. Show your interactive product even just the concept of it.
Give the following:
1. Title: Disaster Risk Reduction Awareness and Management Application
for Secondary Schools Students
2. Its purpose:
a. Provide guidance to Secondary Schools students on how to act
before, during, and after disaster; and
b. Facilitate immediate and efficient information flow during disasters
and emergencies.
3. How can it be used: The end-users will open the application and will be
directed to a gallery of different disasters.
4. Who will be benefitted: This app will benefit the students, teachers, as well
as the DRRMO.
5. What is its learning outcome once end-users or clients will use it or try it:
The end-users will be able to learn and apply the measures on what to do
before, during, and after disasters.

S-ar putea să vă placă și