Sunteți pe pagina 1din 17

Audio Fingerprinting

PRESENTED BY JEYBIN M GEORGE

What is Audio Fingerprinting?

A small, unknown segment of audio data (it can be as short as just a couple of seconds) is used to identify the original audio file from which it came

Defnition
An audio fingerprint can be seen as a short summary of an audio object. Therefore a fingerprint function F should map an audio object X, consisting of a large number of bits, to a fingerprint of only a limited number of bits.

Audio Fingerprint System Parameters


Robustness Reliability Fingerprint size Granularity Search speed and scalability

Applications

Broadcast monitoring Connected Audio Filtering Technology for File Sharing Automatic Music Library Organization Other
Napster--use of fingerprinting systems to prohibit the transmission of copywritten materials Finding desired content efficiently in an overwhelming amount of audio material

Benefits

Automated search of illegal content on the Internet


examines the real audio information rather than just tag information

For the consumer


make the meta-data of songs in a library consistent, allowing for easy organization can guarantee that what is downloaded is actually what it says it is will allow consumer to record signatures of sound and music on small handheld devices

Two principle components

Compute the fingerprint

Compare it to a database of previously computed fingerprints.

GENERAL FRAMEWORK

Extraction

Techniques (general)
Any x number of seconds may be used to compute the fingerprint Audio gets separated into frames

Features computed for each frame:


Fourier coefficients MFCC, LPC Spectral flatness sharpness

features mapped into a more compact representation by using HMM, or quantization

The fingerprint extraction


The fingerprint extraction derives a set of relevant perceptual characteristics of a recording in a concise and robust form. The fingerprint extraction consists of a front-end and a fingerprint modeling block .

Computing the fingerprint

Compare to hash functions?


compare computed hash value with that stored in a database

Drawback
need to worry about perceptual similarity and not mathematical similarity
PCM audio vs. MP3: both sound alike but mathematically (i.e. spectral content) are quite different

perceptual similarity is not transitive


not possible to design a system which computes mathematical fingerprints for perceptually similar objects

Techniques
Any x number of seconds may be used to compute the fingerprint Audio gets separated into frames

Features computed for each frame:


Fourier coefficients MFCC, LPC Spectral flatness sharpness

features mapped into a more compact representation by using HMM, or quantization

Techniques

one 32-bit sub-fingerprint every 11.6 ms


A block consists of 256 sub-fingerprints
Corresponds to a granularity of only 3 seconds

Large overlap (31/32), so subsequent subfingerprints are similar and vary slowly in time worst-case scenario: the frame boundaries used during identification are 5.8 ms off with those in database

Techniques

Data from each frame is sent through a filterbank


33 filters, logarithmically spaced (to correspond roughly to the Bark scale)
between 300 and 2000Hz

phase is neglected (perceptual reasons)

Techniques

downsampled to 11.025 kHz, split into frames with overlap of 2


MCLT is then applied to each frame. A 128-sample log spectrum is generated by taking the log modulus of each MCLT coefficient

Conclusion
Much of the literature continually glorifies the plethora of uses that will open up to the average consumer. Firstly, audio fingerprinting will make the meta-data of the songs in the library consistent, allowing easy organization. Secondly, fingerprinting can guarantee that what is downloaded is actually what it says it is. Most importantly, the algorithm is efficient enough that it will be able to run on small handheld devices. The other application of this technology entails securing audio-related Intellectual Property by monitoring the Internet. There would no longer be any restriction on file names and/or extensions, as algorithms could automatically scan the Internet for real Audio information rather than just tag information.

S-ar putea să vă placă și