Sunteți pe pagina 1din 7

Capitolul III.

Semnalul Audio

1. Analog
Orice perturbaie (energie mecanic) propagat printr-un mediu material sub forma unei unde se numete sunet. Se includ i vibraiile la frecvene din afara domeniului de sensibilitate al urechii: infrasunete (sub 20 Hz) i ultrasunete (peste 20 kHz). Sunetul, din punct de vedere fiziologic, constituie senzaia produs asupra organului auditiv de ctre vibraiile materiale ale corpurilor i transmise pe calea undelor acustice. Urechea uman este sensibil la vibraii ale aerului cu frecvene ntre 20 Hz i 20 kHz, cu un maxim de sensibilitate auditiv n jur de 3500 Hz. Un caz particular de sunet este zgomotul, care se remarc prin lipsa obiectiv sau subiectiv a unei ncrcturi informaionale. Zgomotul deranjeaz fie prin senzaia neplcut pe care o produce, fie prin efectul negativ asupra transmiterii de informaie. Sunetul este o entitate caracterizat de patru atribute: nlime, durat, intensitate i timbru. nlimii i corespunde frecvena (msurat n Hz). Intensitii i corespunde nivelul de intensitate sonor (msurat n dB). Tipuri de sunete: Sunet asociat - semnal de frecven audio care nsoete imaginea de televiziune. Sunet complex - sunet compus din mai multe sunete pure. Sunet reverberat - sunet care persist dup ce o surs sonor nceteaz s emit, prelungind sunetul iniial un timp finit. Sunet vobulat - sunet a crui frecven variaz periodic n jurul unei valori medii, folosit n msurtori electroacustice. Caracteristici ale sunetului: Amplitudinea este caracteristica undelor sonore pe care o percepem ca volum. Frecvena unui sunet este numrul de perioade, sau oscilaii, pe care o und sonor le efectueaz ntr-un timp dat. Frecvena este masurat n hertzi, sau perioade pe secunda. Undele sonore se propag i la frecvene mari i la frecvene joase. Intensitatea sunetului este msurat n decibeli (dB). De exemplu, intensitatea la minimul auzului este 0 dB, intensitatea oaptelor este n medie 10 dB, iar intensitatea fosnetului de frunze este de 20 dB. Reflexia. Rezultatul reflexiei sunetului este ecoul. Un megafon este un tub de tip cornet care formeaz o raz de unde sonore, reflectnd unele dintre razele divergente din parile tubului. Un tub similar poate aduna undele sonore dac se ndreapt spre sursa sonor captul mai mare; astfel de aparat este urechea extern a omului. Refractia Sunetul, ntr-un mediu cu densitate uniform, se deplaseaz nainte ntr-o linie dreapt dar ca i n cazul luminii, sunetul este supus refraciei, care ndeprteaz undele sonore de directia lor original. Viteza sunetului Frecvena unei unde sonore este o masur a numarului de unde care trec printr-un punct dat ntr-o secund. Distana dintre doua vrfuri succesive ale undei (ventre) se numete lungime de und. Produsul dintre lungimea de und i
1

frecven este egal cu viteza de propagare a undei, i este aceeai pentru sunetele de orice frecven (dac sunetul se propag n acelai mediu la aceeai temperatur Digitizarea sunetului: -se produce n trei etape: Prelucrarea semnalului analog i trecerea lui printr-un convertor analog digital; Eantionarea semnalului convertit, astfel nct s se pstreze un volum mic de informaii, dar care s aproximeze suficient de bine forma semnalului audio iniial; aceasta const n secionarea semnalului analog de un numr de 5.500 pn la 48.000 de ori pe secund si pstrarea valorilor determinate; cu ct esantionarea este mai dens, cu att este mai bun aproximarea formei semnalului iniial, dar vor fi mai multe valori de stocat n fisier; Stocarea informaiilor numerice pe un suport de memorie extern conform unui format standard. Etapa critic n procesul de numerizare a sunetului este reprezentat de eantionarea semnalului. Prin aceasta se nelege secionarea semnalului analog pe orizontal, de un numr de ori pe secund, numr cuprins ntre 4500 i 40000.

Figura 1. Reprezentarea grafica a sunetului

Corzile vocale vibreaz i timpanul recepioneaz aceste vibraii. Transferul se realizeaz prin micarea moleculelor din aer, care fac ca vibraiile sa fie percepute. Fluctuaia de vibraii este tradus analogic printr-o variaie continu a tensiunii, care produce o und oscilatorie electric, ce este imprimat membranei difuzorului. Avantajele numerizarii sunt: Stocare si manipulare mult mai usoar; Pstrarea calitii informaiei la copierea pe un alt suport, comparativ cu forma anolog la care calitatea se degradeaz prin copiere; Degrarea cu mult mai redus a suportului fizic de stocare, n cazul fisierelor de sunet, comparativ cu forma analog.

Cele mai utilizate frecvene de esantionare sunt cele de 8 KHz (pentru anunurile fcute prin vocea uman), 11 KHz (pentru nregistrrile vocale, prin microfon sau telefon), respectiv 22 KHz i 44,1 KHz (pentru CD-Audio). n afar de rezoluia pe orizontal, calitatea sunetului mai depinde si de rezoluia pe vertical, adic de intervalul dintre sunetul de cea mai mare intensitate i sunetul de cea mai mic intensitate. Acest interval, numit i spectru dinamic, depinde de precizia conferit sunetului numerizat, prin precizia asociat numrului memorat corespunztor amplitudinii sunetului, n cadrul diviziunii de esantionare. Din acest punct de vedere, exist dou standarde mai rspndite: pe 8 respectiv 16 bii i uneori i 12 bii.

2. AES
AES3 este un standard utilizat pentru transportul semnalelor audio digitale intre dispozitive audio profesionale. Este cunoscut si sub numele de AES / EBU i a fost publicat de Societatea de Inginerie (AES) ca parte a CEI 60958. Acesta a fost dezvoltat de ctre AES i Uniunea Europeana de Radiodifuziune (UER), publicat prima dat n 1985 i ulterior revizuit n 1992 i 2003. Acesta este capabil de a transporta dou canale audio PCM in diferite medii de transmisie, inculsiv pe fibra optica. Conexiuni hardware: The AES3 standard parallels part 4 of the international standard IEC 60958. Of the physical interconnection types defined by IEC 60958, three are in common use:

IEC 60958 Type I Balanced 3-conductor, 110-ohm twisted pair cabling with an XLR connector, used in professional installations (AES3 standard) IEC 60958 Type II Unbalanced 2-conductor, 75-ohm coaxial cable with an RCA connector, used in consumer audio (coaxial S/PDIF) IEC 60958 Type II Optical optical fiber, usually plastic but occasionally glass, with an F05 connector, also used in consumer audio (optical S/PDIF)

The AES-3id standard defines a 75-ohm BNC electrical variant of AES3. This uses the same cabling, patching and infrastructure as analogue or digital video, and is thus common in the broadcast industry. F05 connectors, 5 mm connectors for plastic optical fiber, are more commonly known by their Toshiba brand name, TOSLINK. The precursor of the IEC 60958 Type II specification was the Sony/Philips Digital Interface, or S/PDIF. Tipuri de conectori: IEC 60958 Type I Balanced: INPUT: XLR male plug (cable) mates to XLR female jack (device) OUTPUT: XLR female plug (cable) mates to XLR male jack (device) IEC 60958 Type II Unbalanced: INPUT: RCA male plug (cable) mates to RCA female jack (device) OUTPUT: RCA male plug (cable) mates to RCA female jack (device) IEC 60958 Type II Optical fiber: INPUT: Fiber male plug (cable) mates to TOSLINK female jack (device) OUTPUT: Fiber male plug (cable) mates to TOSLINK female jack (device)
3

Protocol: AES/EBU was designed primarily to support stereo PCM encoded audio in either DAT format at 48 kHz or CD format at 44.1 kHz. No attempt was made to use a carrier able to support both rates; instead, AES/EBU allows the data to be run at any rate, and recovers the clock rate by encoding the data using biphase mark code (BMC). Each sample time, one 64-bit frame is transmitted. This is divided into two 32bit subframes or channels containing one sample each: A (left) and B (right). Each subframe consists of 32 time slots used to transmit individual data bits or synchronization information. 24 bits are available for audio data, of which 20 bits are normally used. 192 consecutive frames are grouped into an audio block. Certain status information is transmitted once per audio block. At the default 48 kHz sample rate, there are 250 audio blocks per second, and 3,072,000 bits per second with a biphase clock of 6.144 MHz The 32 time slots of each subframe are used as following: Time slots 0 to 3 These slots contain a specially coded preamble that identify the subframe and its position within the audio block. They are not normal BMC-encoded data bits, although they do still have zero DC bias. Three preambles are possible :

X (or M) : 11100010 if previous time slot was "0", 00011101 if it was "1". (Equivalently, 10010011 NRZI encoded.) Marks a word for channel A (left), other than at the start of an audio block. Y (or W) : 11100100 if previous time slot was "0", 00011011 if it was "1". (Equivalently, 10010110 NRZI encoded.) Marks a word for channel B (right). Z (or B) : 11101000 if previous time slot was "0", 00010111 if it was "1". (Equivalently, 10011100 NRZI encoded.) Marks a word for channel A (left) at the start of an audio block.

They are called X, Y, Z in the AES3 standard; and M, W, B in IEC 958 (an AES extension). The 8-bit preambles are transmitted in time allocated to the first four time slots of each subframe (time slots 0 to 3). Any of the three marks the beginning of a subframe. X or Z marks the beginning of a frame, and Z marks the beginning of an audio block. Time slots 4 to 7 If the audio word length is more than 20 bits, these slots carry the least significant bits of the audio sample data. If the audio word length is 20 bits (the default) or less, these time slots can carry auxiliary information such as a low-quality auxiliary audio channel for producer talkback or recording studio-to-studio communication. Time slots 8 to 27 These time slots carry 20 bits of audio information starting with LSB and ending with MSB. If the source provides fewer than 20 bits, the unused LSBs will be set to the logical 0 (for example, for the 16-bit audio read from CDs bits 811 are set to 0).
4

Time slots 28 to 31 These time slots carry associated bits as follows:

V (28) Validity bit: it is set to zero if the audio sample word data are correct and suitable for D/A conversion. Otherwise, the receiving equipment may be instructed to mute its output during the presence of defective samples. It is used by most CD players to indicate that concealment rather than error correction is taking place. U (29) User bit: any kind of data such as running time, song, track number, etc. One bit per audio channel per frame form a serial data stream. Each channel of each audio block has a single 192 bit control word. C (30) Channel status bit: like the user bit, the bits from each frame of an audio block are grouped to make a 192-bit channel status word. Its structure depends on whether AES/EBU orS/PDIF is used. P (31) Parity bit: for error detection. A parity bit is provided to permit the detection of an odd number of errors resulting from malfunctions in the interface. If used, it is set to provide even parityover bits 431.

Fig 2. Simple representation of the protocol for both AES/EBU and S/PDIF Channel status word in AES/EBU As stated before there is one channel status bit in each subframe, making one 192 bit word for each channel in each block. This 192 bit word is usually presented as 192/8 = 24 bytes. The contents of the channel status word are completely different between the AES3 and S/PDIF standards, although they agree that the first channel status bit (byte 0 bit 0) distinguishes between the two. In the case of AES3, the standard describes in detail how the bits have to be used. Here is a summary of the channel status word: byte 0: basic control data: sample rate, compression, emphasis

bit 0: A value of 1 indicates this is AES/EBU channel status data. 0 indicates this is S/PDIF data.
5

bit 1: A value of 0 indicates this is linear audio PCM data. A value of 1 indicates

other (usually non-audio) data. bits 24: Indicates the type of signal preemphasis applied to the data. Generally set to 100 (none). bit 5: A value of 0 indicates that the source is locked to some (unspecified) external time sync. A value of 1 indicates an unlocked source. Bits 67: Sample rate. These bits are redundant when real-time audio is transmitted (the receiver can observe the sample rate directly), but are useful if AES/EBU data is recorded or otherwise stored. Options are unspecified, 48 kHz (the default), 44.1 kHz, and 32 kHz. byte 1: indicates if the audio stream is stereo, mono or some other combination. bits 03: Indicates the relationship of the two channels; they might be unrelated audio data, a stereo pair, duplicated mono data, music and voice commentary, a stereo sum/difference code. bits 47: Used to indicate the format of the user channel word. byte 2: audio word length

bits 02: Aux bits usage. This indicates how the aux bits (time slots 47) are used. Generally set to 000 (unused) or 001 (used for 24-bit audio data). bits 35: Word length. Specifies the sample size, relative to the 20- or 24-bit maximum. Can specify 0, 1, 2 or 4 missing bits. Unused bits are filled with 0, but audio processing functions such as mixing will generally fill them in with valid data without changing the effective word length.

bits 67: Unused byte 3: used only for multichannel applications byte 4: Additional sample rate information.

bits 01: indicate the grade of the sample rate reference, per AES11. bit 2: reserved bits 36: Extended sample rate. This indicates other sample rates, not representable in byte 0 bits 67. Values are assigned for 24, 96, and 192 kHz, as well as 22.05, 88.2, and 176.4 kHz. bit 7: This "sampling frequency scaling flag", if set, indicates that the sample rate is multiplied by 1/1.001 to match NTSC video frame rates. byte 5: reserved bytes 69: Four ASCII characters for indicating channel origin. Widely used in large studios. bytes 1013: Four ASCII characters indicating channel destination, to control automatic switchers. Less often used. bytes 1417: 32-bit sample address, incrementing block-to-block by 192 (because there are 192 frames per block). At 48 kHz, this wraps every 24h51m18.485333s. bytes 1821: as above, but offset to indicate samples since midnight.[2] byte 22: contains information about the reliability of the channel status word.

bits 03: reserved

bit 4: if set, bytes 05 (signal format) are unreliable. bit 5: if set, bytes 613 (channel labels) are unreliable. bit 6: if set, bytes 1417 (sample address) are unreliable. bit 7: if set, bytes 1821 (timestamp) are unreliable. byte 23: CRC. This byte is used to detect corruption of the channel status word, as might be caused by switching mid-block. (Generator polynomial is x8+x4+x3+x2+1, preset to 1.)

S-ar putea să vă placă și