Running Mbrola

Running MBROLA
Installing MBROLA
Simply create a directory mbrola (although this is not critical), copy the mbrXXX.zip file
into it (in which XXX stands for a version number), and unzip the file :
unzip mbrXXX.zip (or pkunzip on PC/DOS)
You are now ready to synthesize your first French words.
First try
mbrola
to get a help screen on how to use the software
Testing MBROLA
Now in order to go further, you need to get a version of an MBROLA language/voice

database from the MBROLA project homepage. Let us assume you have copied the FR1
database and referred to the accompanying fr1.txt file for its installation. Then try :
mbrola fr1/fr1 bonjour.pho bonjour.wav
to create a sound file for the word 'bonjour'.
Command line format of MBROLA
mbrola diphone_database phonetic_file output_file
The phonetic_file format is described here
Basically output file is composed of signed integer numbers on 16 bits,corresponding to

samples at the sampling frequency of the MBROLA voice/language database (16 kHz for
the diphone database supplied by the author of MBROLA : Fr1). Since release 2.00b,
MBROLA produces .au, .wav, .aiff, .aif, and .raw file formats depending on the ouput_file
extension. If the extension is not recognized, the format is RAW.
Optional parameters let you shorten or lengthen synthetic speech and transpose it by
providing optional time and frequency ratios:
mbrola -t 1.2 -f 0.8 fr1/fr1 bonjour.pho bonjour.wav
for instance, will result in a RIFF Wav file bonjour.wav 1.2 times longer than the previous
one (slower rate), and containing speech in which all fundamental frequency values have
been multiplied by 0.8 ( sounds deeper )
A - instead of phonetic_file or output_file means stdin or stdout. On multitasking
machines, it is easy to run the synthesizer in real time to obtain audio output from the
audio device, by using pipes.
On a SOLARIS, try with :
mbrola fr1/fr1 bonjour.pho -.au | audioplay

On a HPUX, try with :
mbrola fr1/fr1 bonjour.pho -.au | splayer
Generally, on a UN*X machine, provided you have SOX, try with :
mbrola fr1/fr1 bonjour.pho -.au | sox -t au - -t raw -r 8000 -Ub - >
/dev/audio
You've understood that we can force the format of a stdout output. For example "mbrola
fr1/fr1 bonjour.pho -.wav" will output the audio file on stdout with a RIFF Wav header.
Though, note that the length of the audio file declared in the header is invalid (some
audio file manipulation programs will do with it, other won't) since the length is known
when the whole sentence has been synthesized and that we can't rewind stdout.
Phonetic file format of MBROLA V2.04
(For automatic production of pho files from recorded speech please check out
MBROLIGN).
The input file bonjour.pho supplied as an example with FR1 simply contains :
_ 51 25 114
b 62
o~ 127 48 170
Z 110 53 116
u 211
R 150 50 91
_ 91
This shows the format of the input data required by MBROLA. Each line contains a
phoneme name, a duration (in ms), and a series (possibly none) of pitch pattern points
composed of two integer numbers each : the position of the pitch pattern point within
the phoneme (in % of its total duration), and the pitch value (in Hz) at this position.
Hence, the first line of bonjour.pho :
_ 51 25 114
tells the synthesizer to produce a silence of 51 ms, and to put a pitch pattern point of
114 Hz at 25% of 51 ms. Pitch pattern points define a piecewise linear pitch curve.
Notice that the pitch pattern they define is continuous, since the program automatically
drops pitch information when synthesizing unvoiced phones.
The data on each line is separated by blank characters or tabs.Comments can optionally
be introduced in command files, starting with a semi-colon(;). A comment begining with
"T=ratio" or "F=ratio" changes the time or frequency ratio respectively.
Notice, finally, that the synthesizer outputs chunks of synthetic speech determined as
sections of the piecewise linear pitch curve. Phones inside a section of this curve are
synthesized in one go. The last one of each chunk, however, cannot be properly
synthesized while the next phone is not known (since the program uses diphones as
base speech units). When using mbrola with pipes, this may be a problem. Imagine, for
instance, that mbrola is used to create a pipe-based speaking clock on an HP :
speaking_clock | mbrola - -.au | splayer
which tells the time, say, every 30 seconds. The last phone of each time announcement
will only be synthesized when the next announcement starts. To bypass this problem,
mbrola accepts a special command phone, which flushes the synthesis buffer : "#"
Limitations of the program
1. There may be up to 20 pitch pattern points in each phone, although not more than two or
three are sufficient to copy natural prosody. We have set up a higher limit so as to enable
the use of MBROLA to produce synthetic singing voices, in which case long vowels with
vibrato may require a large number of pitch pattern points.
2. Phones can be synthesized with a maximum duration which depends on the fundamental
frequency with which they are produced. The higher the frequency, the lower the duration.
For a frequency of 133 Hz, the maximum duration is 7.5 sec. For a frequency of 66.5 Hz, it
is 15 sec. For a frequency of 266 Hz, is is 3.75 sec.
3. Although pitch pattern points are facultative, the synthesizer will refuse to produce
sequences of more than 250 phones with no pitch information.
Last updated December 17, 1999, send comments to Mbrola Team

Running Mbrola

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Running Mbrola

Încărcat de

Drepturi de autor:

Formate disponibile

Running MBROLA

unzip mbrXXX.zip (or pkunzip on PC/DOS)

You are now ready to synthesize your first French words.

to get a help screen on how to use the software

Now in order to go further, you need to get a version of an MBROLA language/voice

mbrola fr1/fr1 bonjour.pho bonjour.wav

to create a sound file for the word 'bonjour'.

Command line format of MBROLA

mbrola diphone_database phonetic_file output_file

The phonetic_file format is described here

Basically output file is composed of signed integer numbers on 16 bits,corresponding to

mbrola -t 1.2 -f 0.8 fr1/fr1 bonjour.pho bonjour.wav

On a SOLARIS, try with :

mbrola fr1/fr1 bonjour.pho -.au | audioplay

Phonetic file format of MBROLA V2.04

Hence, the first line of bonjour.pho :

speaking_clock | mbrola - -.au | splayer

Limitations of the program

Last updated December 17, 1999, send comments to Mbrola Team

S-ar putea să vă placă și