Sunteți pe pagina 1din 4

==============================================

Speech recognition script for Asterisk


==============================================
This script makes use of Google's speech recognition engine
in order to render speech to text and return it back to the dialplan
as an asterisk channel variable.
-----------Requirements
-----------Perl
The Perl Programming Language
perl-libwww The World-Wide Web library for Perl
IO-Socket-SSL Perl module that implements an interface to SSL sockets.
flac
Free Lossless Audio Codec
Internet access in order to contact google and get the speech data.
** Optional/Highly experimental **
speex
patent-free audio compression format designed for speech.
works only with patched speex encoder that supports
MIME "x-speex-with-header-byte"
https://github.com/zaf/Speex-with-header-bytes
-----------Installation
-----------To install copy speech-recog.agi to your agi-bin directory.
Usually this is /var/lib/asterisk/agi-bin/
To make sure check your /etc/asterisk/asterisk.conf file
----Usage
----agi(speech-recog.agi,[lang],[timeout],[intkey],[NOBEEP])
Records from the current channel until 2 seconds of silence are detected
(this can be set by the user by the 'timeout' argument, -1 for no timeout) or th
e
interrupt key (# by default) is pressed. If NOBEEP is set, no beep sound is play
ed
back to the user to indicate the start of the recording.
The recorded sound is send over to googles speech recognition service and the
returned text string is assigned as the value of the channel variable 'utterance
'.
The scripts sets the following channel variables:
status
t errors.
id
utterance
confidence
ognition.

: Return status. 0 means success, non zero values indicating differen


: Some id string that googles engine returns, not very useful(?).
: The generated text string.
: A value between 0 and 1 indicating the probability of a correct rec
Values bigger than 0.95 usually mean that the resulted text is corr

ect.
-------Examples
-------sample dialplan code for your extensions.conf
;Simple speech recognition

exten
exten
exten
exten
exten

=>
=>
=>
=>
=>

1234,1,Answer()
1234,n,agi(speech-recog.agi,en-US)
1234,n,Verbose(1,The text you just said is: ${utterance})
1234,n,Verbose(1,The probability to be right is: ${confidence})
1234,n,Hangup()

;Speech recognition demo also using googletts.agi for text to speech synthesis:
exten => 1235,1,Answer()
exten => 1235,n,agi(googletts.agi,"Say something in English, when done press the
pound key.",en)
exten => 1235,n(record),agi(speech-recog.agi,en-US)
exten => 1235,n,Verbose(1,Script returned: ${status} , ${id} , ${confidence} , $
{utterance})
;Check return status:
exten => 1235,n,GotoIf($["${status}" = "0"]?success:fail)
;Check the probability of a successful recognition:
exten => 1235,n(success),GotoIf($["${confidence}" > "0.8"]?playback:retry)
;Playback the text
exten => 1235,n(playback),agi(googletts.agi,"The text you just said was...",en)
exten => 1235,n,agi(googletts.agi,"${utterance}",en)
exten => 1235,n,goto(end)
;Retry in case speech recognition wasn't successful:
exten => 1235,n(retry),agi(googletts.agi,"Can you please repeat more clearly?",e
n)
exten => 1235,n,goto(record)
exten => 1235,n(fail),agi(googletts.agi,"Failed to get speech data.",en)
exten => 1235,n(end),Hangup()
;Voice dialing example
exten => 1236,1,Answer()
exten => 1236,n,agi(googletts.agi,"PLease say the number you want to dial.",en)
exten => 1236,n(record),agi(speech-recog.agi,en-US)
exten => 1236,n,GotoIf($[$["${status}" = "0"] & $["${confidence}" > "0.8"]]?succ
ess:retry)
exten => 1236,n(success),goto(${utterance},1)
exten => 1236,n(retry),agi(googletts.agi,"Can you please repeat?",en)
exten => 1236,n,goto(record)
Under the folder wolfram you can find a sample agi script that in combination wi
th speech-recog.agi
sends queries to WolframAlpha and returs the answers as a dialplan variable. See
wolfram/README for
details and dialplan examples.
------------------Supported Languages
------------------[['Afrikaans',
['af-ZA']],
['Bahasa Indonesia',['id-ID']],
['Bahasa Melayu', ['ms-MY']],
['Catal',
['ca-ES']],
['Ce tina',
['cs-CZ']],
['Deutsch',
['de-DE']],

['English',

['en-AU', 'Australia'],
['en-CA', 'Canada'],
['en-IN', 'India'],
['en-NZ', 'New Zealand'],
['en-ZA', 'South Africa'],
['en-GB', 'United Kingdom'],
['en-US', 'United States']],
['Espaol',
['es-AR', 'Argentina'],
['es-BO', 'Bolivia'],
['es-CL', 'Chile'],
['es-CO', 'Colombia'],
['es-CR', 'Costa Rica'],
['es-EC', 'Ecuador'],
['es-SV', 'El Salvador'],
['es-ES', 'Espaa'],
['es-US', 'Estados Unidos'],
['es-GT', 'Guatemala'],
['es-HN', 'Honduras'],
['es-MX', 'Mxico'],
['es-NI', 'Nicaragua'],
['es-PA', 'Panam'],
['es-PY', 'Paraguay'],
['es-PE', 'Per'],
['es-PR', 'Puerto Rico'],
['es-DO', 'Repblica Dominicana'],
['es-UY', 'Uruguay'],
['es-VE', 'Venezuela']],
['Euskara',
['eu-ES']],
['Franais',
['fr-FR']],
['Galego',
['gl-ES']],
['Hrvatski',
['hr_HR']],
['IsiZulu',
['zu-ZA']],
['slenska',
['is-IS']],
['Italiano',
['it-IT', 'Italia'],
['it-CH', 'Svizzera']],
['Magyar',
['hu-HU']],
['Nederlands',
['nl-NL']],
['Norsk bokml',
['nb-NO']],
['Polski',
['pl-PL']],
['Portugus',
['pt-BR', 'Brasil'],
['pt-PT', 'Portugal']],
['Romna',
['ro-RO']],
['Slovencina',
['sk-SK']],
['Suomi',
['fi-FI']],
['Svenska',
['sv-SE']],
['Trke',
['tr-TR']],
['?????????',
['bg-BG']],
['P??????',
['ru-RU']],
['??????',
['sr-RS']],
['???',
['ko-KR']],
['??',
['cmn-Hans-CN', '??? (????)'],
['cmn-Hans-HK', '??? (??)'],
['cmn-Hant-TW', '?? (??)'],
['yue-Hant-HK', '?? (??)']],
['???',
['ja-JP']],
['Lingua latina', ['la']]];
----------------------Security Considerations
-----------------------

This script contacts googles' servers in order send the recorded voice data and
get back
the resulted text. The script uses SSL by default to encrypt all the traffic bet
ween
your pbx and google servers so no 3rd party can eavesdrop your communication, bu
t your
voice data will be available to Google under a not yet defined policy.
------License
------The speech-recog script for asterisk is distributed under the GNU General Public
License v2. See COPYING for details.
-------Homepage
-------http://zaf.github.com/asterisk-speech-recog/

S-ar putea să vă placă și