Documente Academic
Documente Profesional
Documente Cultură
Jingyu Zhang
Jiarui Wang
wsshelly1037@gmail.com
jiarui.wang523@gmail.com
ABSTRACT
Language is the most important, common and direct way to
exchange information. Speech recognition is an important
technology to achieve man-machine communication. Speech
recognition in voice dialing telephones, remote control home
appliance, industrial control and other fields has a wide
range of application, and has an important practical value.
Our project also uses the speech recognition to implement
an application with the function that users could control
smart phones with Android system to play songs by speaking names or lyrics of songs. Our application translates the
speech into string, then search the string of the speech in
the database of string which includes the lyrics and other
information of songs. If there is a match, it will play the
song.
Keywords
recognition
1.
INTRODUCTION
Figure 1: database
ure 2. After the user press the start button, the application
will invoke the listener of phone using the Audiorecord class
of API. If there is nobody speaking, the application will return null and go back to the start page waiting for next
operation. If the listener detects that someone is speaking, listener will record the speech. Then input the audio
of speech into the translation model. The translation model
will translate the speech which is inputted into string to output. After that, according to the string of the speech, the
application will search the database to find whether there
is a match with names or lyrics of any song one by one. If
the application could not found a match, it will display no
match, and go back to the start page. If the application
found a match, it will display the name, artist, album of the
song and play the song which is already had been stored in
the memory of the smart phone.
Iflytek package is developed by IFLYTEK CO.LTD. It is
an Auto Speech Recognition. It has powerful functions like
Speech synthesis, speech recognition, semantic understanding, and speech search. We just use the speech recognition
function. We mainly use four essential classes, InitListener,
SpeechRecognizer, setParameter, RecognizerDialogListener.
Figure 2: procedure
2.
3.
IMPLEMENTATION
3.1
To get the music information, we first need to know the formate of .lrc file. figure 5
LRC is a text-based formate that used to synchronize songs
with lyrics while playing the audio file. It can be distributed
into two area: ID part and lyric part. figure 6
Usually, ID information are placed at the head of the file,
and the lyric contents are placed after it.
For the first part, there are can be many tags. Most commonly, is the following three tags: ti(Song title),al(Album
name) and ar(Artist). This tags can be arranged in any order.
For the lyrics part, each line contains of a time tag and a
sentence of lyrics.
Thus, we can create a class for lyric files as showed in the
figure 7.
Thus to do the .lrc file resolution, we can do the following
(a)
(b)
Figure 4: wrong data
3.2
Audio translation
3.3
one song, so the application could work successfully to find out the right match. I think that is
a fatal disadvantage of lyric matching, no matter
how you improve the algorithm and the application.
There is another problem of our application that
it can be fatally affected by the noise when the
listener of smart phone is recording. If we want
to improve the application with this problem, we
have to go back to use the fast Fourier transform.
Using fast Fourier transform to convert the time
domain to frequency domain, and eliminate the
low frequency which is noise to enhance the robust of the application. If we can achieve this, we
also can try the first idea again to improve the
application by this frequency matching method.
Frequency matching is a better way to implement
our idea, even though it is hard for us.
Despite the problems above, considering the application could successfully work in a small database
of lyrics, we think our project is basically completed and success. The following figures 8 9 10
are the results of correct lyrics matching.
Match
3.4
Since the .lrc file and .mp3 file has the same file
name for a song. For example the lyric file and
the audio file for the song let it go are : let it
go.irc and let it go.mp3. Therefore, the title of
the LRC object will point us to the very mp3 file.
Then we can play it using MediaPlayer.
4.
CONCLUSIONS
Figure 8: result1
Figure 9: result2