Sunteți pe pagina 1din 2

A Prototype Arabic Sign Language Recognition System Using the

Microsoft Kinect
Salihu Oladimeji Aliyu (G201303230)

Electrical Engineering Department, King Fahd University of Petroleum and


Minerals, Dhahran, 31261, KSA.
G201303230@kfupm.edu.sa

Abstract

Sign language is important for facilitating communication between the hearing


impaired and the rest of society. However, very few vocal people know sign
language. Therefore, there is a need to develop systems to translate automatically
between spoken and sign languages. Several approaches to solve this problem have
been proposed in the literature, however, up to date no practically deployable system
for Arabic Sign Language recognition (ArSLR) has been achieved. Limitations faced
by previously proposed approaches include lighting and environment background
restrictions for image-based approach, and cumbersomeness in the case of glove-
based approach, hence reducing its user acceptability. In this work, we propose a
prototype for Arabic Sign Language recognition, which is expected to solve some of
constrains in previous proposed approaches. In particular, we are using the recently
introduce Microsoft Kinect device as a backbone to develop a prototype for ArSLR.
The proposed system is a two way system which will have the capability to capture a
performed Arabic sign language, translate it to spoken language and vise versa. It
also includes data collection, preprocessing, and algorithm development using
machine learning techniques for recognition of performed signs.

System Description

The Microsoft Kinect (MK) device sensor shares many of the core capabilities of the
Kinect for Xbox 360 sensor. First, both devices contain RGB camera that stores
three-channel data at a 1280 x 960 resolution at 12 frames per second or a 640 x 480
resolution at 30 frames per second. This instrument allows color images or video to
be captured. Second, both devices contain an infrared (IR) emitter, which emits
infrared light beams, and an IR depth sensor, which reads the IR beams reflected
back to the sensor. The reflected beams are converted into depth information,
measuring the distance between an object and the sensor and hence facilitating the
capture of depth images. Third, both devices also contain a 4-channel microphone
array for capturing sound; the microphone channels make it possible to record audio
from a specific direction as well as to identify the location of the sound source and
the propagation direction of the audio waves. This feature will be use in the process
of conversion of spoken language to Arabic Sign Language. Finally, both devices
also contain a three-axis accelerometer configured for a 2G range, where G is the
acceleration due to gravity. It is possible to use the accelerometer to determine the
current orientation of the sensor. The MK device also includes Near Mode, which
enables the devices camera to see objects as close as 40 centimeters in front of the
sensor without losing accuracy or precision, with smooth degradation out up to 3
meters. In Fig. 1, we showed the block diagram of the prototype, while detailed
block diagram of the system is shown in Fig. 2.

Sign Language

Microsoft
Kinect

Spoken
Language

Fig. 1: Block diagram of proposed prototype ArSLR system.

Spoken
Language

Microsoft Data Feature


Kinect collection extraction

Sign Training/
Language validation

Classification

Sign Spoken
Language Language

Fig. 2: Detailed block diagram of proposed ArSLR system.

S-ar putea să vă placă și