Documente Academic
Documente Profesional
Documente Cultură
REALITY
Ann M. McNamara
Department of Visualization, Texas A&M University
College Station, Texas, USA
Abstract
This paper describes a new project which will focus on the integration of eye-tracking technology with mobile Augmented Reality
(AR) systems. AR provides an enhanced vision of the physical
world by integrating virtual elements, such as text and graphics,
with real-world environments. The advent of affordable mobile
technology has sparked a resurgence of interest in mobile AR
applications. Inherent in mobile AR applications is the powerful
ability to visually highlight information in the real world. We are
working on new algorithms to harness this ability to direct gaze
to Points of Interest (POIs). Combining mobile AR and image
manipulation gives visual distinction to POIs in order to directly
influence and direct gaze in real world scenes. Our initial test
domain is that of Art History Education. POIs are determined
based on salient regions of paintings, as identified by the visual
narrative of the painting. We are developing new system that
will be deployed in the Museum of Fine Art in Houston that will
enhance visitor education through the use of gaze-directed mobile
AR.
Introduction
ann@viz.tamu.edu
507
of the screen that will not obstruct image features that are (or will
become) important to the user. Also, imagine a complementary AR
system that can influence where viewers look in a scene, both spatially and temporally. This work proposes strategies to realize these
AR scenarios. The ideal outcome is an eye-tracking AR system
that is fully integrated into mobile devices and can inform AR applications on the optimal placement of AR elements based on gaze
information, and also manipulate AR elements to direct visual attention to specific regions of interest in the real world.
mation. The overlay may include virtual text, images, web links or
even video. AR applications for education are exploding in many
academic arenas [Medicherla et al. 2010] [Kaufmann and Schmalstieg 2002] [Kaufmann and Meyer 2008]. What separates this work
from existing applications is the integration of eye-tracking. Eyetracking gives information about where the student is looking, what
they are looking at and they have actually looked at all the most
pertinent regions. Also, if they are not looking where they are
supposed to, subtle techniques will be introduced to draw attention back to those region. These may be subtle but do not have to
be: the most promising solution may iteratively increase in strength
until the users gaze is drawn to that location. Building on SGD we
plan to incorporate innovative ways to attract and focus attention to
visual information in AR mobile applications. The focus (initially)
is on on Art History Education, although the ideas presented here
have potential application in many disciplines. A healthy number of
(mobile) AR applications have successfully been applied in the Art
domain [Gwilt 2009] [Damala et al. 2008] [Andolina et al. 2009]
[Bruns et al. 2007] [Choudary et al. 2009] [Chou et al. 2005]
[Srinivasan et al. 2009]. To date, however, none have proposed
eye-tracking as an added dimension. The novelty of this approach
lies in the eye-tracking and in attracting and directing the gaze to
the correct region of the artwork in a sequence that will encourage
appropriate visual navigation and understanding of the image and
strengthen observation skills.
Approach
1.1
Eye-Tracking
1.2
This approach will also insure that the viewer does not inadver-
508
2.2
Image Alignment
Aligning the real-world image on the mobile device with the (enhanced) augmented version of the image is necessary for this work.
Two popular algorithms for achieving this are SIFT and SURF.
Scale-Invariant Feature Transform (or SIFT) detects and describes
local features in images. SIFT transforms an image into a large
collection of local feature vectors. Each of these feature vectors
is invariant to any scaling, rotation or translation of the image.
Speeded Up Robust Features (SURF) has similar performance to
SIFT, but executes faster, which is important for mobile devices due
to processing power. OpenSURF, an open-source vision algorithm
to find salient regions in images, forms the basis of many vision
based tasks including object recognition and image retrieval and
will be used to address image recognition and registration [openSURF 2011] [Takacs et al. 2008] [Chen and Koskela 2011].
2
2.1
Conclusion
References
A NDOLINA , S., S ANTANGELO , A., C ANNELLA , M., G ENTILE ,
A., AGNELLO , F., AND V ILLA , B. 2009. Multimodal virtual
navigation of a cultural heritage site: the medieval ceiling of steri
in palermo. In Proceedings of the 2nd conference on Human
System Interactions, IEEE Press, Piscataway, NJ, USA, HSI09,
559564.
A ZUMA , R. T. 1997. A survey of augmented reality. Presence:
Teleoperators and Virtual Environments 6, 4 (Aug.), 355385.
BAILEY, R., M C NAMARA , A., S UDARSANAM , N., AND G RIMM ,
C. 2007. Subtle gaze direction. In ACM SIGGRAPH 2007
sketches, ACM, New York, NY, USA, SIGGRAPH 07.
Implementation
Image Retrieval
To retrieve the appropriate AR information to present, image recognition is into the application. OpenCV, a library of programming
functions for real time computer vision, will be used in the proposed
work as these functions can easily capture and analyze images and
video, [openCV 2011]. Also, OpenCV has been successfully ported
to work with iOS, the mobile devices operating system. OpenCV
can also handle event input (such as mouse events). Rather than
use x,y position from the mouse, we measure the x,y gaze position
in order to drive the gaze direction events. OpenCV commands are
used to stream video capture to the device e.g( #0 CvCapture
capture = cvCaptureFromCAM(0);).
509
heads eye tells the minds eye. Brain Research 1367, 287
297.
I TTI , L., AND KOCH , C. 2000. A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research 40, 10-12 (May), 14891506.
I TTI , L., AND KOCH , C. 2001. Computational modelling of visual
attention. Nature Reviews Neuroscience 2, 3 (Mar), 194203.
C HOUDARY, O., C HARVILLAT, V., G RIGORAS , R., AND G UR DJOS , P. 2009. March: mobile augmented reality for cultural
heritage. In Proceedings of the 17th ACM international conference on Multimedia, ACM, New York, NY, USA, MM 09,
10231024.
M EDICHERLA , P. S., C HANG , G., AND M ORREALE , P. 2010. Visualization for increased understanding and learning using augmented reality. In Proceedings of the international conference on
Multimedia information retrieval, ACM, New York, NY, USA,
MIR 10, 441444.
OPEN CV,
OPEN SURF,
D E C ARLO , D., AND S ANTELLA , A. 2002. Stylization and abstraction of photographs. ACM Trans. Graph. 21 (July), 769
776.
S IELHORST, T., F EUERSTEIN , M., AND NAVAB , N. 2008. Advanced medical displays: A literature review of augmented reality. J. Display Technol. 4, 4 (Dec), 451467.
D ODGE , R., AND C LINE , T., 1901. The angle velocity of eye
movements.
T UMLER , J., D OIL , F., M ECKE , R., PAUL , G., S CHENK , M.,
P FISTER , E. A., H UCKAUF, A., B OCKELMANN , I., AND
ROGGENTIN , A. 2008. Mobile augmented reality in industrial
applications: Approaches for solution of user-related issues. In
510
511
512