Sunteți pe pagina 1din 13

In: RUIU, San Francisco: Morgan Kaufmann, 1998, pp. 1- 13.

INTRODUCTION

Intelligent User
Interfaces:
An Introduction

This introduction describes the need for intelligent stations and portable machines, provide a dizzying
user interfaces (IUIs), specifies the intended purpose array of potential for personal and personalized multi-
and use of this collection, outlines the collection's media interaction.
scope, and defines basic terminology used in the field. Interface technology has advanced from initial
After outlining the theoretical foundations of intelli- command line interfaces to the established use of
gent user interfaces, this introductory section describes direct manipulation or WIMP (windows, icons, menus,
the current state of the art and summarizes the struc- and pointing) interfaces in nearly all applications.
ture and contents of this collection, which addresses Even some of the first computing systems incorporated
some remaining fundamental problems in the field. graphical displays and light pens as pointing devices
(Everett et al. 1957). The next generation of interfaces,
1. MOTIVATION often called "intelligent," will provide a number of
The explosion of available materials on corporate, additional benefits to users, including adaptivity, con-
national, and global information networks is driving text sensitivity, and task assistance. As with traditional
the need for more effective, efficient, and natural inter- interfaces, principled intelligent interfaces should be
faces to support access to information, applications, learnable, usable, and transparent. In contrast, howev-
and people. This is exacerbated by the increasing com- er, intelligent user interfaces promise to provide addi-
plexity of systems, the shrinking of task time lines, and tional benefits to users that can enhance interaction,
the need to reduce the cost of application and interface such as:
development. Fortunately, the basic infrastructure for Comprehension of possibly imprecise, ambigu-
advanced multimedia user interfaces is rapidly appear- ous, and/or partial multimodal input
ing or already available. In addition to traditional pub- Generation of coordinated, cohesive, and coher-
lic telephone networks, cable, fiber-optic, wireless and ent multimodal presentations
satellite communications are rapidly evolving with the Semi- or fully automated completion of dele
aim of serving many simultaneous users through a gated tasks
great variety of multimedia communications (e.g., Management of the interaction (e.g., task comple-
video, audio, text, data). Rapidly advancing micro- tion, tailoring interaction styles, adapting the
processor and storage capabilities, coupled with multi- interface) by representing, reasoning, and exploit-
media input and output devices integrated into work- ing models of the user, domain, task, and context
2 Intelligent User Interfaces: An Introduction

In addition to these end-user benefits, new model- fields of human-computer interaction and intelligent
based interface tools promise to help user interface user interfaces/artificial intelligence. We hope the col-
designers and developers decrease the time, expense, lection will also serve to foster scientific interchange
and level of expertise necessary to construct successful among individuals working in both theory and appli-
user interfaces. cations, and, as such, the collection reflects a mix of
In search of these benefits, governments, industry, these activities.
and academia have emphasized the importance of the This collection can be used as: a key reference
human-machine interface in the global information source for students, researchers, and practitioners of
economy. For example, the United States Digital IUI or as a text in user interface classes or advanced
Library, European Telematics, and the Japanese Human graduate seminars. To satisfy these purposes, the book
Interface Programs are all well funded in the long term, is organized around the key areas of IUI: input analysis,
but their continued success will require researchers and output generation, user- and discourse-adapted interac-
managers to rapidly acquire the standard literature and tion, agent-based interaction, model-based interface
train in the latest interface tools and techniques. As an design, and intelligent interface evaluation. In addition
example, the $500 million, 10-year Real World to a traditional author and keyword index, we also pro-
Computing (RWC) Program initiated in 1992 vide a two-dimensional content index in Section 5.7, to
(Tsukuba, Japan) focuses on pattern/symbol processing facilitate tailored access to relevant content for a range
and includes an emphasis on multimodal interfaces of purposes: research, analysis, or teaching.
integrating gesture, speech, and body language. Articles were chosen from a broad range of
Academic and commercial advances in human- sources, including journals, conference proceedings,
computer interface technology (Baecker, Grudin, workshop notes, and previous book collections. Each
Buxton, and Greenberg 1995) has dramatically article was nominated by a member of the editorial
improved interaction with computers; however, these board and evaluated by multiple reviewers considering
efforts are necessary but not sufficient to address the selection criteria of quality, significance, originality,
preceding challenges. Instead, a new class of interfaces clarity, and relevance as well as special considerations
is required that goes beyond the current tripartite inter- such as historical and sustained influence and difficulty
face model of application, dialogue, and presentation. of acquisition. Each article is followed by a brief reflec-
This collection of papers points the way toward inter- tion written by the original authors, indicating important
faces that model the situation, task, user, discourse, and developments that have influenced, have resulted from,
media and that enable model-based specification and and/or have followed the publication of their work,
generation of interfaces, agent-based interaction, and including key follow-up publications by the authors and
integrated multimodal input and output. Unlike tradi- others. After providing the scope and definitions of the
tional human-computer interfaces, intelligent interfaces field, we overview its brief history and then provide
are those that represent and reason about the user, summaries of the key sections of the readings.
domain, task, media, and situation. A number of appli-
cations are emerging, ranging from mail filters (Maes, 3. SCOPE AND DEFINITIONS
Section VII of this volume) to office assistants (Horvitz Intelligent user interfaces (IUIs) are human-machine
1997) to speaking and listening interface agents (Nagao interfaces that aim to improve the efficiency, effective-
and Takeuchi, Section VII of this volume). ness, and naturalness of human-machine interaction by
representing, reasoning, and acting on models of the
2. PURPOSE AND USE user, domain, task, discourse, and media (e.g., graphics,
The purpose of this collection is multifold. First, it is natural language, gesture). As a consequence, this inter-
intended to motivate and define the field of intelligent disciplinary area draws upon research in and lies at the
user interfaces. Second, it is intended to capture and intersection of human-computer interaction, ergonom-
place into context key developments in this field. ics, cognitive science, and artificial intelligence and its
Third, it is intended to serve as a stimulus for contin- subareas (e.g., vision, speech and language processing,
ued research into the many interesting and challenging knowledge representation and reasoning, machine
problems that remain. Finally, a principal goal of this learning/knowledge discovery, planning and agent-
collection is to bridge the gap between scientists and modeling, user and discourse modeling). Whereas pre-
engineers working in the distinct but interdependent vious collections have focused on related enabling tech-
Intelligent User Interfaces: An Introduction 3

nologies such as text processing (Grosz, Sparck Jones, design, and interaction management. Thus, the collec-
and Webber 1986; MUC-6 1995), spoken language pro- tion does not address input and output devices and dri-
cessing (Waibel and Lee 1990), human-computer inter- vers for input processing and output rendering.
action (Baecker, Grudin, Buxton, and Greenberg 1995), As the dotted regions in Figure 2 illustrate, tradi-
user modeling (Kobsa and Wahlster 1989), artificial tional user interfaces distinguish only three models:
intelligence (Webber and Nilsson 1985), knowledge presentation, dialog, and application. Refinements
representation (Brachman and Levesque 1985), and beyond these three models that are found in IUIs
planning (Allen and Hendler 1990), intelligent human- include explicit models of the user, discourse and
computer interaction requires a synergistic integration domain, input analysis and output generation, and
of these areas. This collection complements previous mechanisms to manage the interaction, such as fusing
works focused on human-computer interaction, multi- and interpreting imprecise, ambiguous, and/or inaccu-
media or intelligent interfaces (Blattner and Dannenberg rate input, controlling the dialog progression, or tailor-
1992, Sullivan and Tyler 1991), and intelligent multi- ing presentation output to the current situation.
media interfaces (Maybury 1993). Research so far has shown that it is possible to
Figure 1 illustrates a high-level architecture of adapt many of the fundamental concepts developed to
intelligent user interfaces and, as such, defines many of date in computational linguistics and discourse theory
the subareas in the field. These include analyzing and in such a way that they become useful for multimedia
interpreting input, designing and rendering output, user interfaces as well. In particular, semantic and prag-
managing the interaction, and representing and reason- matic concepts like communicative acts, coherence,
ing about models that support intelligent interaction. focus, reference, discourse model, user model, implica-
An example of a model is a user model, more generally, ture, anaphora, rhetorical relations, and scope ambigu-
an agent model (e.g., that could represent the user, sys- ity take on an extended meaning in the context of
tem, intermediary, addressee, etc.). The "intelligence" multimodal communication. As Figure 3 illustrates,
in IUIs that distinguishes them from traditional inter- artificial intelligence has much to contribute to user
faces is indicated in bold in Figure 1it includes interfaces, including the use of knowledge representa-
mechanisms that perform automated media analysis, tions for model-based interface development tools, the

Figure 1. Architecture of Intelligent User Interfaces


4 Intelligent User Interfaces: An Introduction

FIGURE 2. Current Interface Practice and Its Relation to IUI

application of plan generation and recognition in dialog tion, to computer input/output devices (e.g., micro-
management, the application of temporal and spatial phone, speaker, screen, pointer). We use the term code
reasoning to media coordination, the use of user mod- to refer to a system of symbols (e.g., natural language,
els to tailor interaction, and so on. We will detail these pictorial language, gestural language). For example, a
theoretical and technical foundations in the following. natural language code might use typed or written text
As shown in Figure 3, more effective, efficient, or speech, which in turn would rely upon visual or
and natural human-computer or computer-mediated, auditory modalities and associated media (e.g., key-
human-human interaction will require both automated board, microphone). It is important to note, however,
understanding and generation of multimedia. Fluent that especially the terms media and mode are fre-
conversational interaction demands explicit models of quently used ambiguously in the literature. Indeed, in
the user, discourse, task, and context. It also requires a this collection we will use them interchangeably when
richer understanding of media in its use both in the their distinction is not important.
interface to support interaction with the user and in Medium, mode, and code are related nontrivially
access to content by the user during a session. (see Figure 4). First, a single medium may carry several
Because of widespread terminology confusion, modalities and, in turn, codes. For example, a piece of
we begin with a clarification of the terms medium and paper may support both language and graphics codes just
mode. By mode or modality, we refer primarily to the as a visual display may support text, images, and video.
human senses employed to process incoming informa- Likewise, a single code may be supported by many media
tion: vision, audition, olfaction, touch, and taste. In and modalities. For instance, language can be supported
contrast, medium refers to the material object (e.g., the visually (i.e., written language) and aurally (i.e., spoken
physical carrier of information such as paper or CD- language)in fact, spoken language can have a visual
ROM) used for presenting or saving information and, component (e.g., lip reading). Analogously, a user of a
particularly in the context of human-computer interac- multimedia CD-ROM is interacting with a physical me-
Intelligent User Interfaces: An Introduction 5

FIGURE 3. AI Meets User Interfaces

dium used to store information captured in a variety of ple, visual and auditory perception of natural language,
codes (e.g., language, graphics) using multiple modalities visual perception of images (still and moving), and audi-
(e.g., auditory, visual) and using various input/output tory perception of sounds. Finally, this multimedia and
media (e.g., mouse, display, speaker). A multimedia doc- multimodal interaction occurs over time. Therefore, it is
ument on the CD-ROM might include text, graphics, necessary to account for the processing of discourse, con-
speech, and video that affect several modalities, for exam- text shifts, and changes in agent states over time.

FIGURE 4. Medium, Mode, and Code


6 Intelligent User Interfaces: An Introduction

The new generation of intelligent multimodal sys- the media themselves (e.g., text, graphics, or images).
tems (Maybury 1993, 1995) goes beyond the standard Key processes include verbalization (moving from for-
canned text, predesigned graphics, and prerecorded mal representations or graphics or images to text) and
images and sounds typically found in commercial mul- visualization (from representations or text to graphics
timedia systems of today. A basic principle underlying or images). Several systems have focused on multi-
these so-called intellimedia systems' is that the various modal presentation generation, designing, and realiz-
constituents of a multimodal communication should be ing coordinated text, speech, graphical, and carto-
generated on the fly from a common representation of graphic presentations. As the dial in the middle indi-
what is to be conveyed without using any preplanned cates, these systems raise the opportunity to select
text or images; that is, the principle is "no generation between a scale from an entirely linguistic to a com-
without representation." It is an important goal of such pletely visual presentation.
systems not simply to merge the verbalization and visu- Multimedia dialog prototypes have been devel-
alization results of a text generator and a graphics gen- oped in several application domains, including
erator but to carefully coordinate them in such a way CUBRICON to support mission planning (Neal et al.,
that they generate a synergistic improvement in com- Section I of this volume), XTRA for tax form prepara-
munication. Such multimodal presentation systems are tion (Wahlster, Section V of this volume; Kobsa et al.
highly adaptive since all presentation decisions are 1986), MMI2 for network management (Binot et al.
postponed until runtime. The quest for adaptation is 1990), AIMI for air mission planning (Burger and
based on the fact that it is impossible to anticipate the Marshall, Section V of this volume), and ALFresco to
needs and requirements of each potential user in an enable art history information exploration (Stock,
infinite number of presentation situations. Section V of this volume). Typically, these systems
Figure 5 indicates the key processes and exempli- parse mixed and asynchronous multimedia input and
fies some systems that have addressed multimodal generate coordinated multimedia output. They also
information processing, including media generation attempt to maintain coherency, cohesion, and consis-
and media conversion. The large arrows indicate where tency across both multimedia input and output. For
processing typically begins, that is, from formal repre- example, these systems often support integrated lan-
sentations such as a data or knowledge base or from guage and deixis for both input and output. They

FIGURE 5. Key Processes in Multimedia Processing


Intelligent User Interfaces: An Introduction 7

extend research in discourse and user modeling (Kobsa keystrokes, gestures, facial expressions) and the proper-
and Wahlster 1989) by incorporating representations ties of those interactions (e.g., conversational syntax and
of media to enable media reference, cross-reference, semantics, dialog structure) over time and for different
and reuse over the course of a session with a user. tasks and contexts. Equally, future interfaces will likely
These enhanced representations support the exploita- incorporate more sophisticated presentation mechanisms.
tion of user perceptual abilities and media preferences For example, Pelachaud, Badler, and Steedman (1996)
as well as the resolution of multimedia references characterize spoken language intonation and associated
(e.g., "Send this plane there" articulated with synchro- emotions (anger, disgust, fear, happiness, sadness, and
nous gestures on a map). surprise) and from these use rules to compute facial
The details of discourse models in these systems, expressions, including lip shapes, head movements, eye
however, differ significantly. For example, CUBRICON and eyebrow movements, and blinks. Finally, future mul-
represents a global focus space ordered by recency timedia interfaces should support richer interactions,
whereas AMI represents a focus space segmented by including user and session adaptation (Schneider-
the intentional structure of the discourse (i.e., a model of Hufschmidt et al. 1993), dialog interruptions, follow-up
the domain tasks to be completed). Although intelligent questions, and management of the focus of attention.
multimedia interfaces promise natural and personalized In summary, as Figures 1 through 5 illustrate,
interaction, they remain complicated and require spe- principal areas of intelligent interface research include
cialized expertise to build. One practical approach to Analysis of input (e.g., spoken, typed, and hand
achieving some of the benefits of these more sophisti- written language; gestures, including hand, eye,
cated systems without the expense of developing full and body states and motion)
multimedia interpretation and generation components Generation (planning or realization) of coordi-
was achieved in ALFresco (Stock, Section V of this vol- nated output
ume), a multimedia information kiosk for Italian art Modeling of the user, discourse, task, and situa-
exploration. By adding natural language processing to a tion and interaction management, including pos-
traditional hypermedia system, ALFresco achieved the sible tailoring of interaction to the user, task,
benefits of hypermedia (e.g., organization of heteroge- and/or situation
neous and unstructured information via hyperlinks, As such, we distinguish these functions in the organi-
direct manipulation to facilitate exploration) together zation of this collection, described later in this intro-
with the benefits of natural language parsing (e.g., direct ductory section.
query of nodes, links, and subnetworks, which provides
rapid navigation). Providing a user with natural lan- 4. THE ROOTS OF INTELLIGENT USER
guage query within a hypertext system helps overcome INTERFACES
the indirectness of the hypermedia web as well as dis- Enabling conversational interaction with computers has
orientation and cognitive overhead caused by large been a vision since the creation of the first computers. In
amounts of semantically heterogeneous links (e.g., part- part stimulated by attempts to pass the Turing test, a
of, class-of, instance-of, or elaboration-of). In addition, number of initial efforts attempted to literally simulate
as in other systems previously mentioned (e.g., CUBRI- conversation with computers using pattern matching to
CON, XTRA), ambiguous gesture and language can select possible responses from a conversational database
yield a unique referent through mutual constraint. (e.g., McCarthy's ADVICE, Weizenbaum's ELIZA,
Finally, ALFresco incorporates simple natural language Colby's PARRY). Other efforts focused on specific
generation that can be combined with more complex aspects of conversation, most notably the focus on nat-
canned text (e.g., art critiques) and images. Reiter, ural language interfaces. We refer the reader to Readings
Mellish, and Levine (Section II of this volume) analo- in Natural Language Processing (Grosz, Sparck Jones,
gously integrate traditional language generation with and Webber 1986), which outlines the history of natural
hypertext to produce hypertext technical manuals. language processing research, including theoretical and
Whereas practical systems are possible today, the computational investigations into tasks, discourse, atten-
multimedia interface of the future may have facilities that tion, beliefs, and plans in support of both analysis and
are much more sophisticated. These interfaces may generation of natural language. Intelligent user interfaces
include humanlike agents that converse naturally with have benefited from a rich interaction between theoreti-
users, monitoring their interaction with the interface (e.g., cal developments, such as Grice's work on implicatures
8 Intelligent User Interfaces: An Introduction

and Austin and Searle's work on speech acts, as well as input analysis, output generation, user and discourse
practical application areas, such as interfaces to data modeling, model-based interfaces, agent interaction,
bases, intelligent tutoring, and automated interface and evaluation. We briefly describe each of these in
design. Figure 6 captures some of the important events in turn, referring the reader to the section introductions
the emergence of the discipline of IUIs, including the for summary overviews of the included papers.
appearance of the first international workshops and con-
ferences, specialized collections, and the emergence of 5.1. Analysis of Input
commercial products and standards. Depicted is the cre- The chapters in Section I of the collection focus on
ation of natural language interfaces in the seventies and supporting intelligent input processing. Motivated by
eighties (including natural language processing toolkits), the observation that human-human communication is
agents in commercial products in the nineties, and a stan- multimedia, multimodal, and multicodal (including
dard reference model (SRM) for intelligent multimodal spoken and written language, gesture, gaze), these
presentation systems (DVIMPS) in 1998 (Bordegoni et al. papers investigate enriching human-machine interac-
1998). The nineties are also characterized by increasing tions using such capabilities. Collectively, the authors
scientific advances (many captured in this collection), illustrate how supporting integrated input from multi-
tool developments (e.g., Kobsa et al.'s BGP-MS user ple sources can simultaneously enhance communica-
modeling shell (Section V of this volume)), and com- tion efficiency, effectiveness (e.g., speed and accu-
mercial applications such as e-mail filters (Maes, Section racy), and naturalness. They provide technical solu-
VII of this volume) and Bayesian-based user models for tions that support interpretation of parallel, imprecise,
Microsoft's Office Assistant (Horvitz 1997). and ambiguous multimedia input. The papers also
illustrate the context-dependent, multifunctional
5. STATE OF THE ART: AN OVERVIEW OF nature of multimodal input, providing rich ground for
THE READINGS future research. Taken together, these papers advance
This readings collection is organized around solutions our ability to enable humans to utilize the full extent
to the key elements introduced in the architecture out- of their linguistic, gestural, and gaze input, with the
lined at the beginning of this introductory section: attendant benefits.

FIGURE 6. Historical Emergence of Intelligent User Interfaces


Intelligent User Interfaces: An Introduction 9

5.2. Generation of Output A number of factors have motivated researchers to


The next three sections of the book address semi- or seek automated graphic designers. Currently, application
fully automatic generation of coordinated multimedia designers are forced to anticipate and predesign every
output. Designing and realizing coherent and cohesive possible data and presentation situation. Moreover, in
multimedia presentations can be subdivided into sev- order to create effective graphics, developers need to be
eral co-constraining processes, which include the design experts, which is often not the case.
determination of communicative intent, the selection Several features characterize the papers. First,
of content to achieve this, its grouping/structuring and they all focus on a move toward explicit representation
ordering, its allocation to a particular code (e.g., text of graphical presentation knowledge. Second, they
versus graphics), its realization in a coordinated fash- support explicit choices among graphical encoding
ion across media, and finally its layout. Ideally, the mechanisms that reason about the expressiveness and
generation process is tailored to the context, task, and effectiveness of underlying representations and result-
user. Investigators have explored these processes in ing presentations. Finally, the authors increasingly
several domains using a range of algorithms. This area focus on representation of knowledge of the user, task,
is accordingly divided into three sections: "Multimedia and context and its exploitation to generating more
Presentation Design," "Automated Graphics Design," effective, tailored presentations while decreasing
and "Automated Layout." required user expertise. Whereas some papers focus
5.2.1. Multimedia Presentation Design principally on information characteristics and data
The second section of the book addresses multimedia graphics, others consider how differences in users'
presentation design, which encompasses the tasks of goals impact the effectiveness of designed graphics.
content selection, media allocation, media realization, Still others exploit perceptual operations that yield
and layout. The papers suggest these processes are inter- more rapid results than cognitive operations (e.g.,
dependent and must be closely coordinated during gen- arithmetic, comparison); for example, grouping and
eration to ensure cohesive and coherent output. Together, ordering information as well as encoding it using
the papers also argue for the importance of key knowl- color, shading, and layout to support "preattentive"
edge sources in these processes, including models of the and sometimes parallel visual search to enable both
information and media, the user, the discourse context, more accurate and more efficient task performance.
the producer, and the interface itself. They further 5.2.3. Automated Layout
demonstrate generalizations of text-linguistic notions, The fourth section of the book addresses the layout of
such as coherence, speech acts, anaphora, and rhetorical media objects, which has a strong influence on the atten-
relations for multimedia presentation design. Each paper tional structure of multimodal communication. A change
contributes concrete but widely ranging technical in the layout of a multimedia document does not neces-
approaches and solutions to these tasks (e.g., employing sarily change the meaning of the document but certainly
templates, rules, plans, constraints). changes the focus of attention of the reader. Multimedia
5.2.2. Automated Graphics Design presentations are too dynamic and come too fast to have
Whereas several articles in Section II address auto- the layout of every visual presentation designed manu-
mated generation of natural language text, this third ally so that automated layout becomes a necessity. In
section looks at the automatic design and realization of addition, automated layout may help to adapt an inter-
two- and three-dimensional graphics from structured face to the screen or window size of a user as well as to
data (see Maybury 1994 for pointers to eight surveys the user's perceptual abilities and preferences. The arti-
and six specialized collections on natural language gen- cles in this section survey the most important techniques
eration and Grosz, Sparck Jones, and Webber 1986). for automated multimedia layout, including approaches
These papers collectively aim to elucidate not only based on rules, constraints, or simulated annealing.
how but why graphics should be designed. Moving
beyond descriptive works of graphical design practice 5.3. User and Discourse Models
that consider examples of successful graphics and make Section V of the collection addresses the adaptation of
observations on how to avoid ambiguous, confusing, or interfaces to the user and context of the interaction,
imprecise and misleading graphics, these efforts aim at specifically addressing the acquisition, tracking, and
formal and prescriptive theories of graphics design. utilization of models of the user and discourse. The
10 Intelligent User Interfaces: An Introduction

articles address user modeling issues, including stereo- decreasing task complexity, bringing expertise to the user
types, plan- and goal-based user models, system initia- (in the form of expert critiquing, task completion, coordi-
tive, and user modeling shells. The articles also nation), or simply providing a more natural environment
address the modeling and use of models of discourse with which to interact. The papers in this section report
for such tasks as planning explanations, answering examples of each of these and also describe open archi-
follow-up questions in the context of prior discourse, tectures for building agent-based multimodal interfaces,
and supporting interruption. The articles cover a wide the use of agents to express system and discourse status
range of application areas, including interactive con- via facial displays, and the multimodal communication
sultation (e.g., recommending books or guiding soft- between animated computer agents.
ware use), user- and context-adaptive hypertext (e.g.,
art exploration), and multimedia interfaces to decision 5.6. Empirical Evaluation
support. A number of innovations in these systems The final section focuses on IUI evaluation. Whereas
include the extension of user and discourse models to community-based evaluation using standard corpora
multimedia interfaces (e.g., to process multimodal and tasks has been applied in several areas related to
deixis), the use of incremental explanation planning, intelligent interfaces (most notably, DARPA evaluations
and interleaved design and realization. in speech, starting with Hirschman 1989; information
extraction, e.g., MUC-6 1995; and information retrieval,
5.4. Model-Based Interfaces e.g., TREC-1, Harmon 1993), relatively little evaluation
Section VI of the collection describes efforts to create has been systematically performed on IUIs. This section
tools that decrease both the time and the expertise attempts to collect the best examples to foster more rig-
required to create interfaces through automation or design orous development of and widespread use of evaluation
assistance. These efforts go beyond user interface toolkits in the future. Important dimensions of the problem
by separating dialog control from application code and include considering human-human versus human-com-
teasing out presentation and style decisions from the puter communication, spoken versus written communi-
toolkit code libraries. They are also distinct from user cation, unimodal versus multimodal communication,
interface management systems in that they make finer- and direct versus mediated communication.
grained distinctions and provide more powerful design As the papers in this section illustrate, the evalua-
tools to interface developers. Much research effort has tor and analyst have at their disposal a range of tools,
been focused on more crisply defining the functional such as Wizard-of-Oz experiments, simulations, and
areas of the interface in order to support declarative instrumentation of live environments to evaluate a
expression and modularization of interface functionality. range of metrics using a variety of quantitive and qual-
These papers make a few key contributions. First, they itative measures and evaluation methodologies (e.g.,
move toward declarative specifications of interfaces, corpus based, task based).
refining the distinctions among models and processes
associated with the domain, the application, the user- 5.7. Content Index
machine dialog control, and the presentation. Second, Because many of the papers address issues that cut
they promise increased portability and ease of across section distinctions, Table 1 provides cross-
evolution as maintenance and extension is done within references to facilitate access to chapters according to
a more formal framework. Finally, they enable new the following categories:
forms of designer support, such as automated design 1. Media input and output data types investigated
critique, refinement, and implementation. (e.g., text, speech, graphics, gesture), which, as a
result, indicate if the investigations examine
5.5. Agent-Based Interaction cross-stream or multiple media processing
The papers in Section VII consider the use of agents in the 2. The underlying models (e.g., of the user, dis-
interface. Important questions explored include, What course, task, situation) that are created, main-
can and should an agent do? How they should do it? and tained, and exploited
How, when, and why should they interact with the user 3. Representational devices utilized, such as numeri-
when doing it? Agents promise to decrease human work- cal-, rule-, plan-, model-, or agent-based processing
loads and make the overall experience of interaction less 4. Application areas addressed, (e.g., decision/design
stressful and more productive. Agents may assist by support, information access or creation, training)
Intelligent User Interfaces: An Introduction 11

TABLE 1. Content-Based Index


12 Intelligent User Interfaces: An Introduction

6. CONCLUSION Cognitive Science, and ACM Transactions on Graphics


Intelligent interfaces promise to improve the quality of as well as more general forums such as Communications
interaction for all who interact with computersat of the ACM and IEEE Computer.
work and at play. They promise In addition to journals and books, a series of confer-
more efficient interactionenabling more rapid ences and workshops can provide additional sources, such
task completion with less work. as the annual ACM-sponsored International Conference
more effective interactiondoing the right thing on Intelligent User Interfaces (sigart.acm.org/iui99/), the
at the right time, tailoring the content and form International Conference on User Modeling, the
of the interaction to the context of the user, task, International Workshop on Advanced Visual Interfaces
dialog. (AVI), and the User Interface Systems Technology (UIST
more natural interactionsupporting spoken, Conference). Proceedings from a number of annual or
written, and gestural interaction, ideally as if semiannual conferences typically contain sessions on
interacting with a human interlocutor. intelligent user interfaces, such as the conference of the
When these interfaces are created in a model-based Association for Computing Machinery Special Interest
fashion, modifying their behavior will require model Group on Computer Human Interaction (www.acm.org
changes, not reprogramming. This will reduce the time, /sigchi), the American Association of Artificial
cost, and expertise required to develop interfaces and, Intelligence (AAAI) National Conference on Artificial
at the same time, will facilitate the creation of more Intelligence, the European Conference on Artificial
principled interfaces. Intelligent interface technology Intelligence (ECAI), and the International Joint
will be essential to effective information interaction in Conference on Artificial Intelligence (IJCAI). There are
the future. For example, better interaction via the web also many related conferences and specialized workshops
(Brusilovsky 1996) has been identified as a challenging in subdisciplines, including speech and language process-
problem, and intelligent web sites in the future promise ing, user and discourse modeling, multimedia, and intel-
to discover user and group skills and interests, tailor ligent training systems.
presentations, and automatically improve web site Finally, materials are increasingly available on-
interfaces (Perkowitz and Etzioni 1997). In short, this line, such as an on-line tutorial on intelligent multi-
area has the potential to improve the quality and effec- media interfaces (www.mitre.org/resources/centers
tiveness of interaction for everyone who communicates /advanced_info/mark.html); an on-line survey, "State
with a machine in the future. To achieve these benefits, of the Art in Human Language Technology"
however, we must overcome the remaining fundamen- (www.cse.ogi.edu/CSLU/HLTsurvey); a study by the
tal problems outlined in the chapters herein. National Research Council on Every Citizen Interfaces
(www.nap.edu/readingroom/books/screen); a human-
7. RESOURCES computer interaction index (is.twi.tudelft.nl/hci); and
A number of resources for teachers, students, and the Electronic Transactions on Artificial Intelligence
researchers contain additional information about this (www.ida.liu.se/ext/etai/indexframe.html), which
subject area. Several relevant collections of papers are includes a special area on intelligent user interfaces
cited in the references at the end of this introductory sec- (www.dfki.de/~andre/etai/colloqb.html). Additional
tion, notably, Intelligent User Interfaces (Sullivan and pointers are available from the ACL SigMedia special
Tyler 1991), Intelligent Multimedia Interfaces (Maybury interest group in multimedia language processing
1993), and Readings in Human-Computer Interaction (www.dfki.de/sigmedia). Related government initia-
(Baecker et al. 1995). Maybury (1995) provides a sum- tives include the European Intelligent Information
mary of research in multimedia parsing and generation. Interfaces program (www.i3net.org) and DARPA's
Key journals (and associated World Wide Web sites) in Intelligent Collaboration and Visualization program
which new results in intelligent user interfaces appear (snad.ncsl.nist.gov/~icv-ewg/).
include the international journals of Human-Computer
Interaction {www.parc.xemx. com/istl/projects/HCI), 8. REFERENCES
including a special issue on multimedia interfaces Allen, J., and Hendler, J. (eds.). 1990. Readings in
(Oviatt and Wahlster 1997), User Modeling and User- Planning. San Francisco: Morgan Kaufmann.
Adapted Interaction (umuai.informatik.uni-essen.de), Arens, Y.; Feiner, S.; Foley, J.; Hovy, E.; John, B.; Neches,
Artificial Intelligence (www.elsevier.com/locate/artint), R.; Pausch, R.; Schorr, H.; and Swartout, W. 1991.
Intelligent User Interfaces: An Introduction 13

Intelligent User Interfaces. (ISI/RR-91-288). University of International Conference on Computational Linguistics,


Southern California Information Sciences Institute Bonn, Germany, 356-361.
Research Report, Marina del Rey, CA. Kobsa, A., and Wahlster, W. (eds.). 1989. User Models in
Baecker, R.; Grudin, J.; Buxton, W.; and Greenberg, S. 1995. Dialog Systems. Berlin: Springer-Verlag.
Readings in Human-Computer Interaction: Toward the Year Maybury, M. T. 1993. Intelligent Multimedia Interfaces.
2000 (second edition). San Francisco: Morgan Kaufmann. Menlo Park, CA/Cambridge, MA: AAAI/MIT Press.
Binot, J-L.; Falzon, P.; Perez, R.; Peroche, B.; Sheehy, N.; Maybury, M. T. 1994. Automated Explanation and Natural
Rouault, J.; and Wilson, M. D. 1990. Architecture of a Language Generation. In Sabourin, C, (ed.), Computational
Multimodal Dialogue Interface for Knowledge-Based Text Generation, Bibliography, Montreal: Infolingua, 1-88.
Systems. In Proceedings of Esprit '90 Conference, 412-433.
Maybury, M. T. 1995. Research in Multimedia Parsing and
Dordrecht, Netherlands: Kluwer Academic Publishers.
Generation. In McKevitt, P. (ed.), Artificial Intelligence
Blattner, M., and Dannenberg, R. (eds.). 1992. Multimedia Review: Special Issue on the Integration of Natural
Interface Design. New York: ACM Press. Language and Vision Processing, 9(2-3): 103-127.
Bordegoni, M.; Faconti, G.; Feiner, S.; Maybury, M.; Rist, Maybury, M. T. 1997. Intelligent Multimedia Information
T.; Ruggieri, S.; Trahanias, P.; and Wilson, M. 1997. A Retrieval. Menlo Park, CA/Cambridge, MA: AAAI/MIT Press.
Standard Reference Model for Intelligent Multimedia
MUC-6. 1995. Proceedings of the Sixth Message
Presentation Systems. The International Journal on the
Understanding Conference. Advanced Research Projects
Development and Application of Standards for Computers,
Agency Information Technology Office, Columbia, MD,
Data Communications and Interfaces, Special Issue on
November 6-8.
Intelligent Multimedia Presentation Systems. Vol. 18, Nos. 6
and 7. Amsterdam: Elsevier Science. Oviatt, S., and Wahlster, W. (eds.). 1997. Special Issue on
Multimodal Interfaces. Human-Computer Interaction: A
Brachman, R. J., and Levesque, H. J. (eds.) 1985. Readings in
Journal of Theoretical, Empirical, and Methodological
Knowledge Representation. San Francisco: Morgan Kaufmann.
Issues, of User Science and of System Design, 12(1, 2).
Brusilovsky, P. 1996. Methods and Techniques of Adaptive Mahwah, NJ: Lawrence Erlbaum Associates.
Hypermedia. In Brusilovsky, P., and Vassileva, J., (eds.),
Pattabhiraman, T., and Cercone, N. (eds.). 1991.
Special Issue on Adaptive Hypertext and Hypermedia, User
Computational Intelligence: Special Issue on Natural
Modeling and User-Adapted Interaction 6 (2-3): 87-129.
Language Generation, 7(4) November. National Research
Everett, R.; Zraket, C. A.; and Bennington, H. 1957. SAGE: Council of Canada.
A Data Processing System for Air Defense, In Proceedings
Pelachaud, C; Badler, N. I.; and Steedman, M. 1996.
of the Eastern Joint Computer Conference, Washington,
Generating Facial Expressions for Speech. Cognitive
DC, December. IRE (now IEEE), 148-155. Reprinted in
Science, 20(1): 1-46.
Everett, R.; Zraket, C. A.; and Bennington, H. 1983. SAGE:
A Data Processing System for Air Defense. Annals of the Perkowitz, M., and Etzioni, O. 1997. Adaptive Web Sites:
History of Computing, 5(4): 330-339. October. An AI Challenge. In Proceedings of Fifteenth International
Joint Conference on Artificial Intelligence (IJCAI '97),
Grosz, B. J., Sparck Jones, K., and Webber, B. 1986.
Readings in Natural Language Processing. San Francisco: Nagoya, Japan, August 23-29, 16-21.
Morgan Kaufmann. Schneider-Hufschmidt, M.; Klihme, T.; and Malinowski, U.
Hirschman, L. (chair), Proceedings of the Speech and (eds.). 1993. Adaptive User Interfaces: Principles and
Natural Language Workshop, February 1989. San Practice. Amsterdam: Elsevier.
Francisco: Morgan Kaufmann. Sullivan, J., and Tyler, S. (eds.). 1991. Intelligent User
Horvitz, E. 1997. Agents with Beliefs: Reflections on Interfaces. New York: ACM Press.
Bayesian Methods for User Modeling. Invited Talk, Sixth TREC-1. 1993. First Text REtrieval Conference (TREC-1),
International Conference on User Modeling, Chia Laguna, NIST, Harmon, Donna K. (ed.). NIST Special Publication
Sardinia, June 2-5. (http://www.research.microsoft.com 500-207, March.
/research/dtg/horvitz/lum. htm) Waibel, A., and Lee, K. 1990. Readings in Speech
Kobsa, A.; Allgayer, J.; Reddig, C; Reithinger, N.; Recognition. San Francisco, CA: Morgan Kaufmann.
Schmaucks, D.; Harbusch, K.; and Wahlster, W. 1986. Webber, B. L., and Nilsson, N. J. (eds.). 1985. Readings in
Combining Deictic Gestures and Natural Language for Artificial Intelligence: A Collection of Articles. San
Referent Identification. In Proceedings of 11th Francisco: Morgan Kaufmann.

S-ar putea să vă placă și