Sunteți pe pagina 1din 2

Katherine Lang

University of Rochester
Department of Computer Science
Rochester, New York, U.S.A.
lang@cs.rochester.edu
https://www.cs.rochester.edu/~lang/

Research Interests

My current research focus is the structure of dialogue managers within natural language dialogue
systems that interact with humans to gather information. My goal is to better understand the systems
structure and explore how adding simple components
can empower a basic system.
1.1

Previous Work

In order to accomplish my goal, I created a statebased dialogue manager that is paired with a deep semantic parser. The Artificially Intelligent Interview
Dialogue, or AID, is designed to extract information
from user provided input. The system reveals a question, waits for the user to record his or her answer, and
then processes and evaluates the given answer.
1.2

Current Work

The common assumption, that the type of control a


natural language dialogue system (NLDS) dictates the
aptitude of the system, is based on an outdated practice. When describing NLDSs, the parser and the dialogue manager are two distinct components of the
same system, but in practice, the parser and the dialogue manager are often intertwined. We found that
this practice inhibits the dialogue manager and by encoding the two components as separate entities, we
gain more power out of even the most simplistic type
of control.
1.3

AID

AID is a finite-state natural language dialogue system coupled with a variant of the TRIPS parser (Ferguson and Allen 1998). It is designed to handle the
natural language answers within a domain of questionanswer types derived from interview questions and
their potential answers, providing a flexible and wide
array of possible users responses.
As Figure 1 illustrates, the instant messenger-like
GUI receives messages from the dialogue manager and
sends messages to the parser.

Figure 1: AID architecture


The user interacts with AID by typing into the GUI
and entering the text to indicate that the input is complete. The text is then sent to the parser to be processed. The parser outputs the best three logical forms
(LFs), or a linguistically based detail semantic representation, and passes them to the dialogue manager
(Allen et al. 2007).
The dialogue manager selects the top ranked LF
from the parser and attempts to extract the desired information, indicated by the expected question-answer
type of the state. AID's dialogue manager does not rely
on word pattern matching for extraction, but uses
broad-coverage semantic rules that can be reused in
other states and interviews. Once the state is satisfied,
the dialogue manager will transition to the next state
and proceed with the next interview question, which is
sent to the GUI and made visible to the user.
To better illustrate the flow of AID, we will walk
through a simple example. Suppose we want to know
the users age. The dialogue manager enters the desired state and sends the question, How old are you?,
to the GUI for the user to read. Once the user has input a response, it is sent to the TRIPS parser. The state
then takes the logical form and attempts to match it to
one in a set of extraction rules. For example, :SCALE
ONT::DURATION-SCALE :UNIT (:* ONT::TIMEUNIT ONT::YEAR) :AMOUNT X is an extraction
rule. This rule matches phrases such as X years old,
I am X years old, X years, and I am X years.

We expect a number, X, to satisfy this state, so all


extraction rules for this state contain some variation of
ONT::NUMBER :VALUE X or :AMOUNT X, indicating the value. The age is then extracted from the LF
and saved as the answer to the question. The state, now
satisfied, transitions to the next state. If the LF did not
contain a number, then the state would not be satisfied
and the question would be repeated until an age can be
extracted from the input. AID also has sub-state implemented, so that questions can be skipped, such as in
the case of the user providing extra information that
answers a follow-up question.
1.4

Evaluation

Traditionally, in order to evaluate AID, we would


compare its characteristics and capabilities to other
natural language dialogue systems. For example, originally, we compared AID to four example systems:
VoiceXML, Olympus, TrindiKit, and TRIPS. TRIPS
and TrindiKit had more capabilities than, as was expected of these agent-based systems. However, when
comparing AID to Olympus and frame-based
VoiceXML systems, the comparison was not as clearly
defined and it seemed as though AID would preform
better, as if had more capabilities, than either of these
systems.
As we can see, this comparison practice can be very
vague and, moreover, does not consider the user experience. Thus, we designed and implemented a user
study collecting a variety of qualitative and quantitative metrics, including metric related to the user experience, gathered from a variety of NLDS comparisons.
The results of the user study are pending.

Future of Spoken Dialog Research

The future of spoken dialogue research will likely


revolve around the application of the systems. This
will the motivate research into improvement to the
overall system. But will this make the area stronger or
more disjoint? The answer is yes to both. By treating
the components of a dialogue system as independent
entities each component will be researched and
strengthen. With each generation of students, researchers will learn and explore more about the specific
components than the overall system, thus the area will
become disjoint. That is why it is important that there
remain researchers of the system as a whole in order
provide prospective and motivate the advancements of
each component towards the advancement of the natural language dialogue system.
As dialogue systems gain popularity and efficiency
in applications, ethics starts becoming a major issue.
Ethical concerns are always present, but will become

more of a concern as more and more information is


collected in order to improve or test these systems.
Where and when to draw the line will be key questions
for many areas of computer science in the future.
Lastly, this generation of young researchers has the
potential to produce great applications and make advancements in this area. However, we are reaching
appoint where evaluating and comparing these systems
is becoming less defined. More research needs to be
done to illuminate the set of metrics that will not only
provide a useful method of comparison and evaluation,
but that also encompass the user experience. Otherwise, we ignore an important factor of natural language dialogue systems and potentially set this area on
a path comprised of academic tools that no one wants
to fund.

Suggestions for discussion

Some suggestions for discussion are listed below.

Useful metrics (quantitative and qualitative)


for comparing NLDSs with any type of control
Ethics of having a computer-assisted person
or self interviewing system collect personal
data

References
James F. Allen, Myroslava Dzikovska, Mehdi
Manshadi, and Mary Swift. 2007. Deep linguistic
processing for spoken dialogue systems. Workshop
on Deep Linguistic Processing, Association for
Computational Linguistics. Prague.
George Ferguson and James Allen. 1998. TRIPS: An
Integrated Intelligent Problem-Solving Assistant.
Proceedings of the Fifteenth National Conference
on AI (AAAI-98). Madison, WI.

Biographical Sketch
Katherine is currently a research assistant for James
Allen at the University of Rochester. She graduated
from Dickinson College in Pennsylvania, United
States of America, with a B.S. in both Mathematics
and Computer Science. Then, Katherine went onto
earn her M.S. in Computer Science at the University of
Rochester. She loves reading, traveling, zumba, and
talking with people.

S-ar putea să vă placă și