Sunteți pe pagina 1din 470

GameDev.

net -- Artificial Intelligence

Click here to advertise with us

Features Resources Community Directories More


News Articles & Resources Forums Job Offers Designer Diaries
Columns Books & Software GD Showcase Member Search Game Dictionary
Contests For Beginners IRC Chat Network Hosted Sites Newsletter
Artificial Intelligence Total Resources : 86
Jump To :
What is it? How can I do it? What are the various algorithms? And
AI Links
more...
AI Theory
For more information about Artificial Intelligence, check out: Applications
● Our Artificial Intelligence books. Documentation
● Our Artificial Intelligence forum. Gaming
Genetic Algorithms
Introduction
Natural Language Processing
Neural Networks
Pathfinding and Searching
AI Links
Name Rating Description
aboutAI.net A second generation AI portal and a
[Added: 5/13/2002 Clicks: 579] successor to ai.about.com, featuring
hundreds of articles and thousands of
AI-related links.
Amit's Game Programming Outstanding large collection of game
Information development articles, concentrating on
[Added: 5/17/2000 Clicks: 7156] path-finding and AI.
Artificial Intelligence A relatively new site dedicated to all
[Added: 11/1/2001 Clicks: 3520] aspects of AI.
Artificial Intelligence Resources Tons of links. A terrific starting point for
[Added: 5/17/2000 Clicks: 5136] your AI research.
Aske Plaat Minimax Apparently Mr. Plaat was a faculty
[Added: 5/17/2000 Clicks: 1890] member of the University of Alberta. I
don't think he's there anymore, but he
wrote up some of the best scholarly
documents about minimax tree-searching
that you'll find.For the uninitiated,
minimax is a way to find optimum moves
in board games.
GA Archives A very complete Genetic Algorithms site.
[Added: 5/17/2000 Clicks: 2151] Includes mailing list archives and pointers
to lots of papers.
Machine Learning in Games Good starting point for developing games
[Added: 5/17/2000 Clicks: 2693] that learn.
The Game AI Page Steve "ferretman" Woodcock's excellent
[Added: 5/17/2000 Clicks: 4143] repositiory of game AI related
information.

http://www.gamedev.net/reference/list.asp?categoryid=18 (1 of 8) [25/06/2002 1:25:24 PM]


GameDev.net -- Artificial Intelligence

The Genetic Programming A good GA site with a tutorial and lots of


Cookbook info.
[Added: 5/17/2000 Clicks: 2219]
The University of Alberta GAMES No free code, but lots of projects writing
Group games that clobber the best folks out
[Added: 5/17/2000 Clicks: 1590] there. Home of Chinook (which you can
play online), which is probably the best
checker-playing program out there.

AI Theory
Topic Author Description
A Modular Framework for Artificial Charles Guy An approach to modeling the biological
Intelligence Based on Stimulus nervous system without using neural nets.
Response Directives
[Added: 2/21/2000]
Allis' Ph.D. thesis: Searching for
Solutions in Games and AI
[Added: 7/16/1999]
Does the Top-down or Bottom-Up J.Matthews Huge essay (7000+ words) detailing many
Approach Best Model the Human aspects of AI - including conceptual
Brain? representation, mentalese, the CYC
[Added: 7/30/1999] project and COG. This was written as my
Extended Essay for the International
Baccalaureate. Abstract, diagrams,
bibliography all included!
Introduction to Machine Vision James An introductiont to machine vision. Edge
[Added: 7/31/1999] Matthews detection and prototyping are discussed.
Logic Programming with Fuzzy Sets Ludek Matyska Zipped .PDF format.
[Added: 7/31/1999]
Philosophical Arguments For and J. Matthews Philosophy often helps artificial
Against AI intelligence in many fields, yet when
[Added: 7/31/1999] looking at the concept of AI as a whole,
often critizes the very fundamental parts
of AI - questioning when intelligent
machines will ever be possible. This essay
looks at some of the different schools of
thought that agree and disagree with AI.
Production Systems S.Hsiung Most AI systems today are production
[Added: 7/31/1999] systems. In fact many can argue that all
computer programs are production
systems. How are production systems
different from the rest? What are their
strengths and weaknesses? How do they
work? Many task-structured programs
such from puzzle solvers to chess playing
programs to medical diagnosis expert
systems, and to monsters in Quake2 are
production systems
Project AI Mark Lewis This document covers the basics of
[Added: 7/28/1999] Baldwin and designing an artificial intelligence for a
Bob Rakosky strategy game.

http://www.gamedev.net/reference/list.asp?categoryid=18 (2 of 8) [25/06/2002 1:25:24 PM]


GameDev.net -- Artificial Intelligence

The Intuitive Algorithm Abraham This is an essay concerning the idea that
[Added: 10/7/1999] Thomas intuition may be a pattern recognition
algorithm. Due to its scientific nature, it
can be a difficult read for some.
The Natural Mind: Conciousness S. Hsiung Perhaps another one of the greatest
and Self-Awareness challenges in AI is to create something
[Added: 7/31/1999] that has knowledge of having knowledge.

The Turing Machine S. Hsiung Production systems continued. How they


[Added: 7/31/1999] relate to turing machines, and are they
truly capable of computing anything that
is computable?

Applications
Topic Author Description
AI in Gaming James Discusses current and future AI in the
[Added: 7/30/1999] Matthews shoot-em-up, flight simulator and board
game games.
Applications in Music J. Matthews This essay looks at AI and music, and
[Added: 7/30/1999] more specifically the guitar. The problems
that arise as effects are added, as speed
increases and as more and more
instruments are added to a piece of music.

Military Applications of AI James This essays looks at AI and the military. It


[Added: 7/30/1999] Matthews covers some of the uses, and also the
moral issues of allowing a computer to
autonomously destroy targets.

Documentation
Topic Author Description
Hierarchal AI Andrew This document proposes an approach to
[Added: 9/7/1999] Luppnow the problem of designing the AI routines
for intelligent computer wargame
opponents
Multilayer Feedforward Networks S. Hsiung A guide to creating multilayer
and the Backpropagation Algorithm feedforward networks and tips and details
[Added: 7/31/1999] on how to implement the backpropagation
algorithm in C++ with heavy theoretical
discussion.

Gaming
Topic Author Description
A Practical Guide to Building a Geoff Howland Part I of this series covers state machines,
Complete Game AI: Volume I unit actions, and grouping.
[Added: 10/12/1999]
A Practical Guide to Building a Geoff Howland Part 2 covers unit goals and pathfinding.
Complete Game AI: Volume II
[Added: 10/12/1999]

http://www.gamedev.net/reference/list.asp?categoryid=18 (3 of 8) [25/06/2002 1:25:24 PM]


GameDev.net -- Artificial Intelligence

AI for Games and Animation: A John Funge Transcends geometric, kinematic,


Cognitive Modeling Approach physical, and behavioral models to
[Added: 2/21/2000] examine the possibilities of cognitive
models.
AI In Empire-Based Games Various A thread on implementing AI in strategy
[Added: 7/31/1999] and empire-building games.
AI Madness: Using AI to Bring Joe Adzima Provides a strategy for programmers who
Open-City Racing to Life are trying to create AI for open city racing
[Added: 1/29/2001] games.
AI Uncertainty Scott Thomson Discusses using Bayesian networks to
[Added: 9/7/1999] deal with uncertainty in game AI
Artificial Emotion: Simulating Ian Wilson A look at how the appearance of emotion
Mood and Personality can be used in games.
[Added: 2/21/2000]
Building Brains Into Your Games Andre LaMothe Discusses various AI topics ranging from
[Added: 7/31/1999] the simple to the complex.
Chess Programming Part I: Getting François This series takes the reader through the
Started Dominic algorithms and strategies needed in
[Added: 5/17/2000] Laramée creating the artificial intelligence involved
in chess and similar games. Part One
gives an overview of the material that will
be covered.
Chess Programming Part II: Data François In the second installment in this series
Structures Dominic about developing an opponent AI, the
[Added: 6/11/2000] Laramée most common and useful data structures
are explained.
Chess Programming Part III: Move François The third installment in this series
Generation Dominic examines the two major move generation
[Added: 7/17/2000] Laramée strategies and explains how to choose
between them for a given application.
Chess Programming Part IV: Basic François Pat IV of this series focuses on the basics
Search Dominic of two-agent search in strategy games:
[Added: 8/6/2000] Laramée why it is useful, how to do it, and what it
implies for the computer's style of play.
Chess Programming Part V: François In this next-to-last article, FDL examines
Advanced Search Dominic advanced search-related techniques which
[Added: 9/6/2000] Laramée can speed up and/or strengthen your
chess-playing program.
Chess Programming Part VI: François The series ends with a close look at
Evaluation Functions Dominic creating a good evaluation function, and
[Added: 10/8/2000] Laramée includes a demo chess program.
Designing Need-Based AI for Ernest Adams Described need based AI in the context of
Virtual Gorillas virtual gorillas, but the information can be
[Added: 1/6/2001] applied generally.
Evolving Go Playing Strategy in NN P.Donnelly
[Added: 7/31/1999]
Game AI: The State of the Industry Steve Ferretman looks at the current trends of
[Added: 2/21/2000] Woodcock AI in games.
Game AI: The State of the Industry, Steve Ferretman's 2000 update on the current
Part One Woodcock importance of AI in games.
[Added: 11/2/2000]

http://www.gamedev.net/reference/list.asp?categoryid=18 (4 of 8) [25/06/2002 1:25:24 PM]


GameDev.net -- Artificial Intelligence

Game AI: The State of the Industry, David C. The second installment of Game
Part Two Pottinger and Developer magazine's annual
[Added: 11/13/2000] Prof. John E. investigation into game AI presents two
Laird more experts discussing this
ever-evolving field.
Game Developers Conference 2001: Eric Dybsand Highlights some of the more salient points
An AI Perspective from the computer game AI related
[Added: 5/11/2001] sessions he attended during the recent
GDC 2001.
Machine Learning, Game Play, and David
Go Stoutamire
[Added: 7/31/1999]
More AI in Less Processor Time: Ian Wright Presents techniques to control and manage
Egocentric AI real-time AI execution, techniques that
[Added: 6/21/2000] open up the possibility for future
hardware acceleration of AI.
Recognizing Strategic Dispositions Steve A compilation of a 1995 newsgroup
thread Woodcock thread about creating an AI that
[Added: 7/5/2000] recognizes strategic situations.
Searching for Solutions in Games L.V. Allis In PostScript ZIP format.
and Artificial Intelligence
[Added: 7/31/1999]

Genetic Algorithms
Topic Author Description
Application of Genetic Tobin Ehlis This article covers the development and
Programming to the Snake Game analysis of a successful function set that
[Added: 8/10/2000] will allow the evolution of a genetic
program that will allow the AI to attain
maximal performance.
GA Playground Ariel Dolan A genetic algorithm toolkit for use with
[Added: 9/7/1999] Java
Genetic Algorithm Example: S.Hsiung A step-by-step example of how genetic
Diophantine Equation algorithms can be used to solve
[Added: 7/30/1999] diophantine equations. Now with
accompanying C++ code!
Genetic Algorithms Tutorials Darrell Whitley ZIPped PostScript format.
[Added: 7/31/1999]
Representing Trees in Genetic C.Palmer, ZIPped PDF format.
Algorithms A.Kershenbaum
[Added: 7/31/1999]

Introduction
Topic Author Description
An Introduction to Artificial S. Hsiung A history oriented introduction to AI.
Intelligence How thought and beliefs surround AI has
[Added: 7/30/1999] evolved since the 1950s- what we can
hope for in the future.

http://www.gamedev.net/reference/list.asp?categoryid=18 (5 of 8) [25/06/2002 1:25:24 PM]


GameDev.net -- Artificial Intelligence

An Introduction to Genetic S.Hsiung and Genetic algorithms are based on


Algorithm and Genetic J.Matthews biological natural evolution - including
Programming reproduction, mutation and natural
[Added: 7/30/1999] selection. They have very many
applications, from math, to music and
puzzles. The effectiveness that GAs
(genetic algorithms) have as heuristic
search systems certainly make them very
interesting.
An Introduction to Neural Networks Dr. Leslie A very good introductory article for
[Added: 9/7/1999] Smith neural network theory.
An Introduction to Robotics James A simple essay detailing how AI and
[Added: 7/31/1999] Matthews ALife can be applied to robotics. Also
focuses on two example robots, Cog and
Kesmit.
Artificial Intelligence and S.Hsiung Can machines really be intelligent? What
Skepticism is understanding? Can we create human
[Added: 7/30/1999] being? Clearing out skepticisms, with an
introduction to the fascinating world of
AI.
Artificial Life James ALife is a fascinating area to study, this
[Added: 7/30/1999] Matthews essay gives you a brief introduction into
cellular automata, flocking and other areas
of ALife. Watch Conway's Life run and
stabilize as you read the essay!
Aspects of Artificial Intelligence S.Hsiung A brief summary of topics from neural
Systems networks to artificial life systems, and
[Added: 7/30/1999] how they compare to each other.
Basics of Game AI Geoff Howland As the title of the article states, this article
[Added: 7/28/1999] provides a brief, but mind provoking
overview of the basics of game AI.
Comp.AI.Games FAQ Though it hasn't been updated since
[Added: 7/31/1999] January 2000, this still contains some
useful information.
Introduction to Learning in Games Geoff Howland This doc provides an introduction on how
[Added: 7/28/1999] to make your games learn from their
players.
Introduction to Neural Networks James This essays is a simple introduction to
[Added: 7/31/1999] Matthews NNs, how they're structures and the
different types of NN.
The History of AI S.Hsiung An introduction of AI and a report on the
[Added: 7/30/1999] major progresses of AI from
1950's-1990's.

Natural Language Processing


Topic Author Description

http://www.gamedev.net/reference/list.asp?categoryid=18 (6 of 8) [25/06/2002 1:25:24 PM]


GameDev.net -- Artificial Intelligence

An Introduction to Natural S. Hsiung Do natural language systems such as


Language Processing ELIZA possess actual understanding? Or
[Added: 7/31/1999] are they truly, simply "chatterboxes",
programs that just play with words,
syntax, and semantics. This essay
provides a brief look at the Turing's test,
real understanding, and the issues that
face many natural language processing
researchers.
An Introduction to Natural S. Hsiung Parsers, Syntax, and Semantics. Have you
Language Theory ever wondered what makes many NLP
[Added: 7/31/1999] programs run? This tutorial provides a
brief overview of what makes
chatterboxes and other NLP programs
tick.
Conceptual Representation and James This is a basic essay detailing the ideas of
Scripting Matthews conceptual representation and scripting.
[Added: 7/31/1999] The essay has some output from a few CR
programs. A good essay for those not
familiar with these concepts.
Object Oriented Parsing U.Hahn Introduces ParseTalk, a grammar model
[Added: 7/31/1999] for natural language analysis.

Neural Networks
Topic Author Description
Back Propogation Neural Network Patrick Ko (.ZIP 25Kb)
Engine source Shu-pui
[Added: 7/31/1999]
Four Neural Networks Various A collection of four neural networks in a
[Added: 7/31/1999] .Zip file.
Neural Netware André LaMothe A nice introduction to neural nets
[Added: 10/7/1999]
Neural Network FAQ Warren S. Sarle FAQ from comp.ai.neural-nets. Updated
[Added: 9/7/1999] monthly.
Neural Network FAQ Lutz Prechelt
[Added: 7/31/1999]
Neurons and Neural Networks: The Michael A
Most Abstract View Arbib
[Added: 7/31/1999]

Pathfinding and Searching


Topic Author Description
A* Algorithm Tutorial Justin A great start for learning the A*
[Added: 9/7/1999] Heyes-Jones algorithm.
Chess Tree Search Paul Verhelst An excellent tutorial to minimax
[Added: 11/1/2001] searching. Despite the name, it's not just
for chess.
Coordinated Unit Movement Dave Pottinger Focusing mainly on the RTS, this article
[Added: 12/22/1999] discusses pathfinding for multiple units.

http://www.gamedev.net/reference/list.asp?categoryid=18 (7 of 8) [25/06/2002 1:25:24 PM]


GameDev.net -- Artificial Intelligence

Implementing Coordinated Dave Pottinger Shows how to apply the methods


Movement discussed in Coordinated Unit
[Added: 12/22/1999] Movement.
Knowing the Path Richard Fine Describes an 'expandable path table,'
which allows NPCs to both navigate
[Added: 6/18/2002]
through an environment, and to explore
and learn it, rather than having perfect
knowledge from the start.
Motion Planning Using Potential Stefan Baert Describes several pathfinding techniques
Fields using potential fields that can sometimes
[Added: 7/15/2000] be used in places of the popular A*
algorithm.
Pawn Captures Wyvern: How Mark Summarizes some of these recently
Computer Chess Can Improve Your Brockington proposed enhancements to A*, and show
Pathfinding you why you would want to consider a
[Added: 6/27/2000] "computer chess" style A* approach the
next time you have to implement A*.
Smart Moves: Intelligent Bryan Stout Covers quite a few search algorithms and
Pathfinding their effectiveness in pathfinding.
[Added: 10/7/1999]
Toward More Realistic Pathfinding Marco Pinter Walks you through some modifications to
[Added: 2/17/2002] the A* algorithm that can refine the
movements of your game's units.

Key: = HTML article hosted here , = HTML article hosted elsewhere = link to another web site = Adobe Acrobat
document = Zip files = Word or text document
About Us | Advertise on GameDev.net | Write for us
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Click here!

http://www.gamedev.net/reference/list.asp?categoryid=18 (8 of 8) [25/06/2002 1:25:24 PM]


Gamasutra - Features - "A Modular Framework for Artificial Intelligence Based on Stimulus Response Directives" [11.10.99]

| | | |

Features
by Charles Guy
Gamasutra
November 10, 1999
A Modular Framework for Artificial
Intelligence Based on Stimulus Response
Directives
There are three fundamental technologies used in modern computer Contents
games: graphics, physics, and artificial intelligence(AI). But while
graphics and physics have shown great progress in the last five years, The Biological Model for
current AI still continues to display only simple repetitive behavior, which Artificial Intelligence
is of little replay value. This deficiency, unfortunately, is often
sidestepped, and the emphasis switched to multi-player games, which Overview of Data Flow /
take advantage of real human intelligence. Data Structures

In this article, I have attempted to model AI based on the functional The Navigator and Goal
anatomy of the biological nervous system. In the pure sense of the word, Servos
a biological model of AI should use neural networks for all stimulus
encoding and motor response signal processing. Unfortunately, neural
networks are still difficult to control (for game design) and very
computationally expensive. Therefore I have chosen a hybrid model,
which uses a "biological" signal path framework in conjunction with more traditional heuristic methods
for goal selection. The main features of this model are:
1. Stimulus detection based on signal strength thresholds.

2. Target goal selection based on directives, know goals and acquired goals.
Letters to the Editor:
Write a letter 3. Target goals acquired by servo feedback loops that drive the body.
View all letters

4. Personalities constructed from sets of directives. Because the directives are modular, it is fairly
straightforward to construct a wide range of distinctive personalities. These personalities can
display stereotypical behavior while still retaining enough flexibility to exercise "judgment" and
adapt to unique situations. The framework of this model should be useful to a wide range of
applications because of its generic nature.

Some Background
This AI model was developed for use in
SpecOps II, a tactical infantry combat
simulation of Green Beret covert
missions. While the emphasis of the
project has been on realism and squad
level tactics, it still falls under the
category of a first-person shooter. The
original SpecOps project was based on
the U.S. Army Ranger Corps. and was one
of the first "photo-realistic" tactical
combat simulators released for the
computer gaming market. The
combination of high quality motion
capture data, photo digitized texture
maps and sound effects recorded from
authentic sources produce a rather
compelling combat experience. Although
the original game was fun to play, it was
justifiably criticized for having poor AI.
Therefore one of the major goals for

http://www.gamasutra.com/features/19991110/guy_01.htm (1 of 2) [25/06/2002 1:26:00 PM]


Gamasutra - Features - "A Modular Framework for Artificial Intelligence Based on Stimulus Response Directives" [11.10.99]
SpecOps II was to improve the AI. The
previous game logic and AI was based on
procedural scripts; the new systems are
based on data driven ANSI C code. (My
experience has convinced me that data
driven code is more reliable, flexible and
extensible than procedural scripts.) When
the data structures that drive the code
are designed correctly, the code itself can
become very simple.
Table 1. Parallels to Biological
Nervous System
Functional Unit Biological System
Stimulus Visual / Auditory
Detection Unit Cortices
Reflex / Conditioned
Directives
Response
Known / Acquired
Short-Term Memory
Goals
Goal Selector /
Frontal Cortex
Navigator
Direction /
Motor Cortex /
Position Goal
Cerebellum
Servos
Typical Behavior in SpecOps II

In the course of a normal game, a player


can order one of his buddies to attack an
enemy. If this enemy is blocked by the
world, the path finder will navigate him Figure 1. Block diagram of the data flow between
until there is a clear path. Once the path
functional units for this brain model.
is clear, the direction servo points the AI
at the enemy and he begins firing his gun. If that enemy is taken out, the AI may engage other
enemies that were aroused by the weapons fire. If all the known enemies have been taken out, the
buddy returns to formation with his commander.

Another typical sequence might begin when a player issues a "demolish position" command to a squad
member. The AI will then navigate to the position goal, place a satchel charge and yell out: "fire in the
hole!" The "get away from explosive" directive will then cause him to move outside of the danger
radius of the explosive. I have observed an interesting case where the initial evasive maneuver lead to
a dead end, followed by backtracking towards the explosive object. Eventually the navigator got the AI
a safe distance away from the explosive in time.
Overview of Data Flow / Data Structures

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19991110/guy_01.htm (2 of 2) [25/06/2002 1:26:00 PM]


Gamasutra - Features - "A Modular Framework for Artificial Intelligence Based on Stimulus Response Directives" [11.10.99]

| | | |

Features
by Charles Guy
Gamasutra Overview of Data Flow
November 10, 1999 Contents
The data flow begins with the Stimulus Detection Unit, which filters sound The Biological Model for
events and visible objects and updates the Known Goals queue. The Goal Artificial Intelligence
Selector then compares the Known Goals and Acquired Goals against the
personality and commander directives and then selects Target Goals. The Overview of Data Flow /
navigator determines the best route to get to a position goal using a path Data Structures
finding algorithm. The direction and position goal servos drive the body
until the Target Goals are achieved and then the Acquired Goals queue is The Navigator and Goal
updated. Servos

Data Structures

The primary data structures used by this brain model are: BRAIN_GOAL
and DIRECTIVE. AI personalities are represented by an array of Directive structures and other
parameters. The following is a typical personality declaration from SpecOps II:

PERSONALITY_BEGIN( TeammateRifleman )
PERSONALITY_SET_FIRING_RANGE ( 100000.0f ) \\ must be this close to fire
gun (mm)
PERSONALITY_SET_FIRING_ANGLE_TOLERANCE( 500.0f ) \\ must point this
accurate to fire (mm)
PERSONALITY_SET_RETREAT_DAMAGE_THRESHOLD( 75 ) \\ retreat if damage
exceeds this amount (percent)
DIRECTIVES_BEGIN
Letters to the Editor:
DIRECTIVE_ADD( TEAMMATE_FIRING_GOAL, AvoidTeammateFire, BaseWeight+1,
Write a letter AvoidTeammateFireDecay )
View all letters DIRECTIVE_ADD( EXPLOSIVE_GOAL, GetAwayFromExplosive, BaseWeight+1, NoDecay
)
DIRECTIVE_ADD( HUMAN_TAKES_DAMAGE_GOAL,
BuddyDamageVocalResponce,BaseWeight, AcquiredGoalDecay )
DIRECTIVE_ADD( DEMOLISH_POSITION_GOAL, DemolishVocalResponce, BaseWeight,
AcquiredGoalDecay )
DIRECTIVE_ADD( SEEN_ENEMY_GOAL, StationaryAttackEnemy, BaseWeight-1,
SeenEnemyDecayRate )
DIRECTIVE_ADD( HEARD_ENEMY_GOAL, FaceEnemy, BaseWeight-2,
HeardEnemyDecayRate )
DIRECTIVE_ADD( UNCONDITIONAL_GOAL, FollowCommander, BaseWeight-3, NoDecay
)
DIRECTIVE_ADD( UNCONDITIONAL_GOAL, GoToIdle, BaseWeight-4, NoDecay )
DIRECTIVES_END
PERSONALITY_END

The DIRECTIVE structure contains four fields:


● Goal type (Know goals, Acquired goals, Unconditional goals)

● Response function pointer (is called if priority weight is best, assigns target goals)

● Priority weight (importance of directive)

● Decay rate (allows older goals to become less important over time)

http://www.gamasutra.com/features/19991110/guy_02.htm (1 of 3) [25/06/2002 1:26:29 PM]


Gamasutra - Features - "A Modular Framework for Artificial Intelligence Based on Stimulus Response Directives" [11.10.99]
The BRAIN_GOAL structure contains all necessary data for object recognition and action response.
The stimulus detection fields are:
● Goal type (i.e. seen/heard teammates, seen/heard enemies, heard gun fire, acquired goals)

● Goal object pointer (void *, cast to typed pointer based on object type)

● Goal position type (i.e. dynamic object position, fixed position, offset position etc.)

● Time of detection (timestamp in milliseconds)

● Previously known (true/false)


The response fields are:
● Action at target (IO_FIRE, IO_USE_INVENTORY etc.)

● Yaw velocity (degrees per second)

● Movement mode (Forward, Forward slow, Sidestep left, Sidestep right etc.)

● Inner radius (navigator threshold)

● Outer radius (goal selector threshold)

● Time of arrival (timestamp in milliseconds, for acquired goals)

The Stimulus Detection Unit

Modeling stimulus detection in a physical way can achieve symmetry and help fulfill the user's
expectations (i.e. if I can see him, he should be able to see me). This also prevents the AI from
receiving hidden knowledge and having an unfair advantage. The stimulus detection unit models the
signal strength of an event as a distance threshold. For example, the HeardGunFire event can be
detected within a distance of 250 meters. This threshold distance can be attenuated by a number of
factors. If a stimulus event is detected, it is encoded into a BRAIN_GOAL and added to the known
goals queue. This implementation of stimulus detection considers only three sensory modalities:
visual, auditory and tactile.
Visual stimulus detection begins by considering all humans and objects within the field of view of the
observer (~180 degrees). A scaled distance threshold is then computed based on the size of the
object, object illumination, off-axis angle and tangential velocity. If the object is within the scaled
distance threshold, a ray cast is performed to determine if the object is not occluded by the world. If
all these tests are passed, the object is encoded into a BRAIN_GOAL. For example, a generic human
can be encoded into a SeenEnemyGoal or generic object can be encoded into SeenExplosiveGoal .
As sounds occur in the game, they are added to the sound event queue. These sound events contain
information about the source object type, position and detection radius. Audio stimulus detection
begins by scanning the sound event queue for objects within the distance threshold. This distance
threshold can be further reduced by an extinction factor if the ray from the listener to the sound
source is blocked by the world. If a sound event is within the scaled distance threshold, it is encoded
into a BRAIN_GOAL and sent to the known goals queue.
When the known goals queue is updated with a BRAIN_GOAL, a test is made to determine if it is was
previously known. If it was previously known, the matching known goal is updated with a new time of
detection and location. Otherwise the oldest known goal is replaced by it. The PREVIOUSLY_KNOWN
flag of this known goal is set appropriately for directives that respond to the rising edge of a detection
event.

Injuries and collision can generate tactile stimulus detection events. These are added to the acquired
goals queue directly. Tactile stimulus events are primarily used for the generation of vocal responses.
The Goal Selector

The goal selector chooses target goals based on stimulus response directives. The grammar for the
directives is constructed as a simple IF THEN statement:

IF I detect an object of type X (and priority weight Y is best) THEN call target goal function Z.
The process of goal selection starts by evaluating each active directive for a given personality. The
known goals queue or the acquired goals queue is then tested to find a match for this directive object
type. If a match is found and the priority weight is the highest in the list, then the target goal function
is called. This function can perform additional logic to determine if this BRAIN_GOAL is to be chosen as

http://www.gamasutra.com/features/19991110/guy_02.htm (2 of 3) [25/06/2002 1:26:29 PM]


Gamasutra - Features - "A Modular Framework for Artificial Intelligence Based on Stimulus Response Directives" [11.10.99]
a target. For example, if the AI is already within the target distance of a BRAIN_GOAL's position, an
alternate goal (i.e. direction) could be chosen. Once a target goal is selected, the position, direction
and posture goals can be assigned. Unconditional directives do not require a matching object type to
be called. These are used for default behavior in the absence of known goals.
The priority weight for a directive can decay at a linear rate based on the age of a known goal (current
time minus time of detection). For example, if an AI last saw an enemy 20 seconds ago and the
directive has a decay rate of 0.01 units per second, the priority decay is: -2. This decay allows the AI's
to lose interest in known goals that haven't been observed for a while.
The goal selector can assign the three target goals (direction, position and posture) orthogonally or in
a coupled fashion. In addition to these target goals, the goal selector can also select an inventory item
and directly activate audio responses. When a direction goal is assigned, the action at target field can
be set. For example, the stationary attack directive sets the action at target field to IO_FIRE. When
the direction servo gets within the pointing tolerance threshold, the action is taken (i.e. the gun is
fired). When a position goal is selected, an inner and outer radius are set by the directive - the outer
radius specifies the distance threshold for the goal selector to acquire, and the inner radius is the
distance threshold that the position goal servo uses for completion. The inner and outer radius
thresholds are different by a small buffer distance (~250 millimeters), so as to prevent oscillations at
the boundary. When a position goal is acquired, the action at target can be evoked. For example, the
Demolish Target directive sets the action at goal field to IO_USE_INVENTORY. This directive also
selects the satchel explosive from the inventory. Some directives can set the posture goal; for
example, the StationaySniperAttack directive sets the posture goal to prone. Likewise the HitTheDirt
directive sets the posture goal to prone.
The Navigator and Goal Servos

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19991110/guy_02.htm (3 of 3) [25/06/2002 1:26:29 PM]


Gamasutra - Features - "A Modular Framework for Artificial Intelligence Based on Stimulus Response Directives" [11.10.99]

| | | |

Features
by Charles Guy
Gamasutra The Navigator
November 10, 1999 Contents
Once a position goal has been selected, the navigator must find a path to The Biological Model for
get there. The navigator first determines if the target can be acquired
Artificial Intelligence
directly (i.e. can I walk straight to it?). My initial implementation of this
test used a ray cast from the current location to the target location. If the Overview of Data Flow /
ray was blocked, then the target was not directly accessible. The ray cast Data Structures
method has two problems:
1. Where an intervening drop off or obstacle did not block the ray and The Navigator and Goal
Servos

2. where the ray is blocked by smoothly curving slopes that can be


walked over.

My final solution for obstacle detection uses a step-wise walk-through.


Each step (~500 millimeters) along the path to the target is tested for obstacles and drop offs. This
method produces reliable obstacle detection and is a good basis for navigation through a world
composed of triangles.

Letters to the Editor:


Write a letter
View all letters

Figure 2. Side view of linear ray cast vs. step-wise walk through obstacle
detection.

If a position goal is not blocked by the world, the position goal servo goes directly to the target.
Otherwise a path finding algorithm is used to find an alternate route to get to the target position. The
path finding algorithm that is used in SpecOps II is based on Navigation Helper Nodes that are placed
in the world by the game designers. These nodes are placed at the junctions of doors, hallways, stairs
and boundary points of obstacles. There are typically a few hundred Navigation Helper Nodes per level.
The first step in the path finding process is to update the known goals queue with all Navigation Helper
Nodes that are not blocked by the world. Because the step-wise walk through obstacle test is fairly
time expensive, it is distributed over a number of frame intervals. Once the know goals queue been
updated with all valid navigation helper nodes, the next position goal can be selected. This selection is
based on when the Navigation Helper was last visited and how close it is to the target position. When a
Navigation Helper Node is acquired by the position goal servo, it is updated in the acquired goals
queue with the time of arrival. By only selecting Navigation Helper Nodes that have not been visited,
or that have the oldest time of arrival, ensures that the path finder will exhaustively scan all nodes
until the target can be reached directly. When two Navigation Helper Nodes have the same age status,
the one closer to the target position is selected.

Direction and Position Goal Servos

The direction and position goal servos take an X, Y, Z position as their goal. This position is
transformed into local coordinates by translation and rotation. The direction servo drives the local X
component to 0 by applying the appropriate yaw velocity. The local Y component is driven to 0 by

http://www.gamasutra.com/features/19991110/guy_03.htm (1 of 2) [25/06/2002 1:28:10 PM]


Gamasutra - Features - "A Modular Framework for Artificial Intelligence Based on Stimulus Response Directives" [11.10.99]
applying the appropriate pitch velocity. When the magnitude of the local X, Y coordinates goes below
the target threshold, the goal is "acquired". The position goal servo is nested within a direction servo.
When the direction servo is pointing at the goal to within the desired tolerance, the AI approaches the
target using the movement mode (i.e. IO_FORWARD, IO_FORWARD_SLOW) set by the directive. Once
the distance to the position goal falls below the inner radius, the goal is "acquired", actions at goal can
be evoked and the acquired goals queue is updated. The acquired goals queue is used as a form of
feedback loop to tell the goal selector when certain goals are completed. This allows the goal selector
to step through a sequence of actions (i.e. state machine).
Brain/Body Interface

Most actions are communicated to the body through a 128 bit virtual keyboard called the action flags.
These flags correspond directly to keys the player can press to control his avatar. Each action has an
enumerated type for each bit mask (i.e. IO_FIRE, IO_FORWARD, IO_POSTURE_UP,
IO_USE_INVENTORY etc.) These action flags are then encoded into animation states. Because the
body is articulated, rotation is controlled by separate scalar fields for body yaw velocity, chest yaw
angle, bicep pitch angle and head yaw/pitch angle. These allow for partially orthogonal direction goals
(i.e. the head and gun can track an enemy while the body is pointing at a different position goal).

Commands

Because of their modular nature, directives can be given to an AI by a commander at runtime. Each
brain has a special slot for a commander directive and a commander goal. This allows the commander
to tell one of his buddies to attack an enemy that is only visible to himself. Commands can be given to
a whole squad or to an individual. Note that is very easy to create directives for commander AI's to
issue commands to their teammates. The following is a list of commander directives used in SpecOps
II:

TypeDirective CommanderDirectiveFormation ={ TEAMMATE_GOAL,


GoBackToFormation, BaseWeight, NoDecay};

TypeDirective CommanderDirectiveHitTheDirt={ POSTURE_GOAL, HitTheDirt,


BaseWeight+1,NoDecay};

TypeDirective CommanderDirectiveAttack = { SEEN_ENEMY_GOAL,


ApproachAttackEnemy,BaseWeight, NoDecay};

TypeDirective CommanderDirectiveDefend = { FIXED_POSITION_GOAL,


DefendPosition, BaseWeight, NoDecay};

TypeDirective CommanderDirectiveDemolish = { DEMOLISH_POSITION_GOAL,


DemolishPosition, BaseWeight, NoDecay};

Future Improvements

Because this brain model is almost entirely data driven, it would be fairly easy to have it learn from
experience. For example, the priority weights for each directive could be modified as a response to
victories or defeats. Alternatively, an instructor could punish (reduce directive priority weight) or
reward (increase directive priority weight) responses to in-game events. The real problem with
teaching an AI during game play is the extremely short life span (10-60 seconds). However, each
personality could have a persistent communal brain, which could learn over the course of many lives.
In my opinion, the real value of dynamic learning in game AI is not to make a stronger opponent, but
to make a continuously changing opponent. It is easy to make an unbeatable AI opponent; the real
goal is to create AIs that have distinctive personalities, and these personalities should evolve over
time.
[Back to] The Biological Model for Artificial Intelligence

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19991110/guy_03.htm (2 of 2) [25/06/2002 1:28:10 PM]


generation5.org - Does the Top-Down Approach or the Bottom-Up Approach Best Model the Human Brain?

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

Does the Top-Down Approach or the Bottom-Up


Approach Best Model the Human Brain?
Abstract
Recently in the field of Artificial Intelligence, scientists are wondering which approach best models the human brain — bottom-up or
top-down. Both approaches have their advantages and their disadvantages. The top-down approach has the advantage of having all
the necessary knowledge already present for the program to use (given that the knowledge was pre-programmed) and thus is can
perform relatively high-level tasks such as language processing. The bottom-up approach has the advantage of being able to model
lower-level human functions, such as image recognition, motor control and learning capabilities. Each method fails where the other
excels — and it is this trait of the two approaches that is the root of the debate.
In order to assess this, this essay deals with two areas of Artificial Intelligence — Natural Language Processing and robotics. Natural
Language Processing uses the top-down methodology, whereas robotics uses the bottom-up approach. Jerry A. Fodor, Mentalese, and
conceptual representation support the ideas of a top-down approach. The MIT Cog Robot Team fervently supports the bottom-up
approach when modelling the human brain. This essay looks at the theory behind conceptual representation and its parallel in
philosophy, Mentalese. The theory involved in the bottom-up approaches is skipped due to the background knowledge required in
both computer science and neurobiology.
After looking at the two approaches, I concluded that currently Artificial Intelligence is too segmented to create any universal
approach to modelling the brain — the top-down approach has all the necessary traits needed for NLP, and the bottom-up approach
has all the necessary traits for fundamental robotics. A universal model will probably arise from the bottom-up approach since
Artificial Intelligence is seeking its information from neurobiology instead of philosophy, thus more concrete theories about the
functioning of the brain are formed. For the moment, though, either the approaches should be used in their respective fields, or a
compromise (such as an object-orientated approach) should be formulated.

Introduction
Throughout the history of artificial intelligence one question has always been asked when given a problem. Should the solution to a
problem be solved via the top-down method or through the bottom-up method? There are many different areas of artificial
intelligence where this question arises — but none more so than in the areas of Natural Language Processing (NLP) and robotics.
As we grow up we learn the language of our parents, making mistakes at first, but slowly growing used to using language to
communicate with others. Humans definitely learn through a bottom-up approach — we all start with nothing when we are born. It is
through our own intellect and learning that we master language. In the field of computing, though, such methods cannot always be
utilised.
The two approaches to the problems are called top-down and bottom-up, according to how they tackle the problems. Top-down takes
pre-programmed knowledge (like a large knowledge base) and uses symbolic creation, manipulation, linking and analysis to perform
the calculations. The top-down approach is the one most commonly used in the field of classical (and neo-classical) artificial

http://www.generation5.org/topdown.shtml (1 of 9) [25/06/2002 1:33:25 PM]


generation5.org - Does the Top-Down Approach or the Bottom-Up Approach Best Model the Human Brain?
intelligence that utilises serial programming. The bottom-up approach tackles the problem at hand by starting with a relatively simple
abstract program that is built to learn by itself — building its own knowledge base and commonsense assertions. This is normally
done with parallel processing, or data structures simulating parallel processing, such as neural networks.
When looking at the similarities of these approaches to the human brain, one can see the similarities in both. The bottom-up approach
has the obvious connection that it uses learning mechanisms to gain its knowledge. The top-down approach has most of its
connections in the field of natural language, and the philosophical computational models of the brain. Much philosophical theory
supports the idea of an inherently computational brain, one that uses symbol manipulation in order to do its ‘calculations’.
The two approaches differ greatly in their usefulness to the two fields concerned. In NLP, the bottom-up approach would take such a
long time before the knowledge system was rich enough to paraphrase text, or infer from newspaper articles, that a large amount of
pre-programmed (but reusable) starting information would be a much more practical approach. With robotics, though, the amount of
space required for a large pre-programmed knowledge base is huge, and with the space restrictions on a computer chip, large code
segments are not an option. More importantly, the top-down, linear approaches are very easily subjected to exceptions that the
knowledge base cannot handle, especially given a real world environment in which to operate in. Since bottom-up approaches learn
to deal with such exceptions and difficulties, the systems adapt, and are incredible flexible in their execution.
As stated, the bottom-up approach utilises parallel processing, and advanced data structures such as neural networking, evolutionary
computing, and other such methods. The theory behind these ideas is too broad and is, aside from a brief introduction to the subject,
beyond the boundaries of this essay. Instead, this essay deals with one of the computational models of the brain, Mentalese, and its
parallel in computer science — conceptual representation.
Despite the applications of the bottom-up and the top-down approach to different fields of NLP and robotics, both fields have a
common problem — how to code commonsense into the program or robot, whether to use brute computation, or when to let the
program learn for itself.

Commonsense and General Knowledge


Commonsense, or consensus reality, is an incredible obstacle that AI faces. Over the course of our childhood, millions of tiny ‘facts’
are gathered by us that are taken for granted later in life. Think about the importance of this for any program that is to exhibit general
intelligence. It has been generally accepted by the AI community that any future parsing program will need command of general
knowledge. Such is the opinion of the ParseTalk team:
"…Natural language understanding tasks, such as large-scale text or speech understanding…[would] not only require
considerable portions of grammatical knowledge but also a vast amount of so-called non-linguistic, e.g., domain and discourse
knowledge…"
Dreyfus (a well-known sceptic of AI) says that a program can only be classified as intelligent after is has exhibited a general
command of commonsense. For instance, a classic program designed to answer questions concerning a restaurant might be able to
answer, "What food was ordered?" or, "How much did the person pay for his food?" Yet it could not answer "What part of the body
was used to eat the food?" or "How many pairs of chopsticks were required to eat the food?" So much of our life requires us to relate
to our commonsense that we never notice it. So, how could commonsense be coded into a computer?
The top-down method relies on vast amounts of pre-programmed information, concepts, and other such symbols for the program to
use. The bottom-up method uses techniques such as neural networking, evolutionary computing, and parallel processing to allow the
program to adapt and learn from it’s environment. Classical AI chooses the top-down method for use in programs such as inference
engines, expert systems and other such programs where learning knowledge over many years is not an option. The field of robotics
often takes the bottom-up methodology, letting the systems get information from the ‘senses’ and allowing them to adapt to the
environment.
When looking at natural languages though, a top-down approach is often needed. This is due to several reasons, the first being that
most modern day computers do not have the capabilities to learn through sensory experience. Secondly, a program that is sold with
the intent to read and paraphrase large amounts of text will not be acceptable if it requires two years of continual running so it can
learn the knowledge required before usage. Therefore, an incredible amount of pre-programmed information is required. The most
comprehensive top-down knowledge base so far is the CYC Project.

The CYC Project: The Top-Down Approach


The CYC Project is a knowledge base (KB) being created by the Cycorp Corporation in Austin, Texas. CYC aims to turn several
million general knowledge assumptions into a computable form. The project has been going on for about 14 years and over a million
discrete concepts and commonsense assertions have been turned into CYC entities.

http://www.generation5.org/topdown.shtml (2 of 9) [25/06/2002 1:33:25 PM]


generation5.org - Does the Top-Down Approach or the Bottom-Up Approach Best Model the Human Brain?

A CYC entity is not necessarily limited to one word; often it represents a group of words, or a concept. Look at this example taken
from the CYC KB:
;;; #$Food-ReadyToEat (#$isa #$Food-ReadyToEat #$ProductType)
(#$isa #$Food-ReadyToEat #$ExistingStuffType)
(#$genls #$Food-ReadyToEat #$FoodAndDrink)
(#$genls #$Food-ReadyToEat #$OrganicStuff)
(#$genls #$Food-ReadyToEat #$Food)
You can see how ‘food that is ready to eat’ is represented in CYC as a group of IS-A relationships and GENLS relationships. The
IS-A relationships are an ‘element of’-relationship whereas the GENLS relationships are a ‘subset of’-relationship. This hierarchy
creates a large linked concept for a very simple term. For example, the Food-ReadyToEat concept IS-A ExistingStuffType, and
in the CYC KB, ExistingStuffType is represented as:
;;; #$ExistingStuffType
(#$isa #$ExistingStuffType #$Collection)
(#$genls #$ExistingStuffType #$TemporalStuffType)
(#$genls #$ExistingStuffType #$StuffType)
With the following comment about it: "…A collection of collections. Each element of #$ExistingStuffType is a collection of
things (including portions of things) which are temporally and spatially stufflike; they may also be stufflike in other ways, e.g., in
some physical property. Division in time or space does not destroy the stufflike quality of the object…" It is apparent how generic
many of the concepts get as they rise in the CYC hierarchy.
Evidently, such a huge KB would generate a large concept for a small entity, but such a large concept is necessary. For example, the
CYC team created a sample program in 1994 that fetched images given search criteria. Given a request to search for images of seated
people, the program retrieved an image with the following caption: "There are some cars. They are on a street. There are some trees
on the side of the street. They are shedding their leaves. Some of them are yellow taxicabs. The New York City skyline is in the
background. It is sunny." The program had deduced that cars have seats, in which people normally sit, when the car is in motion.

COG: The Bottom-Up Approach


After many years of GOFAI (Good Old Fashioned Artificial Intelligence) research, scientists started doubting the classical AI
approaches. The MIT Cog Robot team eloquently put their doubts:
"…We believe that classical and neo-classical AI make a fundamental error: both approaches make the mistake of assuming
that because a description of reasoning/behavior/learning is possible at some level, then that description must be made explicit
and internal to any system that carries out the reasoning/behavior/learning. This introspective confusion between surface
observations and deep structure has led AI away from its original goals of building complex, versatile, intelligent systems and
toward the construction of systems capable of performing only within limited problem domains and in extremely constrained
environmental conditions.…We believe that [our] abilities are a direct result of four intertwining key human attributes:
developmental organization, social interaction, embodiment and physical coupling, and multimodal integration of the system…"
The whole robot is designed without any complex systems modelling, or pre-programmed human intelligence. An impressive
example of this is Cog’s ability to successfully play with a slinky. The robot’s arms are controlled by a set of self-adaptive
oscillators. Each joint in the arm is actuated by an oscillator that uses local information to adjust the frequency and phase of the arm.
None of the oscillators are connected, nor do any of them share information. When the arms are unconnected, they are uncoordinated,
yet if a slinky is used to connect them, the oscillators adapt to the motion, and coordinated behaviour is achieved.
You can see how simple systems can model quite complex behaviour — the problem with doing this is that it takes a long time for
systems to get adjusted to its environment, just like a real human. This can prove impractical. So, in the field of NLP, a top-down
approach is required most of the time. An exception would perhaps be a computer program that can learn a new language
dynamically.
With both approaches to common sense and general knowledge, there is one thing in common — the vast amount of knowledge
required. A method of learning, storing, and retrieving all this information is also needed. A lot of this is automatically taken care of
through the bottom-up approach. With the top-down approach, such luxuries are not present, everything has to be hard coded. For
example, all the text written by the CYC project is useless unless a way can be created to conceptualise and link all the entities
together. A computer cannot understand the entity #$ExistingStuffType as it stands. A program that parses the KB, and turns it
into a data structure that the computer can manipulate is necessary. There is no set way of doing this — but one promising field of
Artificial Intelligence exists for this purpose, that of Conceptual Representation.

Conceptual Representation: Theory


Conceptual Representation (CR) relies on symbolic data types called conceptual structures. A conceptual structure (now referred to

http://www.generation5.org/topdown.shtml (3 of 9) [25/06/2002 1:33:25 PM]


generation5.org - Does the Top-Down Approach or the Bottom-Up Approach Best Model the Human Brain?
as a C-structure) must represent the meaning of a natural language idiom in an unequivocal way. Roger Schank, one of the pioneers
of CR, says the following:
"…The representation of this conceptual content…must be in terms that are interlingual and as neutral as possible. That is, we
will not be concerned with whether it is possible to say something in a given language, but only with finding, once something is
said, a representation that will account for the meaning of that utterance in an unambiguous way and one that can be
transformed back into that utterance or back into any other utterances that have the same meaning…"
C-structures are created, stored, manipulated and interpreted within a CR program. In a typical CR program there are three parts —
the parser and conceptualizer, another module that acts as an inference engine (or something similar), and finally a module to output
the necessary information. The parser takes the natural language it is designed to parse and creates the C-structure primitives
necessary. Then, the main program uses the concepts to either paraphrase the input text, draw inferences from the text provided or
other similar functions. Finally, the output module will convert those structures back into a natural language —this does not
necessarily have to be the same language the text was inputted in.

Parsing
A look at parsing and its two approaches is necessary at this point. Parsers generally take information and convert it into a data
structure that the computer can manipulate. With reference to Artificial Intelligence, a parser is generally a program (or module of the
program) that takes a natural language sentence and converts it into a group of symbols. There are generally two methods of parsing,
bottom-up and top-down. The bottom-up method takes each word separately, matches the word to its syntactic category, does this for
the following word, and attempts to find grammar rules that can join these words together. This procedure continues until the whole
sentence is parsed, and the computer has represented the sentence in a well-formed structure. The top-down method, on the other
hand, starts with the various grammar rules and then tries to find instances of the rules within the sentence.
Here the bottom-up and top-down relationship is slightly different. Nevertheless, a parallel can be drawn if the grammar of a sentence
can be seen as the base of language (like commonsense is the base of cognitive intelligence). Both approaches have problems largely
due to the large amount of computational time both require. With the bottom-up approach, a lot of time is wasted looking for
combinations of syntactic categories that do not exist in the sentence. The same problem appears in the top-down approach, although
it is looking for instances of grammar that are not present that wastes the time.
So which method represents the human mind closer? Neither is perfect, because both methods simply involve syntactic analysis.
Take these two examples:
Carries’s box of strawberries was edible.
Carrie’s love of Kevin was intense.
If a program syntactically analyzed these two statements, it would come to the correct conclusion that the strawberries were edible,
but the incorrect conclusion that Kevin was intense. Despite the syntactical structure of the two sentences being the exact same, the
meaning is different. Nevertheless, even if a syntactical approach is used, it can be used to point the computer to the correct meaning.
As you will see with conceptual representation, if prior knowledge is known about the word ‘love’ then the computer can create the
correct data structure to represent the sentence. This still does not answer the question of what type of parser the brain is closer to. In
Schank’s words, ‘Does a human who is trying to understand look at what he has or does he look to fulfill his expectations?’ The
answer seems to be both; a person not only handles the syntax of the sentence, but also does a certain degree of prediction. Take the
following incomplete sentence:
John: I’ve played my guitar for over three hours and my fingers feel like ——
Looking at the syntax of the sentence it is easy to see that the next word will be a verb (‘dying’) or a noun (‘jelly’). It is easy,
therefore, to predict the conceptual structure of a sentence. Problems arise when meaning also has to be predicted too. We have the
problem of context, for instance. The fingers could be very worn out; they could be very callused from playing, or they could feel hot
from playing for so long.
Prediction is easier when there is more information to go on, for example, if John had said "and my poor fingers," from the context of
the sentence, we could have gathered that the fingers do not feel so good. This kind of prediction is called conversational prediction.
Another type of prediction is based upon the listener’s knowledge. If the listener knows John to be an avid guitar player, then he
might except a positive comment, but if he knows John’s parents force him to play the guitar, the listener could except a negative
remark.
All these factors are constituents when a human listens to someone talking. With all this taken into account, Schank sums up the
answer the following way:
"…We can therefore say that it would seem to be reasonable to claim that a human is a top-down parser with respect to some
well-defined world model. The hearer, however, is a bottom-up parser in that he hears a given word he tries to understand
what it is rather than decide whether it satisfied his ordered list of expectations…"

http://www.generation5.org/topdown.shtml (4 of 9) [25/06/2002 1:33:25 PM]


generation5.org - Does the Top-Down Approach or the Bottom-Up Approach Best Model the Human Brain?

Types of Concepts
A concept can be any one of three different types — a nominal, an action, or a modifier. Nominals are concepts that do not need to be
explained further. Schank refers to nominals as picture-producers, or PPs, because he says that nominals produce a picture relating to
the concept in the mind of the hearer. An action is what a nominal can do, or more specially, what an animate nominal can perform
on some object. Therefore, a verb like ‘hit’ is classified as an action, but ‘like’ is not, since no action is performed. Finally, a modifier
is a descriptor of a nominal or an action. In English, modifiers could be given the names adverbs and adjectives, yet since CR is
supposedly independent of any language, the non-grammatical terms PA (picture aiders – for modifiers of nominals) and AA (action
aiders – for modifiers of actions) are used by Schank.
These three categories can all relate to each other, such relationships are called dependencies. Dependencies are well described by
Schank:
"…A dependency relation between two conceptual items indicates that the dependent item predicts the existence of the
governing item. A governor need not have a dependent, but a dependent must have a governor. The rule of thumb in
establishing dependency relations between two concepts is whether one item can be understood without the other. A governor
can be understood by itself. In order for a conceptualisation to exist, however, even a governor must be dependent on some
other concept in that conceptualisation…"
Therefore, nominals and actions are always governors, and the two types of modifiers are dependents. This does not mean, though,
that a nominal or an action cannot also be a dependent. For instance, some actions are derived from other actions, take for example
the imaginary structure CR-STEAL (conceptual type for stealing). Since stealing is really swapping of possession (with one party not
wanting that change of possession), it can be derived from a simpler concept of possession change.

C-Diagrams
C-Diagrams are the graphical equivalent of the structures that would be created inside a computer, showing the different relationships
between the concepts. C-Diagrams can get extremely complicated, with many different associations between the primitives; this
essay will cover the basics. Below is an example of a C-Diagram:

The above represents the sentence "John hit his little dog." ‘John’ is a nominal since it does not need anything further to describe it.
‘Hit’ is an action, since it is something that an animate nominal does. The dependency is said to be a ‘two-way dependency’ since
both ‘John’ and ‘hit’ are required for the conceptualisation — such a dependency is denoted by a ⇔ in the diagram. ‘Dog’ is also a
governor in this conceptualisation, yet it does not make sense within this conceptualization without a dependency to the action ‘hit.’
Such a dependency is called an ‘objective dependency’ — this is denoted by an arrow (the ‘o’ above it denotes the objectivity). Now
all the governors are in place, and we have created "John hit a dog" as a concept. We have to further this by adding the dependencies
— ‘little’ and ‘his’. Little is called an attribute dependency, since it is a PA for ‘dog’. Attributive dependencies are denoted by a ↑ in
the diagram. Finally, the ‘his’ has to be added — since his is just a pronoun, another way of expressing John, you would think it
would be dependent of ‘dog.’ It is not this simple, though, since ‘his’ also implies possession of ‘dog’ to ‘John.’ This is called a
prepositional dependency, and is denoted by a ⇑, followed by a label indicating the type of prepositional dependency. POSS-BY, for
instance, denotes possession.
With all this in mind lets look at a more complicated C-Diagram. Take the sentence, "I gave the man a book." Firstly, the ‘I’ and
‘give’ relationship is a two-way dependency, so a ⇔ is used. The ‘p’ above the arrow is to denote that the event took place in the
past. The ‘book’ is objectively dependent on ‘give’, so the arrow is used to denote this. Now, though we have a change in possession,
this represented in the diagram by two arrows, with the arrow pointing toward the governor (the originator ), and the arrow pointing
away (the actor ). The final diagram would look like:

You can see through conceptual representation, how a computer could create, store and manipulate symbols or concepts to represent
sentences. How can we be sure that such methods are accurate models of the brain? We cannot be certain, but we can look at
philosophy for theories that can support such computational, serial models of the brain.

http://www.generation5.org/topdown.shtml (5 of 9) [25/06/2002 1:33:25 PM]


generation5.org - Does the Top-Down Approach or the Bottom-Up Approach Best Model the Human Brain?

Computational Models of the Brain


In order to explain the functioning of the brain, the field of philosophy has come up with many, many models of the brain. The ones
that are of most interest to Artificial Intelligence are the computational models. There are quite a few computational models of the
brain ranging from GOFAI approaches (like conceptual representation) to connectionism and parallelism, to the neuropsychological
and neurocomputational theories and evolutionary computing theories (genetic algorithms).

Connectionism and Parallelism


Branches of AI started to look at other areas of neurobiology and computer science. The serial methods of computing were put to one
side in favour of parallel processing. Connectionism attempts to model the brain by creating large networks of entities simulating the
neurones of the brain — these are most commonly called ‘neural networks.’ Connectionism attempts to model lower-level functions
of the brain, such as motion-detection. Parallelism, otherwise known as PDP (parallel distributed processing), is the branch of
computational cognitive psychological that attempts to model that higher-level aspects of the brain, such as face recognition, or
language learning. This is yet another instance of how the bottom-up approach is limited to the fields of robotics.

Introduction to Mentalese
Despite this movement away from GOFAI by some researchers, the majority of scientists carried on with the classical approach. One
such pioneer of the ‘computational model’ field was Jerry A. Fodor. He says the following:
"…I assume that psychological laws are typically implemented by computational processes. There must be an implementing
mechanism for any law of a non-basic science, and the putative intentional generalisations of psychology are not
exceptions…Computational processes are ones defined over syntactically structured objects; viewed in extension, computations
are mappings from symbol to symbol; viewed in intension, they are mappings of symbols under syntactic description to
symbols under syntactic description…"
This quote is very reminiscent of conceptual representation and its methodology. Fodor argues that since sciences all have laws
governing their phenomenon, psychology and the workings of the brain are not an exception.

Language of Thought and Mentalese


Fodor was an important figure in the idea of a ‘language of thought.’ Fodor theorises that the brain ‘parses’ information that it then
transforms to a mental language of its own that it can subsequently manipulate and change more efficiently. When a person needs to
‘speak’ the brain converts the language it uses into a natural language. Fodor thought that this mental language was taken beyond
natural language, but and to the senses to:
"…[The] emphasis upon the syntactical character of thought suggests a view of cognitive processes in general — including, for
example, perception, memory and learning — as occurring in a language-like medium, a sort of ‘language of thought’…"
Fodor called this ‘language of thought’ Mentalese. The theory of Mentalese says that thought is based upon language, that we think
in a language not like English, French or Japanese but in the brain’s own universal language. Before studying the idea of Mentalese,
the concepts ‘syntax’ and ‘semantics’ should be fully explained.

Syntax and Semantics


Syntactic features of a language are the words and sentences that relate to the form rather than to the meaning of the sentence. Syntax
tells us how a sentence in a particular language is formed, how to form grammatically correct sentences. For example, in English the
sentence, ‘Cliff went to the supermarket’ is syntactically correct whereas, ‘To supermarket Cliff the went’ makes no sense
whatsoever. Semantics are the features of words that relate to the overall meaning of the sentence/paragraph. The semantics of a word
also define the relations of the word to the real world, and also to its contextual place in the sentence.
How are syntactics and semantics related to symbols and representations? The syntax of symbols is a huge hierarchy, simple base
symbols that represent simple representations. From these simple symbols, more complicated symbols are derived from them. The
semantics of symbols is easy to explain — symbols are inherently semantic. Symbols represent, relate and various objects to each
other, therefore when you hear or read a sentence it is broken up into symbols that are all related to one another.
With these terms defined, we can see how Mentalese a) can be restated as a computational theory and b) supports the idea of
conceptual representation. Mentalese is computational because ‘it invokes representations which are manipulated or processed
according to formal rules.’ The syntax of Mentalese is just like the hierarchy of CR structures — with different, more complex
structures derived from base structures. The theory even has similarities to the architecture of an entire CR program. The brain

http://www.generation5.org/topdown.shtml (6 of 9) [25/06/2002 1:33:25 PM]


generation5.org - Does the Top-Down Approach or the Bottom-Up Approach Best Model the Human Brain?
receives sentences and turns them into Mentalese, just like a CR program would parse a stream of text, and conceptualize it into
structures within the computer that do not resemble (and are independent of) the language that the text was in. When the brain needs
to retrieve this information, it converts the Mentalese back into the natural language needed, just like a CR program takes the
concepts and changes them back into the necessary language.
Earlier on, the idea of an implementing mechanism was introduced. How can such a mechanism be viewed within the Mentalese
paradigm? The basic idea of an implementing mechanism (IM) is that lower-level mechanisms implement higher-level ones. In more
generic terms, an IM goes further than the "F causes G" stage, to the "How does F cause G?" Fodor says that an IM specifies ‘a
mechanism that the instantiation of F is sufficient to set in motion, the operation of which reliably produces some state of affairs that
is sufficiently for the instantiation of G."

Which Model does the Brain More Closely Follow?


After looking at all the examples of the different approaches, we are presented with the final question of which one best represents
the human brain? In the field of NLP, it seems that the GOFAI, top-down approach is by far the best approach, whereas, the fields of
robotics and image recognition are benefited by the connectionist, bottom-up approach. Despite this seemingly concrete conclusion,
we are still plagued with a few problems.

Consistency
How many different types of cells are our brains composed of? Essentially, it just uses one type — the neurone. How can the brain
both exhibit serial and parallel behaviours with only one type of cell? The obvious answer to this is that it does not. This is a fault in
the overall integrity of both approaches.
For example, we have parallel, bottom-up neural networks that can successfully detect pictures, but cannot figure out the algorithm
for an XOR bit-modifier. We have serial, top-down CR programs that can read newspaper articles, make inferences, and even learn
from these inferences — yet, such programs often make bogus discoveries, due to the lack of background information and general
knowledge. We have robots that can play the guitar, walk up stairs, aid in bomb disposal, yet nothing gets close to a program with
overall intellect equal to that of a human.

Integration of the Fields


What will happen when robotics reaches the stage where it is on the brink of real-time communication, voice-recognition and
synthesis? The fields of NLP and robotics will have to integrate, and a common method will have to be devised. The question of
what method to follow would simply arise again. Since the current breed of robots use parallel processing, future robots will no doubt
use similar data systems to the ones today, and would not easily adapt to a serial, top-down approach to language. Even if a
successful bottom-up approach is found for robots, will it be practical?
Will humans tolerate ‘infant’ robots that require to be ‘looked after’ for the first 3 or 4 years of their life while they learn the
language of their ‘parents’? For robots that are to immediately know a certain language (for instance, English) from the minute they
are turned on, a top-down approach is bound to be necessary. No doubt, a comprise system of language processing would have to be
developed, perhaps the object-orientated system of ParseTalk could be promising, since OOP offers the advantages of serial
programming with some of the advantages of parallel processing too.

Top-Down Approach
One of the best ways to support the top-down approach and its similarities to the brain is to look at just how similar Mentalese and
conceptual representation are. Mentalese assumes that there is a language of the brain completely independent of the natural language
(or languages) of the individual. CR assumes the exact same thing, creating data structures with more abstract names that are
independent to the language, but rely on the parser for its input. This can explain the ease at which programs utilising CR easily
convert between two languages.
Both the ideas of Mentalese and CR must have been formulated from the same direction and perspective of the brain. Both assume a
certain independence that the brain has over the rest of the language/communication areas of the brain. Both assume that
computational manipulation is performed only after the language has been transformed into this language of the brain. It is about this
area that Mentalese and conceptual representation diverge — conceptual representation is limited to language (or has not yet been
applied to other areas), whereas Fodor maintains that this mental language applies to cognitive and perceptive areas of the brain too.
A fault in the Mentalese theory is that Fodor says Mentalese is universal. It seems hard to imagine that as we all grow up, learning

http://www.generation5.org/topdown.shtml (7 of 9) [25/06/2002 1:33:25 PM]


generation5.org - Does the Top-Down Approach or the Bottom-Up Approach Best Model the Human Brain?
how to see, hear, speak, and all the other activities we learn as a new-born, we all adopt the same language. A possible counter to this
is that the language is already there — but this creates a lot of new complications. Nevertheless, this fault is not detrimental to its
applications in computer processing, since nearly any base for an NLP program will require some prior programming (analogous to
the pre-made rules of Mentalese).
The COG Team also outlined three basic (but very important) faults with a GOFAI approach to AI:
"…Three conceptual errors commonly made…are presuming the presence of monolithic internal representations, monolithic
control, and the existence of general purpose processing. These and other errors primarily derive from naïve models based on
subjective observation and introspection, and biases from common computational metaphors (mathematical logic, von
Neumann architectures)…"
The team backs up their assertions with various results from psychological tests that have been performed. Such monolithic
representations and controlling units underlie the theory of Mentalese and conceptual representation.

Bottom-Up Approach
The main advantage of the bottom-up approach is its simplicity (somewhat), and its flexibility. Using structures such as neural
networking, programs have been created that can do things that would be impossible to do with conventional, serial approaches. For
example, Hopfield networks can recognise partial, noisy, shrunken, or obscured images that it has ‘seen’ before relatively quickly.
Another program powered by a neural network has been trained to accept present tense English verbs and convert them to past tense.
The strongest argument for the bottom-up approach is that the learning processes are the same that any human undergoes when they
grow up. If you present the robot or program to the outside world it will learn things, and adapt itself to them. Like the COG Team
asserts, the key to human intelligence is the four traits they spelled out: developmental organisation, social interaction, embodiment,
and integration.
The bottom-up approach also models human behaviour (emotions inclusive) due to chaotic nature of parallel processing and neural
networking:
"…Neural networks combine both chaotic behaviour, since they are a nonlinear system, and reasonable, if unexpected
behaviour since this nonlinearity is controlled by so-called basins of attraction in the memory formed in the connection weight
values…"
The main downfall of the bottom-up approach is its practicality. Building a robot’s knowledge base from the ground up every time
one is made might be reasonable for a research robot, but for (once created) commercial robots or even very intelligent programs,
such a procedure would be out of the question.

Conclusion
In conclusion, a definite approach to the brain has not yet been developed, but the two approaches (top-down and bottom-up)
describe different aspects of the brain. The top-down approach seems like it can explain how humans use their knowledge in
conversation. What the top-down approach does not solve however, is how we get that initial knowledge to begin with — the
bottom-up approach does. Through ideas such as neural networking and parallel processing, we can see how the brain could possibly
take sensory information and convert it into data that it can remember and store. Nevertheless, these systems have so far only
demonstrated an ability to learn, and not sufficient ability to manipulate and change data in the way that the brain and programs
utilising a top-down methodology can.
These attributes of the approaches lead to their dispersion within the two fields of AI. Natural Language Processing took up the
top-down approach, since that had all the necessary data manipulation required to do the advanced analysis of languages. Yet, the
large amount of storage space required for a top-down program and the lack of a good learning mechanism made the top-down
approach too cumbersome for robotics. They adopted the bottom-up approach, which proved to be good for things such as face
recognition, motor control, sensory analysis and other such ‘primitive’ human attributes. Unfortunately, any degree of higher-level
complexity is very hard to achieve with a bottom-up approach.
Now we have one approach modelling the lower level aspects of the human brain, and another modelling the higher levels — so
which models the brain overall the best? Top-down approaches have been in development for as long as AI has been around, but
serious approaches to the bottom-up methodologies have only really started in the last twenty years or so. Since bottom-up
approaches are looking at what we know from neurobiology and psychology, not so much from philosophy like GOFAI scientists do,
there may be a lot more we have yet to discover. These discoveries, though, may be many years in the future. For the meanwhile, a
compromise should be reached between the two levels to attempt to model the brain consistently given the current technology. The
object-orientated approach might be one answer, research into neural networks trained to create and modify data structures similar to
those used in conceptual representation might be another.

http://www.generation5.org/topdown.shtml (8 of 9) [25/06/2002 1:33:25 PM]


generation5.org - Does the Top-Down Approach or the Bottom-Up Approach Best Model the Human Brain?

Artificial Intelligence is the science of the unknown — trying to emulate something we cannot understand. GOFAI scientists have
always hoped that AI might one day explain the brain, instead of the other way around — connectionist scientists do not, and perhaps
this is the key to creating flexible code that can react given any environment, just like the human brain — real Artificial Intelligence.

Bibliography.
Altman, Ira. The Concept of Intelligence: A Philosophical Analysis. Maryland: 1997.

Brooks, R. A., Breazeal (Ferrell), C., Irie, R., Kemp, C. C., Marjanovi_c, M., Scassellati, B. & Williamson, M. M. (1998), Alternative Essences of Intelligence, in ‘Proceedings
of the American Association of Articial Intelligence (AAAI-98)’.

Brooks, R. A., Breazeal (Ferrell), C., Irie, R., Kemp, C. C., Marjanovi_c, M., Scassellati, B. & Williamson, M. M. (1998), The Cog Project: Building a Humanoid Robot.

Churchland, Patricia and Sejnowski, Terrence. The Computational Brain. London: 1996.

Churchland, Paul. The Engine of Reason, the Seat of the Soul: A Philosophical Journey into the Brain. London: 1996.

Crane, Tim. The Mechanical Mind: A Philosophical Introduction to Minds, Machines and Mental Representation. London: 1995.

Fodor, Jerry A. Elm and the Expert: An Introduction to Mentalese and Its Semantics. Cambridge: 1994.

Hahn, Udo. Schacht, Susanne. Bröker, Norbert. Concurrent, Object-Orientated Natural Language Parsing: The ParseTalk Model. Arbeitsgruppe Linguistische
Informatik/Computerlinhguistik. Freiburg: 1995.

Lenat, Douglas B. Artificial Intelligence. Scientific American, September 1995.

Lenat, Douglas B. The Dimensions of Context-Space. Cycorp Corporation, 1998.

Penrose, Roger. The Emperor’s New Mind: Concerning Computers, Minds and The Laws of Physics. Oxford: 1989.

Schank, Roger. The Cognitive Computer: On Language, Learning and Artificial Intelligence. Reading: 1984.

Schank, Roger and Colby, Kenneth. Computer Models of Thought and Language. San Francisco: 1973.

Watson, Mark. AI Agents in Virtual Reality Worlds. New York: 1996.

● Introduction to AI and Philosophy.


● The Natural Mind: Conciousness and Self-Awareness.
● Interview with Marvin Minsky!
● Interview with John Searle!
● AISolutions
● NLP Essays - Many essays on theory and applications of GAs.
● NLP Programs - Full source code included.
● AISolutions

All content copyright © 1998-2002, Generation5

http://www.generation5.org/topdown.shtml (9 of 9) [25/06/2002 1:33:25 PM]


generation5.org - An Introduction to Machine Vision

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

An Introduction to Machine Vision


Machine vision is an incredibly difficult task - a task that seems relatively trivial to humans is infinitely complex for computers to
perform. This essay should provide a simple introduction to computer vision, and the sort of obstacles that have to be overcome.

Data size
We will be looking at the picture at the right throughout the essay. We will be making a few
changes though - we will say that the picture is an 8-bit 640x480 images (not the 200x150
24-bit image it actually is) since this is the "standard" size and colour-depth of a computer
image.
Why is this important? Well, the first consideration/problem of vision systems is the sheer size
of the data it has to deal with. Doing the math, we have 640x480 pixels to begin with
(307,200). This is multiplied by three to account for the red, green and blue (RGB) data
(921,600). So, with just one image we are looking at 900K of data!
So, if we are looking at video of this resolution we would be dealing with 23Mb/sec (or
27Mb/sec in the US) of information! The solution to this is fairly obvious - we just cannot deal with this sort of resolution at this
speed at this colour depth! Most vision systems will work with greyscale video perhaps 200x150. This greatly reduces the data rate -
from 23Mb/sec to 0.72Mb/sec! Most modern day computer can manage this sort of rate very easily.
Of course, receiving the data is the smallest problem that vision system face - it is processing it that takes the time. So how can we
simplify the data down further? I'll present two simple methods - edge detection and prototyping.

Edge Detection
Most vision systems will be determining where and what something is, and for the most part, by detecting the edges of the various
shapes in the image should be sufficient to help us on our way. Let us look at two edge detections of our picture:

The left picture is generated by Adobe Photoshop's "Edge Detection" filter, and the right picture is generated by Generation5's
ED256 program. You can see that both programs picked out the same features, although Photoshop has done a better job of

http://www.generation5.org/vision.shtml (1 of 3) [25/06/2002 1:35:34 PM]


generation5.org - An Introduction to Machine Vision
accentuating more prominent features.
The process of edge detection is surprisingly simple. You merely look for large changes in intensity between the pixel you are
studying and the surrounding pixels. This is achieved by using a filter matrix. The two most common edge detecion matrices are
called the Laplacian and Laplacian Approximation matrices. I'll use the Laplacian matrix here since the number are all integers. The
Laplacian matrix looks like this:

1 1 1
1 -8 1
1 1 1
Now, let us imagine we are looking at a pixel that is in a region bordering a black-to-white block. So the pixel and its surrounding 8
neighbours would have the following values:

255 255 255


255 255 255
0 0 0
Where 255 is white and 0 is black. We then multiply the corresponding values with each other:

255 255 255


255 -2040 255
0 0 0
We then add all of the values together and take the absolute value - giving us the value of 765. Now, if this value is above our
threshold (normalling around 20-30, so this is way above the threshold!) then we say that point denotes an edge. Try the above
calculation with a matrix that consists of only 255. Experiment with the ED256 program which allows you to play with either the
Laplacian or Laplacian Approxmation matrices, even create your own.

Prototyping
Prototyping came about through a data classification technique called competitive learning. Competition learning is employed
throughout different fields in AI, especially in neural networks or more specificially self-organizing networks. Competitive learning
is meant to create x-number of prototypes given a data set. These prototypes are meant to be approximations of groups of data within
the dataset.
Somebody thought it would be neat to apply this sort of technique to an image to see if there are data patterns within an image.
Obviously it is different for every image, but on the whole, areas of the image can be classified very well using this techinque. Here a
more specific overview of the algorithm:

Prototyping Algorithm
1. Take x samples of the image (x is a high number like 1000). In our case, these samples would consist of small region of the
image (perhaps 15x15 pixels).
2. Create y number of prototypes (y is normally a smaller number like 9). Again, these prototypes would consist of 15x15 groups
of pixels.
3. Initialize these prototypes to random values (noisy images).
4. Cycle through our samples, and try and find the prototype that is closest to the sample. Now, alter our prototype to be a little
closer to the sample. This is normally done by a weighted average. ED256 brings the chosen prototype 10% closer to the
sample.
5. Do this many times - around 5000. You will find that the prototypes now actually represent groups of pixels that are
predominate in the image.
6. Now, you can create a simpler image only made up of y colours by classifying each pixel according to the prototype it is
closest too.
Here is our picture in greyscale and another that has been fed through the prototyping algorithm built into ED256. We use greyscale
to make prototyping a lot simpler. I've also enlarged the prototypes and their corresponding colours to help you visualize the process:

http://www.generation5.org/vision.shtml (2 of 3) [25/06/2002 1:35:34 PM]


generation5.org - An Introduction to Machine Vision

Notice how the green corresponds to pixels that have a predominantly white surroundings, most are red because they are similar to
the "brick" prototype. For very dark areas (look at the far right window frame) they are classified as dark red.
For another example, look at this picture of a F-22 Raptor. Notice how the red corresponds to the edges on the right wing (and the
left too for the some reason!) and the dark green for the left trailing edges/intakes and right vertical tail. Dark blue is for horizontal
edges, purple for the dark aircraft body and black for the land.

Conclusion
How do these techniques really help machine vision systems? It all boils down to simplifying the data that the computer has to deal
with. Less data, the more time can be spent extrapolating features. The trade-off is between data size and retaining the features within
the image. For example, with the prototyping example, we would have no trouble spotting the buildings in the picture, but the tree
and the car are a lot harder to differentiate. The same applies with a computer.
In general, edge detection helps when you need to fit a model to a picture - for example, spotting people in a scene. Prototyping helps
to classify images, by detecting their prominent features. Prototyping has a lot of uses since it can "spot" traits of an image that
humans do not.

● ED256 - An edge-detection/prototyping program. Full source code included.


● Hardware Reviews - Reviews for robots like the Sony AIBO, LEGO Mindstorms and more!
● Introduction to Robotics - The basics of robots.
● Problems with Machine Vision - An intro to the problems that image recognition faces.
● AISolutions - Many robotics articles.

All content copyright © 1998-2002, Generation5

http://www.generation5.org/vision.shtml (3 of 3) [25/06/2002 1:35:34 PM]


generation5.org - Philosophical Arguments For and Against AI

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

Philosophical Arguments For and Against AI


There is a fundamental question in the field of Artificial Intelligence — what is Artificial Intelligence? What constitutes intelligence?
When will a program be called intelligent? At the moment in artificial intelligence, the fields are not unified. We have hundreds of
scientists pursuing different areas of AI - parallelism, evolutionary computing, image recognition, voice recognition and synthesis.
Yet, none (with perhaps the exception of long-term robotic projects like Cog) attempt to model intelligence to a level of human
capacitance. In order to deal with the question artificial intelligence, we are going to assume that we are attempting to create a
program of such magnitude. We will ignore the computational limitations, since those will disappear in time.

Dreyfus and Commonsense


One famous skeptic of AI, Dreyfus, says that a computer will never be intelligent unless it can display a good command of
commonsense. Dreyfus then follows up by saying that computers will never be able to fully grasp commonsense, since much of our
commonsense is on a 'know-how' basis. For example, the notion that one solid cannot easily penetrate another is commonsense, yet
the knowledge required to ride a bicycle is not something you can gain from a book, or from someone telling you. You can only learn
through experience. Dreyfus doesn't stop there, he suggests that all knowledge is received in such a way - when learning about an
apple, for example, we learn how to use it, how to eat them, where to get them and many other things not immediately attainable
without direct experience with them.
Now, current computers can only really 'represent' things (actually, that's all they do) - so how to take a skill, emotion, or something
else equally abstract and change it into a series of 0s and 1s is close to impossible according to Dreyfus. This presents quite a large
problem to the overall field of artificial intelligence, since it is very hard to contradict Dreyfus when assuming both of his premises
are true - that general intelligence requires commonsense, or that commonsense requires know-how.
Most defenders of general intelligence AI (GIAI) agree with the first premise, but disagree with the second, stating that it is not
impossible, merely very difficult to do. A good example of this is Doug Lenat's CYC Project - see the Does the Top-Down Approach
or the Bottom-Up Approach Best Model the Human Brain? essay for more details. Such an approach, though, basically attempts at
taking such knowledge and converting it into a computational form through human conversion. Therefore, I myself have to strongly
agree with a lot of what Dreyfus says.

Cog: Goodbye GOFAI


Dreyfus' arguments against artificial intelligence specifically target GOFAI (good, old-fashioned AI) approaches to problems. With
the advent of parallelism, and bottom-up approaches to AI problems, Dreyfus' arguments may not apply. For example, with Cog, the
entire aim of the project was to build the robot from the bottom-up, allowing it to learn things, with nothing hard-coded into the robot
itself. This way, some incredibly interesting behaviour has arisen from the robot - this kind of emergent behaviour is something that
can only be described as artificial intelligence. I have got to give credit to the Cog team with coming up with a paragraph that I really
feel sums up the problem with modern GOFAI attempts:
"Three conceptual errors commonly made by classical AI researchers are presuming the presence of monolithic internal
models, monolithic control, and the existence of general purpose processing. These and other errors primarily derive
from naive models based on subjective observation and introspection, and biases from common computational
metaphors (mathematical logic, Von Neumann architectures, etc.). A modern understanding of cognitive science and
neuroscience refutes these assumptions.

http://www.generation5.org/ai_phil.shtml (1 of 3) [25/06/2002 1:37:36 PM]


generation5.org - Philosophical Arguments For and Against AI
- Alternative Essences of Intelligence

A deep explaination into all of these is out of the scope of this essay (see Robotics Essays for more information), but basically what
this is saying is that often people have tried to model the brain on a computer by modelling the brain on the computer! That is, they
use the computer as the analogy when trying to figure out what how the brain works - therefore, our ideas of the brain are often
distorted, oversimplified, or merely too computationally based.

Searle and the Chinese Room


Let us now assume that the bottom-up approach yields a robot (or computer program) that exhibits intelligent behaviour enough to
the extent that it passes the Turing Test. Is it intelligent? John Searle would argue that is wasn't - it merely was mimicking intelligent
behaviour. He uses an interesting analogy - called "The Chinese Room." What he states is that imagine he was in room with two
windows, labelled I and O (input and output, I assume). He gets handed a piece of paper with a complicated series of strokes on
them. These complicated strokes happen to be questions written in Chinese. Now, in his room he was a huge book of all possibile
questions in Chinese, and has very detailed instructions about how to look up the question and answer given the strokes. When he
finds the answer he writes it on another piece of paper, and hands it out the O-window. He argues that he could take any Chinese
question, output the correct answer and still not understand Chinese. Critics of this argument say that a computer should be
represented as the room as a whole, and not just Searle - therefore a computer would understand Chinese. Searle fired back by saying
he could memorize the entire book, and still do it, and STILL not understand Chinese - the arguments go on and on. I have my own
reservations toward this analogy. I agree that if a program was a huge hash table of possible answers that were arranged according to
question, and it was a simple case of retrieval, then the program wouldn't be intelligent. Yet, most modern programs do not act as
retrieval systems. Admittedly, they do perform set, pre-defined computations upon the data structures in the memory, but this is very
different to a hash-table!
I must admit that over the past couple of months, I have been seriously disillusioned with GOFAI and natural language processing. I
believe that early researchers were incredibly optimistic as to what they could achieve, and the positive momentum they started only
showed signs of slowing a few years ago. Despite this, I still think that Searle is going about the entire problem the wrong way.
Dismissing AI as mimicry is a dangerous assumption in and of itself.

Mimicking Intelligence?
A wonderful topic to throw about is the topic of mimicking intelligence. Can you mimic intelligence? Deep Blue beat Kasparov in
the game that often signified man's intelligence. Did Deep Blue exhibit intelligence? It played (and won) a game that requires a
significant amount of 'thought' and 'planning.' Deep Blue analyzes the board through immense computational power, so what does
and does not constitute intelligence? If a human made a list of all plausible moves given the board diagram, then started removing
options using a set of rules until he came up with what he thought was the best move, would that be classified as intelligence.
Definitely - this type of approach to problems is taken in many fields (granted, not chess) and the question of whether intelligence is
being used is never raised. Now, take a computer and do the same thing and ask yourself the same question. Why do people find it so
hard to see the same thing?
Some people will say that it is merely mimicking intelligence. What is the formal definition of mimicking intelligence? If forced into
answering this question, my second reply would be "the ability to display all traits of intelligence" - my first reply would be that no
such thing exists. To me, 'mimicking intelligence' is an oxymoron. If all traits of intelligence are exhibited (for the meanwhile, let us
assume traits of intelligence reduce down to the ability to have a meaningful conversation with a human) then intelligence exists! I
cannot see how humans as a race can classify how intelligence should be defined when we do not how our own brains work.
The human race is easily threatened because we have been on top for so long that we have never had to deal with something
potentially superior to ourselves. Indeed, Kasparov (and many others) saw the match between himself and Deep Blue as a way for the
human race to "help defend our dignity." The inherent narcissistic tendencies of the human race have been deflated slowly since
Copernicus told us we weren't the centre of the universe. Then again, by Darwin who said we'd evolved from protozoans. Now,
perhaps by IBMs Deep Blue team telling us its not just us who can play chess!

Conclusion
Artificial Intelligence is fraught with philosophical questions, since much of the brain and its functionings are yet unanswered. In this
essay, I merely moved from topic to topic as I wrote. These topics are by NO means the only ones brought up by Artificial
Intelligence. Also, as AI advances (and I believe it will) toward completely humanoid robots, many more philosophical, moral,

http://www.generation5.org/ai_phil.shtml (2 of 3) [25/06/2002 1:37:36 PM]


generation5.org - Philosophical Arguments For and Against AI

ethical and indeed even theological questions will arise. If computers and their apparent lack of intelligence isn't being battered, their
apparent lack of a conciousness (or indeed, soul) is. Now there's some food for thought...
"...You can't do without philosophy, since everything has its hidden meaning which we must know..."
- Maxim Gorky The Zykovs 1918.

StudyWeb
Academic
Excellence
Award

● Introduction to AI and Philosophy.


● The Natural Mind: Conciousness and Self-Awareness.
● Interview with Marvin Minsky!
● Interview with John Searle!
● AISolutions

All content copyright © 1998-2002, Generation5

http://www.generation5.org/ai_phil.shtml (3 of 3) [25/06/2002 1:37:36 PM]


generation5.org - Essays

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

Production Systems
Symbolic AI Systems vs Connectionism
Symbolic AI systems manipulate symbols, instead of numbers. Humans, as a matter of fact, reason symbolically (in the most general
sense). Children must learn to speak before they are able to deal with numbers for example. More specifically, these systems operate
under a set of rules, and their actions are determined by these rules. They always operate under task oriented environments, and are
wholly unable to function in any other case. You can think of symbolic AI systems as "specialists". A program that plays 3d tic tac
toe will not be able to play PenteAI (a game where 5 in a row is a win, but the players are allowed to capture two pieces if they are
sandwiched by the pieces of an opposing player). Although symbolic AI systems can't draw connections between meanings or
definitions and are very limited with respect to types of functionality, they are very convenient to use for tackling task-center
problems (such as solving math problems, diagnosing medical patients etc.). The more flexible approach to AI involves neural
networks, yet NN systems are usually so underdeveloped that we can't expect them to do "complex" things that symbolic AI systems
can, such as playing chess. While NN systems can learn more flexibly, and draw links between meanings, our traditional symbolic AI
systems will get the job done fast.
An example of a programming language designed to build symbolic AI systems is LISP. LISP was developed by John McCarthy
during the 1950s to deal with symbolic differentiation and integration of algebraic expressions.

Production Systems
(This model of production systems is based on chapter 5 of Stan Franklin's book, The Artificial Mind, the example of the 8-puzzle
was also based on Franklin's example)
Production systems are applied to problem solving programs that must perform a wide-range of seaches. Production ssytems are
symbolic AI systems. The difference between these two terms is only one of semantics. A symbolic AI system may not be restricted
to the very definition of production systems, but they can't be much different either.
Production systems are composed of three parts, a global database, production rules and a control structure.
The global database is the system's short-term memory. These are collections of facts that are to be analyzed. A part of the global
database represents the current state of the system's environment. In a game of chess, the current state could represent all the
positions of the pieces for example.
Production rules (or simply productions) are conditional if-then branches. In a production system whenever a or condition in the
system is satisfied, the system is allowed to execute or perform a specific action which may be specified under that rule. If the rule is
not fufilled, it may perform another action. This can be simply paraphrased:
WHEN (condition) IS SATISFIED, PERFORM (action)

http://www.generation5.org/production.shtml (1 of 3) [25/06/2002 1:38:06 PM]


generation5.org - Essays

A Production System Algorithm

DATA (binded with initial global data base)


when DATA satisfies the halting condition do
begin
select some rule R that can be applied to DATA
return DATA (binded with the result of when R was applied to DATA)
end

For a scenerio where a production system is attempting to solve a puzzle, pattern matching is required to tell whether or not a
condition is satisfied. If the current state of a puzzle matches the desired state (the solution to the puzzle), then the puzzle is solved.
However, if this case is not so, the system must attempt an action that will contribute to manipulating the global database, under the
production rules in such a way that the puzzle will eventually be solved.

->

In order to take a closer look to control structures let us look at a problem involving the eight puzzle. The eight puzzle contains eight
numbered squares laid in a three-by-three grid, leaving one square empty. Initially it appears in some, obfuscated state. The goal of
the production system is to reach some final state (the goal). This can be obtained by successively moving squares into the empty
position. This system changes with every move of the square, thus, the global database changes with time. The current state of the
system is given as the position and enumeration of the squares. This can be represented for example as a 9 dimensional vector with
components 0, 1, 2, 3,..., 8, NULL, the NULL object being the empty space.
In this puzzle, the most general production rule can be simply summed up in one sentence:
If the empty square isn't next to the left edge of the board, move it to the left

However, in order to move the empty square to the left, the system must first make room for the square to move left. For example,
from the initial state (refer to above figure) the 1-square would be moved down 1 space, then, the 8-square right 1 space, then the
6-square up one space in order for the empty square to be moved left (i.e., a heuristic search). All these sequences require further
production rules. The control system decides which production rules to use, and in what sequence. To reiterate, in order to move the
empty square left, the system must check if the square is towards the top, or somewhere in the middle or bottom before it can decide
what squares can be moved to where. The control system thus picks the production rule to be used next in a production system
algorithm (refer to the production system algorithm figure above).
Another example of a production system can be found in Ed Kao's 3-dimensional tic-tac-toe program. The rules and conditions for
the AI are conviently listed just like for the 8 puzzle.
The question that faces many artificial intelligence researchers is how capable is a production system? What activities can a
production system control? Is a production system really capable of intelligence? What can it compute? The answer lies in Turing
machines...

Programs
These programs written are examples of production systems.
3d Tic Tac Toe - E. Kao
PenteAI - J. Matthews

http://www.generation5.org/production.shtml (2 of 3) [25/06/2002 1:38:06 PM]


generation5.org - Essays

● Essays - Other essays on Production Systems.

All content copyright © 1998-2002, Generation5

http://www.generation5.org/production.shtml (3 of 3) [25/06/2002 1:38:06 PM]


GameDev.net - Project AI

Project AI GameDev.net

See Also:
Artificial Intelligence:AI Theory

Project AI
by Mark Lewis Baldwin and Bob Rakosky

Introduction
When designing an artificial intelligence (AI) for a strategy game, one must keep clearly in mind the final goal,
i.e. winning the game (actually the goal is to entertain the customer, but the sub-goal of trying to win the
game seems more appropriate here). Normally, winning the game is accomplished by reaching a set of victory
conditions. To achieve these victory conditions, the computer needs to control a divergent set of resources
(units and other decisions) in a coordinated and sophisticated manner.
In order to give a frame of reference for this discussion, we will be discussing Project AI from the point of view
of a strategic wargame, which has multiple units. Each needs to make separate movement decisions in each
turn of the game play. However, this system is not restricted to wargames. It's applicable in any game in
which a large number of decisions have to be made controlling a number of resources, which work best in
coordination with each other.

There are a number of ways to approach the problem and what will be discussed is by no means the only
approach. Project AI is a methodology that allows the computer to solve this problem at a strategic as well as
tactical level. But first, we need to build up to it by discussing the levels of AI decision making upon which it is
based. Each level described below is build upon it's previous levels.

First level...
Approach: In each turn (or cycle), examine each unit to be moved (i.e. node of decision making), build an list
of possible decisions and pick one randomly. In other words, look around and move somewhere.. It doesn't
matter what. Note that an action of doing nothing is still an action.

Problems: This does not direct the computer AI's resources toward the goal of victory, other than at a noise
level. It will however confuse the opponent something awful.

Second level...
Approach: When selecting from the possible moves (decision list) for each unit, pick that move that achieves
the victory conditions. To be specific, look around. Are there any victory goals achievable by the unit (i.e. can
the unit move into a victory location)? If so, implement.
Problem: When there are multiple actions which achieve the same goal, there is no filter in differentiating
equal (or nearly equal) actions. Also, most victory conditions cannot be achieved by any one decision,
therefore placing us back to the first level.

Third level...
Approach: Evaluate each alternate move for a unit by how well it moves us toward the final victory goal. Each
move needs to evaluated not as a TRUE/FALSE but as a numeric value, analyzing how well the target victory
goal is reached. For example, an action which would move to a victory location in two turns is worth more
than an action that would move to a victory location in 20 turns.

Problems: This method does not allow for actions that will support reaching the final victory conditions, but in
and of themselves do not achieve victory (i.e. killing another unit when that is not part of the victory goals).

http://www.gamedev.net/reference/articles/article545.asp (1 of 4) [25/06/2002 1:39:21 PM]


GameDev.net - Project AI

Fourth level...
Approach: Define specific sub-goals which assist the artificial intelligence in achieving the victory conditions,
but in and of themselves are not victory conditions. When a unit is making a decision, it evaluates the
possibility of achieving these sub-goals. Such sub-goals might include killing enemy units, protecting friendly
units, maintaining a defensive line, achieving a strategic position, etc. Accomplishment of each of the subgoals
is then factored in the evaluation of the decision tree. A decision is then made upon it. This process can
actually produce a semi-intelligible game.

Problems: Each unit makes it's decisions independent of all others. It's like warfare pre-Napoleonic. It works
well until the opponent starts coordinating their forces.

Fifth level...
Approach: Allow a unit making decisions to examine the decisions of other friendly units in weighing it's own
decision. Weigh the possible outcomes of the other units planned actions, and balance those results with the
current unit's action tree.
Problems: This allows for coordination, but not strategic control of the resources. However, this level is
actually beyond what many computer AI's do today. It can also lead to iterative cycling, in which each unit is
modifying it's decision based on the others in a viscous circle.

Sixth level...
Approach: Create a strategic (or grand tactical) decision making structure that can control units in a
coordinated manner.
This leads us to the problem of how does one coordinate diverse resources to reach a number of sub-victory
goals. This question may be described as strategic level decision making. One solution would be to look at
how the problem is solved in reality, i.e. on the battlefield or in business.

The solution on the battlefield is several layers of a hierarchical command and control. For example, squad's
1, 2 and 3 are controlled by Company A which in and of itself is controlled by a higher layer of the hierarchy.
Communications of information mostly go up the hierarchy (information about a squad 20 miles away is not
relayed down to the local squad), while control mostly goes down the hierarchy. Upon occasion, information
and control can cross the hierarchy, and although it's happening more now than 50 years ago, it is still
relatively infrequent.

As a result, the lowest level unit must depend on it's hierarchical commander to make strategic decisions. It
cannot itself because a) it doesn't have as much information to make the decision with as it's commander,
and b) it is capable of making a decision based on known data different that others with the same data,
causing chaos instead of coordination.
OK, first cut solution. We build our own hierarchical control system, assigning units to theoretical larger units
or in the case where the actual command/control system is modeled (V for Victory), the actual larger units.
Allow these headquarters to control their commands and work with other headquarters as some type of
'mega-unit'. These in turn could report to and be controlled by some larger unit.
Note that this was actually my first approach to the problem.
But there seems to be some problems here. The hierarchical command system modeled on the real world
does not make optimal use of the resources. Because of the hierarchical structure, too many resources may
be assigned to a specific task, or resources in parallel hierarchies will not cooperate. For example, two units
might be able to easily capture a victory location they are near, but because they each belong to a separate
command hierarchy (mega unit) they will not coordinate to do so, but if by chance they did belong to the
same hierarchy, they would be able to accomplish the task. In other words, this artificial structure can be too
constraining and might produce sub optimal results.
And the human player does not have these constraints.

First, we have to ask ourselves, if the hierarchical command and control structure is not the best solution, why
is it used by business and the military? The difference is in the realities of the situations. As we previously
pointed out, in the battlefield, information known at one point in the decision making structure might not be
known at another point in the hierarchy. In addition, even if all information was known everywhere, identical

http://www.gamedev.net/reference/articles/article545.asp (2 of 4) [25/06/2002 1:39:21 PM]


GameDev.net - Project AI

decisions might not be made from the same data. However, in game play, there is only one decision maker
(either the human or the AI) and all information known is known by that decision maker. This gives the
decision make much more flexibility on controlling and coordinating her resources than does the military
hierarchy.
In other words, the military and business system of strategic decision is not our best model. It's solution
exists because of constraints on communication. But those constraints do not exist in strategy games
(command and control is perfect) and therefore modeling military command and control decision making is
not our perfect model to solve the problem in game play AI.
And we want the best technique of decision making we can construct for our AI. So below is an alternative
Sixth Level attack on the problem...

Project AI (Level 6 Alternative)


This leads us to a technique we call Project AI. Project AI is a methodology that extrapolates the military
hierarchical control system into something much more flexible.

The basic idea behind Project AI is to create a temporary mega-unit control structure (called a Project)
designed to accomplish a specific task. Units (resources) are assigned to the Project on an as needed basis,
used to accomplish the project and then released when not required. Projects exist temporarily to accomplish
a specific task, and then are released.

Therefore, as we cycle through the decision making process of each unit, we examine the project the unit is
assigned to (if it is assigned to one). The project then contains the information needed for the unit to
accomplish its specific goal within the project structure.

Note that these goals are not the final victory conditions of the game, but very specific sub-goals that can lead
to game victory. Capturing a victory location is an obvious goal here, but placing a unit in a location with a
good line of sight could also be a goal, although less valuable.
Let's get a little more into the nitty gritty of the structure of such projects, and how they would interact.

What are some possible characteristics of a project?

- Type of project -- What is the project trying to accomplish, for example, defend a city, kill an enemy unit,
capture a geographical location, etc.

Project Type Examples:

- Kill an enemy unit.


- Capture a location
- Protect a location
- Protect another unit.
- Invade a region.
- Specifics of the project -- Exactly what are the specifics for the project? Examples are "kill the 239 Panzer
Division", "capture the town of St. Vith", etc.
- Priority of the project -- How important is the project compared to other ongoing projects toward the final
victory. This priority is used in general prioritizing and killing off low priority projects should there be memory
constraints.

- Formula for calculating the incremental value of assigning a unit to a project -- In other words, given a unit
and a large number of projects, how do we discern what project to assign the unit to. This formula might take
into account many different factors including how effective the unit might be on this project, how quickly the
unit can be brought in to support the project, what other resources have already been allocated to the project,
what is the value of the project, etc. In practice, we have associated the formula with the project type, and
the each project has just carried specific constants that are plugged into the formula. Such constants might
include enemy forces opposing the project, minimum forces required to accomplish the project, and
probability of success.

- A list of units assigned to the project.


- Other secondary data.

http://www.gamedev.net/reference/articles/article545.asp (3 of 4) [25/06/2002 1:39:21 PM]


GameDev.net - Project AI

OK, now how do we actually use these 'projects'. Here is one approach...
1) For every turn, examine the domain for possible projects, updating the data on current projects, deleting
old projects that no longer apply or have too low a priority to be of value, and initializing new projects that
present themselves. For example, we have just spotted a unit threatening one of our towns, we create a new
project which is to defend the town, or if the project already existed, we might have to reevaluate its value
and resources required considering the new threat.

2) Walk through all units one at a time, assigning each unit to that project that gives the best incremental
value for the unit. Note, that this actually may take an iterative process since assigning/releasing units to a
project can actually change the value of assigning other units to a project. Also, some projects may not
receive enough resources to accomplish their goal, and may then release those resources previously assigned.

3) Reprocess all units, designing their specific move orders taking into account what Project they are assigned
to, and what other units also assigned to the project are planning on doing. Again, this may be an iterative
process.
The result of this Project structure is a very flexible floating structure that allows units to coordinate between
themselves to meet specific goals. Once the goals have been met, the resources can be reconfigured to meet
other goals as they appear during the game.

One of the implementation problems that Project AI can generate is that of oscillations of units between
projects. In other words, a unit gets assigned to one project in one turn, thus making a competing project
more important, grabbing the unit the next turn, etc. This can result in a unit wandering between two goals
and never going to either. The designer needs to be aware of this possibility and protect for it. Although there
can be several specific solutions to the problem, there is at least one generic solution. Previously, we
mentioned a formula for calculating the incremental value of adding a unit to a project. The solution lies in
this formula. To be specific, a weight should be added to the formula if a unit is looking at a project it is
already assigned to (i.e., a preference is given to remaining with a project instead of jumping projects). The
key problem here is assigning a weight large enough that it stops the oscillation problem, but small enough
that it doesn't prevent necessary jumps in projects. So one may have to massage the weights several times
before a satisfactory value is achieved.

Seventh Level
One can extrapolate past the "Project" structure just as we built up to it. One extrapolation might be a
multilayer level of projects and sub-projects. There are other possibilities as well to explore.
© Copyright 1997 - Mark Baldwin and Robert Rakosky

You can find out more about myself and services provided by Baldwin Consulting at
http://members.aol.com/markb01

Discuss this article in the forums

Date this article was posted to GameDev.net: 7/28/1999


(Note that this date does not necessarily correspond to the date the article was written)

© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy


Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article545.asp (4 of 4) [25/06/2002 1:39:21 PM]


GameDev.net - The Intuitive Algorithm

The Intuitive Algorithm GameDev.net

See Also:
Artificial Intelligence:AI Theory

The Intuitive Algorithm


An Essay Concerning Artificial Intelligence, Intuition and The Mind
Intuition may be a pattern recognition algorithm
Abraham Thomas
Copyright 1997
INTRODUCTION
This essay offers an unusual profile of the mind. It is based on a novel insight concerning intuition, a little known and
mysterious mental faculty. The profile begins with an overview of some of the current problems faced by science in
understanding the mind. It outlines seven specific issues, which shroud major aspects of human intelligence in mystery.
It goes on to explain a new algorithm, the logic of which appears to point to answers to these very puzzles. (An
algorithm solves a problem in a finite number of steps, by executing a set of instructions in a specific order). This
algorithm uses a simple but unconventional logic in an expert system which diagnoses diseases.
This logic has classic grace and exceptional power. It appears to have immediate pertinence to the speed and subtlety of
the intuitive process. The mind instantly identifies a single thought, in context, from a lifetime of memories. The act is
equivalent to a search process which instantly locates a single needle on a vast beach. The logic of the algorithm may
make such an achievement feasible. The ingenuity of the logic enables one to imagine a viable process which can
convert a network transmitting nerve impulses into a real time system with knowledge, feelings, consciousness and
awareness.
Based on this critical insight, this essay presents a hypothesis concerning the mind. It suggests where human memory
may be stored, how memory can be recalled, how objects and events may be recognised, and how the mind may control
the body. The thesis suggests how emotions, judgement and will may finally manipulate the system.
" .............. The concept of an intuitive algorithm may provide us a key to the mechanisms and working of the human
brain and the concept of "MIND". Dr.K.Jagannathan, MD DTM FAMS, Consultant Neurologist.
"........... The tenet of The Intuitive Algorithm raises innovative and interesting questions on the very basis of intuitive
thinking." Dr.Prithika Chary MD DM(Neuro) PhD (Neuro) MNAMS (Neuro) MCh (Neurosurgey) Neurologist &
Neurosurgeon. Recipient - Indian Council of Medical Research Award for Outstanding Woman Scientist of The Year
1982.
"............ A highly commendable intellectual endeavour, which can provide leads to researchers in artificial intelligence,
cognitive sciences and advanced computer systems". Dr.K.Sundaram, PhD., Head of the Department of Computer
Science, University of Madras, Principal Contributions in Bio-Physics and Computer Science at the University, at the
All India Institute of Medical Sciences, New Delhi and at NASA, U.S.A.
CONTENTS:
Barriers to understanding the mind. How does the mind internally represent information? How does it instantly
isolate a single pattern from a mass of interveawing patterns? How does it handle "uncertainty"? How does it achieve
this in an astronomically large search space? How is such speed achieved despite slower neuronal transmissions? Does
it use a reasoning process? Where is memory stored? A brief survey of these issues.

http://www.gamedev.net/reference/articles/article770.asp (1 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

A new algorithm. Describes an algorithm, which successfully diagnoses diseases. Essentially, it reverses the logic of
the search process from selection to elimination, to achieve remarkably speedy results.
Instant recognition. When presented with unique links, the algorithm achieves instant recognition in massive search
spaces. It logically handles uncertainty, avoids stupid questions and is holistic. It also ignores the age old reasoning
chains of science, travelling a new avenue in the application of inductive logic.
The nerve cell and recognition. Currently, nerve cells are believed to be computational devices. A new recognition
role is suggested for neurons. They may recognise incoming patterns. Recognition may explain such phenomena as the
modification of pain, the focus of attention, awareness and consciousness.
Memory. Recognition is "the establishment of an identity". It may be achieved by comparing the features of an entity
to those in memory. Recognition may mandate memory. Nerve cells may carry such memory. Feelings may be nerve
impulses, the recognition of which may provide context for the recall of memory.
Recognition of objects. Nerve channels project from point to point, observing neighbourhood relationships. Such
mapping may suggest a matrix type transmission. Intuition may be the instant recognition of such cyclic transmitted
pictures. Cortical association regions recognise objects and may transmit pictures, for recognition by the system.
Motor control. Instant intuitive recognition of pictures may empower motor control functions. Persisting iterating
patterns may form the basis for achieving objectives. Such goal patterns may be triggered by feelings. Habitual
activities may be recalled through intuitive and iterative pattern recognition by the cerebellum.
Event recognition. Intuitive iterating patterns are suggested as enabling the recognition of events. Event recognition
may be the key to complex thought processes. Event recognition may automatically trigger feelings.
The goal drive. Iterating goal patterns may provide basic drives and long term goals and may represent the "purpose"
of the system. Purpose is set by the current feeling. The will of the system may be decided by the limbic system which
may determine the "current feeling".
The mind. Consciousness may be an independent intelligence, which expresses judgment and will, and resides in a
restricted group of nerve channels. The limbic system may over rule will to determine the current feeling and hence set
goals for the system.
An expert system shell. Details of the design of an AI shell program, which can be utitlised to create expert systems.
Explains simple method of knowledge input. Suggests areas in which expert systems can be helpful.
References.

Barriers to Understanding the Mind


Artificial Intelligence awaits a breakthrough. This essay concerns Artificial Intelligence, pattern recognition and the
concept of mind. The first of these, the term "Artificial Intelligence" (AI) originated in the early sixties, representing, at
the time, an ambitious effort to define human intelligence for simulation by machines. The AI effort has succeeded in
solving many problems which were believed to require intelligence, including those in information processing, pattern
recognition, game playing and medical diagnostics. Yet, several decades later, as continuing research unravels the
awesome complexity of the mind, the scientific community has serious doubts as to whether true AI can ever be
developed. AI faces a series of hurdles in defining human intelligence. A new view from a different perspective may
overcome some of these restraints.
The problem of internal representation. The primary restraint is the mystery surrounding the internal language of the
mind. An information processing system may receive data as language, formulae, or even digital readouts. The system
must translate these into its own internal representation. Computers manage with the digital format. These are stored in

http://www.gamedev.net/reference/articles/article770.asp (2 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

memory, recalled, processed and then translated into an acceptable output mode. In AI, problems are translated into
specialised languages. Problem specific languages assist programs to play chess, or diagnose diseases. This need for
specialised languages partitions AI solutions into compartments. There is no single way in which problems can be
represented in AI to tackle chess, diagnostics, chemical analysis and banking. While the ultimate goal of AI may be to
become a single equivalent to human intelligence, its own languages fail to communicate with each other. As opposed
to this, the internal language used by the mind appears to fathom the whole world as we know it. This mystery is
sought to be addressed in this essay, using the logic of a new algorithm. The logic may point to a single internal
representation, for use by the mind. This may be its own interior language of communication.
Pattern Recognition. The second issue that has baffled AI researchers is the problem of how to identify a problem as
belonging to the field of mathematics, vision, or game playing, even before attempting to solve it. With its abstract
qualities, one can see difficulties in identifying a problem. Let alone identify a problem, AI efforts have failed to even
identify a tangible physical object, such as a face. Today, in spite of huge advances in technology, a computer cannot
identify a particular face as belonging to a particular person. The difficulty is that all recognisable objects and events in
our environment have innumerable shared qualities. For a computer, they form trillions of patterns, which overlap each
other. Establishing the identity of a single pattern among a range of overlapping patterns is called pattern recognition.
The recognition of a known face is a pattern recognition task. In AI, a computer algorithm may follow a logical
procedure to solve this problem. A pattern recognition algorithm may attempt to establish the identity of a seen pattern
through a sequence of logical steps. It may seek to identify a seen face as one belonging to a known person.
An exact match an impossibility. Current AI algorithms attempt to identify a pattern by matching its characteristics
strictly with that of a known pattern. The characteristics of known patterns can be stored in the memory of computers
for recall. Consider the problems in the recognition of a face. There are billions of faces in the world. They share
thousands of common features. The characteristics of colour, skin texture, facial features and makeup overlap each
other on a virtually infinite scale. People age, grow beards or change appearances with moods. The changes caused by
light and shade add further complexity. In such an environment, where patterns themselves have millions of shifting
characteristics, it is virtually impossible to find an exact match even if patterns are matched at the microscopic level of
detail. This essay suggests an algorithm which can establish the identity of a pattern in such a complex and changing
environment.
The problem of uncertainty. The third issue which has posed problems for AI programs is the factor of "uncertainty".
Computers work with a "Yes or No" logic. A characteristic belongs to a pattern, or it does not. A pattern can be
selected, or rejected on this basis. Unfortunately many characteristics have vague relationships to patterns. They are
only sometimes present. "Fuzzy logic" attempts to handle vagueness by giving grades to a characteristic, such as short,
medium height, tall and very tall. While this helps to define a characteristic in greater detail, it fails to handle
identification of a person who sometimes wears spectacles. A computer can match "wears glasses", or "does not wear
glasses". It cannot handle both. Unfortunately most patterns have such variable qualities. This essay attempts to show
how such uncertainty can still help pattern recognition.
Instant identification of context. The fourth issue, which has frustrated AI research is the inadequacy of available
tools to gauge the awesome size of the search space. When an AI program attempts machine translation of a word in
context, it must store contextual data and recall this through a search process. It is like searching for a needle on the
beach. The mind instantly identifies context. Every seen object or event fetches its own contextual background. When
the word "pool" is used with "swim", it suggests one meaning and quite another when used with "cartel". As we read,
specific meanings, which exactly suit the context, are instantly recalled. The mind holds a lifetime of memories and
associative thoughts. Yet it instantly identifies a single contextual meaning from such a gargantuan search space.
Computers seek an item in memory through a serial match. One characteristic of the perceived object is compared with
the characteristic of an item in memory. If this matches, the second characteristic is compared and so on, in a
systematic search.
An intractable search problem. The search space is enormous. In AI, a systematic search brings related problems as
to where to begin a search, and the direction of the search. "Heuristics" is a term used for determining a search
direction. If one is searching for a needle on the beach, heuristics would suggest a search to the North to locate it. But
such solutions work only in small search spaces. In spite of many attempted shortcuts, all such search algorithms

http://www.gamedev.net/reference/articles/article770.asp (3 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

eventually face the problem of a "combinatorial explosion". The back and forth search paths become intractably
prolonged and cumbersome. While it takes milliseconds for the mind to locate a memory in context, the AI search and
match algorithm would take years, if it was to recall a single memory from a lifetime of memories. This essay suggests
an algorithm which can make instant identification practical for the mind in the context of a large search space.
A slower processing mechanism. The fifth puzzle is that the human nervous system is known to process data far
slower than a computer. (1) While messages in integrated circuits travel at the speed of light, nerve impulses travel just
a few yards per second. While computers process information in millions of cycles per second, the mind runs at
between 50 and 10,000 cycles per second. When one considers the enormous size of the memory bank of the mind,
how does a slower processing system achieve such incredible speed in locating one memory from trillions of memory
traces? This process of instant identification is usually called intuition, a hitherto unexplained and mysterious capability
of the mind. Parallel processing by the billions of nerve cells in the nervous system does explain some of the
complexity of the mind. Even then, no known search algorithm can achieve such precision with such speed. This essay
suggests a search algorithm which could be used by the mind to practically achieve the speed of intuition, even within
the limitations of the slower processing speeds of the mind.
No chain of reasons. The sixth issue is the mystery surrounding the reasoning processes of the mind. AI programs
attempt to give "backward chaining". When a solution is offered for a problem, step by step reasoning is provided for
the final conclusions. A chain of reasons links the premise to the conclusion. Yet, the average person detects a mistake
in the syntax of a sentence, without necessarily knowing anything about nouns, verbs, prepositions, deep structure, or
other intricacies of grammar. When a person pays attention to a sentence, errors are detected, without always knowing
why they are errors. Thus the reasoning processes used by AI do not appear to be the methods used by the mind. This
essay suggests that the mind may be constructed around a pattern recognition model, which does not apply reasoning
chains to draw its conclusions.
Where does memory reside ? The seventh issue that has baffled scientific research is the scarcity of data concerning
the location of human memory. (2) Classic experiments carried out in the early part of this century on the memories of
rats concluded that no particular location of the brain stored memories and that memories were somehow stored in a
distributed fashion across the entire network. Current theory supports this hypothesis that memory is a network
phenomenon. Research from the seventies in "neural networks" suggested that a network could be induced to carry a
memory through their tendency to balance the relationships between various nodes. By providing "weightage" to
nodes, it was possible for units of memory to be stored. Such an explanation implied that the nodes were devices which
received inputs, carried out certain computation and sent out nerve signals. Opposing this theory, this essay suggests a
recognition rather than a computational role for nerve cells. In the process, the paper suggests a location for human
memory.

A New Algorithm
Recognition and intelligence. Consider the process of reading. The words are just black and white patterns on paper.
Recognition of the patterns conveys the purpose of the author to the reader. A single message on paper can move an
army. The act of recognition of the patterns on the paper provides a powerful, but invisible link. If we did not
comprehend the recognition process, the arrival of a march order would appear to have a puzzling response. The
nervous system appears a mysterious network, with billions of inter-linked communicating nodes. The process of
becoming conscious, or of paying attention appear as baffling activities of the system, without any rational explanation.
This essay shows how instant recognition of patterns by neural processes can reasonably trigger intelligent activity in
real time. Recognition appears to be the key to intelligence.
The Intuitive Algorithm (IA). While the geography and functions of the human nervous system are well known and
well documented, the mind remains a mysterious entity. The key insight to the answers suggested in this essay come
from a diagnostic expert system which uses a new pattern recognition algorithm. It logically achieves virtually instant
recognition in a large search space - the suspected quality of intuition. A similar logic can enable intuition to achieve
the equivalent of instantly finding a needle on the beach. It removes the mystery surrounding intuition. It can be viewed

http://www.gamedev.net/reference/articles/article770.asp (4 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

as a practical process which can identify a single item from an astronomically large database. It grants the mind the
ability of timely recognition in context. The insight opens to view the awesome range and power of an intelligently
interactive mind. The concept begins with the expert system. It uses a singular algorithm. Let us call it the Intuitive
Algorithm (IA).
The conventional expert system. When presented with a list of indicated symptoms, a diagnostic expert system
identifies a disease. Its database contains hundreds of diseases and their symptoms, including many commonly shared
symptoms. If a disease is a pattern, the objective is to identify a single pattern in a collection of interweaving patterns.
As explained before, traditional expert systems achieve this with an open ended search, based on indicated symptoms.
The database is searched for a disease that exhibits the first symptom. The first located disease having the first
symptom is tested for the second symptom. If the test fails, a new disease with the first symptom is located and the
second symptom is again tested. Each new symptom brings new diseases into evaluation. The search ends when all the
presented symptoms match the indicators of a single disease.
The IA process. IA uses a different approach in a logical search of a database. Each disease is stored with one of three
("Yes" (Y), "Neutral" (U), or "No" (N) ) relationships to each symptom question. Y means a positive link - the
symptom is always present in the disease. U means the symptom is sometimes present. And N means the symptom is
absent for the disease. After each answer to a presented symptom question, the Y/U/N relationships of all diseases are
tested in a single step, just the way all cells in a spreadsheet are instantly recalculated. The Y/U/N relationships are
entered specifically for their negative impact. An "Yes" answer eliminates all "N" diseases. If the problem is unilateral,
all bilateral eye diseases are eliminated. A "No" answer eliminates all "Y" diseases. If visual acuity is not affected, all
eye diseases which impact on visual acuity are eliminated. IA also purges questions which have "Y' relationships only
to eliminated diseases. The questioning process begins with the question which has the maximum number of "Y"
relationships. It ends when the presented symptoms eliminate all but a single disease. Specific questions can then
confirm the diagnosis. If all diseases are eliminated, the conclusion is that the presented symptoms do not match any
disease in the database. For IA, it is then an unknown disease. Such a problem solving approach gives IA some
exceptional capabilities.
IA circumvents "stupid questions". Normal search algorithms serially seek to match a symptom with a single
disease. IA narrows the search faster by evaluating the entire database concerning the current answer. IA is holistic.
Doctors know that the lack of a particular symptom clearly indicates the absence of a particular disease. So, a
subsequent query which suggests the possibility of that disease is a "stupid question". If a patient reports a lack of pain,
a subsequent question posing the possibility of a disease which always presents a powerful pain symptom is, naturally,
considered stupid. Such a question annoys the user. With their "back and forth, open ended" serial searches, a
traditional expert system is blind to the global impact of a previous answer on subsequent questions. Additional steps
are required to correct this defect. IA avoids "stupid questions" by purging all "Y" questions which relate only to
diseases eliminated by the process.
IA logically manages "uncertainty". When a disease exhibits a symptom only occasionally, (a "U" condition), it is
retained within the database regardless of whether the answer to the symptom question is "Yes" or "No". The disease is
not eliminated. It remains available for "further consideration". IA continues the elimination process. Each answer
eliminates "Y" or "N" diseases as per the entered relationships, taking IA ever closer to the answer. IA achieves the
subtle objective of making a decision on an uncertain piece of information. While the disease with the uncertain
condition is "retained", every answer continues the elimination process. On the other hand, an uncertain condition is
"garbage" for a traditional expert system, which cannot "match" a disease which has a "maybe" relationship to a
symptom. Since IA does not seek an exact match, it logically handles "uncertainty". For correctly entered relationships,
the IA logic is flawless in diagnosis. Traditional expert systems are slowed down through the exponential growth of
their back and forth search steps. They ask a tediously long series of questions, including stupid ones. They fail to
handle uncertainty. IA is generations ahead of current expert systems. Doctors certify that IA is fast and never asks
stupid questions.
Inductive logic. But, IA follows the logic that a person does not have a particular disease if he does not have a
particular symptom. This is not a conventional logical derivation. In any diagnostic process, we can use deductive, or
inductive reasoning. In deductive reasoning, a generally accepted principle is used to draw a specific conclusion. All

http://www.gamedev.net/reference/articles/article770.asp (5 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

men are mortal. Socrates is a man. Therefore Socrates is mortal. When a person uses a number of established facts to
draw a general conclusion, he uses inductive reasoning. For instance, the observation of swans over the centuries has
led to the conclusion that all swans are white. This is the kind of logic which is normally used in the sciences. An
inductive argument, however, is never final. It is always open to the possibility of being falsified. The discovery of one
black swan would falsify "the white swan theory". Inductive reasoning is always subject to revision if new facts are
discovered. The sciences progress through this process of induction and falsification.
Exclusion is also a logical process. Inductive reasoning has traditionally been based on the principle of inclusion. The
white swan theory is a result of experience over time. If we saw a white bird, we would move one step forward in
identifying it as a swan. But logic is equally sound in exclusion. If the bird was black, we could conclude that it is not a
swan. Subsequent discovery of a black swan would make this induction wrong. But, if the reasoning that all swans are
white was true, then the induction that a black bird is not a swan would be equally true. The white swan theory can
logically lead to both conclusions. In a similar manner, if a symptom is always present for a particular disease,
inductive logic also implies that an absence of the symptom excludes that disease from further consideration. This is
not a conventional conclusion, but is accurate and unassailable.
IA avoids an exact match and uses elimination. A conventional search algorithm seeks an exact match between
indicated symptoms and the symptoms in memory for a known disease. The objective of IA is not to find an exact
match, but to eliminate those diseases which fail to meet the search criteria. Both "Yes" and "No" answers are
specifically encoded to eliminate unrelated diseases. Consider a patient with a disease, who approaches a computer
diagnostic session. Let us say the computer has a list of 200 diseases, which can be identified by 1000 symptom
specific questions stored in the system. (Many diseases will share common symptoms). In practice, on an average, each
disease may answer "Yes" to 20 of the 1000 questions.
More clues in elimination. But, upto 200 "Yes" answers may justify the elimination of the disease, since most
symptoms will promptly point to specific groups of diseases, excluding others. The conventional expert system looks
only for "Yes" answers. It will match the answers for the disease of the patient to just 20 of the 1000 questions. For this
patient, 980 answers will not take the search forwards. But for IA, every "Yes" answer can eliminate up to 20 percent
of the diseases. Elimination of a disease also removes its related questions. The elimination process will yield speedy
results even for "No" answers. IA will identify the disease long before the 20 relevant questions for the disease are
exhausted by swiftly purging any remaining alternatives. In pattern recognition, an elimination procedure is
unbelievably faster than one which seeks an exact match.

Instant Recognition
A logic for instant recognition. The speed of the elimination process is even more striking for IA in a special
situation. When IA identifies a special condition, its recognition process is virtually instantaneous. Its memory stores
the relationships of all diseases to symptoms. Suppose only one disease has a "Y" relationship and all others, an "N"
relationship to an exceptional symptom. The symptom is unique to the disease. Then, an "Yes" answer to this symptom
eliminates all "N" diseases, leading immediately to recognition. The symptom indicates the disease. It is recognised in
a single step of massive elimination. The process is logical. It evaluates every disease in its database against a single
clue from one symptom. A doctor may walk into a surgery and instantly attend to a patient suffering from a heart
attack. He may not even ask a question. With minimum visual clues, he instantly identifies a single disease from his
"known database" of thousands of diseases. He instantly recognises a single pattern in a maze of interweaving patterns.
IA may be imitating the logic of this recognition process.
Unique features can identify a pattern. The IA logic does not seek an exact match, but concentrates on the
elimination of alternate possibilities. Elimination is most effective when there are unique features. It is a practical
strategy for recognition in nature. All the recognised objects in our environment are unique. Despite millions of shared
characteristics, they also have individual qualities. Even where patterns shift constantly, some characteristics remain
stable. Consider a face in a newspaper cartoon. It contains the barest minimum of information - a few lines which
define the edges of facial features. But a public figure is identified by just the curve of a nose. The context of being in

http://www.gamedev.net/reference/articles/article770.asp (6 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

the newspaper eliminates all ordinary people. The turn of the nose eliminates all politicians with straight noses. Unique
features and elimination can determine the outcome. Massive amounts of data are not evaluated. A few clues.
Recognition is virtually instant. Elimination based on uniqueness can achieve logical and acceptable recognition.
IA imitates parallel processing. With the discovery of the spreadsheet, it became possible for computers with single
processors to imitate one characteristic of parallel processing. Even if a spread sheet has thousands of cells, a single
entry in one cell is instantly reflected in all the related cells. Thousands of serial calculations appear to the user as a
single parallel calculation. Logically, the spreadsheet can have billions of cells and a sufficiently powerful processor
can still deliver this result. The spreadsheet is holistic, since every cell reflects the current re-calculated position. IA is
similar. By evaluating the results of a single answer on all the diseases in its database, it is holistic and imitates parallel
processing. Logically, IA too can produce instant recognition in any size of search space. Any unique symptom can
enable IA to instantly identify one among several thousand diseases. If IA is to attempt a problem on the scale of the
human nervous system, the only limitation will be the practical problem of data entry.
IA compared to intuition. Consider the steps followed by IA. It stores details of all diseases, their characteristics and
the relationships between them in memory. It receives inputs concerning symptoms through "Yes/No" answers. It
simulates parallel processing to globally evaluate the current input. It is encoded negatively to use all inputs to
eliminate unrelated diseases. If an input indicates any unique symptom, it achieves instant recognition by eliminating
all except the related disease. It follows an algorithm which results in instant identification. Compare IA to the
recognition process of the mind. When a face is familiar, there is instant recognition. Let us call it intuition. Such
recognition, of thousands of such objects, is repeated by people world-wide millions of times every day. Like most
other events in nature, such a process must follow an orderly set of instructions to achieve results in a finite number of
steps. In essence, intuition must also follow an algorithm.
Memory and relationships. A comparison of IA with the current knowledge of the mind, reveals some similarities
and several unexplained enigmas. This essay attempts to fill in the gaps to create a composite view of the mind. Firstly
IA stores the names of all diseases in memory. It is logical to assume that the mind stores data on all known faces in
memory. But the mechanics of memory remains unknown. This essay suggests a sound possibility. Secondly, IA infers
that certain symptoms are present, or absent, based on simple "Yes/No" answers to queries. There is considerable
evidence that the mind isolates thousands of characteristics of any seen object. Obviously, the mind must perceive the
characteristics of faces to be present, or absent. Thirdly, IA stores the relationships between symptoms and diseases. In
recognising a face, the mind establishes its identity. Identification demands a link between a face and its known
characteristics. One must know that the face is oval, or round. It is reasonable to presume that the mind must have such
links. But how the mind stores such links remains a mystery. This essay suggests how nerve cells can establish and
store such relationships.
Nerve cells eliminate alternative possibilities. Fourthly, IA encodes a negative relationship between diseases and
their symptoms. It is deliberately coded to eliminate. Deliberate elimination of alternatives is a well documented
feature of the nervous system. (3) Nerve cells have a powerful system of parallel inhibition of surrounding neurons
when a particular group of neurons start to send information. This inhibition is strongest for those immediately adjacent
to the excited neurons. Throughout the nervous system there are neural circuits which switch off other circuits when
their own areas are energised. There is evidence that the mind carries such systematic elimination beyond logic. This is
illustrated in the popular vision experiment, where a drawing can be interpreted as a vase, or two faces facing each
other. The mind eliminates one interpretation to recognise the other - a vase, or two faces. Evidently each recognition
path acts powerfully to inhibit the other. Recognition is firmed up by eliminating even logical alternative solutions.
The coding of elimination by nerve cells. The mind is known to have specialised networks which perform unique
functions. There is a network to identify the edges of a seen object. Another to detect the beginning and end of
movements by muscles. This essay gives some examples of how such intelligence can be achieved through recognition
based on the memory codes of neurons. In fact, the key theme of this essay is that such recognition can give
intelligence to a network. Such a tool can give neural networks the capability of achieving a variety of intelligent tasks.
It is assumed that neurons may be suitably coded, to facilitate elimination of less viable alternatives. This essay does
not suggest any probable process the mind may use to determine such elimination. But, elimination, as a neural
process, remains a well documented and practically experienced event.

http://www.gamedev.net/reference/articles/article770.asp (7 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

Parallel links for speed. Definitive research suggests that the brain simultaneously isolates every incoming sensory
image into myriad characteristics. (4) The visual image alone is divided into several hundred million separate
characteristics of light, shade, colour, outline and movement. We do not know how all this information gets organised
and processed. But, each nerve cell in the system is known to have a hundred to a quarter of a million links with other
cells. (5) The average nerve cell is known to respond within about 5 milliseconds of receiving a message. Since all cells
work in parallel, any message received by any cell can reach any other cell in the system within just five or six steps -
in just one fiftieth of a second. Currently, science does not know how such a process can rapidly transfer information in
the system. Recognition may be provide the pivotal link. It can link every cell to the system. If so, every cell in the
network can recognise and respond to every flash of incoming information. If we assume a recognition role for the
nerve cell, global interpretation of incoming information and instant response becomes feasible for the system.
IA imitates intuition. IA has classic simplicity and power in its logic. The elimination process is logical. It is discrete
and does not leave a fuzzy answer. Yet it has the ability to evaluate possibilities with vague qualities. If a face is known
to occasionally wear spectacles, all faces which never wear spectacles can be eliminated. A vague characteristic is
productive for IA. As opposed to this, a search and match algorithm finds the "occasional use" type of information
futile. IA logic is holistic, since it evaluates its entire database, with each input. Every answer updates its perspective,
by eliminating all elements that fail the search criteria. Every answer narrows its focus. It creates in IA the equivalent
of "global awareness" of the mind. As against this, a search and match algorithm ambles about in the vast search space
without a clue as to the global picture and appears stupid. Finally, IA instantly identifies a pattern, if it indicates even a
single unique quality, through simultaneous elimination. In conclusion, IA is logical. It imitates intuition in being
holistic, avoiding "stupid questions", handling uncertainty and in providing instant recognition.

The Nerve Cell and Recognition


A nerve cell has many inputs and a single output. A cell is the basic unit of all living tissue. In the human body,
there are specialised cells called neurons, which transfer information rapidly from one part of the body to another
through electrical nerve impulses. Each of the one hundred billion or so nerve cells has many inputs and a single
output. (6) A typical neuron has thousands of minute threadlike growths called "dendrites" which conduct impulses
towards the cell body. A central "cable" called an "axon", conducts impulses away from the cell body. The output of
every cell in the entire nervous system is an "all, or nothing" impulse, called an action potential, despatched through its
axon. A neuron receives many inputs and dispatches a single output.
Neuron believed to be a computational device. Current research views this output of the cell as a computational
message. (7) The voltage of a neuron at any given moment, is presumed to reflect all the summation activities of a
thousand inputs. As the inputs arrive, they are supposed to be rapidly added to or subtracted from the total neuron
voltage. It is presumed that if the stimulus is strong enough to breach a critical threshold level, an action potential is
fired. Other neural network theories assume complex calculations, giving weightages across neurons. Current scientific
theory assumes that nerve cells use some form of computation, meaning mathematical, especially numeric methods.
">Nerve cells may not compute. They may recognise. IA points to intuition as a process, which acts through
elimination based on simultaneous recognition of millions of separate characteristics. It has been reasoned that, at the
seminal level, recognition may be accomplished by a nerve cell. There are many supporting arguments for this thesis.
"Recognise" means "to establish an identity". Mathematical computational ability does not focus on the identity of a
node. Weightages may give greater identity, but fail to give a node a singular quality, which can be recognised by
millions of other nodes. Yet, there is experimental evidence that a single nerve cell may inhibit the actions of millions
of other cells. If addition or subtraction is the principle, it is hard to justify the idea that the firing of a single nerve cell
among thousands of others can add up to trigger an action potential in an axon. You cannot add "1" to "-1000" and get
"+1". If recognition is the key, even a single microscopically small input from a single cell can trigger recognition and
inhibition of a whole battery of cells.
The nerve cell may operate a form of Boolean Logic. Each nerve cell may be functionally competent to recognise a

http://www.gamedev.net/reference/articles/article770.asp (8 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

single event. It may fire a volley of impulses when the event is recognised. The all or nothing response of the nerve cell
may be a form of Boolean logic. In Boolean algebra, all objects are divided into separate classes, each with a given
property. Each class may be described in terms of the presence or absence of the same property. An electrical circuit,
for example, is either on or off. Boolean algebra has been applied in the design of binary computer circuits and
telephone switching equipment. These devices make use of Boole's two-valued (presence or absence of a property)
system. Firing by each neuron may represent the presence, or absence of a distinct property. The entire nervous system
may recognise an input from a cell as a perception of the presence of a property. Alternatively, the system may
recognise firing by a cell and respond with a specific activity, such as a muscle movement.
Recognition at the input level. For sensory inputs, the firing of a nerve cell is known to indicate recognition. The
entire in formation input into the human nervous system is through cells called receptors which convert sensory
information into nerve impulses. (8) Chemoreceptors in the nose and tongue report on molecules which provide
information on taste and smell. Other receptors are massed together to form sense organs such as the eye and the ear.
There are receptors which report on pressure, touch, pulling and stretching. Nociceptors report on cutaneous pain.
Peripheral nerves connect these sensory receptors to the central nervous system. At the entire input level, nerve
impulses indicate recognition of the occurrence of millions of isolated events. The whole system recognises the firing
by each one of these cells as the perception of a single microscopic event. At the input level, the firing of a cell
indicates an act of recognition and not one of computation.
Motor events at the output level. At the output level, individual nerve impulses control motor outputs. There are
motor areas in the cortex, the wrinkled surface layer of the cerebral hemispheres of the human brain. (9) Careful
electrical stimulation of these areas send nerve impulses which invoke flexion or extension at a single finger joint,
twitching at the corners of the mouth, elevation of the palate, protrusion of the tongue and even involuntary cries or
exclamations. The nerve fibres carrying inputs to and outputs from the cortex pass through the thalamus, a major neural
junction in the brain. This junction plays a key role in this explanation of the activities of the mind. The nerve impulses
passing through follow a form of Boolean logic. They report the presence or absence of individual events, or activate or
are quiescent to isolated motor functions. Each action potential indicates, at the input and output levels, the perception
or the triggering of a property - a distinctive event.
Nerve cells cannot add apples to pears. At the input and output levels, the firing of a nerve cell indicates an event.
Current theory admits the Boolean function at these levels. But scientists imagine computation by nerve cells at
subsequent levels, where these messages are interpreted and transmitted further. While it has a single "all or nothing"
output, a typical neuron receives thousands of inputs from other nerve cells. Numeric computation (adding, subtracting,
dividing, or multiplying) of widely varying inputs is quite improbable. The inputs are distinctly different events such as
sound, light, pressure, or smell. The outputs are complex muscle movements. It is wildly chaotic to include all this into
an integrated computation. It is like adding apples to pears, or subtracting the sense of touch from the sense of pain. It
is more realistic to assume that a pain cell recognises touch and reacts by despatching or inhibiting a pain message.
Recognition can evaluate varied inputs and trigger an appropriate output. Recognition may provide the key to
understanding intelligence.
Recognition the first step to intelligence. Throughout the nervous system there are networks of cells, which appear to
act intelligently. These events have been assumed to be some form of network intelligence - a mysterious mental
capability. But such intelligence can be explained if we assume that nerve cells recognise incoming information and
respond with action potentials through their axons. A typical unexplained act of intelligence is the baffling capability of
the mind to modify the sensation of pain on its route to the cortex. The sensation of pain is known to be reported,
enhanced or suppressed, under varying conditions. Consider the following explanation. A neuron which reports
cutaneous pain may receive inputs from its primary pain sensory neuron (P), along with other dendritic inputs from
neighbouring (sympathetic) pain (SP) and touch sensory (T) cells. The cell may report pain and sympathetic pain. It
may ignore the sense of touch to report pain. It may also inhibit sympathetic pain giving priority to the sense of touch.
In such a context, the cell responses to the listed inputs may be as follows:
P - Fire. Reports pain.
SP - Fire. Reports sympathetic pain.

http://www.gamedev.net/reference/articles/article770.asp (9 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

P+T - Fire. Ignores touch and reports pain.


SP+T - Inhibit. Suppresses sympathetic pain to highlight touch.
In reporting, or suppressing sympathetic pain, the cell may be selectively responding to combinations of nerve impulses
received at different dendritic inputs. It may be recognising unique combinations to trigger its own interpretation of a
single event.
An executive attention centre. The recognition model can also illuminate the puzzling process of paying attention.
(10) William James, in one of the best writings on the mind, suggested that attention is "the taking possession by the
mind, in clear and vivid form, of one out of what seem several simultaneous objects or trains of thought. Focalisation,
concentration of consciousness are its essence". The focus of attention is believed to be the key in trying to understand
the concept of consciousness. Research has revealed some facts concerning attention. (11) PET scans create images of
brain activity by detecting the presence of glucose in blood flow to nerve cells in the brain. When particular cells are
more active, there is more glucose in the local blood flow. The scans detect increased presence of glucose to construct a
three dimensional model of the brain on a computer screen showing greater activity with brighter colours. Recent
research using PET scans have revealed activity in an executive attention centre (EAC) in the cortex, when people
focus attention. This area of the cortex lights up when a person pays attention to a sensory input. Mystery remains as to
how activity in this region can enable the system to pay attention.
Directing attention. The process of paying attention can be shown to act through selective recognition by nerve cells.
Touch sensory receptors in the skin are known to fire impulses, when pressure is applied on the skin. Such messages
are relayed to the cortex in several stages. Consider a relay neuron which transmits impulses from a touch sensory
receptor on the shoulder to the cortex. Let us assume that, among its many inputs, this reporting neuron receives
impulses from EAC through a single dendrite. The reporting neuron may normally be inhibited to prevent an overload
of sensory data to the cortex. The signal from EAC may be recognised by the neuron as an instruction to re-transmit
received messages. When it recognises the input from EAC, the reporting neuron may transmit received impulses from
the receptor to the cortex. These impulses simultaneously further inhibit neighbouring sensory neurons, thus
highlighting the message. By sending nerve impulses to the distinct neurons, EAC may create awareness of the
pressure of cloth on the shoulder.
Awareness and consciousness. This reasoning points to increased awareness as a process, which causes inhibited
sensory neurons to fire. Recognition of EAC impulses by reporting neurons may focus attention by creating localised
awareness. Attention may become the process of increasing awareness in a local sensory region. Signals from EAC
may act by causing inhibited sensory relays to fire. Such control fibres may be linked to the entire sensory system to
enable EAC to focus the attention of the mind on any sensory input. A similar group of fibres may constitute a
consciousness channel, which may aid the mind to be generally aware of sensory inputs. Impulses in this channel may
instruct inhibited sensory neurons to begin reporting sensory events, to wake us into consciousness, with global
awareness. When this channel is inhibited, there may be no sensory awareness. The channel may be inhibited when we
sleep. A consciousness channel may wake us up, just as EAC focuses attention. Currently, awareness, attention, and
consciousness remain mysterious processes, which stand in the way of an understanding of the mind. If we accept the
possibility that individual nerve cells perform acts of recognition at the most rudimentary levels, we may explain many
such intelligent activities of the mind.

Memory
Current knowledge regarding memory is limited. There is little current knowledge about how memory is stored in
the brain. (12) Some researchers suggest that memory is stored in specific sites and others that memories involve
network functions, with many regions working together. This essay suggests a method for the storage of human
memory and a mechanism of its recall. This explanation forms an enabling requirement to support the insight that
instant recognition is a key function of the mind. This follows the hypothesis that nerve cells act as primary recognition
devices at the most fundamental level. Such a premise can explain how memory enables nerve cells to support

http://www.gamedev.net/reference/articles/article770.asp (10 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

intelligent networks, recognition of entities and habitual motor functions. This view of memory structure is vital for all
the functions of the mind, as described in this essay. This section provides an overview of how a nerve cell may store a
memory and how the nervous system may recall a memory.
Recognition requires memory. At the input and output levels, the firing by a nerve cell signifies a finite event.
Receptor cells interpret these sensory inputs and send impulses. These impulses are relayed to the cortex in several
stages. At an intermediate stage, a cell may receive messages from multiple locations representing multiple categories
of such information. The modification of the sensation of pain, or the focusing of attention were suggested to act
through the recognition of incoming messages by reporting cells. This essay suggests that a cell fires when it receives a
distinct pattern which it recognises. To "recognise" is to establish an identity. The identity of any entity can be
established only when it has a known relationship to certain characteristics. Knowledge requires consistency. If a cell
knows a relationship, it must fire every time the relationship is recognised. So, the cell must store a memory of this
relationship, if it is to recognise it. If a cell has the power of recognition it, must have a memory. It is suggested that
such memory may be an ability to selectively recognise different combinations of incoming nerve impulses.
The structure of memory. A nerve cell, with say, 26 dendritic inputs coded from A to Z may have a memory for
combinations of simultaneous inputs, such as CDE, DXZ, etc. The neuron can be said to store a memory for each
combination, if it fires (or is inhibited) on receiving simultaneous impulses at C, D and E, or at D, X and Z. Each
combination becomes a relationship which the cell remembers. Each cell has a functional specialisation. When it fires,
it reports, or triggers a finite and unique event. The combination represents the relationship of this event to other events
(CDE, or DXZ) it perceives. As suggested earlier, the pain reporting neuron fires for pain (P), sympathetic pain (SP) or
pain and touch (P+T). It is inhibited by (SP+T). Each remembered combination becomes a unit of memory, which
triggers a dependable response from the cell.
A massive memory. Perception of each unit of memory may cause the cell to fire, or to be inhibited. 26 characters can
be arranged in millions of unique combinations. For a nerve cell with just 26 inputs, there can be millions of such units
of memory. The cell may selectively respond to millions of combinations. Recognition on this basis may give massive
selective intelligence to the nerve cell. Contemporary research has so far failed to locate a physical location for human
memory. The possibility suggested here can point to incredible memory capabilities in individual nerve cells. If an
individual cell can have such a large memory, imagine the total memory capacity of 100 billion cells! The concept may
also highlight the problem of memory recall. There may be as many units of memory as the number of grains of sand
on a beach. The task may truly be the equivalent of locating a needle on a beach.
A memory at a synapse. High frequency stimulation of of the dendrites of a neuron have been known to improve the
sensitivity of the synaptic junctions. This phenomenon (13) is called long-term potentiation (LTP). Since such activity
is seen to be "remembered" by the cell through greater sensitivity at specific inputs, LTP is considered to be a hopeful
direction for research in locating human memory. This essay suggests that memory derives from a pattern recognition
function. It may follow from the cyclic recognition of the unique features of the multitudes of dendritic inputs of a
neuron. A neuron may become more sensitive to an individual input through LTP. Neurochemicals at the synaptic
junctions have also been known to increase such sensitivity. But, memory may derive from the gobal pattern
recognised by the nerve cell rather than from a greater sensitivity to a specific dendritic input.
Cell memory feasible. Each microscopic living cell contains the DNA molecule which carries within it the entire
blueprint for a human being. Recognition codes in cells interact in the handling of the millions of chemical interactions
in the body. The immune system is also known to use powerful code recognition systems. Under the circumstances, it
is feasible that the protein neuroreceptors which mediate neuronal interactions (or the innumerable chemical synaptic
intermediaries) contain sufficiently powerful memories and code recognition systems for the sustenance of a practically
limitless memory in each nerve cell. If such a massive memory exists within each one of billions of nerve cells, there is
the possibility of an astronomically large human memory - trillions of trillions of megabytes in computer terms.
Acceptance of the presence of such an immense memory may take us a step further in understanding the awesome
power of the mind. It may also create a massive barrier to AI in its efforts to imitate human intelligence.
The memory of nerve cells may be for patterns. Recognition requires a memory for the cell. Instead of just 26
inputs, many nerve cells have thousands, or even hundreds of thousands of incoming dendrites. 26 inputs can be

http://www.gamedev.net/reference/articles/article770.asp (11 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

represented as characters on a page and each unit of memory as a group of characters, such as ABC or CDE. But, with
hundreds of thousands of inputs, the closer equivalent is a pattern of dots on a screen - a picture. With Boolean logic,
the pattern would consist of dots, which are either on, or off, with a defined frequency. The memory of a nerve cell
would be its ability to store in memory and so recognise multiple patterns of dots - the pattern of incoming dendritic
impulses on a cyclic basis. This cyclic pattern of dots is the equivalent of a black and white picture. Recognition of a
picture triggers an impulse from the cell, indicating that the current incoming information has relevance to this
particular cell. Each nerve cell may have a memory for millions of such pictures, recognising individual pictures to
respond with impulses, or with inhibition.
Memory must be recalled in context. Wherever memory may be stored, it concerns a whole lifetime of activity and is
available for instant recall. A threatened animal carries a potent memory bank of past perilous experiences. It has
memories of initial sensory indications of danger, of muscular responses for battle and of escape routes from the battle
zone. With contextual memory recalled within fractions of a second, the whole power of experience is brought to focus
on the ongoing task of survival. A contextual filing system for memories is a vital requirement of life. Contextual use
of memory existed from the beginning of evolution. (14) In the early aeons, "Nosebrains" recalled memories for smells
to decide if an object was edible and to be consumed, or inedible and to be avoided. Smells became the file pockets
which triggered physical activity. Simple odour based filing systems in vertebrates evolved to more sophisticated
feeling based systems in mammals. Feelings provided context for many subtle shades of activities, including leisure,
play, upbringing of the young, and mild hostility, or deadly combat. This essay suggests that feelings may provide the
key to the recall of memory.
Feelings and emotions are real. But, for centuries, feelings were discarded by scientists as not being part of the
rational modern mind, a throwback from primitive times. It was Charles Darwin who first suggested that emotions have
a real world existence, visibly expressed in the behaviour of humans and lower animals. The existence of an emotion
could be derived from an angry face, or even a bad feeling in the stomach. Later theory suggested that each emotional
experience is generated by a unique set of bodily and visceral responses. Visceral responses switch the nervous system
between the sympathetic system which supports energetic activities and the parasympathetic system, which supports
relaxation. (15) Subsequently, this view was disputed by W.B. Canon. He countered that emotions do not follow
artificial stimulation of visceral responses. Emotional behaviour was still present when the viscera was surgically or
accidentally isolated from the central nervous system.
Nerve impulses can represent feelings. This view that emotions have an independent existence is supported by
current research. Euphoric states of mind are created by drugs. (16) Electrical excitation of certain parts of the temporal
lobe of the brain produces intense fear in patients. Excitation of other parts cause feelings of isolation, loneliness or
sometimes of disgust. (17) The feeling of pleasure has been shown to be located in the septal areas of the brain for rats.
The animals were observed when they were able to self stimulate themselves, by pressing a lever, through electrodes
implanted in the septal area. They continued pressing the lever till they were exhausted, preferring the effect of
stimulation to normally pleasurable activities such as consuming food. All experimental evidence over the years
suggests that nerve impulses can trigger feelings. This fits in with the reasoning that nerve impulses represent finite
events. In such a case, a group of fibres which carry feeling impulses can be viewed as a picture in a channel,
representing the real time feelings in the system.
The limbic system - a feeling centre. (18) In 1937 Papez postulated that the functions of central emotion may be
elaborated and emotional expression supported by a region of the brain called the limbic system. This system is a ring
of interconnected neurons containing over a million fibres. These fibres also pass through the thalamus, the main nerve
junction to the cortex mentioned earlier. The limbic system is a feedback ring with impulses travelling in both
directions. (19) This essay suggests that the pattern of impulses in this million fibre channel of the nervous system may
represents our global feelings - a feeling channel. For a system which is constantly interpreting nerve impulses, the cell
of origin of the impulse indicates whether the impulse represents a point of light, a pitch of sound, an element of pain
or a twinge of disgust. Feelings are triggered as nerve impulses which represent measurements of the parameters of the
system. They are ever present. The pattern in this channel reflects the current feeling and may provide the context for
the recall of memories by the mind. Feelings may be expressed as a picture with a million dots. This essay suggests that
each subtle variation of the picture could recall a specific memory.

http://www.gamedev.net/reference/articles/article770.asp (12 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

A sensory map on the cortex. It was reasoned that nerve cells store memories in the context of their relationships.
Such data must be stored somewhere to be recalled. It is widely known that the brain physically isolates each pixel of
sensory information. (20) When light enters the eye, it passes through the lens and focuses its image onto the retina.
The light is received by special cells in the retina called rods and cones. Light-sensitive chemicals in the rods and cones
react to specific wavelengths of light and trigger nerve impulses. About 125 million rods perceive only light and dark
tones in an image. 6 million cones receive colour sensations. The light from a single rod is perceived as a microscopic
spot of light when impulses reach the visual cortex. (21) Similarly, the tones heard by the ear reach a region of the
cortex called Heschl gyrus. There is a spatial representation with respect to the pitch of sounds in this region. Like a
piano keyboard, tones of different pitch or frequency produce signals at measurably different locations of the cortex.
Each pixel of sensory information terminates in a specialised complex on the cortex. The entire sensory inputs to the
mind impinges as a picture in a region of the cortex. Consider the possibility that the memory of each sensory image is
stored exactly where it is received. There is experimental evidence of this possibility.
A Barrel to store memory. Each of the millions of sensory signals is finally known to reach a specialised barrel of
cells in the cortex. (22) In 1959 Powel and Mountcastle identified this complex as the elementary functional unit in the
cortex. Each unit is unique. It is a vertical column of thousands of nerve cells within a diameter of 200 to 500 microns,
extending through all layers of the cortex. Let us call this unit a Barrel. Research has demonstrated the functional
specialisation of each Barrel. Each Barrel represents a single pixel of sensory information. The neurons of one Barrel
are related to the same receptor field and are activated by the same peripheral stimulus. All the cells of the Barrel
discharge at more or less the same latency following a brief peripheral stimulus. The activation of one Barrel indicates
the arrival of one finite element of information to the cortex. A single rod reports the incidence of light on a
microscopic spot on the retina. The impulses from this cell are carried through the optic nerve to a single Barrel in the
visual centre in the cortex. The firing of a Barrel in the primary visual cortex signifies the perception of a point source
of light by the mind. This essay reasons that memories may be stored in the same Barrels.
Barrel - logical location for memory. The firing of one Barrel represents a single pixel of the global sensory
information. The location of the Barrel defines it as a point of light, a pitch of sound or a pressure point on the skin.
The firing of a pattern of Barrels is interpreted by the mind as a sensory image. The Barrels will fire when the image is
received. If the same Barrels fire again, a memory of the same image will be recalled. It was reasoned that a memory
may be recalled in its context. Feelings may provide that context. Feelings are the logical filing references for the recall
of memory. Feelings form a picture in the feeling channel. It was reasoned that nerve cells store memories of
relationships. These relationships were stored as pictures. It is now suggested that such a memory may be recorded into
a Barrel. The current feeling may be recorded into the memory of all Barrels which receive the current sensory
perception. Each Barrel recalls the relationship of this feeling and fires. When this feeling is recalled again, the same
Barrels fire and the sensory memory is recalled. For this reasoning to be plausible, feelings must have access to each
barrel.
A "non-specific" access. If feelings trigger the firing of Barrels and the resultant recall of memory, then the feeling
channel must have access to each Barrel. Current research supports the view that there could be such an access. (23)
The nerve cells in the Barrels of the cortical layer are known to have both radial and parallel fibres. Radiating
downwards from the cortex are millions of fibres which directly link Barrels through the thalamus to all sensory and
motor functions. This link is called the "specific link". The cortex also has a surface layer which runs a thick network
of fibres parallel to the surface. These fibres are also known to be linked to the thalamus. This link is called
"non-specific thalamo-cortical link". The link was recognised when it was discovered that stimulation of the
"non-specific nuclei" of the thalamus led to wide-spread "recruiting activity" in the outer layers of the cortex. This
essay suggests that this "recruiting activity" could be the process of recalling memory.
Feelings have access to Barrels. The feeling channel in the limbic system passes through the thalamus. The impulses
in this channel may be broadcast through the "non specific thalamo-cortical link" to the cortical Barrels. The complex
of cells in each Barrel may receive dendritic inputs from the million fibre feeling channel through the surface layer of
the cortex. Each Barrel may instantly recognise feeling patterns. Recognition of a feeling may cause a pattern of
Barrels to fire. The firing inhibits Barrels with weaker recall. Firm firing by a contour of Barrels recalls the original
sensory image. There is evidence that strong feelings result in more powerful memory traces. When strong feelings are

http://www.gamedev.net/reference/articles/article770.asp (13 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

experienced during a sensory event, each Barrel stores a more intense feeling pattern. As a result, more Barrels recall
the image and a more vivid memory of the event is recalled.
Memory of a flower. As explained earlier, when light from a flower enters the eye, it passes through the lens and
focuses its image onto the retina. The light is broken into millions of pixels. The impulses representing each pixel are
carried through the optic nerve to a single Barrel in the visual centre in the cortex. Each Barrel is a complex of cells
with a vast number of inputs. It is suggested that each such Barrel also receives a feeling image - the feelings
experienced when viewing the flower. Each Barrel, which receives a pixel of the image of the flower, records the
current feeling picture. Later, if the feeling was strong and is recalled, the Barrel fires. Firing by the relevant Barrels
inhibits weaker recognition paths. When all Barrels which recognise this feeling fire, the image of the flower is
recalled. (24) This hypothesis concerning the location of sensory memory is also supported by a recent discovery. In
1988, Kosslyn reported that the recall of a visual image involves activity in the same areas where visual perceptions are
received. Effectively, the same Barrels fired when an object was perceived and when its memory was recalled.
A gargantuan memory. Consider the impact of this view of memory storage and recall. The mass of nerve cells in
each Barrel in the sensory region may store memories of all the feelings one has ever experienced whenever it fired.
The recall of any relevant feeling causes the Barrel to fire. If recognition is weak, it is inhibited by the stronger
recognition of neighbouring Barrels. Firm firing by a pattern of Barrels recalls a clear sensory image. This implies that
any perception is stored as millions of microscopic pixels of the global sensory image, in the context of a relevant
feeling. Such a memory would be widely distributed through all sensory Barrels, This could explain why scientists
could not remove memory by ablating portions of the brain in their experiments on rats. The findings of Kosslyin that
the recall of vision involved activity in the same cortical region as visual perception also supports this view of memory.
Such a system could store a lifetime of sensory memories and instantly assemble a single contextual image. This
process could explain your ability to recall an image from everything you have ever read, seen, or heard, in the blink of
an eye. But, the tendency of the system to inhibit weaker recognition paths may also prevent the recall of weaker
memory traces and appear as the fading of memory.
The cell memory can be inherited, or instantly acquired. It is reasoned that the memory required for pattern
recognition by nerve cells, as envisaged in this essay, can be both inherited and acquired. Inherited processes may be
seen in the visual processing regions of the cortex. The varying attributes of a visual image are analysed in different
regions of the visual cortex. One of these locations analyses the orientation of the outlines of a visual image. The cells
are arranged into distinct modules, with orientation selective cells which fire only when an edge or bar in their
fields is held at a particular orientation. While all the cells in one column of cells respond to one orientation, and an
adjacent column responds to an orientation a few degrees off from the first and so on, till all possibilities are covered. If
a column of cells is to select a single orientation, it must receive inputs concerning all orientations and then select one.
Selection implies choice. From multiple received pictures, a single row of cells select a single picture. This is a
consistent response. Evidently, they remember the picture. Such responses by cells has to be inherited. The recognised
pattern may be its inherited memory. Evidence of such automatic responses by many neural systems provides proof of
a cell memory for patterns.
Sensory memories are, of course, acquired. As against a wide range of inherited responses by nerve cells, new
sensory memories are continually recorded. Every day, events that provoke feelings record thousands of images into
memory. When a Barrel fires to recall a new memory, the pattern of feeling impulses which triggered recall has already
been recorded afresh into the memory of the complex of cells. Since the cells can be sensitive to inputs from even a
single dendrite, the process of recording a memory can be a simple process of recording the current incoming picture
on receiving such an instruction from any source. This essay assumes that cells can have both inherited and acquired
memories.

http://www.gamedev.net/reference/articles/article770.asp (14 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

Recognition of Objects
Channels carrying pictures. We have assumed that nerve cells recognise received pictures. A feeling channel carrying
a picture through a million fibres has also been suggested. There is a vital difference between a picture and a parcel of
messages, when transmitted through a bunch of fibres. Take the 32 bit parallel connection in a computer cable
connecting two parallel ports. A computer can recognise only two states, on or off, in each of its millions of circuit
switches. But when 32 switches are linked together, it can recognise, in a single cycle, over four billion pieces of data.
But, such connections must maintain integrity in neighbourhood relationships at the sending and receiving ends. Only
cyclic information received simultaneously through all the inter-related switches can be interpreted. Compare this to a
glass fibre channel transmitting information. If each fibre carried an individual message, the relative location of the
fibres would not matter. But suppose each fibre in the channel carries a single pixel of a picture. Then, if the relative
positions of the fibres change between the sending and receiving ends, the picture will be lost. It is in such a context,
that the computer cable transmits a primitive 32 dot picture. The relative position of the dots must be maintained at the
sending and the receiving ends. The feeling channel may similarly transmit a million dot picture.
Neighbourhood relationships critical for a picture. If a channel is to accurately transmit a picture, the fibres must be
projected and "mapped" at the receiving end. Such projection and mapping is done in the nervous system. (25)
Throughout their growth, axons extend and map on to specific target regions. Each area of the somato-sensory cortex is
proportionally linked to the number of nerve endings in the corresponding part of the body. Similar parallel projections
exist in many other regions. Proximity relationships are critical for these connections. They maintain integrity in the
relative location of the fibres in a transmission. The principle of projection suggests that relative location has meaning
for the nervous system. The transmission is essentially a matrix of precisely located dots. Millions of such dots, in a
magazine, are recognised by us as a picture. If the relative locations of dots change, the picture is seen to change. Just
as it decodes information from the characters on this page, the mind instantly decodes the information in such a matrix
of dots. In this essay, a picture is defined as a cyclic transmitted pattern of dots, in a fixed matrix, which is recognised
at the receiving end.
Pictures may be the language of the mind. The nerve fibres reporting pain to the cortex is a pain reporting channel.
Each fibre recognises incoming patterns to be inhibited or to report both pain and sympathetic pain. The mass and
location of dots in the picture in the channel reports the precise location and severity of perceived pain. This essay
suggests that recognition of such pictures is the basic capability of the nerve cell and of the nervous system. More
reasons are given to support the view that information may move in the nervous system as such pictures in dedicated
channels. The capability of recognising pictures may apply not just to visual images, but to all messages transmitted in
the system. The meaning of such messages may be defined as the information carried by them. Un-related, a single
pixel of a picture has little meaning. Meaning is derived from its contextual whole. An arrangement of dots can
represent a character of text. Higher and higher orders of meaning can be conveyed by a single character, a word, a
sentence, and a paragraph of text. A picture is said to carry more meaning than a thousand words. It is at the apex of the
hierarchy in the conveyance of meaning. A multi-million dot picture channel can carry an infinite range of information.
Neural channels may convey the most powerful meaning at this level. The remainder of this essay assumes that the
nervous system transmits such meaning.
From primary to secondary, then to association areas. When we assume that channels in the system carry
meaningful pictures, the flow of information reveals awesome order and purpose. The regions and pathways of the
human nervous system have been extensively mapped. Each receiving region performs a function and transmits the
pictures further. (26) The areas of the cortex which receive sensory information are called the primary areas. They were
seen to perceive and recall sensory images. The sensory pictures proceed from primary to secondary areas, which
co-ordinate those from similar sensory receptors in the other half of the body. Neuron channels from the primary areas
send pictures only to the secondary areas. All secondary areas in both hemispheres of the brain are inter-connected. The
secondary areas are known to deal with more complex functions such as binocular vision and stereophonic sound. The
pictures proceed from secondary areas to the so called "association areas" of the cortex. These areas receive the
consolidated pictures from all secondary regions. The association areas appear to be the principal pattern recognition
engines of the mind. They perceive and recognise

http://www.gamedev.net/reference/articles/article770.asp (15 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

Many categories of recognition. Each association area recognises an entity in the context of its received sensory
information. (27) The primary somesthetic area of the cortex receives pictures of the sense of touch. If this area is intact
and there is damage to the somesthetic association area, a patient can feel a common object, such as a pair of scissors
held in the hand, while his eyes are closed, but is unable to identify it. The picture in the somesthetic areas enables the
sense of touch and that in the somesthetic association area enables recognition of the touched object. Failure of each
association area causes failure of a particular recognition ability. The visual association area impacts on visual
recognition. Tactile categorisation affects the recognition of an object by its feel. When the speech association area is
damaged, a person knows the object, but is unable to name it. The association areas appear to perform individual acts
of recognition.
A picture to represent the recognition of an object. The premise is that pictures transmit information in the system.
Pictures imply a distinctive pattern of dots in a fixed matrix. A visual image may be recalled through the firing of the
same Barrels in the cortex which received the original image. Groups of Barrels are also known to transfer information
between different regions of the cortex. The nerve fibres from the Barrels in the somesthetic association area also can
be reasoned to be sending pictures. Damage to this area implies loss of ability of the nerve cells in this region to send
these pictures. Subsequent failure to recognise a pair of scissors suggests that this picture represents a pair of scissors to
the mind. Such a recognition is a stable repeatable event. For recognition to be stable, this picture must be consistent
for this object. If pictures transmit information, the same picture must fire every time scissors are recognised. Each
object that is recognised would require equally consistent pictures. This would require a process which imprints such
pictures in this channel.
A recognition image. This essay suggests two routes for intelligent activity through instantaneous recognition of
patterns by billions of individual nerve cells. One is to recall an image in the exact geographic format in which it is
recorded as in the recall of a visual memory. The second is to recall a reference image, imprinted at the point of
recognition. A recognition picture fired by the association channel could be any random arrangement of dots. It would
be more logical to think that this arrangement is obtained from the system. Since feelings are always present, feeling
patterns could provide random reference points. If the geographic map of the association channel duplicates that of the
feeling channel, and has a parallel link to it, the channel can fire the same pattern as the feeling experienced when the
recognition of an object is first imprinted. Subsequent recognition would fire the same picture and the system would
recognise the same object. Stability of recognition may be provided by nerve cells in Barrels which act to trigger
inhibition of weaker recognition paths. In conclusion, each time the recognition of an object is imprinted, a picture is
imprinted in the association channel. Later when the channel perceives the object, it fires this picture in recognition.
The elimination algorithm to recognise an object. The mind instantly recognises one object from thousands of
known objects through a sense of touch. But the number of identifiable objects is finite. Each identification triggers a
picture by the Barrels of the somesthetic association channel. The Barrels receive integrated sensory pictures from both
halves of the body. They receive the global touch sensory information concerning an object. Assume that a random
group (X) of Barrels store the sensory picture at the point of recognition. X, a random reference, is provided by the
system. Later, during recognition, all Barrels perceive the object. The X Barrels which recognise some unique element
of the object trigger inhibition of Barrels which fail in such recognition. The X picture fires. Firm recognition would
imply a consistent firing of X. Active inhibition of all unrelated Barrels would eliminate other categories, leaving X,
indicating recognition of a single object. Sometimes the name of a recognised object may remain at the boundary of
consciousness, to be lost suddenly. Such "tip of the tongue" feelings may be derived from solutions that are eliminated
at the last moment.
">Elimination from a fixed list. No one will dispute the thesis that the human memory has gargantuan proportions. A
search and match algorithm would just go further and further back into memory to identify an object. Such a search
would be endless. But the elimination algorithm presumes a fixed list of known objects - a limited memory. From the
first recognition of its mother by an infant, the mind continually expands its list of differentiated entities. But at
any point in time, the list must needs be fixed. It is logical to conclude that the list is finite. Secondly, if the list of
known objects was an amorphous mass, the mind would attach little importance to a new addition to the list. But the
first recognition of a new category is more vividly remembered than the second or the third. The Xerox machine and
the Polaroid camera are typical new objects, which are better remembered. The principle of "positioning" in advertising

http://www.gamedev.net/reference/articles/article770.asp (16 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

depends on finding new categories against which products can be "hooked". The marketing world attaches importance
to creating a "new niche" in the customer's mind, because the context of imprinting a new category is better
remembered. The importance attached to the creation of new categories implies a finite list of known categories.
Elimination from this list brings the recognised category into focus.
Analytical logic vs. IA pattern recognition. Initial AI efforts assumed that computation was the key to intelligence.
That the mind was just a sophisticated calculator. Later, it was acknowledged that any intelligent action requires a
knowledge based response. The mind uses a store of knowledge to respond differently to varying circumstances. AI
scientists attempt to assemble the knowledge and the related responses using the tools of analytical and inductive logic.
Data is chunked into categories which follow particular rules. Known relationships need to be generalised to fit rules
for such categories. This need to generalise limits analytical thinking. It cannot cope with infinitely differentiated steps.
It misses galaxies of fine detail. So, it fails to identify between charm and dignity, or anger and enmity. It ends up as a
subset of the human thought processes. IA suggests a pattern recognition process which also follows the principles of
inductive logic. It also uses a store of knowledge to draw conclusions from past experience. While analytical logic
requires a series of steps from premise to conclusion, IA pattern recognition leaps logically from perception to
conclusion based on unique category links in memory.
Fine logical differentiation. The IA logic implies a multi-million dot recognition picture, which can represent trillions
of categories. Each dot in such a picture (a Barrel) results from the evaluation of millions of stored multi-million dot
pictures. Such pattern recognition pinpoints millions of categories with precision, by identifying the unique quality of
each category. It views huge masses of data, using an astronomically large memory. In such a process, it instantly
identifies marble from jade, or tea from coffee. With the capability for massive discrimination, it conquers the
subtleties of language, poetry, art, and music. Such pattern recognition handles both analysis and the highest levels of
subtlety. While analytical logic fails to evaluate a great painting, pattern recognition identifies it as a work of art. It also
instantly recognises the stupid question in an analytical AI process. The only tool available to AI for modelling the
knowledge relationships of the mind was a logically analytical one. Unfortunately it was less sensitive, pitifully slow
and stupid. This became a barrier to an understanding of the mind. The IA pattern recognition model is logical and
capable of fine and massive differentiation. Above all, it functions in real time. It can sift mountains of data in
milliseconds. IA can better help to explain the vast capabilities of the human mind.

Motor Control
Mind's control of the body. A complex mind must interact with the physical world. Nerve impulses must transform
thoughts into an actions. The intelligence involved in such a process has appeared mysterious, with almost spiritual
overtones. It was as if the body followed the instructions of a phantom spirit, which resided somewhere in the brain. As
against such obscurity, this essay has argued that apparently mysterious activities of the intellect can be explained, if
we acknowledge the act of recognition. That recognition of pictures by the nervous system is the key element of
intelligence. This section suggests how the mind may control actions of the body through such a process. A network of
fibres, which convert sensory perceptions into thoughts, may consciously manage fluent physical skills through the
phenomenon of recognition by nerve cells.
The motor control process. This analysis of how the mind may control the body begins with an outline of the
subordinate motor control system, validating intelligence at this level. It goes on to propose how a pattern recognition
system can communicate decisions and objectives at higher levels. An explanation is offered for the logic of
consciousness, which is the pre-requisite for purposive activity. It tries to show how feelings can control conscious
motor activity. It adds the description of a very special organ which mediates to store and retrieve skilled activities.
Such managed recall of skilled physical activity is shown to virtually create the modern human being, with the finely
honed skills of a gymnast, or a concert pianist. The explanation hints at the awesome power and finesse of a control
system which is empowered by sensitive pattern recognition. It ends with more thoughts on a neural channel which
may contextually choose intelligent physical goals for the body.
Habitual and purposive responses to feelings. The recall of a memory and the recognition of an object are

http://www.gamedev.net/reference/articles/article770.asp (17 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

instantaneous acts. As opposed to this, motor activities persist over time. Since motor control impulses may fire upto
10,000 times a second, sequences of millions of impulses command the act of writing a letter, or of playing a game.
Current knowledge is that these controls have both purposive and habitual components, which interact seamlessly.
Experimental evidence has shown that purposive controls come from the cortical areas. Habitual controls are known to
come from an organ known as the cerebellum. These two control systems co-operate to achieve the objectives of the
mind. It is reasonable to assume that the process achieves the satisfaction of felt needs through physical activity. The
million fibre feeling channel was suggested to have access to information on the needs of the system. This chapter
suggests how pictures in this channel can control motor activity.
Control and intelligent response. At the highest level, impulses that control muscle movements originate in the
Barrels of the motor area of the cortex. Electrical stimulation of the cortical motor areas trigger, through 60,000 or so
motor neurons, specific acts of muscle contraction. These are supported by controls from the cerebellum. This organ is
known to co-ordinate motor activity. When these signals are despatched to subordinate levels, a single motor neuron
processes further lower level information from upto 20,000 dendritic inputs (28) from other neurons. These are
supportive controls. In an air liner, a pilot expresses a purpose by moving a cockpit control lever. This act switches in a
series of hydraulic and electrical motors which finally achieve the intention of the pilot. There is purposive movement
at higher levels and intelligent support at lower levels. Similarly, in the human system, signals from the cortex and the
cerebellum provide high level controls. These are converted into smooth activity by the co-ordination of numerous
muscle groups. The 20,000 inputs add intelligent support to cortical purpose.
An inherited cell memory for low level intelligence. Current research does not assign a recognition role to neuronal
inputs. So 20,000 inputs become a mysterious network which achieves co-ordination. Any contracted muscle remains
contracted unless pulled back by an opposing muscle. Smooth activity requires co-ordination between muscles. Such
co-ordination is aided by sophisticated receptors which report back with nerve impulses on pressure, stretching of skin
and the initiation and cessation of movements. Let us assume that neuronal interactions achieve intelligence through
recognition. That each input combination triggers a remembered "fire or inhibit" response. If so, the significance of
20,000 inputs become obvious. Recognition of an impulse indicating the contraction of one muscle requires automatic
inhibition of impulses which contract an opposing muscle. Imagine the interactions in the ordinary act of sitting down.
It involves numerous muscle groups. Each muscle co-ordinates this simple act with the activities of other muscles and
movement related parameters. Changes in the responses of opposing muscles must be immediately reckoned. There
may be an inherited mechanical logic in such responses. Each microscopic muscle movement may inform other
neurons with nerve impulses. The impulses reaching each input may be unique in its information content. There can be
millions of combinations of such inputs. With the intuitive IA process, every decision may inhibit all irrelevant
activities to achieve a single choice. Each motor neuron may recall its memory codes to resolve 20,000 independent
inputs for a single decision on movement for the next instant. Imagine this to be the inherited intelligence at the lowest
level of the system.
A decision must persist. The final co-ordinated output of impulses in the motor channel trigger muscle movements
through the 60,000 motor neurons. Pictures are reasoned to be the language of intelligence. The final motor activity is
the result of a cyclic output picture in the motor channel. Obviously, the highest cortical levels originate this activity
through similar pictures. A cortical purpose picture is intelligently transformed into a motor output picture. It is now
suggested that decisions of the system can also be conveyed as pictures. The complexity of such decisions can be
imagined. The performance of a concert pianist is a product of such decisions. But, pictures can convey meaning at the
highest levels of complexity. Such decision pictures can be recognised by the motor channel to control muscle
movement. But, any decision to act is instantaneous. The impact of a decision must persist till its objective is achieved.
Muscle movements extend over time, while any decision to act occurs in a flash. If an objective is to be achieved, the
muscle must move till the desired action is completed. A decision to sit down must persist till the act of sitting down is
over. If a picture represents a decision, the picture must remain until the task is completed.
An iterating decision picture. In a cyclic system, a feasible solution of the need for persistence is for a decision
picture to iterate, till the task is completed. In a television set, the channel number that appears on the screen is a
constant iterating image, while images on the remainder of the screen change with each cycle. A stationary symbol is
produced in a cyclic system. Any decision, which requires a fixed objective, can be represented by a picture - the

http://www.gamedev.net/reference/articles/article770.asp (18 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

channel number. A fixed picture is needed till an objective is achieved. If a television set could recognise the channel
number on the screen, it could respond with desired programs in the channel. If the channel number changes, the image
events could change. The change in channel number is instantaneous, while the program persists. This essay suggests
that a channel, with iterating signals, may convey the decisions of the nervous system to the motor control regions.
That such pictures iterate in a goal channel. One practical method for the mind to control the body would be to
consciously produce such iterating goal pictures.
The feeling to goal link. Let us assume that feelings represent needs of the system. Its constantly changing patterns
represent demands from the system in real time. If impulses in this channel could control motor activity, felt needs
could be satisfied. Feeling stimuli, like all nerve impulses, are cyclic patterns, which must trigger decisions. Any
decision is an instantaneous event. But it must initiate a persisting activity. Let us assume that feelings trigger goal
pictures. These pictures persist till goals are met. Motor activities are again cyclic, needing to be continually triggered.
Thus, a goal channel establishes a link between an instantaneous decision following a felt need, and continuing motor
activity to meet the need, by providing a persisting objective. The location of the feeling channel has already been
indicated. The goal channel is suggested as a necessary adjunct to a cyclic pattern recognition system which achieves
continuing intelligent behaviour. A possible physical location for this channel is indicated later in this essay.
The wellspring of consciousness. The mind controls the body when it is conscious. This theory of how the mind can
learn to control the body requires an explanation of consciousness. The source of this phenomenon may be evaluated
from a special context where a person becomes unconscious. This is known to occur when there is damage to a region
of the brain called the reticular formation. Damage to most other regions of the brain, (including the removal of half of
the cortex in an operation called hemispherectomy to remove tumours), causes only selective defects. (29) But, serious
damage to the reticular formation results in prolonged coma. Cutaneous or olfactory stimuli to the reticular formation is
known to restore a person from a fainting fit. Electrical stimulation of the reticular formation is also known to induce
sleep in animals. Activity in this region can both raise the levels of consciousness and alertness as well as induce sleep.
The reticular formation appears to be critical to consciousness.
A consciousness control channel. (11) It was shown how an executive attention centre could increase awareness by
causing inhibited sensory inputs to fire. It was suggested that a similar channel may wake us into consciousness, with
global awareness. It is now suggested that the reticular formation has initiating links to the sensory input and goal
channels. That these links form a consciousness control channel. That these channels recognise cyclic impulses from
the reticular formation and wake us into consciousness. That these control signals provide an ongoing consciousness
drive. Cyclic impulses keep us conscious. The consciousness drive opens sensory channels and people become aware
of their surroundings. It also triggers activity in the goal channel, which generates pictures defining objectives of the
system. Goal pictures are automatically interpreted into motor activity.
Purposive movement through learned pattern recognition. In conscious activity, primitive animals may have
inherited links between felt needs and purposive activity. Inherited memories may enable a need felt by a primitive
brain to lead to an activity, which automatically satisfies that need. But human systems have highly differentiated
purposes and even new ones. New purpose cannot be an inherited memory. Pattern recognition also implies the
recognition of a pattern in memory. A pattern recognition system can only recognise a known pattern. Human beings
learn through play and experimentation. These can lead to the successful achievement of goals. Feelings related to such
successes can become memories for subsequent recall. Imagine that a contextual feeling is recorded against a goal
picture which achieved a desired result. If this feeling is recalled later in a similar context, it can trigger the same goal
picture, resulting in the needed motor activity. If a goal channel records and recalls successful goal pictures in the
context of feelings, feelings can then trigger goals. The goal channel must learn to record contextual feelings against
activities which led to successful goals.
Learning purposive movement. Consider this explanation of how the goal channel learns to recognise feelings to
trigger goal pictures. It may be a process which begins in the cradle, with the intense activity of an infant. As the baby
wakes up, the consciousness drive initiates the goal and sensory input channels. The goal channel produces random
images. These symbols trigger erratic hand and leg movements. The active infant sees an object in its field of vision. Its
waving hand touches the object. A feeling of satisfaction is experienced. This feeling is recorded in the goal channel
against the goal picture which achieved this goal. A subsequent view of the object recalls this feeling. The goal channel

http://www.gamedev.net/reference/articles/article770.asp (19 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

recognises the feeling to trigger the goal picture. This picture results in the hand movement. Contextual recall of the
feeling enables the child to moves its hand. An ongoing learning process continually adds memories of similar feelings
to goals relationships in the goal channel. In a continuing process of repeated play and experimentation, the child learns
to move its hand towards seen objects.
An ongoing learning process. Each achieved goal increases control. Pattern recognition permits fine discrimination of
feelings to chose ever more precise goals. Practice and millions of similar memory refinements later, the child learns to
reach out and grasp a pencil. As it learns to control its movements, the feeling channel takes charge of the goal channel
and the random activities of the infant cease. Feelings control the movements of the child. They become purposive.
Ultimately, such purpose covers the entire range of human activity, including speech. Speech, in fact, becomes one of
the best expressions of an individual's feelings. Each of these learned activities is the result of the memory of a goal and
a learned activity, recalled in the context of a feeling. The cerebellum may store and recall memories of learned
activities. Such a store of memory of habitual movements is known to interface seamlessly with purposive cortical
movements.
Computation and pattern recognition for movement. It is argued that the body achieves daily routines through the
instant recall of memories of habitual movements. A proof of this becomes a powerful support for this theory that
pattern recognition, (not computation) is the key to an understanding of the mind. Both computation and remembered
control systems can enable a robot arm to touch an object. It can compute the precise location of the object and make a
sequence of inter-related decisions. It moves a joint at the shoulder to an optimum position and locks. The movement
then switches to the elbow and subsequently to the wrist. Such an activity would be a precise, mechanically computed
movement. Alternatively, the arm could be guided directly to the target. The complex joint movements which result
can be recorded into the system memory. Subsequently, when the target is indicated to the system, it could recall its
memory to follow this "learned" path to reach the object. The first process involves a complex computational capability
in the control system and the second, a powerful memory. Pattern recognition implies the use of memory for intelligent
activity.
A filing cabinet for habitual movements. There is an organ of the mind which appears to store such memory. The
cerebellum is a miniature single purpose brain. It is laid out to assist cortical motor functions. It inserts habitual
movements into purposive activity. Electrical stimulation of the proper primary motor areas of the cortex invoke simple
actions such as the extension at a finger joint. While cortical control is simple, the cerebellum (30) is "necessary for
smooth, co-ordinated, effective movement". Failure of the cerebellum causes movements of a patient to become jerky.
With cerebellar problems, the patient converts a movement which requires simultaneous actions at several joints into a
series of movements, each involving a single joint. (31) When asked to touch his nose, with a finger raised above his
head, the patient will first lower his arm and then flex the elbow to reach his nose. This problem is called
"decomposition of movement". Such behaviour is remarkably similar to the actions of a primitive robot. Complex
joint movements for directly touching an object are forgotten and cortical purpose just manages a tedious joint by joint
movement. Purposive action continues to be achieved without the cerebellum. The patient can still reach the goal. But
presence of the organ achieves the same goals smoothly.
Sequential control of motor functions. Research has shown (32) that the cerebellar cortex has all motor control
functions with sensory inputs related to motor locations spread over its cortical layer with topographic precision. The
cerebellum receives all the information that is needed for motor activity. As it takes over control of habitual motor
functions from the cortex, (33) the entire outputs of nerve impulses from the cerebellum are through a type of nerve
cells called Purkinje cells. Each Purkinje cell is known to have hundreds of thousands of dendritic inputs, with inputs
from global sensory and motor functions. Each input evaluates a single parameter. In 1967, V.Braitenberg suggested
the possibility of control of sequential events by the cerebellum. (34) The organ appeared to have an accurate biological
clock. Impulses in fibres which link successive Purkinje cells reach the cell dendrites at intervals of one ten thousandths
of a second. Alternate rows of Purkinje cells are excited, while in-between rows are inhibited. The cells fire
sequentially.
A memory for habits. The cerebellum is known to perform a co-ordinating function. It has access to the entire range
of contextual motor control information, an accurate pace setting mechanism and is purposively controlled by the

http://www.gamedev.net/reference/articles/article770.asp (20 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

cortex. The only output from the cerebellum are the Purkinje cells. They control habitual motor functions. Such motor
activity meets cortical goals. This essay suggests that each Purkinje cell records a microscopic motor activity for a
single motor neuron in the context of current global motor control data and the current cortical goal. The cell fires again
whenever this picture is recognised. It recalls a memory to generate motor activity. Consider the habitual controls from
the cerebellum in the simple act of sitting down. It is, essentially, a complex movement, controlled by both cortical
purpose and habitual controls from the cerebellum. The height and position of a chair provide cortical goal information.
The cerebellum manages the objectives of many muscle groups to achieve the cortical goal. It is reasoned that, with
trillions of contextual pictures in its memory, the Purkinje cells may sensitively recognise each microscopic motor and
goal prospect to support habitual acts.
A goal channel into the cerebellum. The cerebellum provides some hints of the existence of a goal channel as
suggested in this essay. The cerebellar cortex receives a major input from a nucleus of cells called the olivary nucleus.
Fibres from many regions of the cortex reach the inferior olivary complex and are distributed to all parts of the
cerebellar cortex. (35) Damage to this group of fibres is equivalent in effect to damage to the cerebellum, causing
severe loss of co-ordination of all movements. The cerebellum is known to insert a massive range of learned
movements into normal activities. These inserted activities meet cortical goals. It is suggested that these fibres could
form a goal channel, which continually informs the cerebellum of the current goals of the system.
A seamless interface. The cerebellum could be acting as a memory store, switching controls between it and the motor
areas of the cortex. Neurons in the cerebellum could learn and reproduce remembered movements, becoming inhibited
when cortical intercession takes place in any habitual movement. A person who reaches for an object could be moving
his hand smoothly to a point close to the object through cerebellar controls with conscious controls taking over for a
brief instant to adjust the hand. The cerebellum could again takes over to grasp the object smoothly. The Purkinje cells
also have inputs from stretch receptors which report increased muscle tension enabling the organ to hand back habitual
movements to purposive cortical controls.
Evaluation of a quarter million parameters for an imperceptible muscle shift. Children take years to learn to walk,
run, or ride a bicycle. At ten thousand frames per second for each one of sixty thousand motor neurons they could
represent astronomical memory capacities as each action to meet a cortical goal is learned by this organ. Habitual acts
are unique to each individual. If they were computed movements, they would have been similar for every one. Being
more quixotic and repetitive, it is more probable that these are remembered actions. Even the movements of a skilled
gymnast are learned with painstaking practice. It is training (requiring memory) which achieves the unique, but
co-ordinated movements of many muscles to precisely meet cortical objectives. This essay reasons that finely
discriminative feelings trigger sensitive pictures in the goal channel to recall myriad learned movements from the
cerebellum. The Purkinje cells represent a system which evaluates a quarter of a million parameters to generate a one
ten thousandths of a second movement of a single muscle. Multiply that by 10,000 decisions every second and again by
60,000 motor neurons. Imagine the power of such a system.

Event Recognition
Intelligence through recognition. Recognition has been proposed as the key to understanding intelligence. It was
shown how recognition may cause the feeling of pain to be heightened or suppressed. That it may enable signals from
the reticular formation to create consciousness, or those from EAC to focus attention. That recognition by association
channels may enable the identification of objects. That recognition by Purkinje Cells may enable the mind of a skilled
gymnast to finely control his feats. Recognition, in reality, even provides the links, beyond the level of an individual
intellect, for the highest levels of integration of a modern society. The lifeline for today's technological world is the link
created by the recognition of spoken and written messages. If there was no recognition, intelligence would not exist. It
is now suggested that, just as it identifies objects, the mind may have a capacity to recognise the events around them.
The process of event recognition may produce an altogether nobler level of intelligence.
Event recognition exists. Recognition of a pattern, which triggers a sequence of events is normal for computers. A
"sort" command sorts a column of figures. A "copy" command copies a document from one file to another. The

http://www.gamedev.net/reference/articles/article770.asp (21 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

computer recognises the command to set in motion a train of events. In a reverse process, a computer in a bank may
evaluate a sequence of transactions to trigger an alarm concerning a suspicious, or fraudulent event. A sequence of
activities becomes the cause rather than the result of the recognition of a pattern. Computers can be programmed to
recognise such events. We know that the mind recognises events with enormous power and subtlety. Words and
sentences in language identify and define events in the environment in all their complexity. There is also simultaneous
recognition of multiple events. When one drives through traffic, there is awareness of the movements of many objects
in the field of vision. Cars move in various lanes. People cross the road. Signals flash. Each event has a distinct context,
history and future possibilities. One recognises where a pedestrian comes from and where he is likely to go. Event
recognition implies a complex past, a present and a future.
The time span for event recognition. Events are continuously perceived by the mind. It also absorbs complex
information through the process of reading. It may be absorbing such information in discrete packets. The structure of
language may indicate such a process. Each sentence has an acceptable length and every sentence closes with a period.
A working memory (36) may store the first part of a sentence, while sense is made of the second part. While memory
capacity is astronomically large, the working memory of the mind is just comfortable with a single telephone number.
This may suggest a chunking of information into comparatively manageable segments before it is absorbed. A sentence
contains data regarding objects and their static and dynamic relationships. It would be reasonable to assume that the
time taken to absorb an average comprehensible sentence spans the time period for an event to be learned for storage in
memory for subsequent recall and recognition. While even the instantaneous click of a camera lens is a recognised
event, the structure of our literature suggests a period of ten to fifteen seconds for the absorption of a more complex
event by the mind. The information in a sentence may be assimilated in this period.
An event picture. While an object can be identified instantly, an event needs to be evaluated over time, to achieve
recognition. Even the simple act of running generates a sequence of complex patterns. A sequence of images are
needed to represent the action. Even so, the event can still be represented by the simple word "run". The word covers
the sequence of activities. Words and sentences are static images. If they are considered symbols, then events can be
identified through them. Symbols can be represented by pictures. Fitting into the IA pattern recognition model of this
essay, an event can be represented by a picture. It is reasoned that such pictures may be imprinted in a goal association
channel. Compare the process to imprinting an iterating image "run" on every frame of a movie of a running person.
Subsequent recall of any single frame of the movie will contain the name of the event. The name symbolises the act
and enables recognition. The process imprints an iterating image on every frame of a sequence of images. Subsequent
recognition is achieved by identifying any unique quality of any one of these images. The imprinted iterating image
identifies the event.
IA for event recognition. In IA, a pattern is recognised by identifying its unique qualities, which are absent in any
other known pattern. Events are sequences of patterns. It is suggested that unique qualities differentiate every
recognised event at the microscopic level. A fleeting smile is instantly recognised because of some infinitesimally
differentiated and unique quality. That quality differentiates a smile from a grin or a smirk. While it may be impossible
to define the characteristics, the ability to recognise and differentiate such events remains a normal human ability. It is
known that event images are analysed by the mind into thousands of characteristics. If we assume an astronomically
large memory, which stores profoundly small variations between the characteristics of events, finite event recognition
may be practical for the nervous system. Once this capability is assumed, much of the mystery surrounding thought
processes disappear. Most cognitive processes may revolve around the recognition of an event, its recall from memory
and a visualisation of its consequences.
A goal association channel. This essay suggests that events may be identified by the mind as pictures in a goal
association channel. An iterating image (like a channel number on a TV screen) is suggested, since events cover a
sequence of images spread over a period of time. Association channels have access to sensory perceptions. All Barrel
may perceive, say, a person sitting down. The goal association channel is suggested to have a parallel link to the goal
channel. Certain Barrels (triggered by the current goal picture) may record the sensory image of the event.
Subsequently, when the action is again perceived, these Barrels dip into recorded memory to trigger the recognition of
the event "Sit". Barrels which lack the images in memory are inhibited. The event is recognised and an event picture
assembled in the channel. Just as a picture can represent many objects, event pictures in the channel may represent

http://www.gamedev.net/reference/articles/article770.asp (22 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

multiple events. The equivalent may be visualised as a matrix of "iterating channel numbers" on a single screen,
representing several simultaneous events. Each can be represented by an independently changing number. A single
picture may identify several events, in parallel. An astronomically large and finely differentiated mass of such pictures
may represent our understanding of events.
Assembling event images. A musical composition is also an event. The process of combining such events to create a
unique new event in memory was described by Mozart. (37) According to his narration, when feeling well and in good
humour, melodies crowded into his mind. He retained those that pleased him. Once he selected a theme for a new
composition, another melody came, linking with the first one. Each part fitted into the whole. Mozart continued that,
when the composition was finished, he saw it as a beautiful whole in his head. Just as he could see a beautiful picture,
he saw the whole composition at "a single glance". This essay suggests that each melody may be an event, recalled by
Mozart, evaluated and stored again into his memory as an event picture. These are again combined into a complex
event picture, which covered the entire composition. Even though the composition is played over a long period of time,
Mozart saw the complex event "at a glance" as a single picture in his head. This narration may support the concept of
pictures representing events. The mind does recognise pictures, or, even, the implications of the word "read", "at a
glance".
Thinking in symbols. Running, sitting down, or walking are simple events, recognisable from a single picture.
Recognition of such events and their linkages is reflected in our language. A combination of multiple events is
understood from "John ran home and got into bed". The mind visualises the events "ran" and "got into" in the context
of the objects "John", "home" and "bed", to create a new, recognised event. Recognition of each event may fire a
picture in a goal association channel. With IA, the mind may have access to its global meaning. Language enables the
combination of such symbols into a complex event, which is also assimilated. Language has certain universal elements
in its construction. This may imply that the goal association process, which finally comprehends language, is itself
divided into different segments, which are then assembled according to specific rules. Nouns, and verbs may be
recognised separately and combined according to rules for recognition to generate meaning for the whole. The whole
process of comprehending events has been suggested as being the function of a complex goal association channel, with
many components.
Event pictures within event pictures. A logical extension of the recognition of events ia an event picture, which
stores within it a sequence event pictures. The musical composition of Mozart appears to him as a single composition,
which contains many pieces within. A shopping trip can be recalled as a sequence of many sub events within the main
event. Such event recognition implies a combination of current inputs with memories from the past. These can
represent rising hierarchies of understanding. Since pattern recognition, as envisaged in this essay, permits infinitely
differentiated steps, event recognition can explain virtually any type of human intelligence from planning a strategy for
war to comprehending the theory of relativity. They are hierarchies of events, which contain millions of images. Since
pattern recognition requires only uniquely remembered and not logically connected links, any pattern can be linked to
any other in complex associations. Any alteration of a component image leads to a new understanding and a fine
difference of infinite subtlety in a new image of recognition.
Inherited emotional event recognition. The management of feelings has played a critical role in this essay. It dealt
with the concept that nerve impulses represent different shades of feeling. Feelings include those generated by bodily
demands and those fired by intellectual perceptions. Thirst is triggered by a bodily demand. Fear, a sense of isolation,
loneliness, or disgust are feelings which result from intellectual perceptions. This essay suggests that certain feelings
may be triggered by the recognition of event pictures from the goal association channel. Impulses which represent fear,
sorrow, or jealousy may be triggered by specific, recognised categories of events. These impulses may form the signals
in the feeling channel. Each Barrel may recognise event pictures specific to an emotion. The Barrels which fire to
trigger a feeling of disgust may recognise an event picture which represents a revolting event. Such memories may be
inherited at birth in the Barrels of the feeling channel to enable the system to respond with feelings to events.
Event memories. There is reason to believe that our memories of events are recorded as the feelings which we
experienced during the event. When we recall a conversation, we can recall what was said, but not the exact words. But
we can remember the tone and the meaning. Feelings can convey such meaning. The recall of visual memories was
reasoned to be triggered by the Barrels of the visual cortex. That these memories were recalled in the context of

http://www.gamedev.net/reference/articles/article770.asp (23 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

feelings. It is suggested that the Barrels of the feeling channel may trigger feelings when events are recognised, or
recalled. That the goal association channel recognises events as event pictures. It is suggested that each Barrel in the
feeling channel receives and stores memories of these event pictures, which triggered the related emotion. Subsequent
recall of the event recalls the related feeling. Since event pictures have been suggested to be iterating images, the
sequence of feelings related to an event are recalled when an event is remembered. Iteration may give a time dimension
to experienced feelings, enabling the mind to recall events in sequential detail. The feelings, in turn, may recall visual
and sensory memories.
Feelings as an accurate record. Modern society communicates a major portion of its sophisticated messages through
the medium of language. Text books, novels and scientific articles convey complex meaning to readers. A language,
with about quarter of a million words conveys a majority of the information between human beings. It is suggested that
the sequence of images in a million dot feeling channel can convey all this and more within the mind. When we narrate
an event, the recalled event pictures convey the concept of the event through a sequence of recalled feeling images. The
goal association channel recognises and translates it through the speech mechanism into language. As a person
expresses the words related to a feeling, the mind has access to its entire memory store in the context of that feeling. IA
selects the words which exactly suit the context with precision. When an inner voice speaks the ideas in the mind, the
speech mechanism is merely translating the current sequence of feelings.
A summary of the event to feeling link. Event recognition has so far never been suggested by AI research. Infinitely
graded categorisation itself has never been a possibility. So, event recognition was never visualised. But IA makes such
profoundly sensitive pattern recognition possible. It is but a step further to imagine a time dimensioned pattern
recognition system which can recognise events. Human experience and the clarity of language clearly indicate that
events can be recognised with precision. This process also may use mind's language of pictures. Events can be
represented through words. Pictures convey more meaning than words. Iterating pictures in a goal association channel
may represent events. Events could be absorbed in brief time capsules, as in the time span in which the mind grasps a
sentence. They may be recorded for recall as an event picture by the goal association channel. The mind is known to be
aware of several simultaneous events. An event picture could represent several such concurrent events just as an
ordinary picture could represent many objects. Combinations of events could also become complex event pictures,
which represent sophisticated concepts such as war, or democracy. And, finally, event memories may be stored as
sequences of feelings in the feeling channel, which could be recalled by the iterating event pictures. The recalled
feelings would, in turn trigger the recall of sensory memories of the event.

The Goal Drive


The goal channel. This section elaborates the concept of a goal channel. It has been a suggested as an essential link
between two known and experimentally verified entities - the feeling channel and the motor control network. Such a
channel is a needed link, if the IA pattern recognition process is offered as a basis for explaining the workings of the
mind. The goal channel is presumed to function from certain cortical areas, which are known to trigger sequences of
motor activities. The channel would include all those regions which issue motor control instructions. This essay
suggests that the purpose of the mind may be an iterating picture in a goal channel. These pictures may essentially
represent physical objectives to be achieved by the motor system. These pictures are presumed to be intelligently
interpreted by the mind. The motor channel may interpret the goal pictures as instructions for sequences of motor
outputs. Immediate purpose may be determined by the current feeling. This feeling is shown, later in this essay, to be a
sophisticated intuitive choice by the system. This section explains the likely functions of a goal channel.
">Knowledge which achieves objectives. This essay assumes that nerve channels have powerful memories. Each
channel may store the global knowledge of the system, concerning its specialisation. The channels may use IA to
mutually and instantly exchange and interpret intelligent responses. A goal channel may trigger motor activity and
represents the intelligent purpose of the system. It is purpose which determines and executes those steps which enable a
person to achieve an objective. Purposive activity has three components - the desire, the purpose and the activity which
achieves that purpose. An infant's desire to touch an object is followed by a purpose, which triggers muscle movements

http://www.gamedev.net/reference/articles/article770.asp (24 of 38) [25/06/2002 1:40:15 PM]


GameDev.net - The Intuitive Algorithm

to touch the entity. A person may wish to copy a file on a computer. He interprets this purpose to the computer as a
typed in "copy" command. The computer then executes a series of steps which achieve his objective. This essay
suggests that the goal channel may be a special interface for purpose. It may interpret feelings (the needs of the system)
to determine purpose and trigger motor activity. For this, it may store the knowledge of the system concerning groups
of motor objectives which achieve each purpose. Such purpose may, ultimately, be the driving force of the system.
Purpose as a route map. A nosebrain, which recognised the smell of an object, issued additional instructions to
consume or avoid the food. These instructions were followed by its motor systems. The mammalian feelings system
permits a wider range of options. The cerebellum does not provide cortical purpose, but is known to assist in its
achievement through sequences of recalled motor activities. Just as a route map recalls the physical directions to reach
a destination, the goal channel may recall and set the physical goals which control motor activity to achieve an
objective. If the objective is to leave the room, the goal channel may identify the door as a physical goal. The
cerebellum may co-operate with muscle movements in a stroll to the door. If the objective is to escape from danger, the
channel may contextually select the easiest escape route. During a drive, the goal channel may determine the right
turns, in context, to reach a destination. The channel may contextually respond with the next most suitable physical
goal for motor activity, to meet a particular objective. This objective may meet the needs of the current feeling. Such
physical goals may be in the channel memory in the context of the past achievement of similar objectives.
A goal channel with intuitive intelligence. The channel may have access to an adequate inflow of information to
intuitively choose physical goals for the system. Society teaches an individual how to achieve objectives from driving a
car to building a house through a range of pre-defined physical activities. The channel may build up a massive memory
of physical goals as responses to feelings. The channel may set sequences of physical goals for complex objectives - to
flee, attack, or negotiate. The choice of goals in response to feelings may be established at a young age. Such goals may
be learned gradually from infancy, forming sequences of physical activities, to be recalled instantly. Many patterns of
social interaction may be learned in playing fields, where each feeling may result in a particular fashion of personal
contact.
Inherited responses to feelings. A goal picture may have many components. Geographically, the channel may be
widely distributed. The levels below the thalamus are known to have substantial powers of self management of basic
life support systems, including feeding, drinking, apparent satiation and copulatory responses. The interpretation of
feelings and the issue of such control instructions may be perennial elements of a goal picture. Some bodily responses
to feelings are automatic. The cerebellum was shown to control habitual movements, under cortical guidance. That the
process may be learned by the complex of cells surrounding the Purkinje cells in the cerebellum. It is now suggested
that over and above such learned movements, the cerebellum may respond with specific physical activity to the
interpretation of feelings by the goal channel. The cold sweat of fear, or the shuddering sobs of sorrow may be the
inherited responses triggered through the cerebellum by the goal channel when it recognises specific feelings.
A goal picture may be the primary drive. The purposive element of the channel may provide a mechanical interface
between feeling and motor activity. Feelings compel action. The individual may not be conscious of the many small
subsidiary motor activities which achieve a goal. The next subconscious objective that meets a feeling may be selected
and acted upon without significant conscious input. A child goes into a tantrum. A man commits a violent act.
Recognition of an event causes strong feelings to be experienced. These may automatically trigger goal pictures and
resultant motor activity. The process may be stopped only if the goal is changed. Once a goal decision is made, the
body is compelled to achieve the goal. The concept of a goal channel may explain the powerful drives that impel an
individual. Many day to day activities may also involve goals that are constant over hundreds of sleep and waking
cycles. Childhood feelings may set long term goals, providing contexts for the launch of current feelings. Such
elements of the goal picture may compel one to continue, consistently keeping a focus on primary objectives, over the
years.
Feedback loops co-ordinate output. It has been reasoned that the current feeling determines the goals and hence the
activity of an individual. From thousands of competing wants, the system must select a single one for action. Intuition,
as implied by IA, may be uniquely fit to contextually pick the single most germane selection. This capability is best
illustrated in the motor channel. Each one of 60,000 motor neuron has up to 20,000 inputs from other neurons as it
travels down the spinal cord. Feed back loops use information from lower levels to modify inputs at higher levels.

http://www.gamedev.net/reference/articles/article770.asp (25 of 38) [25/06/2002 1:40:16 PM]


GameDev.net - The Intuitive Algorithm

Every muscle has an opposing one and many muscles must co-ordinate to achieve even the simplest task. Any selection
may instantly inhibit conflicting demands. Such decisions occur thousands of times a second. It is logical to conclude
that these feed back circuits may have a singular ability to instantly consolidate the backward and forward interactions
of millions of simultaneous inputs. Galaxies of parameters may be processed, using phenomenal intelligence
concerning their interactive impact. After such assessment, a single final picture delivers smooth muscle movement. It
is suggested that a similar process may determine the current feeling of the system.
The limbic system may decide the current feeling. Experimental evidence attributes a significant role for the limbic
system in the realm of emotions. This essay suggested that the output of the limbic system may represent the current
feeling. It is a ring passing through the thalamus, consisting of over a million fibres, which acts in both directions. A
process similar to that in the motor channel may take place in the limbic system. It may evaluate millions of received
parameters to determine, instant by instant, the final output. As in the case of the motor channel, where many muscle
movements oppose each other, many feelings may also be in conflict with each other. While a person is reading a story,
an unexpected sound occurs in the background. The sound generates a feeling. The feeling related to the situation in the
story dominates the system. At some point, suddenly, the feeling generated by the background sound obtrudes. This
feeling may now set system goals. The attention of the mind now changes focus to the sound. It is suggested that the
limbic system may continually process myriad feelings generated by bodily needs and intellectual perceptions to
generate the current feeling. This feeling may inhibit conflicting emotions to dominate the system and trigger goal
images.
When a wish becomes an act of will. William James (38) narrates the internal conflicts of a person while getting up
from a warm bed on a cold winter morning. One lies unable to brace oneself to get out of bed. Then there is a sudden
decision. One may think of some thought connected with the day's activities. He calls it a lucky idea, which "awakens
no contradictory or paralysing suggestions, and consequently produces immediately its appropriate motor effects...."
Suddenly there are no negative feelings and one gets quickly out of bed. He calls it a shift from "wish" to an act of
"will". It is suggested that the lucky thought may have been triggered by that segment of the goal channel which
manages longer term goals. It may have called up images which create a feeling of the need to achieve the day's duties.
The limbic system may evaluate competing feelings, and balance them to determine the current feeling. This "lucky"
emotion may inhibit opposing feelings. It may bring appropriate context and memories. The goal channel may
recognise the feeling to initiate "appropriate motor effects". The "act of will" may have been a sophisticated decision
by the limbic system.
A subtle feeling to goal relationship. It is suggested that the distinction between feelings and goals may be hard to
draw in some areas. Some feelings may be subconscious, with their impact triggering goal pictures and resultant visible
activity. Thus the impulses that trigger many feelings, such as curiosity, or playfulness may be subconscious, but may
produce resulting activity which meets the parameters of the feeling. The event recognition process is reasoned to have
an inherited code to trigger specific feelings when a particular type of event is recognised. As an example, when an
object or event evokes interest and cannot be recognised, a feeling of curiosity may be triggered. This feeling may, in
turn trigger goal images which facilitate investigation. The person may then follow those activities which assist in
recognition of the significance of the event.

The Mind
A composite picture. The IA concept visualises intuition as a process of infinitely graded category recognition, which
enables a supersession of the "understanding" of science by the "wisdom" of the mind. Such wisdom is reasoned to be
the property of neural channels. The channels are assumed to be electrical circuits with powerful memories, with
intuitive intelligence of a very high order. Biochemical messages may further aid this process. The mind does not
appear as a single network intelligence. Myriad separate intelligences seem to operate independently from thousands of
specialised and geographically identifiable neural channels. These channels may be distinct entities, mutually
exchanging and recognising unique and perceptive messages. The picture theory of internal exchange of information is
offered as their medium of communication. A holistic, real time interaction is made feasible in such a system by the

http://www.gamedev.net/reference/articles/article770.asp (26 of 38) [25/06/2002 1:40:16 PM]


GameDev.net - The Intuitive Algorithm

swiftness of IA. Such circuits may further explain certain mysterious functions, such as drives, consciousness, will and
judgement. An attempt is made, in this section, to combine these ideas and functions to present a composite picture of
the mind.
Reasoning chains for understanding. Some scientists may dispute the superiority of human wisdom over modern
scientific understanding. Science is founded on inductive, analytical logic. Logical analysis chunks information into
minimums that fit a specific rule, or reason. Science assembles facts that fit these rules to create understanding. It
presumes that any understanding must be built on a logical structure of underlying reasons. Reasoning chains underpin
science. A phenomenon is presumed to be understood only when the underlying causes are well defined. But the
information inflow into the scientific world overwhelms its capability for providing supporting reasoning chains. Every
science has spawned a dozen more. The vastness of the universe, billions of years of history, the complexity of living
things and the miniature worlds in the cell and the atom dwarf scientific ability to provide reasons. Over centuries of
research, the reasoning chains proposed by the scientific community, even those underlying its most fundamental
beliefs, have also been overturned by new discoveries. Reasoning chains have fallen far behind in providing
understanding.
A wisdom which supersedes an understanding. Intuition, as implied by IA, uses inductive logic to identify the
unique elements which link two patterns through a process of elimination. Such recognition of unique links between
complex patterns is reasoned here to be the basis for human wisdom. Intuition instantly recognises the link of the
pattern "the eyes" to the pattern "look friendly". This is human (or even animal) wisdom. As against this, a reasoning
chain would be hard pressed to explain the causes, since analysis of the two patterns may yield an astronomically large
number of categories with vague relationships. Each element of vagueness further weakens a reasoning chain. The
intuitive link, on the other hand, may be based on powerfully accurate and logical perceptions of concrete experience.
Instead of seeking underlying reasons, the intuitive process may find unique links from a vast storehouse of experience.
Intuition may as often be just as wrong as scientific reasoning. But it girdles a wider horizon and is a powerful weapon
for coping with the environment. This essay suggests that the wisdom created by pattern recognition may be superior to
the analytical understanding created by science. Science assists such wisdom with reasoning chains. This essay
attempts to be one such reasoning chain.
Many intelligences in a federal system. Current neural networks theory may be compared to the effect of ripples
created by a sequence of pebbles dropped into a still pool. The ripples interact and can be expected to indicate the
global outcome of every dropped pebble. The theory suggests a similar global intelligence, with every portion of the
network reflecting every event that occurs in the nervous system. As against such a single intelligence, this essay
suggests that many intelligent regions may perform independent functions in the nervous system. Such regions, their
functions and the nerve fibre links between them have been extensively charted by science. These regions may
communicate internally through intelligent pictures. The evidence for the "picture mode" of transmission is provided
by the phenomenon of point to point "mapping" between the myriad neural channels. It was reasoned that the
association region may inform the prefrontal regions that a pair of scissors has been recognised through a picture. This
message is an independent communication between two finite intelligences and not "signals that balance" an entire
network. Medical evidence also supports the concept in this essay of myriad independent intelligences. These
intelligent circuits are known to form a hierarchy of interactive subsystems, each demanding only critical inputs from
higher levels. The management has been likened to a federal government. At the lowest levels, people manage their
affairs by themselves. Higher level decisions are made by the communities, by the state governments and finally, by the
central government.
A self managed system at lower levels. This decision making system is revealed in the "homeostasis" of animals in
the survival process. Homeostasis (39) is the achievement of a relatively constant state within the body, in a changeable
environment. It is naturally maintained. It is brought about by various sensing, feedback and control systems,
supervised by a hierarchy of control centres. The concept that these centres mediate these controls is based on a wide
base of experimental evidence, gathered by studying the impact of destruction of localised topographical targets in
animals. As higher levels are included with the spinal cord below the cut off section, more effective controls are
retained. The thalamus is the major nerve junction sitting at the apex of this survival hierarchy. The levels below can
sustain a wide range of activities including feeding, drinking, apparent satiation and copulatory responses in a wide

http://www.gamedev.net/reference/articles/article770.asp (27 of 38) [25/06/2002 1:40:16 PM]


GameDev.net - The Intuitive Algorithm

range of adverse conditions. Obviously, an incredibly high level of intelligence and self management exists at these
lower levels.
Selective awareness. But, are we just mechanically constructed objects which respond with electrical and chemical
impulses to the external environment? All of us have a deep down knowledge of being free of the mechanisms that
generate the impulses. We can vividly see visual images and powerfully experience a multitude of sensations and
feelings. Unlike a television camera or a microphone, we are independently conscious that we are seeing and hearing
the world around us. If something is seen, surely there must be someone who sees it - a ghost, or a soul? But, while
neural impulses pulse through every part of our body, we have the sensation of seeing only when these impulses
impinge on the visual cortex. Nerve impulses in the heschl gyrus alone cause us to hear sounds. Are these portals into
the soul? This essay suggests that among the myriad pictures evaluated by the nervous system, consciousness involves
a limited group of pictures of which a human being is conscious. Like every group in the nervous system with its own
intelligence, the conscious intelligence may be an independent entity, constituting a group of neural circuits. It may feel
and act as an independent entity. It is suggested that such an intelligence may operate in the region around the
pre-frontal lobe of the cortex.
Pre-frontal regions and a sense of self. The geography of nerve channels pinpoint many functions, which
inter-communicate. While all other regions of the cortex interact mostly within finite regions, the prefrontal lobes have
abundant connections (40) with the association regions of the three sensory lobes. The association regions are known to
perform the most important act of recognising perception. The message of recognition is carried to the prefrontal
regions. These connections may be one to one projections. Recalled memories, recognition of multiple objects and
complex events may travel as pictures to the prefrontal cortex. This region may be the conscious mind that sees and
knows that it sees. Suppose a computer is constructed to receive, categorise and store received sensory images.
Suppose parallel processing enables a second internal system to receive all such information, including its own
operational parameters. The second system may truthfully say "I can see and hear you. My speech mechanism is
functioning at optimum efficiency". An autonomous intelligence in this region may independently evaluate the system
to enhance our impression that we are independent of ourselves. Consciousness and the sense of self may be moulded
by the circuits in the pre-frontal regions.
Consciousness may provide context. The conscious mind receives sensory inputs, feels emotions, recalls memories,
focuses attention, recognises objects and events, visualises and evaluates alternatives, and wills motor activity. But,
while all sensory inputs are monitored, only a small fraction enters consciousness. The motor functions, stored and
recalled by the cerebellum, remains subconscious. Even the act of will does not enter consciousness. Only if the mind
is questioned does it reveal a decision to sit down, or to go to the water cooler. Many feelings which trigger goal events
also may not enter consciousness. From an astronomically large volume of information, and a wide range of options,
intuition forces the elimination of all alternatives to pin down a single choice. While the mind may be processing many
feelings, the conscious mind may experience only a single dominant feeling. The feeling may provide the context for
the recall of a memory of an event. It may provide a file pocket and reference point. A single hook, a focal point is vital
for context in recalling memories. The conscious mind may provide a critical focusing point for context. Since the
volume of information manipulated by the nervous system is massive, nature may have restricted stored memories to
those entering a limited region of consciousness.
Pre-frontal regions pass judgement. It has been suggested that motor activities may be triggered by feelings. Animals
are known to sustain a wide range of activities including feeding, drinking, apparent satiation and copulatory responses
in a wide range of adverse conditions, in spite of being disconnected from the levels above the thalamus. As such, it is
reasonable to presume that a wide range of feelings which trigger these activities may be generated by levels below the
thalamus. The prefrontal regions appear to generate a different set of feelings. Some years ago, (41) a procedure called
prefrontal lobotomy was applied for patients with intractable pain, or in attempts to modify the behaviour of severely
psychotic patients. The surgery disconnected the prefrontal zones from the regions around the thalamus by cutting
nerve fibre connections. It was noted that such patients were "tactless and unconcerned in social relationships, with an
inability to maintain a responsible attitude". These patients were seen to "lack judgement". Presumably, judgement may
result from the more intellectual feelings triggered by the prefrontal regions.
Cutting off judgement. The geographic differentiation between perception and action is seen in prefrontal lobotomy.

http://www.gamedev.net/reference/articles/article770.asp (28 of 38) [25/06/2002 1:40:16 PM]


GameDev.net - The Intuitive Algorithm

Judgement is a process which evaluates the impact of a proposed course of action. This essay suggests that any
proposed action, even a rude one, will trigger a goal picture. A goal picture is a planned event. The event may be
recognised by the prefrontal area to generate feelings related to its outcome. Normally, a person recognises the impact
of rudeness, to generate a feeling of impropriety. If the limbic system received this message, it may instantly select it as
the current feeling. If the current feeling was negative, the rude action would be instantly inhibited. With pre-frontal
lobotomy, this feeling may not be conveyed to the limbic system. This essay suggests that event recognition by the
prefrontal regions may trigger feelings concerning complex human interactions. Without access to such feelings, the
limbic system may permit the execution of tactless actions. While such intellectual feelings may be generated in the
region of consciousness, the so called primeval urges may be generated from regions below the thalamus.
When will is bypassed. Even while the system is incredibly sensitive to one's needs, one is aware of the difference
between voluntary and involuntary actions. This essay has suggested that many intelligences operate in the system. The
conscious mind may be one of these. It may appear as the "self" and "the master". The system seeks to be sensitive to
"the needs of the master". But it may not always yield control. Any planned course of action generates a feeling. If it is
acceptable to the system, action is triggered. The limbic system may select the current feeling from a range, including
the "wishes" of the conscious mind. It may have inherited code recognition parameters, which even prohibit the
dominance of self destructive feelings. If a feeling is unacceptable, conscious will may be ignored and the action
inhibited. While an individual may "will" the movement of a limb, such will may be over ruled if it does not conform
to a "WASP" formula. The action should be Worthwhile, Appropriate, Safe and Practical. One gets up out of bed if one
feels it is worthwhile. No ordinary person can will himself to take an action which is inappropriate, unsafe, or
impractical. This can seen when a person freezes on the high diving board, in spite of his "wish" to dive.
Limited intellectual control. The outcome of a proposed activity may be instantly transmitted to the pre-frontal
regions as a picture in the goal association channel. Recognition triggers related feelings. One wishes to bring one's
knee up. A goal association picture would inform the pre-frontal regions of the outcome of this move. One's wish,
expressed as a feeling, faithfully triggers motor activity through an appropriate goal picture. The knee comes up
dependably. But what happens if one had this ridiculous wish while standing in a crowded lift on the way to office?
The event recognition picture instantly transmits the impact of this move on a neighbour. The picture would trigger a
powerful feeling that it would be inappropriate. This feeling immediately triggers a goal picture which inhibits such
motor activity. When one sets out to do anything, one instantly knows of the social impact of that action. This
knowledge exists in the prefrontal regions. Evidently, if the pre-frontal regions are disconnected, such controls are
disconnected from the system, and one's activities lack judgement. This may also be the reason why an individual may
sense a lack of control. The conscious mind may reside in a region which has only an advisory role, while major
decisions are taken elsewhere.
Decision by the system. Let us consider the process that converts a judgement into a motor activity - the decision
making process. We don't merely respond to sensory inputs. Beyond mere recognition and evaluation, we have the
powerful ability to initiate, cause, activate, begin, create events. Who initiates all this activity? Is there a free will,
which is exercised by the individual to control his actions? This essay suggests that the consciousness drive continually
triggers activity in the feeling, awareness and goal channels. The most powerful indicator of a free will is demonstrated
by one's ability to move one's muscles, or to focus attention. This initiation may be only an automatic mechanism
which merely triggers the next highest priority activity of the system, while there is consciousness. The "initiation"
could merely be a switching process by the limbic system, which selects the most powerful feeling as the current motor
control option. That feeling becomes the will of the mind. The water balance in the body reduces. A feeling of thirst is
triggered. The limbic system switches the feeling in as the highest system priority of the moment. A goal picture is
triggered. The cerebellum assists the cortical decision in a habitual trip to the water dispenser. A series of motor events
meet the goal. Thirst is quenched. A high level goal picture triggers a reminder of the "urgent" file demanding
attention. The next feeling arrives to trigger a quick trip back.
"Will" may be an illusion. One can focus attention wherever one wishes. It is an act obviously seen to be willed by an
individual. This process is controlled by the executive attention centre (EAC). In reality, the process may be the result
of an intuitive search. The creative process demands focus on new contexts to find solutions. An idea or object which
becomes the focus of attention may be contextually the most appropriate in the light of the current goal. The goal

http://www.gamedev.net/reference/articles/article770.asp (29 of 38) [25/06/2002 1:40:16 PM]


GameDev.net - The Intuitive Algorithm

channel may select the focus. Since it precisely meets one's objectives, one is deluded into believing that one "willed"
the focus of attention. Imagine a slave who is so sensitive to its master's needs that he meets these instantly. The master
may believe that his will controls the slave. The truth may be that the slave is voluntarily following the will of the
master. It is one of the key themes of this essay that a pattern recognition system can be so microscopically sensitive to
the demands of the nervous system that its need (will) becomes its command. This sensitivity may give one the illusion
that one is in command of one's body. It may be the equivalent of believing that one controls one's shadow.
Even animals are creative. A search process, which enables the mind to seek information to assist the achievement of
goals may be a powerful subconscious process. Konrad Lorenz (1972) describes a chimpanzee in a room (42) which
contains a banana suspended from the ceiling just out of reach, and a box elsewhere in the room. "The matter gave him
no peace, and he returned to it again. Then, suddenly - and there is no other way to describe it - his previously gloomy
face 'lit up'. His eyes now moved from the banana to the empty space beneath it on the ground, from this to the box,
then back to the space, and from there to the banana. The next moment he gave a cry of joy, and somersaulted over to
the box in sheer high spirits. Completely assured of his success, he pushed the box below the banana. No man watching
him could doubt the existence of a genuine 'Aha' experience in anthropoid apes". This brilliant insight implies that
creative effort is not necessarily a human prerogative, but an essential nervous system process existing in all animals.
Creativity as a pattern recognition process. The mind has the unique ability to question itself. What is to be its next
course of action to meet a particular goal? The act of selecting an option may be considered a genuine act of will. That
act may come from a feeling. An element of uncertainty precedes such a feeling. This is an interim subconscious period
of search. It is suggested that there may be an intelligent search process in the nervous system, which continually
evaluates alternative contexts against a visualised goal image. Goal pictures control ongoing motor activity. A
sequential test of all perceived contexts for an answer to the current objective may merely be another motor activity.
Instead of despatching sequential impulses to manage muscle movements, such impulses may manage a continuing test
of current context against current goals. Such testing may occur constantly in the subconscious, bringing on the "Aha!"
experience of discovery, when a set of imagined events is perceived to meet all the parameters required for achieving a
singular goal.
Creativity from an algorithm. The old adage is that a computer can never be original, since it only spews out what
has been programmed into it. Computers follow algorithms. Creativity of the human mind has been the most powerful
argument against an algorithmic explanation of the mind. But, if a sophisticated computer could keep experimenting in
its memory with multitudes of combinations, with the goal of achieving a desired result, it could arrive at a new and
original solution. A computer can be programmed to "recognise" an "imagined" event which can achieve a specific
goal. The chimpanzee manipulated many images in its mind, chancing on the possibilities following the position of the
box below the banana. It instantly perceived the sequence of events which could achieve its goal. As at the time of this
writing, the memory capabilities of computers and their capacity for manipulating images are woefully limited. Using
its massive memory based on experience, the mind may create myriad images in imagination. Some of these may link
in exotic combinations to create brand new inventions. If a prodigious memory and sensitive pattern recognition is
assumed for the human system, it may explain the development of imaginative and exciting concepts, products and
processes. An algorithmic (and intuitive) recognition process may be primary to this capability.

An Expert System Shell


Artificial Intelligence for accessing data. The use of the personal computer has become a world-wide phenomenon,
enabling people everywhere to improve the quality of their work. Initial applications focused on financial accounting,
word processing and spread sheets. Recently, the Internet opened opportunities to access information from computer
databases in a wide variety of fields. The use of key-words now enable people to locate topics of interest. But, in fields
where specialised words are used, the user needs to know the exact word to locate a subject. Expertise is essentially the
knowledge of the exact word that defines a problem - such as the name of a disease which exhibits a group of
symptoms. Expert systems can locate a problem from a description of such symptoms. They can play a major role in
assisting people to locate vitally needed information. But, expert systems should be fast and they should avoid stupid
questions.

http://www.gamedev.net/reference/articles/article770.asp (30 of 38) [25/06/2002 1:40:16 PM]


GameDev.net - The Intuitive Algorithm

A wide field of possibilities. Expert systems can assist millions of users to access key information regarding computer
software, which grows more complex by the day. The legal aspects of commercial activities cover taxation, company
law and constitutional law. Speedy access to particular case laws is a vital need for the legal profession. Computer
diagnosis of diseases can assist hospitals, general practitioners and students to find vital information in specialised
fields. Expert systems can guide staff in large organisations which have thousands of pages of manuals concerning
complex procedures. Diagnostics can assist in problems related to machinery and equipment. In all these fields,
existing manuals can be entered into expert systems if only the process was fairly simple and straightforward.
Simplification of procedures. Traditional expert systems require knowledge engineers, who understand the logical
reasoning in a diagnostic session and can encode this logic into "If, then, else" rules. When the database is large,
questioning priorities may need to be supported by probability estimates of likely enquiries or heuristic assessment of
enquiry directions. Such rule based systems also become complex and intractable when the size of the knowledge base
expands. This section describes an Expert System Shell based on the Intuitive Algorithm (IA). The IA shell requires
merely the categorised entry of data and the design of questions which can identify these categories. The shell isolates
categories, taking uncertainty into account - a question may or may not identify a particular category. The shell avoids
the perennial AI problem of asking stupid questions. The shell prioritises questions and produces answers based on the
IA elimination process.
General terminology. The Shell follows a certain terminology in its diagnostic processes. There are: Objects. Objects
have Properties. Properties suffer Alterations. Alterations are induced by Causes. The Relationship between Causes and
Alterations form Patterns. Causes, Alterations and the Patterns of their Relationships are stored in the memory of an
Expert System. Typical Applications: Object: Person. Property: Health. Alteration: Symptom. Cause: Disease.
Objective: Recognise Disease from an evaluation of Symptoms, using the Pattern of their Relationships. Similarly, an
Object could be a Legal Entity. Property: Freedom. Alteration: Civil Activity. Cause: Legislation, or Case Laws.
The Shell Program. An Expert inputs Knowledge into the Shell Program to create an User Program. The User inputs
Y/N answers to onscreen Alteration Queries which help to identify Causes. The general functions are as follows:
*Type Names. A 40 Character Alteration Type Name and Cause Type Name for data entry reference. For
a Medical Program: Alteration Type Name = Symptom. Cause Type Name = Disease. Further references
in the Program will be to Symptom and Disease.
*Alterations. A 20 Character Alteration Name. An 80 Character Question to User. Each screen holds 64
Alteration Entries, so that the Expert can have a global view of the questioning process. A 4000 Character
description screen permits the end user to obtain details concerning the question covered by the Alteration.
All data entry can be edited.
*Causes. A 20 Character Cause Name. An 80 Character Identifying Statement. Each screen holds 64
Cause entries. A 4000 Character description screen permits the end user to obtain details of the Cause. All
data entry can be edited.
*Hypertext. The Shell allows the Expert to create hypertext links between Causes, allowing the User to
search through the database, by clicking on highlighted words.
*Relationships. The Shell screen permits the entry of the Relationship between an Alteration and a Cause.
Yes/No/Maybe entries can be entered with a single keystroke. "Yes" is entered when the Alteration is
positively present for the Cause and absence of the Alteration clearly indicates absence of the Cause. "No"
is entered when the Alteration is absent for the Cause and presence of the Alteration indicates that this
Cause can be eliminated from further consideration. "Maybe" is entered when presence, or absence of the
Alteration does not indicate presence or absence of the Cause.
*Preparation of the expert system. The Shell is designed to enable the Expert to view the global range of
Causes and design Alteration questions which efficiently slice the matrix of Causes in multiple directions.
Other inputs include the Title of the Expert System, Introductory opening screens and Menu screens. Data

http://www.gamedev.net/reference/articles/article770.asp (31 of 38) [25/06/2002 1:40:16 PM]


GameDev.net - The Intuitive Algorithm

in the completed program is compressed and the program is compiled producing a .EXE file.
*User interaction. The User is presented with the option to carry out a word search, a menu search, or an
expert system search. The expert system choice presents the User with a sequence of questions, with
Yes/No/Skip options to arrive at a list of Probable Causes. The User can get further details of each
selected Cause to verify the diagnosis. The User can also backtrack the questioning process and alter the
Y/N/S entries.
*The process. An "Yes" answer eliminates all Causes which have been entered with a "No" relationship
to the Alteration question. A "No" answer eliminates all Causes which have been entered with a "Yes"
relationship to the Alteration question. The program chooses questioning priority by selecting Alteration
with highest number of "Y" relationships. The program also eliminates all Alteration questions, which
have "Y" relationships only to eliminated Causes. When there are less than 4 remaining Causes, the
program presents a list of Probable Causes.
Unlimited rules. Since it is not necessary to design complex reasoning chains, there is no theoretical limit to the size of
the database which can be handled by the IA system. Each Cause is eliminated based on a logical relationship. Such
logical relationships are entered as "rules" in the traditional expert system. While such systems will be prone to error
when the number of rules exceed a thousand, the IA system can accurately work with even a hundred thousand rules.
This opens the possibility of using AI in voluminous subjects which have never been attempted because of the
complexity of rule based expert systems.
Uncertainty. An extremely powerful part of the program is its ability to handle questions which may or may not have
an impact on the outcome. A particular symptom may or may not be present for a disease. The program will still
eliminate those diseases which have a positive or negative impact, depending on the answer. In spite of the uncertainty,
the elimination proceeds with power. The ability to deal logically with uncertainty is an exceptional feature, which is
not present in any other type of computer based logic.
Stupid questions. If an answer clearly indicates the absence of a related disease, a further question which indicates the
disease is called a "stupid question". Traditional expert systems struggle with the problem of trying to avoid stupid
questions. In the IA system, when a Cause is eliminated, the program also eliminates any Alteration question which has
a "Y" relationship only to the Cause. So, the program will never ask a stupid question and the expert does not need to
design the program to cover this eventuality.
Commercial value and optimal size. Speedy access to data has a commercial value in all those areas where people
routinely use computers. Expert systems which use IA can provide a third level of help for commercial computer
programs. The experience of the author is that expert systems, which solve problems in other areas, require an optimal
size to be of value. They should not appear to be toys. Speedy access to all the information in a 400 page manual may
not enthuse users. They may consider such information to be basic. A 3000 page data base may be considered more
useful. In India, Constitutional Law can be summed up in about 400 pages. Related case laws may cover 3000 pages. A
law practitioner may consider the extraction of a Constitutional Law Provision as too basic, but would value the
extraction of a related Case Law. An expert system may be planned only for areas of commercial value and should be
of optimal size.
Unutilised potential. AI researchers have tended to focus on the need for codification of knowledge from experts. But
in all commercially viable fields in today's world, expertise is already recorded in research papers, reference books and
manuals. It is more practical to design an expert system from published data and use the expert only to verify the
accuracy of the data and the acceptability of the questions. The lack of availability of a wide range of expert systems
for public use is a clear indication of the rule size limitations, complexity and impracticality of current rule based
expert systems. There is an urgent need for the use of practical AI solutions in thousands of areas for problems which
people encounter in their daily lives.
References
1.The nerve impulse is a sudden change in the permeability of the membrane to sodium ions. The sodium

http://www.gamedev.net/reference/articles/article770.asp (32 of 38) [25/06/2002 1:40:16 PM]


GameDev.net - The Intuitive Algorithm

ions carry a positive charge and displace the potassium ions, raising the voltage. This increased voltage is
carried through the axon in successive steps. The speed is barely .5 to 120 meters per second. A volley of
such nerve impulses are carried by the axon in a single direction only. The Oxford Companion to The
Mind, 1987, Richard L.Gregory, Nervous System, P.W.Nathan, Pages 517.
2.Experiments by Karl Lashley in the 1940s showed that the skills learned by rats in maze running could
not be obliterated by removal of particular cortical areas. The results of such ablations were generalised
deficits proportional to the amount and not the region of the cortex removed. The Oxford Companion to
The Mind, 1987, Richard L.Gregory, Memory: Biological Basis, Steven Rose, Page 458.
3.If the touch of a single hair is critical information, all surrounding sensory inputs are shut off to highlight
the message. Similar automatic emphasising of contrasts takes place for both visual, auditory and sensory
inputs. The brain actively participates in closing off irrelevant sensory inputs. Gray's Anatomy, 1989, 37th
Edition, Neural Organisation, Inhibitory Circuits, Page 865.
4.The visual system categorises the perceived images in terms of edges, orientation of lines, and even in
terms of isolation of moving lines. Human Neuroanatomy, 1975, 6th Edition, Raymond C. Truex and
Malcolm B. Carpenter, The Cerebral Cortex, The Primary Visual Area, Page 562-565.
5.The average nerve cell responds within about 5 milliseconds of receiving a message. Gray's Anatomy,
1989, 37th Edition, Physiological Properties of Neurons, Page 878-879.
6.Current understanding is that there is a step by step conversion of dendritic input impulses into output
impulses by a nerve cell. According to this understanding, a neuron has a resting voltage of about 80 mV,
inside negative. This resting voltage can change gradually, by "graded potentials" or suddenly, through
"action potentials". Gradual changes occur across membranes of dendrites, and the cell body. Such
changes can go up, or down. They can inhibit the cell, or trigger an impulse from it. Action potentials
reverse polarity across the membranes of axons. It is an all-or-none response, completed in about 5 milli
seconds. Once initiated, the action potential spreads rapidly down the axon. They travel as impulses,
maintaining a specific frequency. Gray's Anatomy, 1989, 37th Edition, Physiological Properties of
Neurons, Page 879.
7."Of the numerous synaptic terminals clustered on dendrites and soma of a multipolar neuron, some are
excitatory while those from other sources are inhibitory. Depending on the activity or quiescence of such
sources, the ratio of active excitatory and inhibitory synapses continuously varies. Their effects
summate..........., an action potential is generated and spreads along the axon as a nerve impulse." Gray's
Anatomy, 1989, 37th Edition, Neural Organisation, Neurons, Page 864.
8.There are receptors for pressure, touch, pulling and stretching. There's even one to detect hair
movement. Peritrtrichial receptors are cage like formations that surround hair follicles. A single axon
receives data from many hair follicles and each follicle reports to two to twenty axons. Some receptor
branches encircle the follicle and others run parallel to its long axis. Nociceptors are free nerve endings
which convert energy from substances released by damaged cells into pain impulses. The Human Nervous
System, 1983, 4th Edition, Murray L. Barr and John A. Kiernan, Introduction and Neurohistology,
Peripheral Nervous System, Cutaneous Sensory Endings, Physiological Correlates, Page 37.
9.Careful stimulation of the proper motor areas can invoke flexion or extension at a single finger joint,
twitching at the corners of the mouth, elevation of the palate, protrusion of the tongue, and even
involuntary cries or exclamations. Human Neuroanatomy, 1975, 6th Edition, Raymond C. Truex and
Malcolm B. Carpenter, The Cerebral Cortex, Efferent Cortical Areas, The Primary Motor Area, Page 571.
10."Everyone knows what attention is. It is the taking possession by the mind, in clear and vivid form, of
one out of what seem several simultaneous objects or trains of thought. Focalisation, concentration of
consciousness are its essence". The Principles of Psychology, 1890, William James. Quoted in: In the
Theater of Consciousness, 1997, Bernard J. Baars, Page 95.

http://www.gamedev.net/reference/articles/article770.asp (33 of 38) [25/06/2002 1:40:16 PM]


GameDev.net - The Intuitive Algorithm

11."In landmark work using cognitive and brain imaging techniques, Michael Posner and his coworkers
recently discovered a network of brain centres involved in visual and executive attention". In the Theater
of Consciousness, 1997, Bernard J. Baars, Page 100.
12."Little is known about the physiology of memory storage in the brain. Some researchers suggest that
memories are stored at specific sites, and others that memories involve widespread brain regions working
together; both processes may be involved". "Memory," Microsoft Encarta 97 Encyclopedia.
13.Long-term potentiation (LTP) is "the enduring facilitation of synaptic transmission that occurs
following the activation of a synapse by high-frequency stimulation of the presynaptic neuron." This
phenomenon (LTP) has been found to occur in the mammalian hippocampus. Researchers believe that the
hippocampus to be one of the major brain regions responsible for processing memories. Pinel, J. (1993).
Biopsychology,(2nd Edition) Allyn & Bacon: Toronto.
14.In the early periods of evolution, "Nosebrains" dominated decision making systems of lower
vertebrates. The smell of an object decided whether it was edible and could be consumed. If the odour was
wrong, it was inedible and had to be avoided. The Human Nervous System 1983, 4th Edition, Murray L.
Barr and John A. Kiernan, Introduction and Neurohistology, Telencephalon, Page 8.
15.In the late 1920s, W.B.Cannon published a paper which suggested that emotional behaviour was still
present when the viscera was surgically or accidentally isolated from the central nervous system. Different
emotions had similar patterns of visceral responses. Perceptions of visceral responses were non-specific.
Emotional responses were far quicker than visceral responses. Emotions do not follow artificial
stimulation of visceral responses as a matter of course. The Oxford Companion to The Mind, 1987,
Richard L.Gregory, Emotion, George Mandler, Pages 219-220.
16.Scar tissue in the cerebral cortex is one of the causes of epilepsy. When operating to remove the scar
tissue, the surgeon has to stimulate the brain electrically on the conscious patient to locate the problem
area. Excitation of certain parts of the temporal lobe produces intense fear in the patient. Other parts cause
feelings of isolation, of loneliness or sometimes of disgust. The Oxford Companion to The Mind, 1987,
Richard L.Gregory, Nervous System, P.W.Nathan, Page 527.
17.The septal area has been shown to be a pleasure zone for rats. Experiments were conducted on the
animals with electrodes planted in this area where they could self stimulate themselves by pressing on a
lever. They were observed to continue until they were exhausted preferring the effect of stimulation to
normally pleasurable activities such as consuming food. The Oxford Companion to The Mind, 1987,
Richard L.Gregory, Centers in The Brain, O.L.Zangwill, Page 129.
18.The limbic system of the brain contains a ring of interconnected neurons containing over a million
fibres connecting the thalamus, the hippocampus, the septal areas and the amygdaloid body. The ring
transmits impulses in both directions. In 1937 Papez postulated that these parts of the brain constitute a
harmonious mechanism which may elaborate functions of central emotion as well as participate in
emotional expression. Bilateral removal of the hippocampal formation and amygdaloid bodies in monkeys
is followed by docility and lack of emotional responses such as fear or anger. The Human Nervous
System, 1983, 4th Edition, Murray L. Barr and John A. Kiernan, Regional Anatomy of the Central
Nervous System, Circuits of the Limbic System, Page 268.
19.Current understanding of medical experts is that the limbic system is believed, to be intimately
involved in seeking and capturing prey, courtship, mating, rearing of young, subjective and expressive
elements in emotional responses and the balance between aggressive and communal behaviour. Gray's
Anatomy, 1989, 37th Edition, The Limbic Lobe and Olfactory Pathways, Page 1028.
20."The total number of rods in the human retina has been estimated at 110-125 million and of the cones
at 6.3-6.8 million (Osterberg 1935)." Gray's Anatomy, 1989, 37th Edition, The Visual Apparatus, Page
1197.

http://www.gamedev.net/reference/articles/article770.asp (34 of 38) [25/06/2002 1:40:16 PM]


GameDev.net - The Intuitive Algorithm

21.When mapping activity in the cerebral cortex, the tones heard by the ear were noted to be processed
within a region of the cortex called the Heschl gyrus. This auditory area of the brain receives fibres from
the medial geniculate nucleus in the thalamus. There is a spatial representation in the auditory area with
respect to the pitch of sounds. Tones of different pitch or frequency produce brain signals at measurably
different locations within the Heschl gyrus. It was laid out like a piano keyboard. A report by
Dr.Christopher Gallen of the Scripps Clinic in La Jolla, California.
22.A study in 1959 by Powell and Mountcastle indicated that a vertical column of cells extending across
all cellular layers in the somatic sensory cortex constitutes the elementary functional cortical unit. The
columns form a barrel, varying in diameter from 200 microns to 500 microns, with a height equal to the
thickness of the cortex. Neurons of one barrel are related to the same receptive field, are activated as a rule
by the same peripheral stimulus and all the cells of a vertical column discharge at more or less the same
latency following a brief peripheral stimulus. A barrel represents a piece of the cortex activated by a single
axon from one of the specific thalamic nuclei. Similar barrels also exist for associate and commisural
fibres, which transfer information between different regions of the cortex. Human Neuroanatomy, 1975,
6th Edition, Raymond C. Truex and Malcolm B. Carpenter, The Cerebral Cortex, Sensory Areas of the
Cerebral Cortex, Page 555-556. The Human Nervous System, 1983, 4th Edition, Murray L. Barr and John
A. Kiernan, Regional Anatomy of the Central Nervous System, Histology of the Cerebral Cortex,
Intracortical Circuits, Page 228.
23.In the early forties, Dempsey and Morison reported that repeated electrical stimuli into the
"non-specific" nuclei of the thalamus resulted in widespread activity in the outermost cortical layers. The
activity appeared to be of a "recruiting" nature. In 1960 Jasper again suggested that the synaptic
termination of the fibres of the "non-specific" system in the cortex travels parallel to the surface and is
widely distributed in all layers, but the principal functional processes appear to be within the outermost
layers. Human Neuroanatomy, 1975, 6th Edition, Raymond C. Truex and Malcolm B. Carpenter, The
Cerebral Cortex, Nonspecific Thalamocortical Relationships, Page 582-584.
24.Stephen Kosslyn and Martha Farah have shown extensively that visual imagery elicits activity in the
same parts of the cortex as visual perception (Kosslyn, 1980). In the Theater of Consciousness, 1997,
Bernard J. Baars, Page 74.
25.Throughout the growth of the nervous system, axons grow from one region to another and "map" on to
specific target regions. The Oxford Companion to The Mind, 1987, Richard L.Gregory, Brain
Development, Colwyn Trevarthen, Pages 101-110.
26.The information proceeds from primary areas of the cortex to secondary areas which co-ordinate the
information from similar sensory receptors in the other half of the body. Neurons in the primary areas
connect only to the secondary areas. All secondary areas in both hemispheres of the brain are
interconnected. These areas assist binocular vision and stereo-phonic sound. The association areas receive
information from every other secondary sensory region. The Human Nervous System, 1983, 4th Edition,
Murray L. Barr and John A. Kiernan, Regional Anatomy of the Central Nervous System, Medullary
Center, Internal Capsule and Lateral Ventricles, Medullary Center, Page 242.
27. All sensory inputs are first received in the primary somesthetic area. Electrical stimulation of this area
gives modified tactile senses, such as tingling, or numb sensations. If this area gets damaged, the related
sensory inputs cannot be felt. If the somesthetic area is intact and there is damage in the somesthetic
association area, awareness of general senses persists but significance of information with reference to
previous experience is elusive. It is impossible to correlate the surface texture, shape, size, and weight of
the object or to compare the sensations with previous experience. A patient is unable to identify a common
object such as a pair of scissors held in the hand while his eyes are closed. The Human Nervous System,
1983, 4th Edition, Murray L. Barr and John A. Kiernan, Regional Anatomy of the Central Nervous
System, Functional Localisation in the Cerebral Cortex, The Somesthetic Association Cortex, Page

http://www.gamedev.net/reference/articles/article770.asp (35 of 38) [25/06/2002 1:40:16 PM]


GameDev.net - The Intuitive Algorithm

232-233.
28.Each of the 30,000 motor neurons, which control motor activity, receives approximately 20,000
synaptic contacts. The greatest number are from interneurons in the spinal tract. They run up and down the
spinal pathway and synapse with the motor neurons. The Human Nervous System, 1983, 4th Edition,
Murray L. Barr and John A. Kiernan, Regional Anatomy of the Central Nervous System, Spinal Cord,
Ventral Horn, Page 71.
29.Situated in the brain stem, the reticular formation is an early predecessor to the brain. The reticular
formation is the recipient of data from most of the sensory systems. While damage to most other regions
of the brain cause only selective defects, serious damage to the reticular formation results in prolonged
coma. Cutaneous and olfactory stimuli to the reticular formation appear to be especially important in
maintaining consciousness. The latter stimuli may be the reason for the success of smelling salts in
restoring a person from a fainting fit. Experimental results show that electrical stimulation of the reticular
formation can also induce sleep in animals. While there are processes in the reticular formation which
raise the level of consciousness and alertness, there may be a co-existing process that induces sleep. The
Human Nervous System, 1983, 4th Edition, Murray L. Barr and John A. Kiernan, Regional Anatomy of
the Central Nervous System, Reticular Formation, Page 145, 152.
30.Medical research confirms that the cerebellum is "necessary for smooth, co-ordinated, effective
movement". Gray's Anatomy, 1989, 37th Edition, Cerebellar Dysfunction, Page 978.
31.Terminations of movements are affected by damage to the cerebellum. For a normal person, when the
elbow is made to flex against resistance and the arm is released suddenly, contraction of opposing muscle
fibres prevents overflexion. In cerebellar disease, flexion is uncontrolled and the patient may hit himself in
the face or chest. This is called the "rebound phenomenon". With cerebellar problems, the patient converts
a movement which requires simultaneous actions at several joints into a series of movements, each
involving a single joint. When asked to touch his nose, with a finger raised above his head, the patient will
first lower his arm and then flex the elbow to reach his nose. This problem is called "decomposition of
movement". Human Neuroanatomy, 1975, 6th Edition, Raymond C. Truex and Malcolm B. Carpenter,
The Cerebellum - Functional Considerations, Page 434.
32.Each half of the body is represented in the cerebellar cortex. The cerebellum has an arrangement that
represents all motor control functions spread over its cortical layer, with topographic precision.
Researchers have mapped out localised areas on the cerebellar cortex for the control of leg, arm and facial
movements which they found were identical with tactile receiving areas. Motor and sensory functions
were integrated in the cerebellum. Human Neuroanatomy, 1975, 6th Edition, Raymond C. Truex and
Malcolm B. Carpenter, The Cerebellum - Functional Considerations, Page 439.
33.The only fibres leaving the cerebellar cortex are the axons of a specialised group of neurons called the
Purkinje cells. The Human Nervous System, 1983, 4th Edition, Murray L. Barr and John A. Kiernan,
Regional Anatomy of the Central Nervous System - Cerebellum, Gross Anatomy, Cerebellar Cortex,
Cortical Layers, Page 159.
34.In 1967, V.Braitenberg suggested the possibility of control of sequential events by the cerebellum.
These neural relationships appear to create, in the cerebellum, an accurate biological clock. Impulses in
fibres which link successive Purkinje cells, reach the cell dendrites at intervals of about a one ten
thousandths of a second. Alternate parallel rows of Purkinje cells are excited, while the in-between rows
are inhibited. Gray's Anatomy, 1989, 37th Edition, Mechanisms of the Cerebellar Cortex, Page 974.
35.The inferior olivary complex is the source of climbing fibres to all regions of the cerebellar cortex. In
1940 Brodal noted that in young cats and rabbits, all regions of the cerebellar cortex receive exquisitely
marked out projections from the olivary nucleus. Destruction of this olivary neuron branch to the
cerebellar cortex results in severe loss of co-ordination of all movements. Such damage appears to cause
problems very similar to those caused by damage to the cerebellum, even though this bundle of nerves is

http://www.gamedev.net/reference/articles/article770.asp (36 of 38) [25/06/2002 1:40:16 PM]


GameDev.net - The Intuitive Algorithm

only one of the many nerve tracts connecting the cerebellum. Human Neuroanatomy, 1975, 6th Edition,
Raymond C. Truex and Malcolm B. Carpenter, The Cerebellum, Olivocerebellar Fibers, Page 422.
36.Sensory events occurring within a tenth of a second merge into a single conscious sensory experience,
suggesting a 100-millisecond scale. But working memory, the domain in which we talk to ourselves or use
our visual imagination, stretches over roughly 10 second steps. In the Theater of Consciousness, 1997,
Bernard J. Baars, Page 48.
37.Mozart, Wolfgang Amadeus. (Based on his quotation in Hadamard 1945, Page 16). Taken from The
Emperor's New Mind, 1989, Roger Penrose, Page 547.
38.The Principles of Psychology, 1890, William James. Quoted in: In the Theater of Consciousness, 1997,
Bernard J. Baars, Page 130.
39.Homeostasis is the naturally maintained, relatively constant state within the body, maintained in a
changeable environment. It is brought about by various sensing, feedback and control systems, supervised
by a hierarchy of control centres. The frontal cortex, limbic system, hypothalamus, reticular formation and
spinal cord constitute some of the components of this hierarchy. The concept that these centres mediate
these controls is based on a wide base of experimental evidence gathered by studying the impact of
destruction of localised topographical targets in animals. As higher levels are included with the spinal cord
below the cut off section, more effective controls are retained. With transection below the hypothalamus,
minor reflex adjustments of cardiovascular, respiratory and alimentary systems survive, but are not
integrated and normal temperature is not maintained. With transection above the hypothalamus, separating
it from the limbic system, effective controls are maintained within a moderate range of conditions. Innate
drives and motivated behaviour are preserved, including feeding, drinking, apparent satiation, and
copulatory responses. But such controls fail if environmental stresses exceed a certain range e.g.,
persistently high or low temperatures. Animals may attack, try to eat, drink or copulate with inappropriate
objects. But if the connections between the limbic system and the hypothalamus survive and only the
frontal cortex is cut off, normal homeostasis is preserved even in a wide range of adverse conditions.
Gray's Anatomy, 1989, 37th Edition, Functions of the Hypothalamus, Page 1011.
40.The prefrontal area forms a part of the frontal lobe of the cortex including much of the frontal gyri,
orbital gyri, most of the medial frontal gyrus and the anterior part of the cingulate gyrus. While all other
regions of the cortex communicate mostly within finite regions, the prefrontal lobe has abundant
connections with the association cortex of the three sensory lobes. Human Neuroanatomy, 1975, 6th
Edition, Raymond C. Truex and Malcolm B. Carpenter, The Cerebral Cortex, Prefrontal Cortex, Page 587.
41.Medical evidence suggests that patients with extensive frontal lobe damage show disregard for the
general tenets of behaviour and a marked lack of concentration. Some years ago, a procedure called
prefrontal lobotomy, or leucotomy was widely used, either for patients with intractable pain or in attempts
to modify the behaviour of severely psychotic patients. The basic operation disconnected the prefrontal
area from the lower regions by cutting its nerve fibre connections. Many institutionalised patients were
able to return home and even to resume their former activities. The results of these operations were
evaluated in a number of publications. While there was abolition of morbid anxiety and obsessional states,
Freeman and Watts noted a lessening of the consciousness of self. The patients were "tactless and
unconcerned in social relationships, with an inability to maintain a responsible attitude". Human
Neuroanatomy, 1975, 6th Edition, Raymond C. Truex and Malcolm B. Carpenter, The Cerebral Cortex,
Prefrontal Cortex, Page 588.
42.Lorenz, Konrad, 1972. As quoted in The Emperor's New Mind, 1989, Roger Penrose, Page 551.
Discuss this article in the forums

Date this article was posted to GameDev.net: 10/7/1999

http://www.gamedev.net/reference/articles/article770.asp (37 of 38) [25/06/2002 1:40:16 PM]


GameDev.net - The Intuitive Algorithm

(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article770.asp (38 of 38) [25/06/2002 1:40:16 PM]


generation5.org - The Natural Mind: Conciousness and Self-Awareness

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

The Natural Mind: Conciousness and


Self-Awareness
Science's biggest mystery is the nature of consciousness. It is not that we possess bad or imperfect theories of human awareness; we
simply have no such theories at all.
- Nick Herbert

Consciousness: Subjective Experiences, and Neural Firings


Third person conciousness, according to David Chalmers can be likened to a characteristic that allows complex systems to scan their
own processing. This is quite a straight forward definition. Existing self-diagnosing machines technically share this aspect of
conciousness. Do such machines really have conciousness? Well, if they do, at most, this phenomenon would occur at a very
rudimentary level.
Have you ever wondered why blue books look blue? Or why things look like anything at all? This is the problem of sensory qualia.
Qualia are qualitive aspects of our mental states which can include taste, smell, touch, pleasure, etc. Physically speaking, when we
look at a blue book, our brain triggers a pattern of nerve firings, whichs allow us to see the book. If looking at objects can be likened
down to patterns of nerve firings then why does the experience of looking at a book seem so subjective? A scientific look at nerve
firings and patterns will help us better understand qualia, and subjective experience.
Well, what of subjective experiences? Why do we even experience anything at all? Could we carry out daily day activities without
these experiences. Are our physical actions the only things that matter? The epiphenomentalists certainly think so, they believe
subjective experiences are of no importance and only physical processes count. Subjective states are "epiphenomenal", they are of no
consequence. Zombies are beings that carry out physical processes without possessing any subjective mental states. Besides what
may live in our memories from horror movies, zombies do not exist, but could they? Certainly if we were to create an android void of
subjective experiences it could be classified as a zombie. However, many scientists believe machines are capable of experiencing
subjective mental states. If we wish to endow machines with subjective experience, we must first discover the engine that creates
conciousness and self-awareness.

Pattern and Information


● Pattern and information always occur together, both are aspects of an entity we will label as pattern-information.
● All information is carried by a pattern in the physical world
● All patterns carry some information.
Patterns always come with information. Even obfuscatory patterns that cannot be read or easily discerned includes information.
There can also be infinitely many "types" of information that can be extracted from any arbitrary pattern. The kind of information
that an agent (such as us) can extract from an essay for example, depends on how it is processed. We could first read the essay, and
summarize it. Or we could even count the number of words in the essay. Information can come in words, numbers almost anything.
Suppose we take another pattern, this time it is composed of a bunch of nonsense scribbling on a wall. It doesn't say or mean
anything, however there is still information that can be processed from that scribbling. We could for example analyze the hue of the
scribbling and come up with a color- say red. Or we could calculate what percentage of the wall is covered. Thus there are very many

http://www.generation5.org/mind.shtml (1 of 2) [25/06/2002 1:40:42 PM]


generation5.org - The Natural Mind: Conciousness and Self-Awareness
processes that exist for extracting information given a pattern. Information is only relative to the choice of process.
We recall that, objectively speaking third-person mental events are patterns of neural firings in the brain. So if we are given patterns
of neural firings, then where can the information be found? Qualia is just information from our subjective experience. No doubt if an
agent undergoes subjective experiences then it is conscious. Thus, the claim that consciousness arises from patterns of neural firings
can be substantiated.

Conclusion
Like David Chalmers, Donald Griffin believes that conciousness results from patterns of activity involving thousands or millions of
neurons. Perhaps this claim is ambiguous, but it is a pretty good lead. If we modeled this neural activity on a machine, then could it
be conscious - at least to some degree? We will have to wait and see.

● Introduction to AI and Philosophy.


● The Natural Mind: Conciousness and Self-Awareness.
● Interview with Marvin Minsky!
● Interview with John Searle!
● AISolutions

All content copyright © 1998-2002, Generation5

http://www.generation5.org/mind.shtml (2 of 2) [25/06/2002 1:40:42 PM]


generation5.org - Turing Machines

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

Turing Machines
During the 1930s-1950s many researchers debated over what was computable, and what wasn't. Many had argued over formal
approaches to computability. In 1937, Alan Turing, a British mathematician who is now considered the father of computing and
artificial intelligence sought to seek an answer to this dilemna. He constructed the theory of a Turing machine. His theorem (the
Church-Turing thesis) states that
Any effective procedure (or algorithm) can be implemented through a Turing machine.

So what are Turing machines? Turing machines are abstract mathematical entities that are composed of a tape, a read-write head, and
a finite-state machine. The head can either read or write symbols onto the tape, basically an input-output device. The head can
change its position, by either moving left or right. The finite state machine is a memory/central processor that keeps track of which of
finitely many states it is currently in. By knowing which state it is currently in, the finite state machine can determine which state to
change to next, what symbol to write onto the tape, and which direction the head should move (left or right). (Note: the tape shall be
assumed to be as large as is neccessary for the current computation it was assigned) As seen in the above figure, input onto the tape
comprises of some finite alphabet (in this case it consists of 0, 1, blank). Thus, the Turing machine can do three possible things.
1. It can write a new symbol at its current position on the tape.
2. It can assume a new state.
3. it can move the position of the head one position to either the left or the right.
This machine is (by the Church-Turing thesis) capable of making any computation. This is not a provable theorem (it has yet to be
disproved) nor a strictly formal definitive approach, the Church-Turing thesis is based on our intuition of what computation is about.
By understanding what Turing machines can compute, we can also gain a better grasp of the potential of production systems for
computing.

Production Systems and Turing Machines


If a production system can emulate a Turing machine, then we can say that it can make any computation by the Church-Turing thesis.
Production systems are simulated by digital computers, thus if the statement that we just made was true, then today's digital
computers can make any computation.
To recap, a production system contains a control structure, a knowledge base and a global database. By then you can probably infer
how such a case for emulation would be possible. In the Turing machine, the position of the head, the symbols written onto the tape,
and their relative positions would be stored into the global database of the production system. (The global database shall be assumed
to be as large as is neccessary to complete the computation). The production rules can be generalized as for every given state and

http://www.generation5.org/turing.shtml (1 of 2) [25/06/2002 1:41:04 PM]


generation5.org - Turing Machines
symbol (from the Turing machine) then do one of the three possible actions (move head, change state, write symbol). For example, If
reading symbol 1 and in state A write symbol 0 etc. As many production rules which would be needed would be produced. The
control structure would then choose whichever rule would satisfy the current state of the system (applying this to the last example, if
the system was reading symbol 1, and if it was in state A, then that production rule would be applied).
With this in mind, we have proven that production systems can do what Turing machines do, including having the ability to compute
anything that is computable. We can conclude that production systems are powerful computing tools.

● The JavaScript Turing Machine.

http://www.generation5.org/turing.shtml (2 of 2) [25/06/2002 1:41:04 PM]


generation5.org - AI in Gaming

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

AI in Gaming
Artificial Intelligence in games is slowly getting better. With the advent of games like HalfLife and Unreal, even the notoriously
dumb AI-engines in first-person shooters are gradually getting more and more intelligent! Is it due to neglect that games have taken
so long to get half-intelligent enemies? Perhaps, but it is also due to the incredible complexity of advanced AI engines that has put a
lot of programming groups off putting in the effort and research to create one. This essay deals with the techniques often used in
AI-engines in games, and possible uses for other paradigms in AI.

Finite State Machines


Despite the rather technical-sounding term, finite state machines (FSM) are the simplest AI engines. They are often used in
first-person shooters such as Doom, Quake and Q2. The enemies have about 8 states that they can be assigned, each state having its
own behaviour, and its own trigger.
Quake2 is a good example to look at since most people have played it. Quake2 uses 9 different states: standing, walking, running,
dodging, attacking, melee, seeing the enemy, idle and searching. For the programmers, here is a bit of code from Quake2 that shows
how the monsters check whether they should change to their dodge state:
static void check_dodge (edict_t *self, vec3_t start, vec3_t dir, int speed) {
vec3_t end;
vec3_t v;
trace_t tr;
float eta;

VectorMA (start, 6144, dir, end); // JM: Lengthened vector.


tr = gi.trace (start, NULL, NULL, end, self, MASK_SHOT);
if ((tr.ent) && (tr.ent->svflags & SVF_MONSTER) && (tr.ent->health > 0)
&& (tr.ent->monsterinfo.dodge) && infront(tr.ent, self)) {
VectorSubtract (tr.endpos, start, v);
eta = (VectorLength(v) - tr.ent->maxs[0]) / speed;
tr.ent->monsterinfo.dodge (tr.ent, self, eta);
}
}
Note that this comes from a Quake2 mod I created, so I've changed the code slightly. The states in a game might be connected
together to form action, for example, to attack in Quake2, the states go from IDLE to RUN (to a point closer to the player), then once
the point has been reached, the state switches to ATTACK. The states in a game can often be represented by a simple flow diagram.
A typical FSM operation diagram might look like this:

http://www.generation5.org/app_game.shtml (1 of 3) [25/06/2002 1:41:45 PM]


generation5.org - AI in Gaming

Finite-state machines are a good way of create a quick, simple, and sufficient AI model for the games it is encorporated in. The
"fun-factor" in first-person shooters comes from the sheer numbers of enemies, combined with (in modern ones) stunning 3D
graphics. The new first-person shooters allow for add-ons, like bots. A lot of bots encorporate much more advanced AI algorithms,
like A* pathing algorithms (generated from information they've dynamically learnt about the level), better dodging, jumping and
general fighting code.
There is also a slight derivative to the state-based engine, but it used in more complicated games like flight simulators and games like
MechWarrior. They use goal-based engines - each entity within the game is assigned a certain goal, be it 'protect base', 'attack
bridge', 'fly in circles'. As the game world changes, so do the goals of the various entities.

Minimax Trees and Alpha-Beta Pruning


Moving on to another genre of games completely - board games. Board gaming AI has received a huge amount of publicity since the
famous chess match between Deep Blue (IBM's master chess computer) and Kasparov - the first time a chess world champion has
been beaten by a machine.
Games like chess, checker, Pente, and Go require a great deal of thinking ahead, predicting what moves the opponent might pick, and
how to counter them. This is where minimax trees come in — the goal of a minimax tree is to minimize the opponents maximum
move.
A board can be represented as a huge tree of moves, starting with a blank board as the root, then branching off with all the possible
first moves, all of which in turn branch, until a winning state (or draw) is achieved. Yet, creating an entire tree of ALL moves would
be impossible on current computer - even a simple game could require around a million nodes. So games like chess and Go are
definitely impossible to represent as a complete tree. Therefore, the algorithms only generate trees 5-10 layers deep.
Looking ahead obviously uses the assumption that the computer and its opponent use the same set of heuristics (rules) to determine
which move to make. This is obviously a rather tenuous assumption to make, especially when playing against humans (therefore, it is
often these heuristics that make or break an AI engine). So, the computer generates all possible board positions from the current one
and does this for about 5-10 moves ahead. After that, it sets about evaluating the nodes, and assigning values to them according to the
"decency" of the move (again, the heuristic involved in this process can have a huge impact on the game). It then uses these values to
determine which path to take so as to minimize its opponents best game.
Just as real trees can grow out of control and need to be pruned, so do minimax trees! Even when only evaluating a few moves ahead,
in a complicated board game such as chess, these trees can be immense, and take a long time to generate and evaluate. Therefore,
pruning is used to ensure time isn't wasted on evaluated pointless nodes.

Possible AI Applications
AI techniques such as genetic algorithms and neural networking can be applied to gaming, and are increasingly are making their
ways into some AI engines. Generation5 has interviewed both Steve Grand (the lead programmer of Creatures, a program that
utilizes both neural networks and genetic emulation) and Andre LaMothe (a famous computer programmer and author) about their AI
programming methods.

Genetic Algorithms
Genetic Algorithms are excellent at searching very large problem spaces, and also for evolutionary developement. For example, an
idea I was going to implement was create a large structure of possible traits for Quake II monsters (aggressiveness, probability of
running away when low on health, probability of running away when few in numbers etc), then use a genetic algorithm to find the
best combination of these structures to beat the player. So, the player would go through a small level, and at the end, the program
would pick the monsters that faired the best against the player, and use those in the next generation. Slowly, after a lot of playing,

http://www.generation5.org/app_game.shtml (2 of 3) [25/06/2002 1:41:45 PM]


generation5.org - AI in Gaming
some reasonable characteristics would be evolved.

Neural Networks
Neural networks can be used to evolve the gaming AI as the player progresses through the game. LaMothe suggests that a
neural-network can be used to assess what fighting moves are to be made in the 3D fighting game (such as VirtuaFighter). The best
thing with neural networks is that they will continually evolve to suit the player, so even if the player changes his tactics, before long,
the network would pick up on it. The biggest problem with NN-programming is that no formal definitions of how to construct an
architecture for a given problem have be discovered, so producing a network to perfectly suit your needs takes a lot of trial-and-error.

Conclusion
I'm very pleased with the direction that Artificial Intelligence is heading - it is slowly taking over more and more of the game loop as
computers get quicker. Players who are getting used to network play are looking for intelligent opponents when playing offline too.
As artificial intelligence techniques get formalized and become more mainstream, we can expect to see some excellent games
emerging over the next 3 to 5 years.
Coming soon: Essays on how to apply AI to your games. Hopefully, I'll get a few essays up on minimax trees (I know the above
explanation wasn't the best), A* pathing, and FSMs.

● Game Reviews - All platforms, even PlayStation2!


● Gaming Essays - Essays on programming game AI.
● Gaming AI Programs - Windows-based, full source code included.
● Gaming AI Interviews
● AISolutions
● Applications in Gaming.
● Applications in the Military.
● Applications in Music.
● AISolutions

All content copyright © 1998-2002, Generation5

http://www.generation5.org/app_game.shtml (3 of 3) [25/06/2002 1:41:45 PM]


generation5.org - Applications in Music

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

Applications in Music
The Artificial Intelligence applications in music are endless - unfortunately, at present there is very little so show for it. Artificial
Intelligence and music are at either ends of the spectrum, AI being seen as the epitomy of computer science, and music the epitomy
of art and abstractness.
Dynamic and autonomous music creation has endless possibilities - pieces could be composed in seconds, for garden centre, elevator
and trance music! Computers could provide brilliant 'jam' partners for guitarists, blues/jazz players to help develop their style.
Computers could provide piano accompaniment for orchestral practices. The possibilities are endless! Yet how can all of this be
achieved?

Generating Computer Music


Creating music takes a certain amount of inspiration to get a piece started. How do we inspire computers? As of yet, there is no
music-generating program that takes no human input. Most programs at the moment require the human user to set various parameters
and options - these are used by the computer to generate the piece.
A lot of music programs use genetic algorithms as a means of generating pieces. GenJam by Al Biles uses genetic algorithms to
create jazz improvisation riffs. The program would come out with a riff, and Biles would tell the program whether it was good or
bad, thus improving the fitness of the riff. After much training, GenJam is a formidable improvisation partner!
Other program use several modules communicationing with each other - some GA-controlled, others not controlled to create a piece.
The non-GA controlled modules use mathematical formulas to form note patterns, chords etc. Nevertheless, the largest direction in
computer-generated music seems to be genetically-created music.
Why is this? Well, GAs do seem well suited to the problem. They set out to search for the best note sequence in an infinite search
space. Given initial criteria, they may find some relatively good sequences. It is the fitness function that is so hard to program,
because exactly what does make a musical piece good? For some, it has to have structure, flow and definite movements - for myself,
it has to evoke an emotion of sorts, or have an excellent guitar solo! Therefore, creating a fitness function for these kind of goals is
incredibly difficult.
What's more, music has to bind together - it is no use getting an genetic algorithm to generate 20 bars of music, and expecting them
to gel together. There is a huge likelihood they'll make no musical-sense. So, how do generate something on the fly, yet allowing it to
bind together? Fractals spring to mind for me. Fractals are used to generate landscapes on the fly that tile perfectly, since they often
use Fourier Transforms.

Fractal Music
What do fractals have to do with music? Ever since the 1920s with Joseph Schillinger, music has been recognized to have a chaotic
and recursive nature. Many other studies as to why we find certain music pleasing, and other music as cacophony. It has been shown
that music often has spectral density of 1/f (the concept of spectral density is unimportant here). It has been shown that most fractals
fall into a similar 1/f category too. Fractals have been used to generate music in several ways - you can select a row and use each
pixel to represent a certain note. Other ways have been done by creating music in the same way that the fractal is drawn, with each
pixel position represent a certain note. For me, the best example of a fractal-generated piece has been one created from a Mandelbrot
set:

http://www.generation5.org/app_music.shtml (1 of 2) [25/06/2002 1:42:08 PM]


generation5.org - Applications in Music

mandel1.mp3 (347Kb)

This is a very chaotic piece, with occasional breaks - nevertheless, it has a certain Eastern tone to it. The is a lot of evidence that
fractal music could soon provide us with some very real, very entertaining pieces in the near future. If you are interested in finding
more about fractal music, please see our links section.

Automated Transcription
Automated Transcription would indeed revolutize the music industry - imagine putting in a CD, pressing a button and having the
computer create a perfect score of the piece. Transcription can be incredibly difficult, especially for pieces like Canon music, or new
highly-layered pieces such as the works of Frank Zappa or Steve Vai. The human ear (well, the human brain) has the ability to listen
in on a certain sound, ignoring (or at least not taking as much attention to) the other sounds. For example, when you listen to a song
you can listen to the words without being distracted by the music, because you can concentrate on the singer's voice. If you want,
though, you can listen to the guitar, bass, or drums without any trouble.
Creating a program to be able to hone in on these sounds is inherently difficult, since we have no idea how the brain is able to
distinguish sounds within sounds. The area of voice recognition may some day lead us to answers, since voice recognition is
basically the study of finding meaning in one sound - auto transcription would be finding 'meaning' (individual instruments) in a
multitude of sounds.

Other Applications
While these two are the main areas of use for Artificial Intelligence, being a guitar player, I see other areas. Roland released a
software package a few years ago - a MIDI program for guitarists. It printed MIDI files in terms of piano keys, music, or showed the
fret board and the positions played (essentially, tablature). Now, the one complaint was that the tablature feature was terrible, since
the notes and positions the program suggested had no 'logical' order. For non-guitarists, on a guitar you can play a note in several
areas, in fact on my guitar I can play a certain note in 6 different places (E). Therefore, when playing a piece, you can play a certain
note sequence in different areas - these areas can make it easier or harder to play a piece, depending on the stretches, jumps and string
skipping required. I have often contemplated creating a MIDI to TAB program, that takes a guitar riff and uses a genetic algorithm to
find the best tablature to play it with.

● Applications in Gaming.
● Applications in the Military.
● Applications in Music.
● AISolutions
● Interview with Al Biles.
● Applications in Music.
● AISolutions

All content copyright © 1998-2002, Generation5

http://www.generation5.org/app_music.shtml (2 of 2) [25/06/2002 1:42:08 PM]


generation5.org - Military Applications of AI

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

Military Applications of AI
The military and the science of computers has always been incredibly closely tied - in fact, the early development of computing was
virtually exclusively limited to military purposes. The very first operational use of a computer was the gun director used in the
second world war to aid ground gunners to predict the path of a plane given its radar data. Famous names in AI, such as Alan Turing,
were scientists that were heavily involved in the military. Turing, recognized as one of founders of both contempory computer
science and artificial intelligence, helped create a machine (called Bombe, based on previous work done by Polish mathematicians) to
break any portion of the German Enigma code.
As computing power increased and pragmatic programming languages were developed, more complicated algorithms and
simulations could be realized. For instance, computers were soon utilized to simulate nuclear escalations and wars or how arms races
would be affected by various parameters. The simulations grew powerful enough that the results of many of these 'wargames' became
classified material, and the 'holes' that were exposed were integrated into national policies.
Artificial Intelligence applications in the West started to become extensively researched when the Japanese announced in 1981 that
they were going to build a 5th Generation computer, capable of logic deduction and other such capabilities.
Inevitably, the 5th Generation project failed, due to the inherent problems that AI is faced with. Nevertheless, research still continued
around the globe to integrate more 'intelligent' computer systems into the battlefield. Emphatic generals foresaw battle by hordes of
entirely autonomous buggies and aerial vehicles, robots that would have multiple goals and whose mission may last for months,
driving deep into enemy territory. The problems in developing such systems are obvious - the lack of functional machine vision
systems has lead to problems with object avoidance, friend/foe recognition, target acquisition and much more. Problems also occur
trying to get the robot to adapt to its surroundings, the terrain, and other environmental aspects.
Nowadays, developers seem to be concentrating on smaller goals, such as voice recogition systems, expert systems and advisory
systems. The main military value of such projects is to reduce the workload on a pilot. Modern pilots work in incredibly complex
electronic environments - receiving information not only from their own radar, but from many others (principle behind J-STARS).
Not only is the information load high, the multi-role aircraft of the 21st century have highly complex avonics, navigation,
communications and weapon systems. All this must be organized in a highly accessible way. Through voice-recognition, systems
could be checked, modified and altered without the pilot looking down into the cockpit. Expert/advisory systems could predict what
the pilot would want in a given scenario and decrease the complexity of a given task automatically.
Aside from research in this area, various paradigms in AI have been successfully applied in the military field. For example, using an
EA (evolutionary algorithm) to evolve algorithms to detect targets given radar/FLIR data, or neural networks differentiating between
mines and rocks given sonar data in a submarine. I will look into these two examples in depth below.

Genetic Programming
Genetic programming is an excellent way of evolving algorithms that will map data to a given result when no set formula is known.
Mathmaticians/programmers could normally find algorithms to deal with a problem with 5 or so variables, but when the problem
increases to 10, 20, 50 variables the problem becomes close to impossible to solve. Briefly, how an GP-powered program works is
that a series of randomly generated expression trees are generated that represent various formulas. These trees are then tested against
the data, poor ones discarded, good ones kept and breed. Mutation, crossover, and all of the elements in genetic algorithms are used
to breed the 'highest-fitness' tree for the given problem. At best, this will perfectly match the variables to the answer, other times it
will generate an answer very close to the wanted answer. (For a more in-depth look at GP, read the case study)

http://www.generation5.org/app_military.shtml (1 of 3) [25/06/2002 1:42:52 PM]


generation5.org - Military Applications of AI

A notable example of such a program is SDI's e evolutionary algorithm designed by Steve Smith. e has been used by SDI to research
algorithms to use in radars in modern helicopters such as the AH-64D Longbow Apache and RAH-66 Comanche. e is presented with
a mass of numbers generated by a radar and perhaps a low-resolution television camera, or FLIR (Forward-looking Infra-red) device.
The program then attempts to find (through various evolutionary means) an algorithm to determine the type of vehicle, or to
differentiate between a actual target and mere "noisy" data.
Basically, the EA is fed with a list of 42 different variables collected from the two sensors, and then a truth value specifying whether
the test data was clutter or a target. The EA then generates a series of expression trees (much more complicated than those normally
used in GP programs). When new a best program is discovered, the EA uses a hill-climbing technique to get the best possible result
out of the new tree. Then, the tree is subjected to a heuristic search to optimize the tree.
Once the best possible tree is found, e will output the program as either pseudocode, C, Fortran or Basic.
Once the EA had evolved the training data, it was put to work on some test data. The results were quite impressive:

Percent Errors
Type
Training Data Test Data
Radar 2.5% 8.3%
Imaging 2.0% 8.0%
Fused 0.0% 4.2%

While the algorithms performed well on the training data, the performance decreased a lot when applied to the test data.
Nevertheless, the fused detection algorithm (using both radar and FLIR information) still provided a decent error percentage.
An additional plus to this technique is that the EA could be actually programmed into the weapon systems (not just the algorithm
outputted), so that the system could dynamically adapt to the terrain, and other mission-specific parameters.

Neural-networks
Neural networks (NN) are another excellent technique of mapping numbers to results. Unlike the EA, though, they will only output
certain results. A NN is normally pre-trained with a set of input vectors and a 'teacher' to tell them what the output should be for the
given input. A NN can then adapt to a series of patterns. Thus, when feed with information after being trained, the NN will output the
result whose trained input most closely resembles the input being tested.
This was the method that some scientists took to identify sonar sounds. Their goal was to train a network to differentiate between
rocks and mines - a notoriously difficult task for human sonar operators to accomplish.
The network architecture was quite simple, it had 60 inputs, one hidden layer with 1-24 inputs, and two output units. The output
would be <0,1> for a rock and <1,0> for a mine. The large amount of input units was to encorporate 60 normalized energy levels of
frequency bands in the sonar echo. What this means is that a sonar echo would be detected, and subsequently fed into a frequency
analyzer, that would break down the echo into 60 frequency bands. The various energy levels of these bands was measured, and
converted into a number between 0 and 1.
A few simple training method was used (gradient-descent), as the network was fed examples of mine echoes and rock echoes. After
the network had made its classifications, it was then told whether it was correct or not. Soon, the network could differentiate as good
or better than its equivalent human operator.
The network had also beaten standard data classification techniques. Data classification programs could successfully detect mines
50% of the time by using parameters such as the frequency bandwidth, onset time, and rate of decay of the signals. Unfortunately, the
remaining 50% of sonar echoes do not always follow the rather strict heuristics that the data classification used. The networks power
came in its ability to focus on the more subtle traits of the signal, and use them to differentiate.

Morality: A Quick Thought


All these systems are quite impressive, and perfected models could prove incredible assets on the battlefield. Artificial Intelligence
may only get developed to a certain level due to the threat humans feel as computers get more and more intelligent. The concept
behind movies such as Terminator where our robotic military technology backfires on us and destroys us are rampant. Are there

http://www.generation5.org/app_military.shtml (2 of 3) [25/06/2002 1:42:52 PM]


generation5.org - Military Applications of AI
moral issues that we must confront as artificial military intelligence develops? As Gary Chapman puts it:
Autonomous weapons are a revolution in warfare in that they will be the first machines given the responsibility for
killing human beings without human direction or supervision. To make this more accurate, these weapons will be the
first killing machines that are actually predatory, that are designed to hunt human beings and destroy them.

Conclusion
The applications of AI in the military are wide and varied, yet due to the robustness, reliability, and durability required for most
military programs and hardware, AI is not yet an intregral part of the battlefield. As techniques are refined and improved, more and
more AI applications will filter into the war scene - after all, silicon is cheaper than a human life.

● Applications in Gaming.
● Applications in the Military.
● Applications in Music.
● AISolutions
● Generation5 Interview with Steve Smith.
● Neural Networks - Generation5 Essays on NNs.
● Genetic Algorithms - Generation5 Essays on GAs.

All content copyright © 1998-2002, Generation5

http://www.generation5.org/app_military.shtml (3 of 3) [25/06/2002 1:42:52 PM]


GameDev.net - Hierarchal AI

Hierarchal AI GameDev.net

See Also:
Artificial Intelligence:Documentation

Courtesy of Amit Patel

Newsgroup: comp.ai.games
From: andrew@cs.uct.ac.za (Andrew Luppnow)
Date: Fri, 2 Dec 1994 10:10:50 +0200 (SAT)
This document proposes an approach to the problem of designing the AI routines for intelligent computer wargame
opponents. It is hoped that the scheme will allow the efficient, or at least feasible, implementation of opponents
which are capable of formulating strategy, rather than behaving predictably according to fixed sets of simple rules.
In the text below, "DMS" is an abbreviation for "decision-making-system". I use the term very loosely to denote
any programming subsystem which accepts, as input, a "situation" and which generates, as output, a "response".
The DMS may be a simple neural network, a collection of hard-coded rules, a set of fuzzy logic rules, a simple
lookup table, or whatever you want it to be! It's most important feature is that it must be SIMPLE and TRACTABLE
- in particular, it must accept input from a small, finite set of possible inputs and generate output which belongs in
a similarly small, finite set of possible outputs.
Some time ago I asked myself how a programmer might begin to implement the AI of a wargame which requires
the computer opponent to develop a sensible military strategy. I eventually realized that simply feeding a SINGLE
decision-making system with information concerning the position and status of each friendly and enemy soldier is
hopelessly inefficient - it would be akin to presenting a general with such information and expecting him to dictate
the movement of each soldier!
But in reality a general doesn't make that type of decision, and neither does he receive information about the
precise location of each soldier on the battlefield. Instead, he receives strategic information from his commanders,
makes strategic decisions and presents the chosen strategy to the commanders. The commanders, in turn, receive
tactical information and make tactical decisions based on (1) that information and (2) the strategy provided by the
general.
And so the process continues until, at the very bottom level, each soldier receives precise orders about what he
and his immediate comrades are expected to accomplish.
The important point is that the whole process can be envisaged in terms of several 'levels'. Each level receives
information from the level immediately below it, 'summarises' or 'generalises' that information and presents the
result to the level immediately above it. In return, it receives a set of objectives from the level above it and uses
(1) this set of objectives and (2) the information from the lower level to compute a more precise set of objectives.
This latter set of objectives then becomes the 'input from above' of the next lower level, and so on. In summary:
information filters UP through the levels, becoming progressively more general, while commands and objectives
filter DOWN through the levels, becoming progressively more detailed and precise.

I decided that this paradigm might represent a good conceptual model for the implementation of the AI procedures
in a complex strategy-based game: a "tree of DMS's" can be used to mimic the chain of command in a military
hierarchy. Specifically, one might use one or more small, relatively simple DMS's for each level. The inputs for a
DMS of level 'k' would be the outputs of a level (k+1) DMS and the information obtained by 'summarising' level
(k-1) information. The outputs of the level k DMS would, in turn, serve as inputs for one or more level (k-1)
DMS's. Outputs of the level zero DMS's would be used to update the battlefield.

http://www.gamedev.net/reference/articles/article199.asp (1 of 2) [25/06/2002 1:43:57 PM]


GameDev.net - Hierarchal AI

"Top brass" - fewer,


MORE GENERAL options
allow lookahead and
Level 3 ^ o "what-if reasoning."
/|\ / \
Level 2 / | \ o o
| /|\ |\
Level 1 | o o o o o
\ | / /| | | | |\
Level 0 \|/ o o o o o o o Individual soldiers -
V many options, but
decision-making is
As information simple and doesn't
filters UP the attempt "lookahead",
tree, it becomes "what-if reasoning",
more general. As etc.
objectives filter
DOWN the tree,
they become more
specific.

The main advantage of this scheme is that it allows the "higher levels" of the hierarchy to formulate strategy,
without being overwhelmed by the immense and intractably large number of possibilities which the computer AI
would have to consider if it possessed only information about individual soldiers. Indeed, at the topmost level,
decisions would involve rather abstract options such as
● "direct all military activity towards seizing territory X", or

● "conduct wars of attrition in territories X, Y, and Z", or


● "buy time - stick to diplomacy for the time being", or
● "avoid direct military engagement - concentrate on disrupting enemy supply routes",
● etc.
Under these circumstances, it would be feasible for the computer to attempt a certain amount of "lookahead", or
to consider "what-if" scenarios - something which would be out of the question if options were presented in terms
of the actions of individual soldiers.
At the time of writing this, I haven't yet had the opportunity to explore an implementation of these ideas in a
working game, but if anybody DOES enjoy some practical success with these ideas, I'd be interested in hearing
from him/her!

--- Andrew Luppnow

Discuss this article in the forums

Date this article was posted to GameDev.net: 9/7/1999


(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article199.asp (2 of 2) [25/06/2002 1:43:57 PM]


generation5.org - Essays

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

Multilayer Feedforward Network and the


Backpropagation Algorithm
Introduction
Backpropagation
The backpropagation algorithm is perhaps the most widely used training algorithm for multi-layered feedforward networks.
However, many people find it quite difficult to construct multilayer feedforward networks and training algorithms from scratch,
whether it be because of the difficulty of the math (which can seem misleading at first glance of all the derivations) or the difficulty
involved with the actual coding of the network and training algorithm. Hopefully after you have read this guide you'll walk away
knowing more about the backpropagation algorithm than you ever did before. Before continuing further on in this tutorial you might
want to check out James' introductory essay on neural networks.

Summary
The problem with the perceptron is that it cannot express non-linear decisions. The perceptron is basically a linear threshold device
which returns a certain value, 1 for example, if the dot product of the input vector and the associated weight vector plus the bias
surpasses the threshold, and another value, -1 for example, if the threshold is not reached.
When the dot product of the input vector and the associated weight vector plus the bias
f(x1,x2,..,xn)=w1x1+w2x2+...wnxn+wb=threshold, is graphed in the x1,x2,...,xn coordinate plane/space one will notice that it is
obviously linear. More than that however, this function separates this space into two categories. All the input vectors that will give a
(f(x1,x2,..,xn)=w1x1+w2x2+...wnxn+wb) value greater than the threshold are separated into one space, and those that will not will be
separated into another (see figure).

http://www.generation5.org/nn_bp.shtml (1 of 4) [25/06/2002 1:46:46 PM]


generation5.org - Essays

The obvious problem with this model then is, what if the decision cannot be linearly separated? The failure of the perceptron to learn
the XOR network and to distinguish between even and odd almost led to the demise of faith in neural network research. The solution
came however, with the development of neuron models that applied a sigmoid function to the weighted sum
(w1x1+w2x2+...wnxn+wb) to make the activation of the neuron non-linear, scaled and differentiable (continuous). An example of a
commonly used sigmoid function is the logistic function given by o(y)=1/(1+e^(-y)), where y=w1x1+w2x2+...wnxn+wb. When these
"sigmoid units" are arranged layer by layer, with each layer downstream another layer acting as the input vector etc. the multilayer
feedforward network is created.
Multilayer feedforward networks normally consist of three or four layers, there is always one input layer and one output layer and
usually one hidden layer, although in some classification problems two hidden layers may be necessary, this case is rare however.
The term input layer neurons are a misnomer, no sigmoid unit is applied to the value of each of these neurons. Their raw values are
fed into the layer downstream the input layer (the hidden layer). Once the neurons for the hidden layer are computed, their activations
are then fed downstream to the next layer, until all the activations eventually reach the output layer, in which each output layer
neuron is associated with a specific classification category. In a fully connected multilayer feedforward network, each neuron in one
layer is connected by a weight to every neuron in the layer downstream it. A bias is also associated with each of these weighted sums.
Thus in computing the value of each neuron in the hidden and output layers one must first take the sum of the weighted sums and the
bias and then apply f(sum) (the sigmoid function) to calculate the neuron's activation.

How then does the network learn the problem at hand? By modifying the all the weights of course. If you know calculus then you
might have already guessed that by taking the partial derivative of the error of the network with respect to each weight we will learn a
little about the direction the error of the network is moving. In fact, if we take negative this derivative (i.e. the rate change of the error
as the value of the weight increases) and then proceed to add it to the weight, the error will decrease until it reaches a local minima.
This makes sense because if the derivative is positive, this tells us that the error is increasing when the weight is increasing, the
obvious thing to do then is to add a negative value to the weight and vice versa if the derivative is negative. The actual derivation will
be covered later. Because the taking of these partial derivatives and then applying them to each of the weights takes place starting
from the output layer to hidden layer weights, then the hidden layer to input layer weights, (as it turns out this is neccessary since
changing these set of weights requires that we know the partial derivatives calculated in the layer downstream) this algorithm has
been called the "back propagation algorithm".

http://www.generation5.org/nn_bp.shtml (2 of 4) [25/06/2002 1:46:46 PM]


generation5.org - Essays

How is the error of the network computed? In most classification networks the output neuron that achieves the highest activation is
what the network classifies the input vector to be. For example if we wanted to train our network to recognize 7x7 binary images of
the numbers 0 through 9, we would expect our network to have 10 output neurons, which each output neuron corresponding to one
number. Thus if the first output neuron is most activated, the network classifies the image (which had been converted to a input
vector and fed into the network) as "0". For the second neuron "1", etc. In calculating the error we create a target vector consisting of
the expected outputs. For example, for the image of the number 7, we would want the eigth output neuron to have an activation of 1.0
(the maximum for a sigmoid unit) and for all other output neurons to achieve an activation of 0.0. Now starting from the first output
neuron calculate the squared error by squaring the difference between the target value (expected value for the output neuron) and the
actual output value and end at the tenth output neuron. Take the average of all these squared errors and you have the network error.
The error is squared as to make the derivative easier.
Once the error is computed, the weights can be updated one by one. This process continues from image to image until the network is
finally able to recognize all the images in the training set.

Training
Recall that training basically involves feeding training samples as input vectors through a neural network, calculating the error of the
output layer, and then adjusting the weights of the network to minimize the error. Each "training epoch" involves one exposure of the
network to a training sample from the training set, and adjustment of each of the weights of the network once layer by layer.
Selection of training samples from the training set may be random (I would recommend this method escpecially if the training set is
particularly small), or selection can simply involve going through each training sample in order.
Training can stop when the network error dips below a particular error threshold (Up to you, a threshold of .001 squared error is
good. This varies from problem to problem, in some cases you may never even get .001 squared error or less). It is important to note
however that excessive training can have damaging results in such problems as pattern recognition. The network may become too
adapted in learning the samples from the training set, and thus may be unable to accurately classify samples outside of the training
set. For example, if we over trained a network with a training set consisting of sound samples of the words "dog" and "cog", the
network may become unable to recognize the word "dog" or "cog" said by a unusual voice unfamiliar to the sound samples in the
training set. When this happens we can either include these samples in the training set and retrain, or we can set a more lenient error
threshold.
These "outside" samples make up the "validation" set. This is how we assess our network's performance. We can not expect to assess
network performance based solely on the success of the network in learning an isolated training set. Tests must be done to confirm
that the network is also capable of classifying samples outside of the training set.

Backpropagation Algorithm
The first step is to feed the input vector through the network and compute every unit in the network. Recall that this is done by
computing the weighting sum coming into the unit and then applying the sigmoid function. The 'x' vector is the activation of the
previous layer.

http://www.generation5.org/nn_bp.shtml (3 of 4) [25/06/2002 1:46:46 PM]


generation5.org - Essays

The second step is to compute the squared error of the network. Recall that this is done by taking the sum of the squared error of
every unit in the output layer. The target vector involved is associated with the training sample (the input vector).

The third step is to calculate the error term of each output unit, indicated below as 'delta'.

The fourth step is to calculate the error term of each of the hidden units.

The fifth step is to compute the weight deltas. 'Eta' here is the learning rate. A low learning rate can ensure more stable convergence.
A high learning rate can speed up convergence in some cases.

The final step is to add the weight deltas to each of the weights. I prefer adjusting the weights one layer at a time. This method
involves recomputing the network error before the next weight layer error terms are computed.

● Neural network essays - A lot of essay on all aspects of neural networking.


● Neural network programs - Most Windows-based, with full source code.
● Neural Network Books
● Neural Network Interviews
● Neural Network Software
● AISolutions

All content copyright © 1998-2002, Generation5

http://www.generation5.org/nn_bp.shtml (4 of 4) [25/06/2002 1:46:46 PM]


generation5.org - Introduction to Neural Networks

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

Introduction to Neural Networks


Introduction
Neural-networks is one of those words that is getting fashionable in the new era of technology. Most people have heard of them, but
very few actually know what they are. This essay is designed to introduce you to all the basics of neural networks — their function,
generic structure, terminology, types and uses.
The term 'neural network' is in fact a biological term, and what we refer to as neural networks should really be called Artificial
Neural Networks (ANNs). I will use the two terms interchangeable throughout the essay, though. A real neural network is a
collection of neurons, the tiny cells our brains are comprised of. A network can consist of a few to a few billion neurons connected in
an array of different methods. ANNs attempt to model these biological structures both in architecture and operation. There is a small
problem: we don't quite know how biological NNs work! Therefore, the architecture of neural networks changes greatly from type to
type. What we do know is the structure of the basic neuron.

The Neuron
Although it has been proposed that there are anything between 50 and 500 different types of neurons in our brain, they are mostly just
specialized cells based upon the basic neuron. The basic neuron consists of synapses, the soma, the axon and dendrites. Synapses are
connections between neurons - they are not physical connections, but miniscule gaps that allow electric signals to jump across from
neuron to neuron. These electrical signals are then passed across to the soma which performs some operation and sends out its own
electrical signal to the axon. The axon then distributes this signal to dendrites. Dendrites carry the signals out to the various synapses,
and the cycle repeats.
Just as there is a basic biological neuron, there is basic artificial neuron. Each neuron has a certain number of inputs, each of which
have a weight assigned to them. The weights simply are an indication of how 'important' the incoming signal for that input is. The net
value of the neuron is then calculated - the net is simply the weighted sum, the sum of all the inputs multiplied by their specific
weight. Each neuron has its own unique threshold value, and it the net is greater than the threshold, the neuron fires (or outputs a 1),
otherwise it stays quiet (outputs a 0). The output is then fed into all the neurons it is connected to.

Learning
As this talk about weights and thresholds leads to an obvious question. How are all these values set? There are nearly as many
training methods as there are network types (a lot!), but some of the more popular ones include back-propagation, the delta rule and
Kohonen learning.
As architectures vary, so do the learning rules, but most rules can be categorized into two areas - supervised and unsupervised.
Supervised learning rules require a 'teacher' to tell them what the desired output is given an input. The learning rules then adjusts all
the necessary weights (this can be very complicated in networks), and the whole process starts again until the data can be correctly
analyzed by the network. Supervised learning rules include back-propagation and the delta rule. Unsupervised rules do not require
teachers because they produce their own output which is then further evaluated.

http://www.generation5.org/nnintro.shtml (1 of 3) [25/06/2002 1:47:25 PM]


generation5.org - Introduction to Neural Networks

Architecture
This area of neural networking is the "fuzziest" in terms of a definite set of rules to abide by. There are many types of networks -
ranging from simple boolean networks (Perceptrons), to complex self-organizing networks (Kohonen networks), to networks
modelling thermodynamic properties (Boltzmann machines)! There is, though, a standard network architecture.
The network consists of several "layers" of neurons, an input layer, hidden layers, and output layers. Input layers take the input and
distribute it to the hidden layers (so-called hidden because the user cannot see the inputs or outputs for those layers). These hidden
layers do all the necessary computation and output the results to the output layer, which (surprisingly) outputs the data to the user.
Now, to avoid confusion, I will not explore the architecture topic further. To read more about different neural nets, see the
Generation5 essays.
Even after discussing neurons, learning and architecture we are still unsure about what exactly neural networks do!

The Function of ANNs


Neural networks are designed to work with patterns - they can be classified as pattern classifiers or pattern associators. The networks
can takes a vector (series of numbers), then classify the vector. For example, my ONR program takes an image of a number and
outputs the number itself. Or my PDA32 program takes a coordinate and can classify it as either class A or class B (classes are
determined by learning from examples provided). More practical uses can be seen in military radars where radar returns can be
classified as enemy vehicles or trees (read more in the Applications in the Military essay).
Pattern associators takes one vector and output another. For example, my HIR program takes a 'dirty' image and outputs the image
that represents the one closest to the one it has learnt. Again, at a more practical level, associative networks can be used in more
complex applications such as signature/face/fingerprint recognition.

The Ups and Downs of Neural Networks


There are many good points to neural-networks and advances in this field will increase their popularity. There are excellent as pattern
classifiers/recognizors - and can be used where traditional techniques do not work. Neural-networks can handle exceptions and
abnormal input data, very important for systems that handle a wide range of data (radar and sonar systems, for example). Many
neural networks are biologically plausible, which means they may provide clues as to how the brain works as they progress.
Advances in neuroscience will also help advance neural networks to the point where they will be able to classify objects with the
accuracy of a human at the speed of a computer! The future is bright, the present however...
Yes, there are quite a few down points to neural networks. Most of them, though, lie with our lack of hardware. The power of
neural-networks lie in their ability to process information in a parallel fashion (that is, process multiple chunks of data
simultaneously). Unfortunately, machines today are serial - they only execute one instruction at a time. Therefore, modelling parallel
processing on serial machines can be a very time-consuming process. As with everything in this day and age, time is of the essence,
which often leaves neural networks out of the list of viable solutions to a problem.
Other problems with neural networks are the lack of defining rules to help construct a network given a problem - there are many
factors to take into consideration: the learning algorithm, architecture, number of neurons per layer, number of layers, data
representation and much more. Again, with time being so important, companies cannot afford to invest to time to develop a network
to solve the problem efficiently. This will all change as neural networking advances.

Conclusion
Hopefully, by now you have a good understanding of the basics of neural networks. Generation5 has recently had a lot of information
added on neural networking, both in essays and in programs. We have examples of Hopfield networks, perceptrons (2 example
programs), and even some case-studies on back-propagation. Please browse through the site to find out more!

● Neural network essays - A lot of essay on all aspects of neural networking.


● Neural network programs - Most Windows-based, with full source code.

http://www.generation5.org/nnintro.shtml (2 of 3) [25/06/2002 1:47:25 PM]


generation5.org - Introduction to Neural Networks
● Neural Network Books
● Neural Network Interviews
● Neural Network Software
● AISolutions

All content copyright © 1998-2002, Generation5

http://www.generation5.org/nnintro.shtml (3 of 3) [25/06/2002 1:47:25 PM]


GameDev.net - A Practical Guide to Building a Complete Game AI: Volume I

A Practical Guide to Building a Complete Game AI: Volume I GameDev.net

See Also:
Artificial Intelligence:Gaming

A Practical Guide to Building a Complete Game AI: Volume I


by Geoff Howland
Artificial Intelligence (AI) in games has taken the backseat in development for a long time for many reasons but the future of games
is definitely going to be weighted heavily with increasingly detailed game AI. If your game's AI is not up to the current level that
game player's expectations demand then your game will feel dated and suffer for it in their opinions.
Game AI is not just neural networks and learning systems and complex mathematical structures, although it can be, but primarily
game AI is about creating an environment and the appearance of thought from units. Game AI is behavioral, not scientific.
The key to understanding how to create game AI is understanding what you want your final results to be and then building the
system to provide those results. It all comes down to what the player can see; if they can't tell it's happening, then it might as well
not be.
The examples and discussion will be given based on the format of Real-Time Strategy (RTS) games, however some of these concepts
can be translated into being appropriate for other genres as well. All data examples are done in standard C format.

State Machines
Finite State Machine
A finite state machine (FSM) is a system that has a limited number of states of operation. A real world example could be a light
switch which is either on or off, or an alarm clock that is either idling by telling time, ringing an alarm or having its time or alarm
set. Any system that has a limited number of possibilities where something can be defined by one state (even combinations) can be
represented as a finite state machine.
Finite state machines are natural for any type of computer program and understanding how to use them effectively to create game
AI is only as hard as understanding the system you are trying to represent, which is as detailed or simple as you make it.

Using Finite State Machines


There are many purposes for using FSMs in games but one of the more intricate ones you have to deal with is trying to model unit
behavior since trying to simulate human beings is the toughest simulation there is. As guaranteed hard as it is to simulate human
behavior there have been many stories of detailed game AI's that were mistaken for human players and vice versa by other players
and spectators of the games, especially some detailed FSMs systems.
While some other systems are designed to more accurately model the way humans think and learn, sometimes you can just never
beat the simplicity of having a choice, weighing the factors and deciding, as a human, which one you would make given that choice.
When learning more about AI decision and learning systems always keep this in mind as often the best system for the job is the
simplest and not the most scientifically accurate.
Don’t misunderstand this as opposition to Neural Network, Genetic Algorithms or any other artificial intelligence systems, just
don’t mistake clever routines and interesting algorithms as being a better solution if they wont give better results. Weigh your
choices based off what you need to get your end result, not the latest trends.

Game State Machines


Creating a believable environment for your game means that you need to consider as many detailed elements that the player might
possibly focus their attention on as you can. The more of these you anticipate by planning and testing, the more immersive the

http://www.gamedev.net/reference/articles/article784.asp (1 of 6) [25/06/2002 1:47:41 PM]


GameDev.net - A Practical Guide to Building a Complete Game AI: Volume I
environment will be for the player when they are discovering your creation.
In your total game state there will be at least two division of state machines that you will need to keep your game going. The first
state machine will deal with the game interface, which includes whether the game is paused, if there are different modes the player
can be looking at the world in, then which one, what things the player can and can't see and any other flags you might use for your
particular interface.
The second state machine will deal with what is actually going on in the game, the current state of the environment, objects in the
level, objectives completed or failed in the mission and all other variables that you use to guide and challenge the player.
You may have an alert system where the enemies will be actively patrolling if the player has been spotted or has fired shots, or flags
for whether certain critical pieces have been destroyed or not. All of these items can be contained inside of a structure such as the
example one below.

struct GameLevelState {
int alert; // Alert status of enemies //
struct Positionshot; // Position of last shot fired //
int shotTime; // Game cycle last shot was fired //
int hostage; // Hostage rescued //
int explosives; // Explosives set or not //
int tank; // Tank destroyed //
int dialogue; // Dialogue variable //
int complete; // Mission completed //
};

Flexibility
Keeping your AI flexible is extremely important. The more modular you make your routines, the more you will be able to expand
them as you go on. Its important to understand that designing a game AI is very much an iterative process, you need to try things
out and build upon them.
The goal in creating a good AI is to have units that react in situations that seem realistic in an environment they seem to be
interacting with. If you box in what your units will be able to do too early it will be difficult to expand the breadth of their actions
later on when you decide to augment the game world to feel more complete or interactive.

Unit Actions
In a game where the player controls units, it is all important to have meaningful and well organized information on them. Without
this, adapting the units to the players will become difficult and the users interface with the game could suffer. If the player doesn’t
feel he is controlling the units and getting appropriate information back from them, then all he is doing is clicking around in an
interface and all immersive aspects the game held will be lost. Meaning the player won't be having any fun and could be becoming
frustrated.

Anatomy 101
To get a sense of what kind of information you may want to provide to the player with let's take a look at a sample data structure.

struct Character {
struct Positionpos; // Map position //
int screenX, screenY; // Screen position //
int animDir, animAction, animNum; // Animation information, action and animation
frame number //
int rank; // Rank //
int health; // Health //
int num; // Number of unit //
int group; // Group number //

http://www.gamedev.net/reference/articles/article784.asp (2 of 6) [25/06/2002 1:47:41 PM]


GameDev.net - A Practical Guide to Building a Complete Game AI: Volume I

int style; // Style of unit (elf, human) //


struct AnimationObject animObj; // Animation object for complex animations //
};
Now some definitions of the variables:
The pos variable determines the unit's position in the game world and the screenX, screenY variables are useful for easily adding
information around the unit on the screen such as health or selection information.
The animDir, animAction and animNum all refer to the units current animated state that will be drawn to the screen.
The rank and health variables are both fairly obvious and are extremely simplified for what information they could hold.
The num variable is the number that the unit is in the main unit array. Later, when calling the unit information from its group it is
sometimes useful to pass off which unit it is, without giving the actual structure address.
The group variable determines which group the unit belongs to as a unit should belong to a group at almost all times. The only time
a unit should not be in a group is if the unit is dead.
The style and animObj variables are both more information about how the unit's graphics will be drawn.

Building Past Basics


Once you have your initial routines working based on a simple version of your units, such as the above, its time to start building
more information into them to really bring them to life.
You will need to think about what kind of actions and reactions you want the units to have. Do you want them to be controlled by
emotions? Do you want them to be able to freeze up? Run away? Charge like a madman?
If you do then adding variables to determining emotional states could be a next step. The best way to try to understand what
components your units can have is to try and understand yourself and what you would be dealing with in a situation that they are. In
order to create human like reactions, you need to base it off of how a human would react.
There is another side to this, almost an opposite of human reaction, which is providing a synthetic experience that is not based on
reality, but instead based on challenging the player. Instead of making your units based on your instincts, you will need to think in
whatever manner you want your units to react in. The point is that you have to make the decisions about every facet you can or you
will end up with flat, boring reactions that are too easy to predict, or even seemingly random. Either of these traits could ruin an
otherwise good game, so put a lot of thought into it, and most importantly, play it to death to make sure it works!

Grouping
To group or not to group?
If you are creating a First Person Shooter game then it comes as no big surprise that grouping isn't for you. However, if you are
creating a RTS or a game that has the player controlling more than one unit at a time then you have a question to ask yourself.
Do you need your units to act in a coordinated way?
If the answer is yes, then there is a good chance that grouping is for you. If the answer is no there still may be advantages to
grouping but you will have to sort those out on your own as they will no doubt be totally dependent on exactly the kind of actions
you want your units to perform.

Benefits of Grouping
1. Units can move in a formation only accessing one master list of movement information. The advantage here is that you do
not have to propagate information to every unit in the group when a destination or target changes as they all get their
movement information off of the group source.
2. Multi-unit coordinated actions, such as surrounding a building, can be controlled at a central location instead of each unit

http://www.gamedev.net/reference/articles/article784.asp (3 of 6) [25/06/2002 1:47:41 PM]


GameDev.net - A Practical Guide to Building a Complete Game AI: Volume I
trying to work out where it is in relation to other units and bumping back and forth until they are in the correct position.
3. Groups can maintain their structure so that issuing new orders only takes the same amount of time and data as issuing an
order to a single unit. Most important you will create something that can be easily understood and read. Changing around 25
units or so and having them try to pass information to each other can be quite a chore if they don’t have any common ground.
4. Depending on the formation of the group, obstacle avoidance and detection can be simplified and time to find paths can be
reduced, which can be a serious concern when dealing with a large amount of units.

The Big Picture


Organizing your group, just like everything else in creating a game AI, is about understand the final affect you want to gain with
control of the units. The idea is to create a place where there is a central repository of information which can be found quickly and
shared between units. The idea is also not to duplicate any data, you want data to be found at one source and one source only, and
that source needs to be the most logical position for the information so that when you are later working with it and building off of it,
other logical extensions will equally seem to be in the correct places.
From my experience I decided that this separation in my work should be split where anything that has to do with the unit as an
enclosed entity will be placed in the unit's data structure, while anything that had to do with movement, or actions, since those are
what we are trying to organize and share, will be placed in the group data structures.
This means that the units alone will not have any information on where they are going, or what they are doing beyond the physical
position they are in, like their animation frame and position in the world. To do this it means that a unit must ALWAYS be in a
group as long as they are capable of moving or their actions changing. If they are alone, then they are just a group of one.

One of many
While we are ultimately looking for a group to act as a coordinated system, the system is definitely made up of individual pieces
and it's important to keep track of what each unit is doing individually so that when we need the group to break formation and move
about as separate entities with common or individual purposes we can. For this goal I created a structure similar to the one below.

struct GroupUnit {
int unitNum; // Character Number //
struct Unit *unit; // Unit character data //
struct Positionwaypoint[50]; // Path in waypoints for units //
int action[50]; // Actions by waypoints //
int stepX, stepY; // Step for individual units, when in cover mode //
int run, walk, sneak, fire, hurt, sprint, crawl; // Actions //
int target; // Targets for units //
struct Position targetPos; // Target Position //
};
Explanations of the variables:
The unitNum is the number of the unit in the group. If there is a maximum of 10 units in a group, then there will be 10 possible slots
that could have units. The first unit would be unitNum 0, following to unitNum 9.
The unit is a pointer to the unit's character data, which holds information like the characters current position, health and every other
piece of information on the individual. Its important that a unit's vital signs and other information can be monitored from the group
so that you can easily check to see if a group member has been wounded and communicate this to the other members, along with a
myriad of other possibilities.
The waypoint array contains all the places that the unit has to move in a queue. All actions and waypoints are only specified in the
GroupUnit structure if the group is not in a formation and units need to move about on their own.
The action array contains actions that are associated with the movements to waypoints. This allows you to create more detailed
command chains, as telling units to sneak across one area and then sprint across another adds a lot of possibilities to making more
strategic and thought out movements by the player.
The stepX, stepY information can be used for simple velocity; every frame move this unit this many world-position-units in any

http://www.gamedev.net/reference/articles/article784.asp (4 of 6) [25/06/2002 1:47:41 PM]


GameDev.net - A Practical Guide to Building a Complete Game AI: Volume I
direction on the map. Used properly this can be just as applicable for all situations as doing real physics modeling, only with a
simpler system and usually reduced processing time (not to mention ease of implementing the first time or quickly).
The run, walk, sneak…variables all deal with different states the unit are in. These are not animations, but action states that can be
toggled easily and even have multiple states that effect each other differently when more than one are turned on.
The target and targetPos variables are used to hold the unit number of the enemy being targeted and his current position. The
enemies position, as well as health and other attributes, could just be referenced each time by looking up the enemies unit number,
but for readability I decided it would be easier to keep a local copy of the enemies position.

Mob mentality
The ultimate goal of course is to have a centralized location for as much of the data as possible to limit the amount of look ups and
processing to a minimum and keep things simple. Lets take a look at a sample data structure for doing this.

struct Group {
int numUnits; // Units in group //
struct GroupUnit unit[4]; // Unit info //
int formation; // Formation information for units and group //
struct Position destPos; // Destination (for dest. circle) //
int destPX, destPY; // Destination Screen Coords //
struct Position wayX[50]; // Path in waypoints for group //
float formStepX, formStepY; // Formation step values for group movements //
int formed; // If true, then find cover and act as individuals,
otherwise move in formation //
int action, plan; // Group action and plans //
int run, walk, sneak, sprint, crawl, sniper; // Actions //
struct Position spotPos; // Sniper Coords //
int strategyMode; // Group strategy mode //
int orders[5]; // Orders for group //
int goals[5]; // Goals for group //
int leader; // Leader of the group //
struct SentryInfo sentry; // Sentry List //
struct AIStateaiState; // AI State //
};
The numUnits variable refers to the number of units in the group and the unit array holds the GroupUnit information. For this
particular group the maximum units has been hard coded to 4.
The formation flag determines what type of formation the group is in. They could be formed in a column, wedge or diamond shape
easily by just changing this variable and letting the units reposition themselves appropriately.
The destPos and destPX,destPY are all information for keeping track of the final destination of the group and relaying that
information quickly to the player. The waypoints and steps work the same manner as individuals except that when units are in a
formation they will all have the same speed so they stay in formation. There is no need to update each unit by its own speed value,
as the group's can be used.
The formed variable is one of the most important as it determines whether the units act in formation or as individuals. The concept
is that if the group is formed then all the units will have the same operations performed on them each cycle. If there is a reason that
they can't all move the same way, such as enemies attacking or a necessary break in formation to get past an obstacle, then the units
need to move on their own.
The actions are the same as individual, and you'll notice that there is a variable for a sniper that is not in GroupUnit structure as
there is no reason to have one unit in a group be a sniper while the rest are off running around. It is logical to split that unit into its
own group and then control the sniper activities at the group level. This is the kind of planning you need to do to figure out what
information is best served in what section of your structures.
The strategyMode is a quick variable that determines how the units respond to enemies. Is it aggressive, aggressive with cause,
defensive, or run-on-sight? Having an easy to access overview variable that controls basic responses is a good way to cut out a lot

http://www.gamedev.net/reference/articles/article784.asp (5 of 6) [25/06/2002 1:47:41 PM]


GameDev.net - A Practical Guide to Building a Complete Game AI: Volume I
of individual unit and group situation calculations. Beyond that it gives the control to the player, who can set different groups in
different modes so they know how each group will react if they encounter any enemies.
The orders and goals arrays point to orders and goals described in an order and goal database so that when orders are given they
can be assigned to multiple groups easily and each group feeds off the same information.
The sentry and aiState are fairly self explanatory as they contain sentry information and more detailed aiState information for doing
detailed pattern matching.

Putting it together
Now that we have some structures for our groups, what's next? The next step is to figure out how you are going to use this
information by routines in your game.
It's crucial that you carefully plan for your AI routines to be modular and flexible so that you can add on to them later and easily
call different pieces. The concept here, as in data structure organization, is to only do something in one function, and to make that
function limited so that it does a specific thing. Then if you need to do that thing again you can call the routine that is already tested
and is a known single point of operation on that data. Later if you run into problems with your AI and need to debug it, you don’t
have to go hunting all over the place for the offending routine, because there is only one routine that would operate on that data, or
at least in that manner.

Tips
Walk before you run. Learn the basics, create the basics, add more advanced routines for your particular game as you need them.
Never fall into the trap of using a routine because it is popular and everyone else seems to be using it. Often other people are using a
routine because its' benefits fit their needs, but it may not fit yours. Furthermore, people often use routines just because they are the
standard, even if they are actually not the best routines for their situation.
Your concern should always be on getting the best results, not having the current fashionable routines; if it works, use it. The game
developer mantra used to be, and always should be, "If it looks right, it is right." Don’t let the people who are interested in
designing real world total physics simulations make you feel bad for creating a simple positional system where you add X to the
position each frame. If that is what works in your particular situation, then do it. Correct physics has its place in some games, but
not all of them and there are other ways of achieving nearly the same results.
You can never build a perfect replica of reality. That is just a fact. So you need to draw your own line on where good enough is and
then make your good enough reality.
Volume II: Unit Goals and Path Finding

Discuss this article in the forums

Date this article was posted to GameDev.net: 10/12/1999


(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article784.asp (6 of 6) [25/06/2002 1:47:41 PM]


GameDev.net - A Practical Guide to Building a Complete Game AI: Volume II

A Practical Guide to Building a Complete Game AI: Volume II GameDev.net

See Also:
Artificial Intelligence:Gaming

A Practical Guide to Building a Complete Game AI: Volume II


by Geoff Howland
Artificial Intelligence (AI) is based on making intelligent looking decisions, for the units in our games to look
intelligent they have to perform actions that seem reasonable for the situations they are in.
In a Real-Time Strategy (RTS) type game these actions would consist of moving, patrolling, avoiding obstacles,
targeting enemies and pursuing them. Lets take a look at what it would take to implement each of these actions.

Movement
Moving, in its most basic form, consists of simply advancing
from one set of coordinates to another set over a period of
time. This can be performed easily by finding a distance vector
and multiplying it by the speed the unit is moving and the time
since we last calculated the position.
Because we are working from a mouse based input system, we
don't expect the user to have to make all the movements
around obstacles like they would in a joystick or first-person
shooter. The way to keep the user from having to click their
way around obstacles is to create an action queue so that we
can have more than one action in a row completed. This way if
a path has to avoid an obstacle we can add the additional paths
in front of the final destination to walk the unit around the
obstacle without player intervention.

Patrolling

http://www.gamedev.net/reference/articles/article785.asp (1 of 5) [25/06/2002 1:48:55 PM]


GameDev.net - A Practical Guide to Building a Complete Game AI: Volume II

Patrolling consists of moving to a series of specified positions in order. At the time when a unit has moved to a
destination and has nowhere else to go, we can compare his current position to his list of patrol points and set a new
destination to the one after where he is closest to.
There isn't a lot to this, but having units moving on the screen, as opposed to standing still and waiting, makes the
world look a lot more alive and gives them a lot less chance of being snuck up on and catching intruders.

Obstacle Avoidance

http://www.gamedev.net/reference/articles/article785.asp (2 of 5) [25/06/2002 1:48:55 PM]


GameDev.net - A Practical Guide to Building a Complete Game AI: Volume II

Avoidance algorithms require the understanding of how your maps


are going to work and how you want your units to interact with them
while moving around. In our case we are going to assume an outdoor
environment with relatively small and simple obstacles such as small
buildings and objects. We will also assume that you cannot go inside
an obstacle and that obstacles are convex polygons with 4 vertices.
In an environment such as this we are mostly dealing with open
movement and obstacles that have to be avoided can be with only 1-2
avoidance movements.

In the example to the left we have a very thin 4 point poly that is
between the unit and the destination. In this case we move away
from the closest vertex to the destination and move several units
away in the perpendicular angle from the unit's collision. This gives
us the buffer space we need to move around the obstacle.
Obstacle avoidance has to be determined by the type of the obstacles
you are going to be providing. In this simple example we are going
to use convex 4 point polygons, which will usually be in a diamond
or square shape. Because of these obstacle limitations we can simply
find the edge closest to the destination which the unit trying to get to
and move out from the obstacle a little thereby creating a simple way
to avoid obstacles which works fairly well as long as obstacles dont
get too close together.
When you want to get into some more advanced path finding
algorithms, you should look into A*, which is a popular algorithm
for finding the shortest path through very maze-like areas. Beyond
A* there are various steering algorithms for gradually moving
around obstacles and other more hard-coded situations such as creating funneling intersections that can be used to
get to different areas of the map.

http://www.gamedev.net/reference/articles/article785.asp (3 of 5) [25/06/2002 1:48:55 PM]


GameDev.net - A Practical Guide to Building a Complete Game AI: Volume II

Targeting Enemies
Targeting for other units will greatly depend on what you want
your player to be doing in your game. You may want your
player's units to automatically fire on enemies they see, so that
player can devote themselves to the big picture. You may want
your units to only attack if specifically told to keep your
player's attention on the units and their surroundings.
Either way, you will want your enemies to be on the look out
for the player's units to provide the challenge of on-the-ball
enemies.
In a situation where you have split up your directions 8 ways,
you can assume that for a unit to be facing another unit, within
vision range, they must be either directly in front of the unit or
in one of the adjacent directions. A simple test to determine if
the unit is within maximum sight distance and is in one of
these three directions from the unit can give you good result
with a minimal amount of time to test cases.
Of course you will want to add in a test for obstacles to see if the units are blocked, most likely you can use the same
test for obstacle avoidance for this, as you usually either have a clear path or not . Adding in height to visibility
testing will of course totally change the nature of these tests, but that is for another article.

Pursuit
Once your enemies have found a target, you won't want them to just wander around aimlessly if they lose sight of
their prey. At this point you need to make a choice about how you wish to handle your searching though. Up until
now we have only talked about spotting units based on actually being able to see them. In some cases this may get a
little tricky when pursuing an enemy, so you may opt to cheat and just set the destination of the unit being tracked as
the tracker's destination.

http://www.gamedev.net/reference/articles/article785.asp (4 of 5) [25/06/2002 1:48:55 PM]


GameDev.net - A Practical Guide to Building a Complete Game AI: Volume II

If you wish to keep things more realistic and do less "cheating", then you
need to store the last position the target was seen in to give a place to start
searching for them. For our example we will just take a random search
approach. First you would set the last position the target was seen as the
first destination. Then as the next destination you would make a random
distance in the direction that the target was originally from the unit before
he lost sight of the target.
In this way we assume that the target ran away from the unit and if we are
correct, the unit will hopefully find him quickly after passing the first
destination. In case the unit did not find his target, we can make a back up
plan of setting a patrol at random distances around where the first
destination was. So the unit will go back to the spot his target was last
seen and will walk in a pattern searching for him.
While this doesn't cover a lot of possibilities, it does give us a reasonable
response given the situation.

Conclusion
The secret to implementing all game AI is the understanding the cases
you are trying to deal with and the results of what you want it to look like. If you can picture what you want the
actions to look like and formulate an algorithm to make them turn out that way you are 90% of the way done.
However the last 10%, getting it to work, can easily take 10 times as long as figuring out how to do it…

-Geoff Howland
Lupine Games

The first article: Practical Guide to Building a Complete Game AI: Volume I
Discuss this article in the forums

Date this article was posted to GameDev.net: 10/12/1999


(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article785.asp (5 of 5) [25/06/2002 1:48:55 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]

| | | |

Features
by John Funge
Gamasutra
December 6, 1999
Printer Friendly
AI for Games and Animation:
Version A Cognitive Modeling Approach
Modeling for computer games addresses the challenge of automating a
variety of difficult development tasks. An early milestone was the Contents
Excerpted from AI for
combination of geometric models and inverse kinematics to simplify
Games and Animation
keyframing. Physical models for animating particles, rigid bodies, Making Them Think
(AK Peters, 1999)
deformable solids, fluids, and gases have offered the means to generate
copious quantities of realistic motion through dynamic simulation. Predefined Behavior
Biomechanical modeling employs simulated physics to automate the lifelike
animation of animals with internal muscle actuators. Research in behavioral Goal-Directed Behavior
modeling is making progress towards self-animating characters that react
appropriately to perceived environmental stimuli. It has remained difficult, The Middle Ground
however, to instruct these autonomous characters so that they satisfy the
programmer's goals. Hitherto absent in this context has been a substantive A Simple Tutorial: Maze
Solving
apex to the computer graphics modeling pyramid (Figure 1), which we
identify as cognitive modeling.
Discussion

Notes and References

Figure 1. Cognitive modeling is the new


apex of the CG modeling hierarchy

Cognitive models go beyond behavioral models, in that they govern what a character knows, how that
knowledge is acquired, and how it can be used to plan actions. Cognitive models are applicable to
instructing the new breed of highly autonomous, quasi-intelligent characters that are beginning to find
use in interactive computer games. Moreover, cognitive models can play subsidiary roles in controlling
cinematography and lighting. See the color plates at the end of this article for some screenshots from
two cognitive modeling applications.
We decompose cognitive modeling into two related sub-tasks: domain knowledge specification and
character instruction. This is reminiscent of the classic dictum from the field of artificial intelligence
(AI) that tries to promote modularity of design by separating out knowledge from control.
knowledge + instruction = intelligent behavior
Domain (knowledge) specification involves administering knowledge to the character about its world

http://www.gamasutra.com/features/19991206/funge_01.htm (1 of 3) [25/06/2002 1:50:05 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]
Letters to the Editor: and how that world can change. Character instruction involves telling the character to try to behave in
Write a letter a certain way within its world in order to achieve specific goals. Like other advanced modeling tasks,
View all letters both of these steps can be fraught with difficulty unless developers are given the right tools for the
job.
Background

The situation calculus is the mathematical logic notation we will be using and it has many advantages
in terms of clarity and being implementation agnostic, but it is somewhat of a departure from the
repertoire of mathematical tools commonly used in computer graphics. We shall therefore overview in
this section the salient points of the situation calculus, whose details are well-documented in the book
[Funge99] and elsewhere [LRLLS97,LLR99]. It is also worth mentioning that from a user's point of
view the underlying theory can be hidden. In particular, a user is not required to type in axioms
written in first-order mathematical logic. In particular, we have developed an intuitive high-level
interaction language CML (Cognitive Modeling Language) whose syntax employs descriptive keywords,
but which has a clear and precise mapping to the underlying formalism (see the book [Funge99], or
website www.cs.toronto.edu/~funge, for more details ).

The situation calculus is an AI formalism for describing changing worlds using sorted first-order logic.
A situation is a "snapshot" of the state of the world. A domain-independent constant s0 denotes the
initial situation. Any property of the world that can change over time is known as a fluent. A fluent is a
function, or relation, with a situation term (by convention) as its last argument. For example,
Broken(x, s) is a fluent that keeps track of whether an object x is broken in a situation s.
Primitive actions are the fundamental instrument of change in our ontology. The sometimes
counter-intuitive term "primitive" serves only to distinguish certain atomic actions from the "complex",
compound actions that we will defined earlier. The situation s' resulting from doing action a in situation
s is given by the distinguished function do, so that s' = do(a,s). The possibility of performing action a
in situation s is denoted by a distinguished predicate Poss (a,s). Sentences that specify what the state
of the world must be before performing some action are known as precondition axioms. For example,
it is possible to drop an object x in a situation s, if and only if a character is holding it:

The effects of an action are given by effect axioms. They give necessary conditions for a fluent to take
on a given value after performing an action. For example, the effect of dropping a fragile object x is
that the object ends up being broken

Surprisingly, a naive translation of effect axioms into the situation calculus does not give the expected
results. In particular, stating what does not change when an action is performed is problematic. This is
called the "frame problem" in AI. That is, a character must consider whether dropping a cup, for
instance, results in, say, a vase turning into a bird and flying about the room. For mindless animated
characters, this can all be taken care of implicitly by the programmer's common sense. We need to
give our thinking characters this same common sense. They need to be told that they should assume
things stay the same unless they know otherwise. Once characters in virtual worlds start thinking for
themselves, they too will have to tackle the frame problem. The frame problem has been a major
reason why approaches like ours have not previously been used in computer animation or until
recently in robotics. Fortunately, the frame problem can be solved provided characters represent their
knowledge with the assumption that effect axioms enumerate all the possible ways that the world can
change. This so-called closed world assumption provides the justification for replacing the effect
axioms with successor state axioms. For example, the following successor state axiom says that,
provided the action is possible, then a character is holding an object if and only if it just picked up the
object or it was holding the object before and it did not just drop the object:

.
Character Instruction

We distinguish two broad possibilities for instructing a character on how to behave: predefined
behavior and goal-directed behavior. Of course, in some sense, all of a character's behavior is defined
in advance by the animator/programmer. Therefore, to be more precise, the distinction between
predefined and goal-directed behavior is based on whether the character can nondeterministically
select actions or not.
What we mean by nondeterministic action selection is that whenever a character chooses an action it
also remembers the other choices it could have made. If, after thinking about the choices it did make,
the character realizes that the resulting sequence of actions will not result in a desirable outcome, then
it can go back and consider any of the alternative sequence of actions that would have resulted from a
different set of choices. It is free to do this until it either finds a suitable action sequence, or exhausts
all the (possibly exponential number of) possibilities.
A character that can nondeterministically select actions is usually a lot easier to instruct, but has a
slower response time. In particular, we can tell a cognitive character what constitutes a "desirable

http://www.gamasutra.com/features/19991206/funge_01.htm (2 of 3) [25/06/2002 1:50:05 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]
outcome" by giving it goals, and it can then use its background domain knowledge to figure out
whether it believes a given action sequence will achieve those goals or not. Although we are using the
word "nondeterministic" in a precise technical sense, the trade-off between execution speed and
programming effort should already be a familiar and intuitive concept for many readers.
A third possibility we will consider is something of a compromise between the two extremes of
predefined and goal-directed behavior. In particular, we introduce the notion of complex actions and
explain how they can be used to provide goals, and a "sketch plan" for how to achieve those goals.
Before we continue, it is worth pointing out that sometimes people identify a particular class of
programming languages with a particular kind of behavior. For example, logic programming languages
are often associated with nondeterministic goal-directed behavior, and regular imperative languages
with deterministic predefined behavior. While it is true that logic programming languages have built-in
support for nondeterministic programming, there is nothing to stop us implementing either kind of
behavior in any programming language we choose (assuming it is Turing complete). To avoid
unnecessary confusion, we shall not tie the following discussion to any particular programming
languages.
_____________________________________________________________
Predefined Behavior

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19991206/funge_01.htm (3 of 3) [25/06/2002 1:50:05 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]

| | | |

Features
by John Funge
Gamasutra
December 6, 1999
Printer Friendly Predefined Behavior
Version Contents
There are many convenient techniques we can use to predefine a
character's behavior. In this article, however, we are more interested in Making Them Think
techniques for which the character's behavior is not completely determined
Excerpted from AI for in advance. Therefore, we shall not attempt a comprehensive survey of Predefined Behavior
Games and Animation techniques for predefining behavior. Instead, we shall take a brief look at
(AK Peters, 1999) two particularly popular approaches: reactive behavior rules, and Goal-Directed Behavior
hierarchical finite-state machines (HFSM).
The Middle Ground
Reactive Behavior Rules
A Simple Tutorial: Maze
We will use the term reactive behavior when a character's behavior is based Solving
solely on its perception of the current situation. What we mean by this is
that the character has no memory of previous situations it has encountered. Discussion
In particular, there is no representation of its own internal state and so it
will always react in the same way to the same input stimuli, regardless of Notes and References
the order in which the inputs are received. A simple way to encode reactive
behavior is as a set of stimulus-response rules. This has a number of important advantages:
● Although the set of rules might be short, and each of the rules very simple, that doesn't
necessarily mean the behavior that results from the character following the rules is simple at all.
That is, we can often capture extremely sophisticated behavior with some simple rules.
● We can usually evaluate the rules extremely quickly so there should be no problem obtaining
real-time response from our characters.
● There is no need to worry about various knowledge representation issues that arise when
characters start thinking for themselves. That is, the characters are not doing any thinking for
themselves; we have done it all for them, in advance.

The use of reactive behavior rules was also one of the first approaches proposed for generating
character behaviors, and it is still one of the most popular and commonplace techniques. Great
success has been obtained in developing rule sets for various kinds of behavior, such as flocking and
collision avoidance. As an example of a simple stimulus-response rule that can result in extremely
sophisticated behavior, consider the following rule:

Believe it or not, this simple "left-hand rule" will let a character find its way through a maze. It is an
excellent example of how one simple little rule can be used to generate highly complex behavior. The
character that follows this rule doesn't need to know it is in a maze, or that it is trying to get out. It
blindly follows the rule and the maze-solving ability simply "emerges". Someone else did all the
thinking about the problem in advance and managed to boil the solution down to one simple
instruction that can be executed mindlessly. This example also shows how difficult thinking up these
simple sets of reactive behavior rules can be. In particular, it is hard to imagine being the one who
thought this rule up in the first place, and it even requires some effort to convince oneself that it
works.

We can thus see that despite some of the advantages, there are also some serious drawbacks to using
sets of reactive behavior rules:
● The biggest problem is thinking up the correct set of rules that leads to the behavior we want. It
can require enormous ingenuity to think of the right set of rules and this can be followed by
hours of tweaking parameters to get things exactly right.
Letters to the Editor: ● The difficult and laborious process of generating the rules will often have to be repeated, at least
Write a letter in part, every time we want to effect even a slight change in the resulting behavior.
View all letters
● Since the behavior rules are deterministic, once an action is chosen, there is no way to
reconsider the choice. There are many cases when a cognitive character could use its domain

http://www.gamasutra.com/features/19991206/funge_02.htm (1 of 3) [25/06/2002 1:52:11 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]
knowledge to quickly anticipate that an action choice is not appropriate. An autonomous
character has no ability to make such judgments and, regardless of how appropriate it is, must
blindly follow the predefined behavior rules that pertain to the current situation.
● When there are many rules it is quite likely their applicability will overlap and they could give
conflicting suggestions on which action to choose. In such cases some conflict resolution strategy
must be employed.

It is often easier to write a controller if we can maintain some simple internal state information for the
character. One popular way to do this is with HFSM that we discuss in the next section.
Hierarchical Finite-state Machines (HFSM)

Figure 2. The WhichDir FSM

Finite-state machines (FSMs) consist of a set of states (including an initial state), a set of inputs, a set
of outputs, and a state transition function. The state transition function takes the input and the current
state and returns a single new state and a set of outputs. Since there is only one possible new state,
FSMs are used to encode deterministic behavior. It is commonplace, and convenient, to represent
FSMs with state transition diagrams. A state transition diagram uses circles to represent the states and
arrows to represent the transitions between states. Figure 2 depicts an FSM that keeps track of which
compass direction a character is heading each time it turns "left".
As the name implies, an HFSM is simply a hierarchy of FSMs. That is, each node of an HFSM may itself
be an HFSM. Just like functions and procedures in a regular programming language, this provides a
convenient way to make the design of an FSM more modular. For example, if a character is at
coordinates (x,y), Figure 3 depicts an HFSM that uses the FSM in Figure 2 as a sub-module to calculate
the new cell after turning "left", or moving one cell ahead.

http://www.gamasutra.com/features/19991206/funge_02.htm (2 of 3) [25/06/2002 1:52:11 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]

Figure 3. HFSM that uses the WhichDir FSM

HFSMs are powerful tools for developing sophisticated behavior and it is easy to develop graphical user
interfaces to assist in building them. This has made them a popular choice for animators and game
developers alike.
HFSMs maintain much of the simplicity of sets of reactive-behavior rules but, by adding a notion of
internal state, make it easier to develop more sophisticated behaviors. Unfortunately, they also have
some of the same drawbacks. In particular, actions are chosen deterministically and there is no explicit
separation of domain knowledge from control information. This can lead to a solution which is messy,
hard to understand and all but impossible to maintain. Just like reactive-behavior rules, there can also
be a large amount of work involved if we want to obtain even slightly different behavior from an HFSM.
________________________________________________________
Goal-Directed Behavior

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19991206/funge_02.htm (3 of 3) [25/06/2002 1:52:11 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]

| | | |

Features
by John Funge
Gamasutra
December 6, 1999
Printer Friendly Goal-Directed Behavior
Version Contents
The first step in describing goal-directed behavior is to come up with a way
to define a cognitive character's goals. The situation calculus provides a Making Them Think
simple and intuitive theoretical framework to explain how this can be done.
Excerpted from AI for In particular, a character's goals can be expressed in terms of the desired Predefined Behavior
Games and Animation value of various relevant fluents. A goal can therefore be expressed as a
(AK Peters, 1999) defined fluent, i.e., a fluent defined in terms of other fluents. For example, Goal-Directed Behavior
suppose we have two characters, call them Dognap and Jack, such that
Dognap is armed with a gun, and wants to kill Jack. Then, we can state that The Middle Ground

A Simple Tutorial: Maze


Dognap's goal is to kill Jack: Solving

Clearly, Dognap will have achieved this goal in any situation s' for which is Discussion
goal(s') true. We recall that any situation is either the initial situation s0, or
of the form: Notes and References

Therefore, if goal(s0) is not true, then Dognap must search for a sequence of n actions, a0,...,an-1
such that

is true.
Situation Tree

To explain how characters can automatically search for sequences of actions that meet their goals, we
will introduce the idea of a situation tree. In particular, we can think of the actions and effects as
describing a tree of possible future situations. The root of the tree is the initial situation s0, each
branch of the tree is an action, and each node is a situation. Figure 4 shows an example of a tree with
n actions, a0,a1...,an-1.

Letters to the Editor:


Write a letter
View all letters

http://www.gamasutra.com/features/19991206/funge_03.htm (1 of 4) [25/06/2002 1:53:01 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]

Figure 4. An abstract situation tree

The value of the fluents at each node (situation) is determined by the effect axioms. Figure 5 shows a
simple concrete example using the Dognap and Jack example, and the corresponding effect axioms,
that we described earlier.

Figure 5. A concrete example of a situation tree

Figure 5 A concrete example of a situation tree. A goal situation is a situation in which the fluent is
true. For example, in Figure 5 we can see that if the goal is still to kill Jack then the situation

is a goal situation. We can see that in this example there are many goal situations, for example

is another goal situation. In general, however, there is no guarantee that a goal situation exists at all.
If a goal situation does exist, then any action sequence that leads to one of the goal situations is called
a plan.

http://www.gamasutra.com/features/19991206/funge_03.htm (2 of 4) [25/06/2002 1:53:01 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]

Figure 6. An abstract situation tree with just three actions.

Figure 6 shows a simple abstract situation tree with just three actions, and three goal situations. We
will use this figure to illustrate how a character can search the tree to automatically find a plan (a
path) that leads from the initial situation (the root) to a goal situation. Depending on how we choose
to search the tree we will find different plans (paths). In particular, we can see some common search
strategies being applied. We can see that a bounded depth-first search strategy finds the plan
[a0,a2,a0], whereas a breadth first search finds [a1,a2].

A breadth-first search tries exhaustively searching each layer of the tree before proceeding to the next
layer. That is, it considers all plans of length 0, then all plans of length 1, etc. Thus, a breadth-first
search is guaranteed to find a plan if there is one. Moreover it will find the shortest such plan.
Unfortunately, a breadth-first search requires an exponential amount of memory as the character has
to remember all the previous searches.

A depth-first search doesn't require an exponential amount of memory, as there is no need to


explicitly store large portions of the tree. That is, a depth-first search only needs to remember one
branch of the tree at a time. It keeps looking down this one branch until it gets to a goal, or it reaches
a leaf node. If it reaches a leaf-node, it backs up to the previous node and searches another branch. If
there are no more branches, it backs up one step further and proceeds recursively until it has
searched the entire tree. Unfortunately, even if there is a goal in the tree, depth-first search is not
guaranteed to find it. In particular, it is quite likely that the tree will have branches that are infinite.
That is, the character can just keep doing some sequence of actions over and over again, but it never
leads to a goal. A depth-first search can get sidetracked by searching down one of these fruitless
infinite branches. Because it never reaches a goal, or a leaf node, the algorithm never terminates.
Another drawback of a depth-first search is that even if it does find a plan, this plan is not guaranteed
to be the shortest possible plan. Depending on the application, this may or may not be important.
A bounded depth-first search attempts to resolve some of the limitations of a depth-first search by
putting a bound on how deeply in the tree the search can proceed. Now the search backs up if it finds
a leaf node, or if the maximum search depth is exceeded. It is even possible to iteratively search with
a deeper and deeper bound. To avoid redoing the work of the previous search, the results of the last
search can be stored so that we don't have to begin from scratch each time the depth bound is
increased. Unfortunately, we are now back to remembering large portions of the tree and, just like a
breadth-first search, this requires an exponential amount of memory.

In the worst case, the situation tree does not contain any goal situations. If this is the case, then any
exhaustive search algorithm will take an exponential amount of time to respond that there is no plan
available to achieve the goal. This is one of the major limitations of planning and is something we will
look at in more detail in the next section. In the meantime, we mention that looking for different
search algorithms is an important topic in AI research and the interested reader should consult the
further reading section. One of the most interesting new developments is the use of stochastic search
algorithms.
It should also now be apparent how choosing actions nondeterministically entails searching for
appropriate action sequences in a search space that potentially grows exponentially. This corresponds
to the usual computer science notion of computational complexity. Another interesting point to note is
that CPU processing power is also growing exponentially. Therefore, according to Moore's law, our
computer characters can be expected to be able to search one layer deeper in the situation tree every
eighteen months or so.
________________________________________________________

The Middle Ground

http://www.gamasutra.com/features/19991206/funge_03.htm (3 of 4) [25/06/2002 1:53:01 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19991206/funge_03.htm (4 of 4) [25/06/2002 1:53:01 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]

| | | |

Features
by John Funge
Gamasutra
December 6, 1999
Printer Friendly The Middle Ground
Version Contents
As we explained, for predefined behaviors the character doesn't have to do
any searching for actions that achieve its goals. It simply follows the Making Them Think
instructions it was given and ends up at a goal situation. In effect, for a
Excerpted from AI for given set of inputs, the path through the tree of possible situations has been Predefined Behavior
Games and Animation determined in advance. If the predefined behaviors were defined properly,
(AK Peters, 1999) then the path that they specify through the tree will lead to a goal situation. Goal-Directed Behavior

In this section, the question we want to ask is whether there is some middle The Middle Ground
ground between asking the character to do all the work at run-time and
asking the programmer to all the work at compile time. In particular, A Simple Tutorial: Maze
consider that on the one hand we have predefined behavior which Solving
corresponds to a single path through the situation tree, and on the other
hand we have goal-directed behavior which corresponds to searching the Discussion
whole tree. Clearly, the middle ground has to be searching some subset of
Notes and References
the tree.
Note that this "middle ground" is still technically goal-directed behavior, but we now have control over
how much nondeterminism is allowed in the behavior specification. Only in the limiting case, when we
have removed all the nondeterminism, does the behavior reduce to deterministic predefined behavior.
Precondition Axioms

Although we might not have realized it, we have already seen one way to exclude parts of the
situation tree from the search space. In particular, precondition axioms prune off whole chunks of the
tree by stating that not all actions are possible in all situations. Figure 7 shows an example of an
abstract tree in which it is not possible to do an action a2 because an action a1 changed something
which made it impossible.

Figure 7. Preconditions preclude portions of the tree

While preconditions are important for cordoning off parts of the situation tree, they are a clumsy way
Letters to the Editor: to try and coerce a character to search a particular portion of the tree. In particular, we need a way to
Write a letter give a character general purpose heuristics to help it find a goal faster. For example, we might want to
View all letters give the character a heuristic that will cause it look at certain groups of actions first, but we do not
want to absolutely exclude the other actions.

http://www.gamasutra.com/features/19991206/funge_04.htm (1 of 4) [25/06/2002 1:54:42 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]
The hard part of exploiting the middle ground between predefined and goal-directed behavior is to
think up a useful way to specify subsets of the tree. In the next section, we will introduce a convenient
way to specify arbitrary subsets of the situation tree to search.
Complex Actions

We would like to provide a character with a "sketch plan" and have it responsible for filling in the
remaining missing details. In this way, we salvage some of the convenience of the planning approach
while regaining control over the complexity of the planning tasks we assign the character. We will
show how we can use the idea of complex actions to write sketch plans.
The actions we discussed previously, defined by precondition and effect axioms, are referred to as
primitive actions. (The term "primitive action" is only meant to indicate an action is an atomic unit,
and not a compound action. Unfortunately, the term can be misleading when the action actually refers
to some sophisticated behavior, but we will stick with the term as it is widely used in the available
literature). Complex actions are abbreviations for terms in the situation calculus; they are built up
from a set of recursively defined operators. Any primitive action is also a complex action. Other
complex actions are composed using various operators and control structures, some of which are
deliberately chosen to resemble a regular programming language. When we give a character a
complex action a, there is a special macro Do that expands a out into terms in the situation calculus.
Since complex actions expand out into regular situation calculus expressions, they inherit the solution
to the frame problem for primitive actions.
Complex actions are defined by the macro Do(a,s,s'), such that is a state that results from doing the
complex action a in state s. The complete list of operators for the (recursive) definition of Do are given
below. Together, the operators define an instruction language we can use to issue direction to
characters. The mathematical definitions can be difficult to follow, and the reader is encouraged to
consult the book [Funge99], in which we explain the basic ideas more clearly using numerous
examples of complex actions (note there are two freely available implementations of complex actions
that can be studied for a more practical insight into how the macro expansion works--see
www.cs.toronto.edu/~funge/book).

http://www.gamasutra.com/features/19991206/funge_04.htm (2 of 4) [25/06/2002 1:54:42 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]

Figure 8. Effect of the complex action on a situation tree

The macro expansion Do(a,s,s') specifies a relation between two situations s and s', such that is a
situation that results from doing the complex action a in situation s. In general, there is not a unique
s', so if we have some initial situation s0, a complex action "program", and a bunch of precondition
and effect axioms, then Do(program, s0, s') specifies a subset of the situation tree. Figure 8 shows a
quick example of how a complex action can be used to limit the search space to some arbitrary subset
of the situation tree. The other thing we can see from the figure is that the mathematical syntax can
be rather cryptic. Therefore, in the appendix, we introduce some alternative syntax for defining
complex actions that is more intuitive and easy to read.
On its own, just specifying subsets of the situation tree is not particularly useful. Therefore, we would
normally explicitly mention the goal within the complex action. We shall see many examples of this in
what follows. For now, suppose the complex action "program" is such a complex action. If we can find
any

such that Do(program, s0, s'), then the plan of length n, represented by the actions a0,...,an-1. , is
the behavior that the character believes will result in it obtaining its goals. Finding such an s' is just a
matter of searching the (pruned) situation tree for a suitable goal situation. Since we still end up
searching, research in planning algorithms is just as relevant to this section as to the straight
goal-directed specification section.
Implementation

Note that we defined the notion of a situation tree to help us visualize some important ideas. We do
not mean to suggest that in any corresponding implementation that there need be (although, of
course, there may be) any data structure that explicitly represents this tree. In particular, if we
explicitly represent the tree, then we need a potentially exponential amount of memory. Therefore, it
makes more sense to simply build portions of the tree on demand, and delete them when they are no
longer required. In theorem provers and logic programming languages (e.g., Prolog), this is exactly
what happens continually behind the scenes.
Logic programming languages also make it straightforward to under-specify the domain knowledge.
For example, it is perfectly acceptable to specify an initial state that contains a disjunction, e.g.
OnTable(cup,s0) v OnFloor(cup,s0). Later on, we can include information that precludes a previously
possible disjunct, and the character will still make valid inferences without us having to go back and

http://www.gamasutra.com/features/19991206/funge_04.htm (3 of 4) [25/06/2002 1:54:42 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]
alter any of the previous information. If we do not need such a sophisticated notion of elaboration
tolerance, then it might be simpler to build a situation tree explicitly. Moreover, if the tree is not too
deep, or if it is heavily pruned, it needn't be excessively large and thus can be fast to search. Whether
such a shallow, or sparse, tree is useful or not will depend on the particular application, but in
computer games and animation there are countless examples where a character with even a moderate
ability to plan ahead can be extremely useful.
_____________________________________________________________________
A Simple Tutorial: Maze Solving

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19991206/funge_04.htm (4 of 4) [25/06/2002 1:54:42 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]

| | | |

Features
by John Funge
Gamasutra
December 6, 1999
Printer Friendly A Simple Tutorial Example: Maze Solving
Version Contents
We already looked at some predefined behavior for solving a maze. Let's
take a look at a goal-directed approach to the problem. Of course, since Making Them Think
there are well-known predefined behaviors for maze solving, we would not
Excerpted from AI for suggest using a goal-directed approach in a real application. Therefore, this Predefined Behavior
Games and Animation section is simply meant as a tutorial example to show how some of the
(AK Peters, 1999) different pieces fit together. Goal-Directed Behavior

Domain Knowledge The Middle Ground

Let us suppose we have a maze defined by a predicate Free(c), that holds A Simple Tutorial: Maze
when, and only when, the grid cell c is "free". That is, it is within range and Solving
is not occupied by an obstacle.
Discussion

Notes and References

Occupied(c), sizex, and sizey each depend upon the maze in question. In addition, there are two maze
dependent constants start and exit that specify the entry and exit points of a maze. Figure 9 shows a
simple maze and the corresponding definition.

Figure 9. A simple maze.

We also need to define some functions that describe a path within the maze. We say that the adjacent
cell "North" of a given cell is the one directly above it, similarly for "South", "East", and "West".

Letters to the Editor:


There are two fluents; position denotes which cell contains the character in the current situation, and
Write a letter visited denotes the cells the character has previously visited.
View all letters
The single action in this example is a move action that takes one of four compass directions as a
parameter. It is possible to move in some direction d, provided the cell to which we are moving is free

http://www.gamasutra.com/features/19991206/funge_05.htm (1 of 5) [25/06/2002 1:55:50 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]
and has not been visited before.

Figure 10 shows the possible directions a character can move when in two different situations.

Figure 10. Possible directions to move

A fluent is completely specified by its initial value and its successor-state axiom. For example, the
initial position is given as the start point of the maze and the effect of moving to a new cell is to
update the position accordingly.

So for example, in Figure 9, if the character has previously been to the locations marked with the filled
dots, and in situation the character moves north to the unfilled dot, then we have that position(s) =
(2,0) and that position(do(move(north),s)=(2,1).
The list of cells visited so far is given by the defined fluent . It is defined recursively on the situation to
be the list of all the positions in previous situations (we use standard Prolog list notation).

For example, in Figure 9, when

we have that (s) = (2,1), and that (s) = [(2,0,(1,0),(0,0)].


Character Instruction

We have now completed telling the character everything it needs to know about the concept of a
maze. Now we need to move on and use complex actions to tell it about its goal and any heuristics
that might help it achieve those goals. As a first pass, let's not give it any heuristics, but simply
provide a goal-directed specification of maze-solving behavior. Using complex actions we can express
this behavior elegantly as follows:

Just like a regular "while" loop, the above program expands out into a sequence of actions. Unlike a
regular "while" loop, it expands out, not into one particular sequence of actions, but into all possible
sequences of actions. The precondition axioms that we previously stated, and the exit condition of the
loop, define a possible sequence of actions. Therefore, any free path through the maze, which does
not backtrack and ends at the exit position, meets the behavior specification.

http://www.gamasutra.com/features/19991206/funge_05.htm (2 of 5) [25/06/2002 1:55:50 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]
Note that the use of regular programming constructs may initially cause confusion to the reader of the
above code. Most of the work is being done by the nondeterministic choice of arguments operator " ".
The example makes it clear that by "nondeterministic" we do not mean that anything random is
happening; we simply mean that we can specify a large number of possibilities all at once. In
particular, the ( d) construct should be read as "pick the correct direction d". For the mathematically
inclined, perusing the definitions may serve to alleviate any sense of bewilderment. To make things
even clearer we shall, however, consider the expansion of the complex actions in terms of their
definitions. The expansion is based on the simple maze described previously in Figure 9.

In the initial situation we have . Thus the guard of the "while" loop holds and we
can try to expand

Expanding this out into the full definition gives

However, from the action preconditions for and the definition of the maze we can see that:

This leaves us with s = do(move(north), s0) V s = do(move(east), s0).That is, there are two possible
resulting situations. That is why we refer to this style of program as nondeterministic.

In contrast, in situation s = do(move(north),s0) there is only one possible resulting situation. We have
Do(( d) move(d),s,s') that expands out into s'=do(move(north),s).
If we expand out the macro

from start to finish, we get

So, as depicted in Figure 11, our "program" does indeed specify all paths through the maze.

http://www.gamasutra.com/features/19991206/funge_05.htm (3 of 5) [25/06/2002 1:55:50 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]

Figure 11. Valid Behaviors

Although we disallow backtracking in the final path through the maze, the character may use
backtracking when it is reasoning about valid paths. In most of the mazes we tried, the character can
reason using a depth-first search to find a path through a given maze quickly. For example, Figure 12
shows a path through a reasonably complicated maze that was found in a few seconds.

Figure 12. Maze solving in practice

To speed things up, we can start to reduce some of the nondeterminism by giving the character some
heuristic knowledge. For example, we can use complex actions to specify a "best-first" search
strategy. In this approach, we will not leave it up to the character to decide how to search the possible
paths, but constrain it to first investigate paths that head toward the exit. This requires extra lines of
code, but could result in faster execution.
For example, suppose we add an action goodMove(d), such that it is possible to move in a direction d
if it is possible to "move" to the cell in that direction and the cell is closer to the goal than we are now.

Now we can rewrite our high-level controller as one that prefers to move toward the exit position
whenever possible.>

At the extreme, there is nothing to prevent us from coding in a simple deterministic strategy such as
the "left-hand" rule. For example, if we introduce a defined fluent dir that keeps track of the direction
the character is traveling, and a function ccw that returns the compass direction counterclockwise to
its argument, then the following complex action implements the left-hand rule.

http://www.gamasutra.com/features/19991206/funge_05.htm (4 of 5) [25/06/2002 1:55:50 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]

The important point is that using complex actions does not rule out any of the algorithms one might
consider when writing the same program in a regular programming language. Rather, it opens up new
possibilities for high-level specifications of behavior at a cognitive level of abstraction.

_______________________________________________________________
Discussion

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19991206/funge_05.htm (5 of 5) [25/06/2002 1:55:50 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]

| | | |

Features
by John Funge
Gamasutra
December 6, 1999
Printer Friendly Discussion
Version Contents
Complex actions provide a convenient tool for giving a character "advice" in
the form of heuristic rules that will help it solve problems faster. In general, Making Them Think
the search space will still be exponential, but reducing the search space can
Excerpted from AI for make the difference between a character that can plan 5 steps ahead, say, Predefined Behavior
Games and Animation and one that can plan 15 steps ahead. That is, we can get characters that
(AK Peters, 1999) appear a lot more intelligent. Goal-Directed Behavior

The possibility also exists for incremental refinement of a specification, The Middle Ground
perhaps, from a high-level specification to the point where it more closely
resembles a controller written using a conventional imperative programming A Simple Tutorial: Maze
language. That is, we can quickly create a working prototype by relying Solving
heavily on goal-directed specification. If this prototype is too slow, we can
use complex actions to remove more and more of the nondeterminism. If Discussion
required, we can even do this to the point where the behavior is completely
Notes and References
predefined.
To sum up, if we can think of, or look up, a simple predefined way to produce the behavior we are
interested in, then it makes a lot of sense to use it. This is especially so if we don't think the behavior
will need to be modified very often, or at least if the anticipated modifications are minor ones. It is not
surprising, therefore, that a lot of simple reactive behavior is implemented using simple reactive
behavior rules. For simple reactive behavior, like collision avoidance, it is not hard to think of a small
set of reactive behavior rules that will do the job. Moreover, once we have this set of rules working, it
is unlikely that we will need to modify it.
We have tried to make it clear that one type of behavior can be implemented using a variety of
techniques. We have, therefore, chosen not to classify behavior according to what the character is
trying to achieve, but rather on the basis of the technique used to implement it. The reader should
note however that some others do try to insist that behavior in the real world is of a certain type, and
its virtual world counterpart must therefore be implemented in a particular way. Unfortunately, this
leads to lots of confusion and disagreement among different research camps. In particular, there are
those who advocate using predefined behavior rules for implementing every kind of behavior, no
matter how complex. In the sense that, given enough time and energy it can be done, they are
correct. However, they are somewhat like the traditional animator who scoffs at the use of physical
simulators to generate realistic-looking motion. That is, to the traditional animator a physical simulator
is an anathema. She has an implicit physical model in her head and can use this to make realistic
motion that looks just as good (if not better), and may only require the computer to do some simple
"inbetweening". Compared to the motion that needs a physical simulator to execute, the key-framed
approach is lightning fast. If we could all have the skill of a professional animator there would not be
so much call for physical simulators. Unfortunately, most of us do not have the skill to draw
physically-correct looking motion and are happy to receive all the help we can get from the latest
technology. Even artists who can create the motion themselves might prefer to expend their energies
elsewhere in the creative process.
In the same vein, many of us don't have any idea of how to come up with a simple set of
stimulus-response rules that implement some complex behavior. Perhaps, we could eventually come
up with something, but if we have something else we'd rather do with our time it makes sense to get
the characters themselves to do some of the work for us. If we can tell them what we want them to
achieve, and how their world changes, then perhaps they can figure it out for themselves.
We should also point out that there are those who advocate a cognitive modeling approach for every
kind of behavior, even simple reactive ones. This view also seems too extreme as, to coin a phrase,
Letters to the Editor:
there is no point "using a sledgehammer to crack a nut". If we have a simple reactive behavior to
Write a letter implement, then it makes sense to look for a simple set of predefined rules. Also, if lightning-fast
View all letters
performance is an absolute must, then we might be forced to use a predefined approach, no matter
how tough it is to find the right set of rules.

http://www.gamasutra.com/features/19991206/funge_06.htm (1 of 2) [25/06/2002 1:57:00 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]
Of course, there is a big gray area in which there is no clear answer as to whether we should just stick
with predefined behavior rules or not. In such cases, the choice of how to proceed can depend on
personal preference and the available tools and expertise. Obviously, this article is primarily aimed for
those who decide to go the cognitive modeling route.
________________________________________________________
Notes and References

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19991206/funge_06.htm (2 of 2) [25/06/2002 1:57:00 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]

| | | |

Features
by John Funge
Gamasutra
December 6, 1999
Printer Friendly Notes
Version Contents
For some basic information on FSMs see [HU79]. For more in-depth
information on predefined behavior techniques, consult Making Them Think
[Maes90,BBZ91,Tu99]. There are even some commercial character
Excerpted from AI for development packages that use HFSMs to define character behavior. See Predefined Behavior
Games and Animation [Nayfeh93] for a fascinating discussion on maze-solving techniques. Many of
(AK Peters, 1999) the classic papers on planning can be found in [AHT90]. See [SK96] for Goal-Directed Behavior
some work on the use of stochastic techniques for planning. Prolog is the
best known nondeterministic programming language and there are The Middle Ground
numerous references, for example see [Bratko90].
A Simple Tutorial: Maze
The complex action macro expansion is closely related to work done in Solving
proving properties of computer programs [GM96]. Our definitions are taken
from those given in [LRLLS97]. A more up-to-date version, that includes Discussion
support for concurrency, appears in [LLR99]. See [Stoy77] for the
Scott-Strackey least fixed-point definition of (recursive) procedure Notes and References
execution.
References

[AHT90] J. Allen, J. Hendler, and A. Tate, editors. Readings in Planning. Morgan Kaufmann, San
Mateo, CA, 1990.
[BBZ91] N.I. Badler, B.A. Barsky, and D.Zeltzer, editors. Making Them Move: Mechanics, Control, and
Animation of Articulated Figures. Morgan Kaufmann, San Mateo, 1991.
[Bratko90] I. Bratko. PROLOG Programming for Artificial Intelligence. Addison Wesley, Reading, MA,
1990.
[Funge99] J. Funge. AI for Games and Animation: A Cognitive Modeling Approach. A. K. Peters. Natick,
MA, 1999.

[GM96] J. A. Goguen and G. Malcolm. Algebraic Semantics of Imperative Programs. MIT Press,
Cambridge, MA, 1995.
[HU79] J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages, and
Computation. Addison-Wesley, Reading, MA, 1979.
[LLR99] Y. Lespérance, H. J. Levesque, and R. Reiter. A Situation Calculus Approach to Modeling and
Programming Agents. In A. Rao and M. Wooldridge, editors, Foundations and Theories of Rational
Agency. Kluwer, New York, 1999. (See also: www.cs.toronto.edu/cogrobo)
[LRLLS97] H. Levesque, R. Reiter, Y. Lespérance, F. Lin, and R. Scherl. Golog: A Logic Programming
Language for Dynamic Domains. Journal of Logic Programming, 31:59-84, 1997.

[Maes90] P. Maes (editor). Designing Autonomous Agents: Theory and Practice from Biology to
Engineering and Back. MIT Press, Boston, 1990.
[Nayfeh93] B. A. Nayfeh. "Using a Cellular Automata to Solve Mazes." Dr. Dobb's Journal, February
1993.
[SK96] B. Selman and H. Kautz. "Knowledge compilation and theory approximation." Journal of the
ACM, 43(2):193-224, 1996.
Letters to the Editor: [Stoy77] J. E. Stoy. Denotational Semantics: The Scott-Strachey Approach to Programming Language
Write a letter Theory. MIT Press, Cambridge, MA, 1977.
View all letters
[Tu99] X. Tu. Artificial Animals for Computer Animation: Biomechanics, Locomotion, Perception, and

http://www.gamasutra.com/features/19991206/funge_07.htm (1 of 2) [25/06/2002 1:58:04 PM]


Gamasutra - Features - "AI for Games and Animation" [12.6.99]
Behavior. ACM Distinguished Ph.D Dissertation Series, Springer-Verlag, 1999.
John Funge recently joined a research group at Sony Computer Entertainment America
(SCEA) that investigates software issues related to the PlayStation. Previously John was a
member of Intel's microcomputer research lab. He received a B.Sc. in Mathematics from
King's College London in 1990, an M.Sc. in Computer Science from Oxford University in
1991, and a Ph.D. in Computer Science from the University of Toronto in 1997. For his Ph.D.
John successfully developed a new approach to high-level control of characters in games
and animation. John is the author of numerous technical papers and his new book "AI for
Games and Animation: A Cognitive Modeling Approach" is one of the first to take a serious
look at AI techniques in the context of computer games and animation. His current research
interests include computer animation, computer games, smart networked devices, interval
arithmetic and knowledge representation.

________________________________________________________
[Back to] Making Them Think

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19991206/funge_07.htm (2 of 2) [25/06/2002 1:58:04 PM]


GameDev.net - AI In Empire-Based Games

AI In Empire-Based Games GameDev.net

See Also:
Artificial Intelligence:Gaming

AI In Empire-Based Games
Courtesy of Amit Patel
http://www-cs-students.stanford.edu/~amitp

From: Free at last!


To: patel@shell.com
Subject: RE: Space Empire Games
Date: Fri, 23 Jul 93 16:22:48 EDT
Amit -
I just wrote up a bit of a description about Second Conflict for another correpondant. I append it here. Its certainly
what I consider Space Empire. I'm trying to clone Second Conflict which is what my editorial or side comments on
extensions or computer strategies refer to.
Thoughts and comments are appreciated. I'm still somewhat overwhelmed by the thought of programming a decent
computer opponent.
Thanks!
Kevin
-------------------
Second Conflict appears to be produced by the folks who run the Galactica BBS. It has multi-player capability
[possibly for BBS users?] but I've never tried that, playing purely Human-Computer conflicts. I can't really
remember the premise, so I'll just talk about the mechanics [which is what I'm trying to clone right now, I'll figure
out my own premise later].
You start by selecting the game parameters. Up to 26 star systems (one per alphabet letter) and up to 10 players.
Each player gets a beginning star system with 10 planets [each planet produces troops], a random number of
Warships, Stealthships and Transports, a certain number of Missiles and Factories, and a random number of system
defenses.
Two basic scenarios are available with the shareware version of Second Conflict. The first is that each player gets
one system. The second is that all systems are divided evenly between all players.
The winner is the players who conquers the whole universe or who has the largest number of points when the game
ends [game length can be selected].
You can choose to build any of the ships, defenses, or factories. You can send scouts to check out other systems.
Each turn every player makes orders (produce X, send ships to system Y, etc) and then all movement and combat
orders are reconciled at the end of the turn.
Stealthships are more powerful but cost 3 points to build versus 1 point for Warships. Fleets can 'Conquer' (fight
until win or die), 'Probe' (attack once then retreat), or 'Raid' (seize transports and/or build points from enemy
systems). Items can be wrecked to retrieve 70% of original points.
Score points are awarded for ships, star systems, planets, troops belonging to a player at the end of the turn. More

http://www.gamedev.net/reference/articles/article196.asp (1 of 5) [25/06/2002 2:01:08 PM]


GameDev.net - AI In Empire-Based Games

points for star systems owned, followed by planets, and then ships/defenses/factories. [So an obvious decision weight
factor comes to mind: conquering a system is higher priority then building more ships unless you've got lots of
ships].
In an enemy system, one must first destroy the protecting fleet/defenses. Then you must destroy the enemy troops
occupying the planets. Every turn you have un-conquered planets, the enemy can destroy your ships, possibly
reducing the occupation fleet small enough so the system overthrows your rule.
So a typical game for me starts out scouting nearby systems while building up my fleet. I try to find the nearest
'neutral' (non-player-occupied) system that has high defenses (usually an indication of a large number of existing
factories; since it costs 5 points per *existing* factory to build a new one, the more already in a system, the better).
Or if there are any nearby enemy systems I send raid fleets to get points to build with [the player's home system has
no production limit; that is if you have extra points you can build as many of X with those points as you can,
whereas other systems can only build as many X as they have factories].
One of the tricks the computer opponent might do is to wreck factories to build stealthships. Since production in the
home system is not limited by number of factories, 1 factory can build several hundred stealthships from the points
recovered by wrecking the other factories. Then the computer can easily conquer several nearby systems, and use
those systems' factories to build. The computer opponent only seems to do this early in the game if there are lots of
nearby neutral systems. I haven't decided why the opponent decides to wreck factories later in the game.
A weakness of the computer opponent is to send most of the fleet to attack a new system, leaving an old system
relatively unprotected. If the computer has a small enough fleet, its possible to occupy the old system with little fear
of successful return take-over.
There are some other parts of the game, but thats pretty much it in a nutshell. The authors have produced a windows
version that has different rules for some of the above (eg its harder to raid). Part of my motivation for making my
own version is that I think their Windows interface is a dog, I'd like to learn, and the version I have has some
annoying bugs (like with a large game [26 star systems, 5 players] the game will tend to have field overwrite
problems, so that all of a sudden one player has got -32000 ships and is completely unconquerable).
Some potential additions include having systems that are rich in metals versus good crop planets, taking the time to
mine planets, colonization versus conquest, spy satellites, more ship types, trade, diplomacy, etc. But I'd like to get
my clone working first and then extend it.

From: Free at last!


To: patel@shell.com, robert@gtx.com
Subject: Medieval SimCity
Date: Fri, 23 Jul 93 16:46:43 EDT
Amit, someone posted this response, which I think pretty much echos your comments re: realism.
>One tip for the Medieval Sim-City game... Ditch realism (or at least some of
>it) and invent something that will work well from a game balance perspective
>and make the game fun to play. Reality can be a good source of game ideas
>sometimes, other times it can be crippling.
Its sort of sinking in for me that I do need to concentrate a little more on playability rather than strict realism,
although history provides a number of ideas that can be incorporated into a game.
You mentioned for example that Civilization has emporers living 4500 years! This just so happens to be something I
was having a little trouble figuring out how to handle the transition as the current ruler died, like what happens if the
ruler doesn't produce an heir? (or multiple?) But it could just be ignored if necessary.

http://www.gamedev.net/reference/articles/article196.asp (2 of 5) [25/06/2002 2:01:08 PM]


GameDev.net - AI In Empire-Based Games

Rob, you wrote:


>I really like the idea of a leader and his group getting dumped in the middle of nowhere.
The basic premise I was working from was the fact that around 600-700 A.D. large numbers of new villages were
founded in portions of Northern Europe that had not been extensively settled. At that time, most of N. Europe was
forested. What I hadn't figured out was how to explain how the potential villagers got there without a path or road,
but I shrugged that off for the moment.
This causes the game to start out to be one of resource management as the village must work to create fields using
existing grain resources. Then, as exploration takes place, they will encounter traders and other villages, heathen and
bandits, etc.
The goal, in my mind, is to start with nothing like this, and develop to a successful large city [possibly the political
center of a new country or a bishopric]. Obstacles include the barbarian invasions, trade wars, the black death, and
the constant war.
>Have you considered hunting & gathering as a potentially
>bountiful resource for small populations, as a springboard until farming
>begins to yield its returns?
Yes, as pigs were typically fed on wild acorns as a major staple, and hunting was a significant contributor of food. I
wanted to get the grain plant/yield ratios settled first but maybe I'm trying to take baby steps that are too small. I may
be trying to make things too complicated by assigning different activities different costs in terms of grain eaten.
I originally thought to require the user to select which activities to use the people on. For example, with 25 starting
people they can plow 5 fields in a season if no one does anything else, but they may end up with no food because
they eat all the grain. On the other hand, 5 people can build houses, 10 people can hunt, and 15 people can plow,
resulting in 3 fields but enough food from hunting, _and_ a place to stay.
>If i recall, farming without basic tools is supposed to be hardly worth
>the effort (unless you live on a flood plain as rich as the ancient Nile).
>Likewise, great skill in farming, even with relatively primitive tools,
>can reap rewards.
Right. The original plow (pre-700 A.D) was an ox-drawn edition of the original stick plow. Ideal for light
Mediterranean soils but poor in the heavier soils of N. Europe. I wanted the village to start out with this plow (and
consequently lower grain yields) plus a two-field rotation system.
Then, after the development of the shoulder-harness for horses, horse shoes and the heavy mouldboard plow circa
750 AD, the village can acquire this knowledge and increase production. The additional use of a three-field rotation
system can also increase production. With increased production the player can then spend more resources for
building a church, grain mills, windmills, trade, and developing more of a city.
Perhaps I'm trying to be too realistic, making it much less fun?
Kevin

From: Free at last!


To: fingon@nullnet.fi
Subject: RE: Space Strategy AI
Date: Mon, 26 Jul 93 11:12:55 EDT
Markus -

http://www.gamedev.net/reference/articles/article196.asp (3 of 5) [25/06/2002 2:01:08 PM]


GameDev.net - AI In Empire-Based Games

>I, too, am creating space strategy game. Only part working 100% now is computer AI, :).
Care to share details? I'm rather lost when it comes to the AI part. Both Amit Patel and Robert Eaglestone have
expressed interest or ideas wrt the AI.
At this point I've done nothing on the AI (leave the hard part for last :). I haven't even thought much on the potential
computer operations, much less how the computer makes decisions between them [nor even how the computer
gathers the data to make the decisions, but that should be easier].
Is your AI data-driven? What computer operations/decision-points do you have, and how does the computer decide
between them? Don't feel that you have to give everything away, any input at all would be helpful at this point.
Thanks!
Kevin

From: Free at last!


To: robert@gtx.com, patel@shell.com
Cc: routley@4gl.enet.dec.com
Subject: Promised infrom from Markus Stenberg on Computer AI
Date: Wed, 28 Jul 93 16:57:34 EDT
From: US2RMC::"fingon@nullnet.fi" "Markus Stenberg" 27-JUL-1993 15:07:29.00
To: 4gl::routley (Free at last!)
CC:
Subj: Re: Space Strategy AI
> >I, too, am creating space strategy game. Only part working 100% now is
> >computer AI, :).
> Care to share details? I'm rather lost when it comes to the AI part.
> Both Amit Patel and Robert Eaglestone have expressed interest or ideas
> wrt the AI.
I'll write something.. :)
> At this point I've done nothing on the AI (leave the hard part for last :).
> I haven't even thought much on the potential computer operations, much
> less how the computer makes decisions between them [nor even how the
> computer gathers the data to make the decisions, but that should be easier].
Data gathering is simple - at least in my model it uses same data as players + some statistical data from preivious
turns..
> Is your AI data-driven? What computer operations/decision-points do you
> have, and how does the computer decide between them? Don't feel that you
> have to give everything away, any input at all would be helpful at this
> point.
AI I have designed uses mostly data to make decisions - some random chance has been thrown in, too. I think that
the AI has to be quite game-specific - at least mine wouldn't work even in VGAPlanets, which is _very_ like my
game..
For example about planetary conquest: Computer saves all previous attempts, &c. When ship's turn comes (it
handles em quite easily), it checks out 20 nearest not-own planets, and what kind of success it has had before trying
to conquer em. Then it orders the ship to go to the planet &c. L8er, if some other ship thinks that the same planet is

http://www.gamedev.net/reference/articles/article196.asp (4 of 5) [25/06/2002 2:01:08 PM]


GameDev.net - AI In Empire-Based Games

easiest conquest in terms of range/defense, it merges them to fleet before attacking. If planetary defenses were last
time better than the fleet, it just flies to the system & waits until there is great enough force to wipe out the planet.
(Planetary defenses cannot attack, as name implies)
--
Markus Stenberg / fingon@nullnet.fi / Finland / Europe
Public PGP key available on request.
Discuss this article in the forums

Date this article was posted to GameDev.net: 7/31/1999


(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article196.asp (5 of 5) [25/06/2002 2:01:08 PM]


Gamasutra - Features - "AI Madness: Using AI to Bring Open-City Racing to Life" [1.24.01]

| | | |

Features
by Joe Adzima
Gamasutra
[Author's Bio] AI Madness: Using AI to Bring Open-City
January 24, 2001
Racing to Life
Angel Studios' Midtown Madness 2 for PC and Midnight Club
for Playstation 2 are open racing games in which players
AI Map: Roads, have complete freedom to drive where they please. Set in
Intersections, and
Open Areas
"living cities," these games feature interactive entities that
include opponents, cops, traffic, and pedestrians. The role of
Curves Ahead: artificial intelligence is to make the behaviors of these
Creating Traffic high-level entities convincing and immersive: opponents
must be competitive but not insurmountable. Cops who spot
City People: you breaking the law must diligently try to slow you down or
Simulating Pedestrians stop you. Vehicles composing ambient traffic must follow all
traffic laws while responding to collisions and other
Simulating Vehicles unpredictable circumstances. And pedestrians must go about their routine business, until you swerve
with Full Physics towards them and provoke them to run for their lives. This article provides a strategy for programmers
who are trying to create AI for open city racing games, which is based on the success of Angel Studios'
Printer Friendly implementation of AI in Midtown Madness 2 and Midnight Club. The following discussion focuses on the
Version autonomous architecture used by each high-level entity in these games. As gameplay progresses, this
autonomy allows each entity to decide for itself how it's going to react to its immediate circumstances.
Discuss this This approach has the benefit of creating lifelike behaviors along with some that were never intended,
Article but add to gameplay in surprising ways.

From the December AI Map: Roads, Intersections, and Open Areas


2000 issue of: At the highest level, a city is divided into three primary components for the AI map: roads,
intersections, and open areas (see Figure 1). Most of this AI map is composed of roads (line segments)
that connect intersections. For our purposes, an intersection is defined as a 2D area in which various
roads join. Shortcuts are just like roads, except they are overlaid on top of the three main component
types. Shortcuts are used to help the opponents navigate through the various open areas, which by
definition have no visible roads or intersections. Each of these physical objects is reflected in a
software object.

Letters to the Editor:


Write a letter
View all letters

http://www.gamasutra.com/features/20010124/adzima_01.htm (1 of 2) [25/06/2002 2:03:33 PM]


Gamasutra - Features - "AI Madness: Using AI to Bring Open-City Racing to Life" [1.24.01]
Figure 1. The AI map elements appear as
green and blue line segments for roads and
sidewalks, 2D areas for intersections, and
additional line segments for shortcuts across
open areas.
[Expand Image]

The road object contains all the data representing a street, in terms of lists of 3D vertices. The main
definition of the road includes the left/right boundary data, the road's centerline, and orientation
vectors defined for each vertex in the definition. Other important road data includes the traffic lane
definitions, the pedestrian sidewalk definition, road segment lengths, and lane width data. A minimum
of four 3D vertices are used to define a road, and each list of vertices (for example, center vertices,
boundary vertices, and so on) has the same number of vertices.
The intersection object contains a pointer to each connected shortcut and road segment. At
initialization, these pointers are sorted in clockwise order. The sorting is necessary for helping the
ambient traffic decide which is the correct road to turn onto when traversing an intersection. The
intersection object also contains a pointer to a "traffic light set" object, which, as you might guess, is
responsible for controlling the light's sequence between green and red. Other important tasks for this
object include obstacle management and stop-sign control.
Big-city solutions: leveraging the City Tool and GenBAI Tool. Angel's method for creating
extremely large cities uses a very sophisticated in-house tool called the City Tool. Not only does this
tool create the physical representation of the city, but it also produces the raw data necessary for the
AI to work. The City Tool allows the regeneration of the city database on a daily basis. Hence, the city
can be customized very quickly to accommodate new gameplay elements that are discovered in
prototyping, and to help resolve any issues that may emerge with the AI algorithms.
The GenBAI Tool is a separate tool that processes the raw data generated from the City Tool into the
format that the run-time code needs. Other essential tasks that this GenBAI Tool performs include the
creation of the ambient and pedestrian population bubbles and the correlation of cull rooms (discrete
regions of the city) to the components of the road map.

Based on the available AI performance budget and the immense size of the cities, it's impossible to
simulate an entire city at once. The solution is to define a "bubble" that contains a list of all the road
components on the city map that are visible from each cull room in the city, for the purpose of culling
the simulation of traffic and pedestrians beyond a certain distance. This collection of road components
essentially becomes the bubbles for ambient traffic and pedestrians.
The last function of the GenBAI tool is to create a binary version of the data that allows for superfast
load times, because binary data can be directly mapped into the structures.
Data files: setting up races. The AI for each race event in the game is defined using one of two
files: the city-based AI map data file or the race-based AI map data file. The city file contains defaults
to use for all the necessary AI settings at a city level. Each race event in the city includes a race-based
AI map data file. This race file contains replacement values to use instead of the city values. This
approach turns out to be a powerful design feature, because it allows the game designer to set
defaults at a city level, and then easily override these values with new settings for each race.

Some examples of what is defined in these files are the number and definition of the race's opponents,
cops, and hook men. Also defined here are the models for the pedestrians and ambient vehicles to use
for a specific race event. Finally, exceptions to the road data can be included to change the population
fill density and speed limits.

________________________________________________________

Curves Ahead: Creating Traffic

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/20010124/adzima_01.htm (2 of 2) [25/06/2002 2:03:33 PM]


Gamasutra - Features - "AI Madness: Using AI to Bring Open-City Racing to Life" [1.24.01]

| | | |

Features
by Joe Adzima
Gamasutra
[Author's Bio] AI Madness: Using AI to Bring Open-City
January 24, 2001
Racing to Life
Curves Ahead: Creating Traffic
AI Map: Roads, Following rails and cubic spline curves. During normal driving conditions, all the ambient vehicles
Intersections, and are positioned and oriented by a 2D spline curve. This curve defines the exact route the ambient traffic
Open Areas will drive in the XZ-plane. We used Hermite curves because the defining parameters, the start and end
positions, and the directional vectors are easy to calculate and readily available.
Curves Ahead:
Creating Traffic
Since the lanes for ambient vehicles on each road are defined by a list of vertices, a road subsegment
City People: can easily be created between each vertex in the list. When the ambient vehicle moves from one
Simulating Pedestrians segment to the next, a new spline is calculated to define the path the vehicle will take. Splines are also
used for creating recovery routes back to the main rail data. These recovery routes are necessary for
Simulating Vehicles recovering the path after a collision or a player-avoidance action sent the ambient vehicle off the rail.
with Full Physics Using splines enables the ambient vehicles to drive smoothly through curves typically made up of
many small road segments and intersections.
Printer Friendly
Version
Setting the road velocity: the need for speed. Each road in the AI map has a speed-limit
parameter for determining how fast ambient vehicles are allowed to drive on that road. In addition,
each ambient vehicle has a random value for determining the amount it will drive over or under the
Discuss this road's speed limit. This value can be negative or positive to allow the ambient vehicles to travel at
Article different speeds relative to each other.

From the December When a vehicle needs to accelerate, it uses a randomly selected value between 5 and 8 m/s2. At other
2000 issue of: times, when an ambient vehicle needs to decelerate, perhaps because of a stop sign or red light, then
the vehicle calculates a deceleration value based on attaining the desired speed in 1 second. The
deceleration is calculated by

(V2 - V02) / 2(X - X0)

where V is the target velocity, V0 is the current velocity, and (X - X0) is the distance required to
perform the deceleration.
Detecting collisions. With performance times being so critical, each ambient vehicle can't test all the
other ambient vehicles in its obstacle grid cell. As a compromise between speed and
comprehensiveness, each ambient vehicle contains only a pointer to the next ambient vehicle directly
in front of it in the same lane. On each frame, the ambient checks if the distance between itself and
the next ambient vehicle is too close. If it is, the ambient in back will slow down to the speed of the
ambient in front. Later, when the ambient in front becomes far enough away, the one in back will try
to resume a different speed based on the current road's speed limit.
Letters to the Editor: By itself, this simplification creates a problem with multi-car pileups. The problem can be solved by
Write a letter stopping the ambient vehicles at the intersections preceding the crash scene.
View all letters
Crossing the intersection. Once an ambient vehicle reaches the end of a road, it must traverse an
intersection. To do this, each vehicle needs to successfully gain approval from the following four
functional groups.
First, the ambient vehicle must get approval from the intersection governing that road's "traffic
control." Each road entering an intersection contains information that describes the traffic control for
that road. Applicable control types are NoStop, AllwaysStop, TrafficLight, and StopSign (see
Figure 2). If NoStop is set, then the ambient vehicle gets immediate approval to proceed through the
intersection. If AllwaysStop is set, the ambient never gets approval to enter the intersection. If
TrafficLight is set, the ambient is given approval whenever its direction has a green light. If
StopSign is set, the ambient vehicle that has been waiting the longest time is approved to traverse
the intersection.

http://www.gamasutra.com/features/20010124/adzima_02.htm (1 of 4) [25/06/2002 2:04:01 PM]


Gamasutra - Features - "AI Madness: Using AI to Bring Open-City Racing to Life" [1.24.01]

Figure 2. In this case, the TrafficLight class is


set to red for some vehicles, which stop and
wait. The other vehicles with green/yellow
lights get permission to cross the intersection.
The vehicle crossing in the left lane decides to
turn left, while the vehicle in the right lane
goes straight.

The second approval group is the accident manager. The accident manager keeps track of all the
ambient vehicles in the intersection and the next upcoming road segment. If there are any accidents
present in these AI map components, then approval to traverse the intersection is denied. Otherwise,
the ambient vehicle is approved and moves on to the third stage.
The third stage requires that the road which the ambient is going to be on after traversing the
intersection has the road capacity to accept the ambient vehicle's entire length, with no part of the
vehicle sticking into the intersection.
The fourth and final approval comes from a check to see if there are any other ambient vehicles trying
to cross at the same time. An example of why this check is necessary is when an ambient vehicle is
turning from a road controlled by a stop sign onto a main road controlled by a traffic light. Since the
approval of the stop sign is based on the wait time at the intersection, the vehicle that's been waiting
longest would have permission to cross the intersection -- but in reality that vehicle needs to wait until
the cars that have been given permission by the traffic light get out of the way.
Selecting the next road. When an ambient vehicle reaches the end of the intersection, the next
decision the vehicle must make is which direction to take. Depending on its current lane assignment,
the ambient vehicle selects the next road based on the following rules (see Figure 2):
● If a vehicle is in the far-left lane, it can go either left or straight.
● If it's in the far-right lane, it can go either right or straight.
● If it's in any of the center lanes, then it must go straight.
● If it's on a one-way road, then it picks randomly from any of the outgoing roads.
● If it's on a freeway intersection where on-ramps merge with the main freeway traffic, then it
must always go right.
● U-turns are never allowed, mostly because a splined curve in this situation would not look
natural.

Since the roads are sorted in clockwise order, this simplifies selection of the correct road. For example,
to select the road to the left, just add 1 to the current road's intersection index value (the ID number
of that road in the intersection road array). To pick the straight road, add 2. To go right, just subtract
1 from the road's intersection index value.
Changing lanes. On roads that are long enough, the ambient vehicles will change lanes in order to
load an equal number of vehicles into each lane of the road. When the vehicle has traveled to the point
that triggers the lane change (usually set at 25 percent of the total road length), the vehicle will
calculate a spline that will take it smoothly from its current lane to the destination lane.

The difficulty here is in setting the next-vehicle pointer for collision detection. The solution is to have a
next-vehicle pointer for each possible lane of the road. During this state, the vehicle is assigned to two
separate lanes and therefore is actually able to detect collision for both traffic lanes.
Once a vehicle completes the lane change, it makes another decision as to which road it wants to turn
onto after traversing the upcoming intersection. This decision is necessary because the vehicle is in a

http://www.gamasutra.com/features/20010124/adzima_02.htm (2 of 4) [25/06/2002 2:04:01 PM]


Gamasutra - Features - "AI Madness: Using AI to Bring Open-City Racing to Life" [1.24.01]
new lane and may not be able to get to the previously selected road from its new lane assignment.
Orienting the car. As the ambient traffic vehicles drive around the city, they are constantly driving
over an arbitrary set of polygons forming the roads and intersections. One of the challenges for the AI
is orienting the ambient vehicles to match the contour of the road and surfaces of open areas. Because
there are hills, banked road surfaces, curbs separating roads and sidewalks, and uneven open terrain,
the obvious way to orient the vehicles is to shoot a probe straight down the Y-axis from the front-left,
front-right, and rear-left corners of the vehicle. First, get the XZ position of the vehicle from the
calculated spline position and determine the three corner positions in respect to the center point of the
vehicle. Then, shoot probes at the three corners to get their Y positions.
Once you know the three corner positions, you can calculate the car's orientation vectors. This
approach works very well, but even caching the last polygon isn't fast enough to do all the time for
every car in the traffic bubble. One way to enhance performance is to mark every road as being either
flat or not. If an ambient vehicle drives on a flat road, it doesn't need to do the full probe method.
Instead, this vehicle could use just the Y value from the road's rail data. Another performance
enhancement is to orient the vehicles that are far enough from the player using only the road's
rail-orientation vectors. This approach works well when small vehicle-orientation pops are not
noticeable.
Managing the collision state. When an ambient vehicle collides with the player, or with a dynamic
or static obstacle in the city, the ambient vehicle switches from using a partially simulated physics
model to a fully simulated physics model. The fully simulated model allows the ambient vehicle to act
correctly in collisions.
A vehicle manager controls the activities of all the vehicles transitioning between physics models. A
collision manager handles the collision itself. For example, once a vehicle has come to rest, the vehicle
manager resets it back to the partially simulated physics model. At this point, the ambient vehicle
attempts to plot a spline back to the road rail. As it proceeds along the rail, the vehicle will not
perform any obstacle detection, and will collide with anything in its way. A collision then sends the
vehicle back to the collision manager. This loop will repeat for a definable number of tries. If the
maximum number of tries is reached, the ambient vehicle gives up and remains in its current location
until the population manager places it back into the active bubble of the ambient vehicle pool.
Using an obstacle-avoidance grid. Every AI entity in the game is assigned to a cell in the
obstacle-avoidance grid. This assignment allows fully simulated physics vehicles to perform faster
obstacle avoidance.
Since the road is defined by a list of vertices, these vertices make natural separation points between
obstacle-avoidance buckets. Together, these buckets divide the city into a grid that limits the scope of
collision detection. As an ambient vehicle moves along its rail, crossing a boundary between buckets
causes the vehicle to be removed from the previous bucket and added to the new bucket. The
intersection is also considered an obstacle bucket.
Simulation bubbles for ambient traffic. A run-time parameter specifies the total number of
ambient vehicles to create in the city. After being created, each ambient vehicle is placed into an
ambient pool from which the ambients around the player are populated. This fully simulated region
around the player is the simulation bubble. Relative to the locations of the player, remote regions of
the city are outside of the simulation bubble, and are not fully simulated.
When a player moves from one cull room to another, the population manager compares the vertex list
of the new cull room against the list for the old one. From these two lists, three new lists are created:
New Roads, Obsolete Roads, and No Change Roads. First, the obsolete roads are removed from the
active road list, and the ambient vehicles on them are placed into the ambient pool. Next, the new
roads are populated with a vehicle density equal to the total vehicle length divided by the total road
length. The vehicle density value is set to the default value based on the road type, or an exception
value set through the definition of the race AI map file.
As the ambient vehicles randomly drive around the city, they sometimes come to the edge of the
simulation bubble. When this happens, the ambient vehicles have two choices. First, if the road type is
two-way (that is, ambient vehicles can drive in both directions), then the vehicle is repositioned at the
beginning of the current road's opposite direction. Alternatively, if the ambient vehicle reaches the end
of a one-way road, the vehicle is removed from the road and placed into the pool and thereby
becomes available to populate other bubbles.
Driving in London: left becomes right. London drivers use the left side of the road instead of the
right. To accommodate this situation, some changes have to be made to the raw road data. First, all of
the right lane data must be copied to the left lane data, and vice versa. The order of each lane's vertex
data must then be reversed so that the first vertex becomes the last, and the lane order reversed so
that what was the lane closest to the road's centerline becomes the lane farthest from the center.
Given these changes, the rest of the AI entities and the ambient vehicle logic will work the same
regardless of which side of the road the traffic drives on. This architecture gave us the flexibility to
allow left- or right-side driving in any city.

http://www.gamasutra.com/features/20010124/adzima_02.htm (3 of 4) [25/06/2002 2:04:01 PM]


Gamasutra - Features - "AI Madness: Using AI to Bring Open-City Racing to Life" [1.24.01]
________________________________________________________

City People: Simulating Pedestrians

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/20010124/adzima_02.htm (4 of 4) [25/06/2002 2:04:01 PM]


Gamasutra - Features - "AI Madness: Using AI to Bring Open-City Racing to Life" [1.24.01]

| | | |

Features
by Joe Adzima
Gamasutra
[Author's Bio] AI Madness: Using AI to Bring Open-City
January 24, 2001
Racing to Life
City People: Simulating Pedestrians
AI Map: Roads, In real cities, pedestrians are on nearly every street corner. They walk and go about their business, so
Intersections, and it should be no different in the cities we create in our games. The pedestrians wander along the
Open Areas sidewalks and sometimes cross streets. They avoid static obstacles such as mailboxes, streetlights,
and parking meters, and also dynamic obstacles such as other pedestrians and the vehicles controlled
Curves Ahead:
by the players. And no, players can't run over the pedestrians, or get points for trying! Even so,
Creating Traffic
interacting with these "peds" makes the player's experience as a city driver much more realistic and
City People:
immersive.
Simulating Pedestrians
Simulation bubbles for pedestrians. Just as the ambient traffic has a simulation bubble, so do the
Simulating Vehicles pedestrians. And while the pedestrian bubble has a much smaller radius, both types are handled
with Full Physics similarly. During initialization, the pedestrians are created and inserted into the pedestrian pool. When
the player is inserted into the city, the pedestrians are populated around him. During population, one
Printer Friendly pedestrian is added to each road in the bubble, round-robin style, until all the pedestrians in the pool
Version are exhausted.
Pedestrians are initialized with a random road distance and side distance based on an offset to the
Discuss this center of the sidewalk. They are also assigned a direction in which to travel and a side of the street on
Article which to start. As the pedestrians get to the edge of the population bubble, they simply turn around
and walk back in the opposite direction from which they came.
From the December
2000 issue of: Wandering the city. When walking the streets, the pedestrians use splines to smooth out the angles
created by the road subsegments. All the spline calculations are done in 2D to increase the
performance of the pedestrians. The Y value for the splines is calculated by probing the polygon the
pedestrian is walking on in order to give the appearance that the pedestrian is actually walking on the
terrain underneath its feet.

Each pedestrian has a target point for it to head toward. This target point is calculated by solving for
the location on the spline path three meters ahead of the pedestrian. In walking, the ped will turn
toward the target point a little bit each frame, while moving forward and sideways at a rate based on
the parameters that control the animation speed. As the pedestrian walks down the road, the ped
object calculates a new spline every time it passes a sidewalk vertex.
Crossing the street. When a pedestrian gets to the end of the street, it has a decision to make. The
ped either follows the sidewalk to the next street or crosses the street. If the ped decides to cross the
street, then it must decide which street to cross: the current or the next. Four states control ped
navigation on the streets: Wander, PreCrossStreet, WaitToCrossStreet, and CrossStreet (see
Figure 3). The first of these, Wander, is described in the previous section, "Wandering the City."
Letters to the Editor:
PreCrossStreet takes the pedestrian from the end of the street to a position closer to the street curb,
Write a letter
View all letters
WaitToCrossStreet tells the pedestrian waiting for the traffic light that it's time to cross the street,
and CrossStreet handles the actual walking or running of the pedestrian to the curb on the other side
of the street.

http://www.gamasutra.com/features/20010124/adzima_03.htm (1 of 4) [25/06/2002 2:04:40 PM]


Gamasutra - Features - "AI Madness: Using AI to Bring Open-City Racing to Life" [1.24.01]

Figure 3. In this situation, the PreCrossStreet


state has moved the pedestrians next to the
street curb, and now the WaitToCrossStreet
state is holding the peds in place until the light
turns green.

Animating actions. The core animation system for the pedestrians is skeleton-based. Specifically,
animations are created in 3D Studio Max at 30FPS, and then downloaded using Angel's proprietary
exporter. The animation system accounts for the nonconstant nature of the frame rate.

For each type of pedestrian model, a data file identifies the animation sequences. Since all the
translation information is removed from the animations, the data file also specifies the amount of
translation necessary in the forward and sideways directions. To move the pedestrian, the ped object
simply adds the total distance multiplied by the frame time for both the forward and sideways
directions. (Most animation sequences have zero side-to-side movement.)
Two functions of the animation system are particularly useful. The Start function immediately starts
the animation sequence specified as a parameter to the function, and the Schedule function starts the
desired animation sequence as soon as the current sequence finishes.
Avoiding the speeding player. The main rule for the pedestrians is to always avoid being hit. We
accomplish this in two ways. First, if the pedestrian is near a wall, then the ped runs to the wall, puts
its back against it, and stands flush up against it until the threatening vehicle moves away (see Figure
4).

Figure 4. When the oncoming player vehicle


threatens these pedestrians, they decide to hug
the wall after sending out a probe and finding a
wall nearby.

Alternatively, if no wall is nearby, the ped turns to face the oncoming vehicle, waits until the vehicle is
close enough, and then dives to the left or right at the very last moment (see Figure 5).

http://www.gamasutra.com/features/20010124/adzima_03.htm (2 of 4) [25/06/2002 2:04:40 PM]


Gamasutra - Features - "AI Madness: Using AI to Bring Open-City Racing to Life" [1.24.01]

Figure 5. The pink lines visualize the direction


the peds intend to walk. When a player vehicle
introduces a threat, the pedestrians decide to
dive right or left at the last moment, since no
wall is nearby.
[Expand Image]

The pedestrian object determines that an oncoming vehicle is a threat by taking the forward
directional vector of the vehicle and performing a dot product with the vector defined by the ped's
position minus the vehicle's position. This calculation measures the side distance. If the side distance
is less than half the width of the vehicle, then a collision is imminent.

The next calculation is the time it will take the approaching vehicle to collide with the pedestrian. In
this context, two distance zones are defined: a far and a near. In the far zone, the pedestrian turns to
face the vehicle and then goes into an "anticipate" behavior, which results in a choice between shaking
with fear and running away. The near zone activates the "avoid" behavior, which causes the pedestrian
to look for a wall to hug. To locate a wall, the pedestrian object shoots a probe perpendicular to the
sidewalk for ten meters from its current location. If a wall is found, the pedestrian runs to it.
Otherwise, the ped dives in the opposite direction of the vehicle's rotational momentum. (Sometimes
the vehicle is going so fast, a superhuman boost in dive speed is needed to avoid a collision.)
Avoiding obstacles. As the pedestrians walk blissfully down the street, they come to obstacles in the
road. The obstacles fall into one of three categories: other wandering pedestrians; props such as trash
cans, mailboxes, and streetlights; or the player's vehicle parked on the sidewalk.
In order to avoid other pedestrians, each ped checks all the pedestrians inside its obstacle grid cell. To
detect a collision among this group, the ped performs a couple of calculations. First, it determines the
side distance from the centerline of the sidewalk to itself and the other pedestrian. The ped's radius is
then added to and subtracted from this distance. A collision is imminent if there is any overlap
between the two pedestrians.
In order to help them avoid each other, one of the pedestrians can stop while the other one passes.
One way to do this is to make the pedestrian with the lowest identification number stop, and the latter
ped sets its target point far enough to left or right to miss the former ped. The ped will always choose
left if it's within the sidewalk boundary; otherwise it will go to the right. If the right target point is also
past the edge of the sidewalk, then the pedestrian will turn around and continue on its way. Similar
calculations to pedestrian detection and avoidance are performed to detect and avoid the props and
the player's vehicle.

________________________________________________________

Simulating Vehicles with Full Physics

http://www.gamasutra.com/features/20010124/adzima_03.htm (3 of 4) [25/06/2002 2:04:40 PM]


Gamasutra - Features - "AI Madness: Using AI to Bring Open-City Racing to Life" [1.24.01]

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/20010124/adzima_03.htm (4 of 4) [25/06/2002 2:04:40 PM]


Gamasutra - Features - "AI Madness: Using AI to Bring Open-City Racing to Life" [1.24.01]

| | | |

Features
by Joe Adzima
Gamasutra
[Author's Bio] AI Madness: Using AI to Bring Open-City
January 24, 2001
Racing to Life
Simulating Vehicles with Full Physics
AI Map: Roads, The full physics simulation object, VehiclePhysics, is a base class with the logic for navigating the
Intersections, and
city. The different entities in the city are derived from this base class, including the RouteRacer object
Open Areas
(some of the opponents) and the PoliceOfficer object (cops). These child classes supply the
Curves Ahead: additional logic necessary for performing higher-level behaviors. We use the term "full-physics
Creating Traffic vehicles" because the car being controlled for this category behaves within the laws of physics. These
cars have code for simulating the engine, transmission, and wheels, and are controlled by setting
City People: values for steering, brake, and throttle. Additionally, the VehiclePhysics class contains two key public
Simulating Pedestrians
methods, RegisterRoute and DriveRoute.
Simulating Vehicles
with Full Physics Registering a route. The first thing that the navigation algorithm needs is a route. The route can
either be created dynamically in real time or defined in a file as a list of intersection IDs. The real-time
Printer Friendly method always returns the shortest route. The file method is created by the Race Editor, another
Version proprietary in-house tool that allows the game designer to look down on the city in 2D and select the
intersections that make up the route. The game designer can thereby create very specific routes for
opponents. Also, the file method eliminates the need for some of the AI entities to calculate their
Discuss this
routes in real time, which in turn saves processing time.
Article
Planning the route. Once a route to a final destination has been specified, a little bit more detailed
From the December planning is needed for handling immediate situations. We used a road cache for this purpose, which
2000 issue of: stores the most immediate three roads the vehicle is on or needs to drive down next (see Figure 6).

Letters to the Editor:


Write a letter
View all letters Figure 6. The route is defined by the roads
connecting intersections 1 to 5, in order.
Vehicle A is on road 2-3, which is the "hint
road." Vehicle B has accidentally been knocked
onto road 6-2. The immediate target is
intersection 3 for both vehicles. Thus, Vehicle
A's cache consists of roads 2-3, 3-4, and 4-5.
Vehicle B's cache consists of roads 6-2, 2-3,
and 3-4.

At any given moment, the vehicle knows the next intersection it is trying to get to (the immediate
target), so the vehicle can identify the road connecting this target intersection with the intersection
immediately before the target. If the vehicle is already on this "hint road," then the cache is filled with

http://www.gamasutra.com/features/20010124/adzima_04.htm (1 of 3) [25/06/2002 2:05:40 PM]


Gamasutra - Features - "AI Madness: Using AI to Bring Open-City Racing to Life" [1.24.01]
the hint road and the next two roads in the route.
If the vehicle isn't on the hint road, it has gotten knocked off course. In this situation, the vehicle
looks at all the roads that connect with the intersection immediately before the target. If the vehicle is
on one of these roads, then the cache is filled with this road and the next two roads the vehicle needs
to take in order to get back on track. If the vehicle isn't on any of these roads, then it dynamically
plots a new route to the target intersection.
Determining multiple routes. If there are no ambient vehicles in the city, then there is only one
route necessary to give to an opponent (the computer-controlled player, or CCP), the best route. In
general, however, there is ambient traffic everywhere that must be avoided if the CCP is to remain
competitive. The choice then becomes which path to pick to avoid the obstacles. At any given
moment, this choice comes down to going left or right to avoid an upcoming obstacle. As the CCP
plans ahead, it determines two additional routes for each and every obstacle, until it reaches the
required planning distance. This process produces a tree of routes to choose from (see Figure 7).

Figure 7. The purple lines on the road show the


tree of possible routes that this opponent
vehicle is considering. The orange line shows
the best route -- which is typically the one that
isn't blocked, stays on the road, and goes as
straight as possible.
[Expand Image]

Choosing the best route. When all the possible routes have been enumerated, the best route for the
CCP can be determined. Sometimes one or more of the routes will take the vehicle onto the sidewalk.
Taking the sidewalk is a negative, so these routes are less attractive than those which stay on the
road. Also, some routes will become completely blocked, with no way around the obstacles present,
making those less attractive as well. The last criterion is minimizing the amount of turning required to
drive a path. Taking all these criteria into account, the best route is usually the one that isn't blocked,
stays on the road, and goes as straight as possible.
Setting the steering. The CCP vehicle simulated with full physics uses the same driving model that
the player's vehicle uses. For example, both vehicles take a steering parameter between -1.0 and 1.0.
This parameter is input from the control pad for the player's vehicle, but the CCP must calculate its
steering parameter in real time to avoid obstacles and reach its final destination. Rather than planning
its entire route in advance, the CCP simplifies the problem by calculating a series of Steering Target
Points (STPs), one per frame in real time as gameplay progresses. Each STP is simply the next point
the CCP needs to steer towards to get one frame closer to its final destination. Each point is calculated
with due consideration to navigating the road, navigating sharp turns, and avoiding obstacles.
Setting the throttle. Most of the time a CCP wants to go as fast as possible. There are two
exceptions to this rule: traversing sharp turns and reaching the end of a race. Sharp turns are defined
as those in which the angle between two road subsegments is greater than 45 degrees, and can occur
anywhere along the road or when traversing an intersection. Since the route through a sharp turn is
circular, it is easy to calculate the maximum velocity through the turn by the formula

where V is equal to the velocity, u is the coefficient of friction for the road surface, g is the value of
gravity, and R is the radius of our turn. Once the velocity is known, all that the CCP has to do is slow

http://www.gamasutra.com/features/20010124/adzima_04.htm (2 of 3) [25/06/2002 2:05:40 PM]


Gamasutra - Features - "AI Madness: Using AI to Bring Open-City Racing to Life" [1.24.01]
down to the correct speed before entering the turn.
Getting stuck. Unfortunately, even the best CCP can occasionally get stuck, just like the player does.
When a CCP gets stuck, it throws its car into reverse, realigns with the road target, and then goes
back into drive and resumes the race.

The Road Ahead


In the wake of the original Midtown Madness, we wanted open city racing to give players much more
than the ability to drive on any street and across any open area. In order for a city to feel and play in
the most immersive and fun way possible, many interactive entities of real cities need to be simulated
convincingly. These entities include racing opponents, tenacious cops, ambient traffic, and pedestrians,
all of which require powerful and adaptive AI to bring them to life. Midtown Madness 2 and Midnight
Club expand on the capabilities of these entities, which in turn raises the bar of players' expectations
even further.
The future of open city racing is wide open -- literally. Angel Studios and I are planning even more
enhancements to the AI in any future games of this type that we do. Some ideas I'm planning to
investigate in the future include enhancing the opponent navigation skills of all AI entities, and
creating AI opponents that learn from the players. Additionally, I'd like to create more player
interaction with the city pedestrians, and have more interaction between AI entities. Anyone wanna
race?
Joe Azdima has been an AI programmer at Angel Studios for three years. During that time, he
architected and implemented the entire AI system for Midtown Madness 1 and 2 for PC and Midnight
Club for Playstation 2. Joe thanks Robert Bacon, Angel Studios' technical writer, for the exceptional
editorial efforts Robert has applied to this article.
Discuss this article in Gamasutra's discussion forums

________________________________________________________

[Back to] AI Map: Roads, Intersections, and Open Areas

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/20010124/adzima_04.htm (3 of 3) [25/06/2002 2:05:40 PM]


GameDev.net - AI Uncertainty

AI Uncertainty GameDev.net

See Also:
Artificial Intelligence:Gaming

Newsgroups: comp.ai.games
From: smt@cs.monash.edu.au (Scott M Thomson)
Subject: Uncertainty in AI
Summary: This posting will hopefully publicise Bayesian networks, which provide
Keywords: AI
Date: Fri, 31 Mar 1995 04:56:58 GMT
Have you ever played peek-a-boo with a small child? Why is it that it works? What is it that engages the
child's delight? Why doesn't it work with older people?

The game of peek-a-boo takes advantage of the limited cognitive development of the child, when we hide
ourselves behind an object the child's mind no longer registers our presence. When we pop out from hiding
the child's mind is delirious at the magic of our rematerialization.

A complicated challenge for artificial intelligence since its inception has been knowledge representation in
problems with uncertain domains. What a system can't see is, nonetheless, of possible importance to its
reasoning mechanisms. What is unknown is also often still vital to common sense reasoning. This posting
will hopefully publicise Bayesian networks, which provide a formalism for modelling and an inference
mechanism for reasoning under uncertainty and initiate discussion about uncertainty problems and
probabilistic reasoning in game AI's.

Sun Tzu was a Chinese general who lived approximately 2400 years ago. His work, "The Art of War",
describes the relationships between warfare, politics, economics, diplomacy, geography and astronomy.
Such modern generals as Mao Zedung have used his work as a strategic reference.

Sun Tzu's philosophy on war can be summed up in this statement, "to win one hundred victories in one
hundred battles is not the acme of skill. To subdue the enemy without fighting is the supreme excellence"
[11]. In computer games utilising cheats for toughening computer AI there is no skill in a computer player's
victory. If a computer player can beat a human on even terms then we may start to discuss the skill of the
AI designer and any human victory is that much more appreciated.

The difficulty in representing uncertainty in any game AI is in the vast numbers of combinations of actions,
strategies and defences available to each player. What we are left with is virtually impossible to represent in
tables or rules applicable to more than a few circumstances. Amongst the strategies expounded by Sun Tzu
are enemy knowledge, concealment and position[11].

Enemy knowledge is our most obvious resource. Another player's units or pieces inform us about possible
future actions or weaknesses by location, numbers and present vectored movement. They suggest
possibilities for defensive correction, offensive action and bluffing. Sun Tzu states that we should, ``Analyse
the enemy's plans so that we will know his shortcomings as well as strong points. Agitate him in order to
ascertain the pattern of his movement"[11].

Concealment may be viewed as the art of both hiding one's own strategy and divining one's opponent's. By
considering our opponent's past history and placing our current situation in that context we hope to
discover something about what is hidden in their mind. Conversely, our actions must be designed to convey
as little as possible about the true strength or weakness of our positions.

The position of units refers to their terrain placement in the game. Those terrains that grant defensive or
offensive bonuses to computer players units should be utilised to the best advantage. In addition computer
units should strike where the enemy is weakest and where the most damage can be inflicted at the least
loss. Impaling units on heavily fortified positions for nominal gain is best left to real generals in real war and
is not a bench mark of intelligent behaviour.

To combine everything we need to play a good game in the face of a deceptive and hostile opponent is not
a trivial task. Sun Tzu believed, "as water has no constant form, there are no constant conditions in war.
One able to win the victory by modifying his tactics in accordance with the enemy situation may be called a

http://www.gamedev.net/reference/articles/article197.asp (1 of 5) [25/06/2002 2:06:17 PM]


GameDev.net - AI Uncertainty

divine!" [11]. Our aim in designing game AI's is to obtain a mechanism for moderate strategic competence,
not a program with a claim to god-hood.

Debate on the mechanism for the representation of uncertainty has settled into two basic philosophies,
extensional and intensional systems [19, p3]. Extensional systems deal with uncertainty in a context free
manner, treating uncertainty as a truth value attached to logic rules. Being context free they do not
consider interdependencies between their variables. Intensional systems deal with uncertainty in a context
sensitive manner. They try to model the interdependencies and relevance relationships of the variables in
the system.

If an extensional system has the rule,


if A then B with some certainty factor m.
and observes A in its database it will infer something about the state of B regardless of any other
knowledge available to it. Specifically, on seeing A it would update the uncertainty of B by some function of
the rule strength 'm' [].
If an intensional system were to consider the same rule, it would interpret it as a conditional probability
expression P(B|A) = m []. What we believe about B in this system is dependent on our whole view of the
problem and how relevant information interacts.

The difference between these two systems boils down to a trade-off between semantic accuracy and
computational feasibility. Extensional systems are computationally efficable but semantically clumsy.
Intensional systems on the otherhand were thought by some to be computationally intractable even though
they are semantically clear.

Both MYCIN (1984) and PROSPECTOR (1978) are examples of extensional systems. MUNIN (1987) is an
example of an intensional system.

MYCIN is an expert system which diagnoses bacterial infections and recommends prescriptions for their
cure. It uses certainty factor calculus to manipulate generalised truth values which represent the certainty
of particular formulae. The certainty of a formula is calculated as some function of the certainty of it
subformulae.
MUNIN is an expert system which diagnoses neuromuscular disorders in the upper limbs of humans. It uses
a causal probabilistic network to model the conditional probabilities for the pathophysiological features of a
patient[1].

Some of the stochastic infidelity of extensional systems arises in their failure to handle predictive or
abductive inference. For instance, there is a saying, ``where there's smoke there's fire". We know that fire
causes smoke but it is definitely not true that smoke causes fire. How then do we derive the second from
the first? Quite simply, smoke is considered evidence for fire, therefore if we see smoke we may be led to
believe that there is a fire nearby.

In an extensional approach to uncertainty it would be necessary to state the rule that smoke causes fire in
order to obtain this inferencing ability. This may cause cyclic updating which leads to an over confidence in
the belief of both fire and smoke, from a simple cigarette. To avoid this dilemma most extensional systems
do not allow predictive inferencing. An example of predictive inferencing in a strategic game is the
consideration of a player's move in reasoning about their overall strategy.

Even those authors that support extensional systems as a means for reasoning under uncertainty
acknowledge their semantic failures. "There is unfortunately a fundamental conflict between the demands of
computational tractability and semantic expressiveness. The modularity of simple rule-based systems aid
efficient data update procedures. However, severe evidence independence assumptions have to be made for
uncertainties to be combined and propagated using strictly local calculations"[5].

Although computationally feasible these systems lack the stochastic reliability of plausible reasoning. THE
PROBLEM WITH CERTAINTY FACTORS OR TRUTH VALUES BEING ATTACHED TO FORMULAE IS THAT
CERTAINTY MEASURES VISIBLE FACTS WHEREAS UNCERTAINTY IS RELATED TO WHAT IS UNSEEN, THAT
WHICH IS NOT COVERED BY THE FORMULAE[].

The semantic merits of intensional systems is also the reason for their computational complexity. In the
example,
if P(A|B) = m,

http://www.gamedev.net/reference/articles/article197.asp (2 of 5) [25/06/2002 2:06:17 PM]


GameDev.net - AI Uncertainty

we cannot assert anything about B even if given complete knowledge about A. The rule says only that if A is
true and is the only thing that is known to be relevant to B, then the probability of B is 'm'. When we
discover new information relevant to B we must revoke our previous beliefs and calculate P(B|A,K), where K
is the new knowledge. The stochastic fidelity of intensional systems leaves them impotent unless they can
determine the relevance relationships between the variables in their domain. It is necessary to use a
formalism for articulating the conditions under which variables are considered relevant to each other, given
what is already known. Using rule-based systems we quickly get bogged in the unwieldy consideration of all
possible probable interactions. This leads to complex and computationally infeasible solutions.

Bayesian networks are a mechanism for accomplishing computational efficacy with a semantically accurate
intensional system. They have been used for such purposes as, sensor validation [9], medical diagnoses[1,
2], forecasting [3], text understanding [6] and naval vessel classification [7].

The challenge is to encode the knowledge in such a way as to make the ignorable quickly identifiable and
readily accessible. Bayesian networks provide a mathematically sound formalism for encoding the
dependencies and independencies in a set of domain variables. A full discussion is given in texts devoted to
this topic [10].

Bayesian networks are directed acyclic graphs in which the nodes represent stochastic variables. These
variables can be considered as a set of exhaustive and mutually exclusive states. The directed arcs within
the structure represent probabilistic relationships between the variables. That is, their conditional
dependencies and by default their conditional independencies.
We have then, a mechanism for encoding a full joint probability distribution, graphically, as an appropriate
set of marginal and conditional distributions over the variables involved. When our graphical representation
is sparsely connected we require a much smaller set of probabilities than would be required to store a full
joint distribution.

Each root node within a Bayesian network has a prior probability associated with each of its states. Each
other node in the network has a conditional probability matrix representing probabilities, for that variable,
conditioned on the values of its parents.

After a network has been initialised according to the prior probabilities of its root nodes and the conditional
probabilities of its other variables, it is possible to instantiate variables to certain states within the network.
The network, following instantiation, already has posteriors associated with each node as a result of the
propagation during initialisation. Instantiation leads to a propagation of probabilities through the network to
give posterior beliefs about the states of the variables represented by the graph.

In conclusion, I am not proposing that Bayesian networks are some god given solution to all of AI's
problems. It is quite plain that quite a few problems push the bounds of computational feasibility even for
Bayesian networks. It is my hope that by posting this I may play some game in the future that "reasons" in
a remotely intelligent way about strategies for victory. Perhaps incorporating the concepts of probabilistic
reasoning into a hybrid system is a feasible solution to a competent strategic AI.

Here is a list of some references I used in my Honours thesis. Numbers 8 and 10 are texts devoted to
Bayesian Networks.

[1]
Andreassen, S; et al.
"MUNIN - an Expert EMG Assistant."
{Computer-Aided Electromyography and Expert Systems}, 1989.

[2]
Berguini, C; Bellazi, R; Spiegelhalter, D.
"Bayesian Networks Applied to Therapy Monitoring.",
{Uncertainty in Artificial Intelligence},
Proceedings of the Seventh Conference (1991) p35.

[3]
Dagum, P; Galpher, A; Horvitz, E.
"Dynamic Network Models for Forcasting."
{Uncertainty in Artificial Intelligence},
Proceedings of the Eighth Conference (1992) p41.

http://www.gamedev.net/reference/articles/article197.asp (3 of 5) [25/06/2002 2:06:17 PM]


GameDev.net - AI Uncertainty

[4]
Findler, N.
"Studies in Machine Cognition using th Game of Poker."
{Communications of the ACM}, v20, April 1977, p230.

[5]
Fox, J; Krause, P.
"Symbolic Decision Theory and Autonomous Systems."
{Uncertainty in Artificial Intelligence},
Proceedings of the Seventh Conference (1991) p103.

[6]
Goldman, R; Charniak, E.
"A Probabilistic Approach to Language Understanding."
{Tech Rep CS-90-34}, Dept Comp Sci, Brown University 1990.

[7]
Musman, SA; Chang, LW.
A Study of Scaling In Bayesian Networks for Ship Classification."
{Uncertainty in Artificial Intelligence},
Proceedings of the Ninth Conference (1993) p32.

[8]
Neapolitan, RE.
{"Probabilistic Reasoning in Expert Systems, Theory and Algorithms."}
John Wiley and Sons, 1989.

[9]
Nicholson, AE; Brady, JM.
"Sensor Validation using Dynamic Belief Networks."
{Uncertainty in Artificial Intelligence},
Proceedings of the Eighth Conference (1992) p207.

[10]
Pearl, J.
{"Probabilistic Reasoning in Intelligent Systems, Networks of Plausible Inference."}
Morgan Kaufmann Publishers, Inc, 1988.

[11]
Wordsworth Reference.
{"Sun Tzu, The Art of War."}
Sterling Publishing Co Inc, 1990.

I hope this has been helpful,

Scott Thomson

###############################################################################
Scott M Thomson \|/ ^^^ \|/
smt@bruce.cs.monash.edu.au -O-[/@ @\]-O-
\ | > | /
| |___| |
\ \ U / /
---
"Cognito cognito ergo cognito sum cognito"
"I think I think therfore I think I am I think?"
(pardon the grammar)
###############################################################################

Discuss this article in the forums

Date this article was posted to GameDev.net: 9/7/1999

http://www.gamedev.net/reference/articles/article197.asp (4 of 5) [25/06/2002 2:06:17 PM]


GameDev.net - AI Uncertainty

(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article197.asp (5 of 5) [25/06/2002 2:06:17 PM]


Gamasutra - Features - "Artificial Emotion: Simulating Mood and Personality" [05.07.99]

| | | |

Features
By Ian Wilson
Gamasutra
May 7, 1999
Artificial Emotion:
Simulating Mood and Personality
Characters that display emotion are critical to a rich and believable simulated Contents
environment, especially when those characters interact with real people possessing Introduction
real emotions. Emotion is the essential element that creates the difference
between robotic behavior and lifelike, engaging behavior. Traditionally, animators Layers of emotion
have painstakingly created these behaviors for prerendered animations. This
approach, however, is not possible when we wish to use autonomous, interactive How can we use AE?
characters that possess their own unique personalities and moods. Truly
interactive characters must generate their behavior autonomously through techniques based upon
what I call artificial emotion (AE).
Why do we have real emotion?

As human beings, we have an innate understanding of what emotions are. However, outside of
academia, we rarely hear discussions on how emotions are produced and, more importantly, on why
we have emotions. Within academia, these issues are subject to much contention and debate. That
said, allow me to offer my own thoughts on these issues.
When attempting to simulate natural systems, we first need to ask, "What is the nature of this system
and what is its purpose or reason for being?" Very few, if any, systems in the natural world exist for no
reason.

Letters to the Editor:


Write a letter
View all letters

Fujitsu’s fin fin

Emotions are an integral part of our decision-making systems. Emotions tune our decisions according
to our personalities, moods, and momentary emotions to give us unique responses to situations
presented by our environment. But why do we need unique responses to situations? Why don’t we all
have the same responses? To answer this question, we need to look beyond the individual at humanity
as a group or society of individuals. I believe personality has evolved as a problem-solving mechanism.
Our unique personalities determine that we all think and hence solve problems in unique and different
ways. In an evolutionary sense, this diverse method of solving problems is highly effective. If we had
only one method of problem solving there would be a large, if not infinite, number of solutions that
would be outside of our problem solving capabilities. So personality has evolved as a way of attacking
problems from many different angles: from bold high-risk solutions to cautious and precise
incremental solutions; from solutions discovered though deep thought and reflection to solutions
discovered by gaining knowledge from others (socializing).
Emotion is, to a large degree, an emergent system. Its use must be looked at in terms of its
interaction with society rather than in isolation to gain a better understanding of its reason for being.

http://www.gamasutra.com/features/19990507/artificial_emotion_01.htm (1 of 2) [25/06/2002 2:07:38 PM]


Gamasutra - Features - "Artificial Emotion: Simulating Mood and Personality" [05.07.99]
We can look at a corporation as an example. Here, at the top of the hierarchy, we have a CEO who is
bold and fearless, making broad decisions with little regard to details. At the other end of the
hierarchy, we might find someone who is fearful of the unknown, is timid, and has great respect for
details. The organization, and in a greater sense society, needs both types of people and the many
others in between to function efficiently. Imagine if we all had identical decision-making systems,
which gave us all the same responses to situations, but those responses were wrong. We wouldn’t last
very long as a species.

Layers of emotion

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990507/artificial_emotion_01.htm (2 of 2) [25/06/2002 2:07:38 PM]


Gamasutra - Features - "Artificial Emotion: Simulating Mood and Personality" [05.07.99]

| | | |

Features
By Ian Wilson
Gamasutra
May 7, 1999
Layers of emotion Contents
Introduction
Fundamental to our AE-based behavior system is the notion that emotions
comprise three layers of behavior. At the top level are what we term momentary Layers of emotion
emotions; these are the behaviors that we display briefly in reaction to events. For
example, momentary emotions occur when we smile or laugh at a joke or when we How can we use AE?
are surprised to see an old friend unexpectedly. At the next level are moods.
Moods are prolonged emotional states caused by the cumulative effect of momentary emotions.
Underlying both of these layers and always present is our personality; this is the behavior that we
generally display when no momentary emotion or mood overrides (Figure 1).

Letters to the Editor:


Write a letter
View all letters

Figure 1. The three layers of emotional behavior.

These levels have an order of priority. Momentary emotions have priority over mood when determining
which behavior to display. One’s mood, in turn, has priority over one’s personality (Figure 2)

http://www.gamasutra.com/features/19990507/artificial_emotion_02.htm (1 of 3) [25/06/2002 2:08:23 PM]


Gamasutra - Features - "Artificial Emotion: Simulating Mood and Personality" [05.07.99]

Figure 2. Emotional priorities.

Figures 1 and 2 show the various layers of emotional behavior. Momentary emotions are brief
reactions to events that assume the highest priority when we select our behavior. These momentary
behaviors are short-lived and decay quickly. Moods are produced by momentary emotions, usually by
the cumulative affects of a series of momentary emotions. Moods can gradually increase in prominence
even after the momentary emotions have subdued. The development of moods depends on whether
the momentary emotions are positive or negative (punishments or rewards in a reinforcement sense).
If a character were to receive a stream of negative momentary emotions, then the mood would
obviously be bad and would decay slowly. The personality layer is always present and has a consistent
level of prominence.
The behavior that a character displays depends upon each emotional layer’s prominence. The more
prominent the layer, the higher the probability of that behavior being selected.
Where can we use AE?

With the notable exceptions of P.F. Magic’s Catz and Dogz series, Fujitsu’s fin fin, and Cyberlife’s
Creatures series, autonomous AE of any significant depth is rarely seen in the world of interactive
entertainment. Why is this the case?
The field of interactive entertainment is dominated by genres that require the user to either conquer
and/or to kill everything in his or her path. Little emotion is required by the opposition, besides
perhaps a little hard-coded fear or aggression that manifests itself in simple movement patterns.
Emotion primarily serves a social function in interactive entertainment. Emotional responses are used
to make the characters that we encounter believable and engaging. For example, if we were to walk
into a virtual bar and all of the characters in the bar had distinct personalities, the scene would be a
very immersive and believable social situation. If the characters showed no emotion, our suspension of
disbelief would be immediately broken and we would be reminded that we were in a
computer-generated simulation rather than in our own fantasy world. Of course, if all of the bar’s
customers had guns and our sole purpose was to dispatch them to a simulated afterlife, then this
really wouldn’t constitute a social situation and emotion might not be required.
A key to the use of AE, then, is the context of situations in which it is used. An important area of
growth is in the field of girls’ entertainment, pioneered by Purple Moon and its friendship adventures
built on Brenda Laurel’s excellent research into girls’ play behavior and girls and sport. For more
information on Ms. Laurel’s research, see
http://www.purple-moon.com/cb/laslink/pm?stat+corp+play_behavior and
http://www.purple-moon.com/cb/laslink/pm?stat+corp+girl_sport.

Social cooperation is a key element in this area and as such is an ideal place to use autonomous
characters with AE. In these situations, the characters’ emotional states and their emotional responses
to the players’ actions are what make the experience enjoyable, interesting, and entertaining. After
playing the first of Purple Moon’s titles, I was a little disappointed to find that it used only static
animations, which limited its sense of immersion. A full, living, 3D world would have increased its
impact (and cost) dramatically.
Of course, processor overhead is always a problem with an element as computationally complex as AE.
The reason that Catz, Dogz, and Creatures succeed in displaying characters with believable emotional

http://www.gamasutra.com/features/19990507/artificial_emotion_02.htm (2 of 3) [25/06/2002 2:08:23 PM]


Gamasutra - Features - "Artificial Emotion: Simulating Mood and Personality" [05.07.99]
behavior is that this element is generally the games’ sole
area of concern. Graphics and other elements are kept to
an acceptable minimum so that maximum resources can
be devoted to behavior generation. As we’re not yet at the
stage where we can throw unlimited resources at
character AE, we should learn from those titles that
employ it successfully and design our simulations
intelligently with these constraints in mind. In other
words, fight the battles you can win.
Still, a significant amount of ingenuity and optimization
would certainly contribute to the use and availability of
AE. Consider the graphics technique of LOD (level of
detail), in which objects farther from the viewer are
displayed in progressively lower and lower levels of detail.
Using LOE (level of emotion), characters farther away
from the viewer would generate and display progressively
lower and lower levels of emotion. If a character is out of
sight, we generally don’t care about its emotional state. In
addition, one can also be careful in the choice of
characters to use. Using human characters necessarily
Cyberlife’s Creatures implies that their behavior is deep and complex.
Unfortunately, because we are most attuned to
recognizing human emotion, we are also very well attuned at recognizing flawed human behavior,
which can break the illusion of an otherwise well-constructed simulated environment. One way to
attack this problem is to use nonhuman characters. Cats, dogs, and Norns all show engaging levels of
interactive emotional behavior that maintains the illusion of life without having, or needing, the
complexity of human emotional responses.
An important point to reiterate here is that we’re specifically dealing with autonomous interactive
characters. These characters have responses and behaviors that cannot be prescripted or predefined
to any great degree and must instead employ systems that are able to produce behavior in response
to changes in the environment and interactions with the user.

How can we use AE?

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990507/artificial_emotion_02.htm (3 of 3) [25/06/2002 2:08:23 PM]


Gamasutra - Features - "Artificial Emotion: Simulating Mood and Personality" [05.07.99]

| | | |

Features
By Ian Wilson
Gamasutra
May 7, 1999
How can we use AE? Contents
Introduction
Artificial emotion produces two fundamental components as output: gestures and
actions. Actions are a general category and are dependent upon the context of the Layers of
situation in which the character exists. A simulation’s movement system uses AE to
emotion
select and/or modify an action. When selecting an action, AE indicates what
actions are appropriate to the character’s personality and current mood. So a timid
How can we
character is unlikely to do anything aggressive, for example. When modifying an
use AE?
action, AE can help to determine how an action is carried out. An outgoing,
extroverted character might perform an action enthusiastically, although this probably wouldn’t be the
case for an extreme introvert. Our primary use of AE, however, is in driving gestures, namely hand,
body, and facial gestures. Gestures are the way in which we communicate our emotions to the outside
world. Without them, we would seem cold, flat, and unemotional — rather like a computer character.
These AE-driven gestures are tied directly to our characters’ personalities and moods and follow
definite patterns.
This body language adds an extra dimension to a
character’s behavior, giving life and depth to
simulations populated by autonomous characters
that now posses unique personalities. We are all
used to seeing environments populated by
characters that all have identical motions or body
language. They all stand stiffly upright and move
like clockwork toys. Would it not be refreshing to
see a sad looking fellow, shoulders hunched over,
arms hanging limply and walking slowly as he
Letters to the Editor: makes his way through our environment? This idea
Write a letter immediately introduces all sorts of theatrical and P.F. Magic's Dogz
View all letters cinematic possibilities, such as populating our
environment with a whole cast of unique characters. Our viewer’s experience would be enriched as
well. "Who is that guy? Why does he look so sad? What’s his story? Should I go and ask him?" The
kinds of questions that occur to the viewer of a truly interactive experience are simply irrelevant
without AE.
(It should be noted that I could also substitute the acting term character in place of my term
personality. Character might be a more appropriate term, but could confuse the reader because I’m
using character to indicate an autonomous agent in this article. The terms are, however,
interchangeable.)
The future of AE

I can imagine a scene; I’m searching for a lost city in a wild, remote jungle with my trusted
autonomous companion Aeida. Suddenly, we find the entrance to the city and walk in. It’s still
inhabited. The inhabitants’ body language changes when they see us, reacting to our sudden intrusion.
Some become fearful, backing away and curling into a nonthreatening posture. Others do the
opposite, standing upright, shoulders back, chest out, and fists clenched — looks like trouble. We
stand motionless for a time, until a very jovial character smiles broadly at us, laughs, then comes over
to greet us, telling the other inhabitants to do likewise. The inhabitants’ interactive behavior, and more
importantly their individual behavior, creates a living world for us to explore and within which to
entertain ourselves. This environment would be socially oriented; our decisions and actions would be
based upon the personalities and moods of the characters that we encounter. Essentially, the
characters’ decisions and actions would be interactively based upon ours; nothing would be prescripted
(unless the designer of the experience wished it that way, as in interactive theatre).
Such a world would require that designers spend a good deal of time designing their characters for
deep and engaging roles. Designers will need to add the skills of scriptwriting and storytelling to their
growing repertoire of talents. Interactive theatre and cinema is a relatively new area that is emerging
around autonomous characters. Those who are interested in participating in its development would be
wise to start their reading now. A great place to start looking is the web site composed by Andrew

http://www.gamasutra.com/features/19990507/artificial_emotion_03.htm (1 of 2) [25/06/2002 2:09:20 PM]


Gamasutra - Features - "Artificial Emotion: Simulating Mood and Personality" [05.07.99]
Stern of P.F. Magic at http://pw2.netcom.com/~apstern/index.html. Here you’ll find links to just about
every conceivable source in these fields and many more besides.
The convergence of many factors — processor speed, market awareness, and the maturation of the
entertainment field to name a few — will revolutionize the way in which we use characters in
simulations. Whole new avenues and genres will open up before us. The timing of these developments
may not be a moment too soon, considering the growing (and plausible) perception that videogames
turn kids into desensitized and violent members of society. Designing experiences around social
interaction may not push your buttons, but society at large will probably thank you (and give you lots
of free press). This subject is full of real emotion, which could, ironically, be averted through the use of
artificial emotion.
Visionary, life designer, philosopher, creative genius, egomaniac, and legend in his own day
dreams. Ian (Gamasutra Profile) aims to be the father of believable, emotional, virtual
characters in the emerging arena of entertainment simulation. He is currently trying to
establish artificial emotion as a separate field of study from AI (which is, according to
Hollywood, going to take over the world and enslave us all!). He can be reached at
ianw@artificial-emotion.com.

[Back to] Introduction

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990507/artificial_emotion_03.htm (2 of 2) [25/06/2002 2:09:20 PM]


GameDev.net - Building Brains Into Your Games

Building Brains Into Your Games GameDev.net

See Also:
Artificial Intelligence:Gaming

Building Brains into Your Games


by André LaMothe
August, 1995
Game developers have always pushed the limits of the hardware when it comes to graphics and sound, but I think we
all agree that when it's time to implement artificial intelligence for a game, AI always gets the short end of the stick!
In this article, we are going to study a potpourri of AI topics ranging from the simple to the complex.
Along the way, we are going to try out a few demos that use a very rudimentary graphics interface to illustrate some
of the simpler concepts. However, most of our discussion will be quasi-theoretical and abstract. This is because AI is
not as simple as an algorithm, a data structure, or similar things. Artificial intelligence is a fluid concept that must be
shaped by the game it is to be used on. Granted, you may use the same fundamental techniques on myriad games, but
the form and implementation may be radically different.
Let's begin our discussion with some simple statements that define what AI is in the context of games. Artificial
intelligence in the arena of computer games implies that the computer-controlled opponents and game objects seem
to show some kind of cognitive process when taking actions or reacting to the player's actions. These actions may be
implemented in a million different ways, but the bottom line, from an observers point of view, is that they seem to
show intelligence.
This brings us to the fundamental definition of intelligence. For our purposes, intelligence is simply the ability to
survive and perform tasks in an environment. The tasks may be to hunt down and destroy the player, find food,
navigate an asteroid field, or whatever. Nevertheless, this will be our loose definition of intelligence.
Now that we have an idea of what we are trying to accomplish, where on earth should we begin? We will begin by
using humans as our models of intelligence because they seem to be reasonably intelligent for carbon units. If we
observe a human in an environment, we can extrapolate a few key behaviors of intelligence that we can model using
fairly simple computer algorithms and techniques.
These behaviors are blind reflexes, random selection, use of known patterns, environmental analysis, memory-based
selections and sequential behaviors that may encompass some or all of the other behaviors. We'll take a look at all of
these behaviors and explore how we might implement them in a computer game, but first let's talk about the graphics
module we are going to use for some of the demos.

The Graphics Module


Half the world uses Microsoft C and C++ compilers and the other half uses Borland C and C++ compilers--so it's
always a problem publishing demos that depend on the use of either. Hence, we are going to write C code that is
totally compiler independent, based on a graphics interface that we are going to write ourselves, and that will work
on both compilers. The graphics interface will be based on graphics mode 13h, which is 320 by 200 pixels with 256
colors as shown in Figure 1. For the simple demos we are going to write, all we want to do is place the VGA/SVGA
card in mode 13h and plot single pixels on the screen. Thus we need two functions:
Set_Video_Mode(int mode);
and

http://www.gamedev.net/reference/articles/article574.asp (1 of 8) [25/06/2002 2:10:18 PM]


GameDev.net - Building Brains Into Your Games

Plot_Pixel(int x, int y,unsigned char color);


We will use the video BIOS function 10h to set the video mode, but how can we plot pixels? Plotting pixels in mode
13h is very simple because the graphics are fully memory mapped. Basically, mode 13h is a totally linear array of
memory that represents each pixel with a single byte. Further, this video memory starts at location A000:0000.
Figure 1 shows that there are 200 rows and 320 columns. Therefore, to compute the address of any pixel at (x,y) we
simply multiply the Y component by 320 and add the X. Or in other words:
memory offset = y*320+x;
Adding this memory offset to A000:0000 gives us the final memory location to access the desired screen pixel.
Hence, if we alias a FAR pointer to the video memory like this:
unsigned char far* video_buffer = (unsigned char far*)A0000000L;
Then we can access the video memory using a syntax like:
video_buffer[y*320+x] = color;
And that's it. So, using that information, we can then write a simple pixel-plotting function and graphics-mode
function. These two functions should be added to each demo so that the demos can perform the graphics-related
functions without help from the compiler-dependent graphics library. We're also going to add a little time-delay
function based on the PC's internal timer. The function is called Time_Delay() and takes a single parameter, which is
the number of clicks to wait for. Listing 1 shows the complete graphics interface named GMOD.H for the demos
contained within this article. Simply include the code of the graphics module with each demo and everything should
work fine. Now that we have the software we need to do graphics, let's begin our discussion of AI.

Deterministic Algorithms
Deterministic algorithms are the simplest of the AI techniques used in games. These algorithms use a set of variables
as the input and then use some simple rules to drive the computer- controlled enemies or game objects based on these
inputs. We can think of deterministic algorithms as reflexes or very low-level instincts. Activated by some set of
conditions in the environment, the algorithms then perform the desired behavior relentlessly without concern for the
outcome, the past, or future events.
The chase algorithm is a classic example of a deterministic algorithm. The chase algorithm is basically a method of
intelligence used to hunt down the player or some other object of interest in a game by applying the spatial
coordinates of the computer-controlled object and the object to be tracked. Figure 2 illustrates a good example of
this. It depicts a top view of a typical battleground, on which three computer-controlled bad guys and one player are
fighting. The question is, how can we make the computer-controlled bad guys track and move toward the player?
One way is to use the coordinates of the bad guys and the coordinates of the player as inputs into a deterministic
algorithm that outputs direction changes or direction vectors for the bad guys in real time.
Let's use bad guy one as the example. We see that he is located at coordinates (bx1,by1) and the player is located at
coordinates (px,py). Therefore, a simple algorithm to make the bad guy move toward the player would be:

// process x-coords
if (px>bx1) bx1++;
else if (px<bx1) bx1--;

// process y-coords
if (py>by1) by1++;
else if (py<by1) by1--;

http://www.gamedev.net/reference/articles/article574.asp (2 of 8) [25/06/2002 2:10:18 PM]


GameDev.net - Building Brains Into Your Games

That's all there is to it. If we wanted to reverse the logic and make the bad guy run then the conditional logic could be
inverted or the outcome increment operators could be inverted. As an example of deterministic logic, Listing 2 is a
complete program that will make a little computer-controlled dot chase a player-controlled dot. Use the numeric
keypad to control your player and press ESC to exit the program.
Now let's move on to another typical behavior, which we can categorize as random logic.

Random Logic
Sometimes an intelligent creature exhibits almost random behaviors. These random behaviors may be the result of
any one of a number of internal processes, but there are two main ones that we should touch upon--lack of
information and desired randomness.
The first premise is an obvious one. Many times an intelligent creature does not have enough information to make a
decision or may not have any information at all. The creature then simply does the best it can, which is to select a
random behavior in hopes that it might be the correct one for the situation. For example, let's say you were dropped
into a dungeon and presented with four identical doors. Knowing that all but one meant certain death, you would
simply have to randomly select one!
The second premise that brings on a random selection is intentional. For example, say you are a spy trying to make a
getaway after acquiring some secret documents (this happens to me all the time). Now, imagine you have been seen,
and the bad guys start shooting at you! If you run in a straight line, chances are you are going to get shot. However,
if during your escape you make many random direction changes and zigzag a bit, you will get away every time!
What we learn from that example is that many times random logic and selections are good because it makes it harder
for the player to determine what the bad guys are going to do next, and it's a good way to help the bad guys make a
selection when there isn't enough information to use a deterministic algorithm. Motion control is a typical place to
apply random logic in bad- guy AI. You can use a random number or probability to select a new direction for the bad
guy as a function of time. Let's envision a multiplayer game with a single, computer- controlled bad guy surrounded
by four human players. This is a great place to apply random motion, using the following logic:

// select a random translation for X axis


bx1 = bx1 + rand()%11 - 5;
// select a random translation for Y axis
by1 = by1 + rand()%11 - 5;

The position of the bad guy is translated by a random amount in both X and Y, which in this case is +-5 pixels or
units.
Of course, we can use random logic for a lot of other things besides direction changes. Starting positions, power
levels, and probability of firing weapons are all good places to apply random logic. It's definitely a good technique
that adds a bit of unpredictability to game AI. Listing 3 is a demo of random logic used to control motion. The demo
creates an array of flies and uses random logic to move them around. Press ESC to exit the demo.
Now let's talk about patterns.

http://www.gamedev.net/reference/articles/article574.asp (3 of 8) [25/06/2002 2:10:18 PM]


GameDev.net - Building Brains Into Your Games

Encoded List Processing


Many intelligent creatures have prerecorded patterns or lists of behaviors that they have either learned from
experience or are instinctive. We can think of a pattern as a sequence of steps we perform to accomplish a task.
Granted, this sequence may be interrupted if something happens during the sequence that needs attention. But in
general, if we forget about interruptions then we can think of patterns as a list of encoded instructions that an
intelligent creature consumes to accomplish some task.
For example, when you drive to work, school, or your girlfriend's or boyfriend's house, you are following a pattern.
You get into your car, start it, drive to the destination, stop the car, turn it off, get out, and finally do whatever it is
you're going to do. This is a pattern of behavior. Although during the entire experience a billion things may have
gone through your head, the observed behavior was actually very simple. Hence, patterns are a good way to
implement seemingly complex thought processes in game AI. In fact, many games today still use patterns for much
of the game logic.
So how can we implement patterns for game AI? Simply by using an input array to a list processor. The output of the
processor is the control of a game object or bad guy. In this case, the encoded list has the following set of valid
instructions:
● Turn right

● Turn left

● Move forward

● Move backward

● Sit still

● Fire weapon

Even though we only have six selections, we can construct quite a few patterns with a short input list of 16 elements
as in the example. In fact there are 6 16 different possible patterns or roughly 2.8 trillion different behaviors. I think
that's enough to make something look intelligent! So how can we use encoded lists and patterns in a game for the
AI? One solid way is to use them to control the motion of a bad guy or game object. For example, a deterministic
algorithm might decide it's time to make a bad guy perform some complex motion that would be difficult if we used
standard conditional logic. Thus, we could use that pattern, which simply reads an encoded list directing the bad guy
to make some tricky moves. For example, we might have a simple algorithm like this:

int move_x[16] = {-2,0,0,0,3,3,2,1,0, -2,-2,-,0,1,2,3,4};


int move_y[16] = {0,0,0,1,1,1,0,0,-1,-1, 2,3,4,0,0.-1};

// encoded pattern logic for a 16 element list


for (index=0; index<16; index++)
{
bx1+=move_x[index];
by1+=move_y[index];
} // end for index

You'll notice that the encoded pattern is made up simply of X and Y translations. The pattern could just as well have
contained complex records with a multitude of data fields. I've written detailed code that will create an example of
patterns and list processing, a demo of an ant that can process one of four patterns selected by the keys 1-4.
Unfortunately, it's too long to print here. Go to the Game Developer ftp site, though (ftp://ftp.mfi.com/gdmag/src),
and you can download it there.

http://www.gamedev.net/reference/articles/article574.asp (4 of 8) [25/06/2002 2:10:18 PM]


GameDev.net - Building Brains Into Your Games

Now we're starting to get somewhere, but we need an overall control unit with some form of memory, and we must
select the appropriate types of behaviors.

Finite State Machines


Finite state machines, or FSMs, are abstract models that can be implemented either in hardware or software. FSMs
are not really machines in the mechanical sense of the word, but rather abstract models with a set of "states" that can
be traversed. Within these states, the FSM not only has a special set of outputs but remembers the state and can
transition to another state, if and only if a set of inputs or premises are met. Figure 3 is a typical depiction of a finite
state machine. We see that there is a set of states labeled, S0, S1, S2, and Sn. We also see that there is a set of
connecting edges or arcs. These are called transition arcs and are the premises that must be met for the FSM to move
from state to state. Finally, within each state is a set of outputs. These outputs can be anything we wish-- from
motion controls for a game's bad guys to hard disk commands.
So how do we model an FSM in software and use it to control the game AI? Let's begin with the first question.
We can model an FSM with a single variable and a set of logical conditions used to make state transitions along with
the output for each state. For example, let's actually build a simple software state machine that controls a computer
bad guy differently based on the bad guy's distance to the player. The state machine will have the following four
states:
● State 0: Select new state = STATE_NEW

● State 1: Move randomly = STATE_RANDOM

● State 2: Track player = STATE_TRACK

● State 3: Use a pattern = STATE_PATTERN

The FSM's transition diagram is shown in Figure 4. We can see that if the bad guy is within 50 units of the player,
then the bad guy moves into State 2 and simply attacks. If the bad guy is in the range of 51 to 100 units from the
player, then the bad guy goes into State 3 and moves in a pattern. Finally, if the bad guy is farther than 100 units
from the player then chances are the bad guy can't even see the player (in the imaginary computer universe). In that
case, the bad guy moves into State 1, which is random motion.
So how can we implement this simple FSM machine? All we need is a variable to record the current state and some
conditional logic to perform the state transitions and outputs. Listing 4 shows a rough algorithm that will do all this.
Note that S0 (the new state) does not trigger any behavior on the part of the opponent. Rather, it acts as a state
"switchbox," to which all states (except itself) transition. This allows you to localize in a single control block all the
decision making about transitions
Although this requires two cycles through the FSM loop to create one behavior, it's well worth it. In the case of a
small FSM, the entire loop can stay in the cache, and in the case of a large FSM loop, the localization of the
transition logic will more than pay for the performance penalty. If you absolutely refuse to double-loop, you can
handcraft the transitions between states. A finite-state machine diagram will vividly illustrate, in the form of
spaghetti transitions, when your transition logic is out of control.
Now that we have an overall thought controller, that is, an FSM, we should discuss simulating sensory excitation in a
virtual world.

http://www.gamedev.net/reference/articles/article574.asp (5 of 8) [25/06/2002 2:10:18 PM]


GameDev.net - Building Brains Into Your Games

Environmental Sensing
One problem that plagues AI game programming is that it can be very unfair-- at least to the player. The reason for
this is that the player can only "see" what's on the computer screen, whereas the computer AI system has access to all
variables and data that the player can't access.
This brings us to the concept of simulated sensory organs for the bad guys and game objects. For example, in a three-
dimensional tank game that takes place on a flat plain, the player can only see so far based on his or her field of
view. Further, the player can't see through rocks, buildings, and obstacles. However, because the game logic has
access to all the system variables and data structures, it is tempting for it to use this extra data to help with the AI for
the bad guys.
The question is, is this fair to the player? Well, of course not. So how can we make sure we supply the AI engine of
the bad guys and game objects with the same information the player has? We must use simulated sensory inputs such
as vision, hearing, vibration, and the like. Figure 5 is an example of one such imaginary tank game. Notice that each
opponent and the player has a cone of vision associated with it. Both the bad guys and the player can only see objects
within this cone. The player can only see within this cone as a function of the 3D graphics engine, but the bad guys
can only see within this cone as a function of their AI program. Let's be a little more specific about this.
Since we know that we must be fair to the player, what we can do is write a simple algorithm that scans the area in
front of each bad guy and determines if the player is within view. This scanning is similar to the player viewing the
viewport or looking out the virtual window. Of course, we don't need to perform a full three-dimensional scan with
ray tracing or the like--we can simply make sure the player is within the view angle of the bad guy in question by
using trigonometry of any technique we wish.
Based on the information obtained from each bad guy scan, the proper AI decision can be made in a more uniform
and fair manner. Of course, we may want to give the computer-controlled AI system more advantage than the human
player to make up for the AI system itself being rather primitive when compared to the 100 billion- cell neural
network it is competing against, but you get the idea.
Finally, we might ask, "Can we perform other kinds of sensing?" Yes. We can create simulated light detectors, sound
detectors, and so forth. I have been experimenting with an underwater game engine, and in total darkness the only
way the enemy creatures can "see" you is to listen to your propulsion units. Based on the power level of the player's
engines the game AI determines the sound level that the bad guys hear and moves them toward the sound source or
sources.

Memory and Learning


The final topic we're going to touch upon is memory and learning. Memory is easy enough to understand, but
learning is a bit more nebulous. Learning as far as we are concerned is the ability to interact in an environment in
such a way that behaviors that seem to work better than others under certain conditions are "memorized" and used
more often. In essence, learning is based on memory of past actions being good or bad or whatever. Imagine that we
have written a fairly complex game composed of computer-controlled aliens. These aliens use an FSM- based AI
engine and environmental sensing. The problem is that one of the resources in the game is energion cubes and the
player and aliens must compete for these cubes.
As the player is moving around in the environment, he or she can create a mental map of where energion cubes seem
to be plentiful, but the alien creatures have no such ability; they can only stand and are at a disadvantage. Can we
give them a memory and teach them where these energion cubes are? Of course we can, we are cybergods!
One such implementation would work as follows: We could use a simple data structure that would track the number

http://www.gamedev.net/reference/articles/article574.asp (6 of 8) [25/06/2002 2:10:18 PM]


GameDev.net - Building Brains Into Your Games

of times an alien found energion in each geographical region of the game.(Figure 6 illustrates one such memory
map.) Then, when an alien was power hungry, instead of randomly bouncing around, the alien would refer to this
memory data structure and select the geographical region with the highest probability of finding energion and set its
trajectory for this region.
The previous example is a simple one, but as we can see, memory and learning are actually very easy to implement.
Moreover, we can make the computer AI learn much more than where energion is. It could learn the most common
defensive moves of the player and use this information against the player.
Well that's enough for basic AI techniques. Let's take a quick look at how we can put it all together.

Building Monsters from the Id


We have quite a repertoire of computer AI tricks at our fingertips, so how should we use it all? Basically, when you
write a game and are implementing the AI, you should list the types of behaviors that each game object or bad guy
needs to exhibit. Simple creatures should use deterministic logic, randomness, and patterns. Complex creatures that
will interact with the player should use an FSM-based AI engine. And the main game objects that harass and test the
player should use an FSM and sensory inputs and memory. Figure 7 illustrates a final model of the most advanced
AI engine we can construct with what we have to work with.

The Future
I see AI as the next frontier to explore. Without a doubt, most game programmers have focused so much on graphics
that AI hasn't been researched much. The irony is that researchers have been making leaps and bounds in AI research
and Artificial Life or A-Life.
I'm sure you've heard the common terms "genetic algorithms" and "neural networks." Genetic algorithms are simply
a method of representing some aspect of a computer-based AI model with a set of "genes," which can represent
whatever we wish--aggressiveness, maximum speed, maximum vision distance, and so on. Then, a population of
creatures is generated using an algorithm that adds a little randomness in each of the output creatures' genes.
Our game world is then populated with these gene-based creatures. As the creatures interact in the environment, they
are killed, survive, and are reborn. The biological analog comes into play during the rebirthing phase. Either
manually or by some other means, the computer AI engine "mates" various pairs of creatures and mixes their genes.
The resulting offspring then survive another generation and the process continues. This causes the creatures to
evolve so that they are most adapted for the given environment.
Neural networks, on the other hand, are computer abstractions of a collection of brain cells that have firing
thresholds. You can enhance or diminish these thresholds and the connections between cells. By "teaching" a neural
network or strengthening and weakening these connections, the neural net can learn something. So we can use these
nets to help make decisions and even come up with new methods.

Andre LaMothe is the author of the best-selling Tricks of the Game Programming Gurus (SAMS Publishing, 1994)
and Teach Yourself Game Programming in 21 Days (SAMS Publishing, 1994). His latest creation is the Black Art of
3D Game Programming (Waite Group Press, 1995).
Discuss this article in the forums

Date this article was posted to GameDev.net: 7/31/1999

http://www.gamedev.net/reference/articles/article574.asp (7 of 8) [25/06/2002 2:10:18 PM]


GameDev.net - Building Brains Into Your Games

(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article574.asp (8 of 8) [25/06/2002 2:10:18 PM]


GameDev.net - Chess Programming Part I: Getting Started

Chess Programming Part I: Getting Started GameDev.net

See Also:
Artificial Intelligence:Gaming

Chess Programming Part I: Getting Started


by François Dominic Laramée
This is the first article in a six-part series about programming computers to play chess, and by extension other similar
strategy games of perfect information.
Chess has been described as the Drosophila Melanogaster of artificial intelligence, in the sense that the game has
spawned a great deal of successful research (including a match victory against the current world champion and
arguably the best player of all time, Gary Kasparov), much like many of the discoveries in genetics over the years
have been made by scientists studying the tiny fruit fly. This article series will describe some of the state-of-the-art
techniques employed by the most successful programs in the world, including Deep Blue.
Note that by the time the series is completed (in October), I will have written a simple implementation of the game in
Java, and the source code will be freely available for download on my web site. So if you want to see more code
samples, be patient; I'll give you plenty in due time!

Games of Perfect Information


Chess is defined as a game of "perfect information", because both players are aware of the entire state of the game
world at all times: just by looking at the board, you can see which pieces are alive and where they are located.
Checkers, Go, Go-Moku, Backgammon and Othello are other members of the category, but stud poker is not (you
don't know what cards your opponent is holding in his hands).
Most of the techniques described in this series will apply more or less equally to all games of perfect information,
although the details will vary from game to game. Obviously, while a search algorithm is a search algorithm no
matter what the domain, move generation and position evaluation will depend completely on the rules of the game
being played!

What We Need
In order to play chess, a computer needs a certain number of software components. At the very least, these include:
● Some way to represent a chess board in memory, so that it knows what the state of the game is.

● Rules to determine how to generate legal moves, so that it can play without cheating (and verify that its human
opponent is not trying to pull a fast one on it!)
● A technique to choose the move to make amongst all legal possibilities, so that it can choose a move instead of
being forced to pick one at random.
● A way to compare moves and positions, so that it makes intelligent choices.

● Some sort of user interface.

This series will cover all of the above, except the user interface, which is essentially a 2D game like any other. The
rest of this article describes the major issues related to each component and introduces some of the concepts to be
explored in the series.

http://www.gamedev.net/reference/articles/article1014.asp (1 of 3) [25/06/2002 2:10:48 PM]


GameDev.net - Chess Programming Part I: Getting Started

Board Representations
In the early days of chess programming, memory was extremely limited (some programs ran in 8K or less) and the
simplest, least expensive representations were the most effective. A typical chessboard was implemented as an 8x8
array, with each square represented by a single byte: an empty square was allocated value 0, a black king could be
represented by the number 1, etc.
When chess programmers started working on 64-bit workstations and mainframes, more elaborate board
representations based on "bitboards" appeared. Apparently invented in the Soviet Union in the late 1960's, the bit
board is a 64-bit word containing information about one aspect of the game state, at a rate of 1 bit per square. For
example, a bitboard might contain "the set of squares occupied by black pawns", another "the set of squares to which
a queen on e3 can move", and another, "the set of white pieces currently attacked by black knights". Bitboards are
versatile and allow fast processing, because many operations that are repeated very often in the course of a chess
game can be implemented as 1-cycle logic operations on bitboards.
Part II of this series covers board representations in detail.

Move Generation
The rules of the game determine which moves (if any) the side to play is allowed to make. In some games, it is easy
to look at the board and determine the legal moves: for example, in tic-tac-toe, any empty square is a legal move.
For chess, however, things are more complicated: each piece has its own movement rules, pawns capture diagonally
and move along a file, it is illegal to leave a king in check, and the "en passant" captures, pawn promotions and
castling moves require very specific conditions to be legal.
In fact, it turns out that move generation is one of the most computationally expensive and complicated aspects of
chess programming. Fortunately, the rules of the game allow quite a bit of pre-processing, and I will describe a set
of data structures which can speed up move generation significantly.
Part III of this series covers this topic.

Search Techniques
To a computer, it is far from obvious which of many legal moves are "good" and which are "bad". The best way to
discriminate between the two is to look at their consequences (i.e., search series of moves, say 4 for each side and
look at the results.) And to make sure that we make as few mistakes as possible, we will assume that the opponent is
just as good as we are. This is the basic principle underlying the minimax search algorithm, which is at the root of
all chess programs.
Unfortunately, minimax' complexity is O(bn), where b ("branching factor") is the number of legal moves available
on average at any given time and n (the depth) is the number of "plies" you look ahead, where one ply is one move
by one side. This number grows impossibly fast, so a considerable amount of work has been done to develop
algorithms that minimize the effort expended on search for a given depth. Iterative-deepening Alphabeta, NegaScout
and MTD(f) are among the most successful of these algorithms, and they will be described in Part IV, along with the
data structures and heuristics which make strong play possible, such as transposition tables and the history/killer
heuristic.
Another major source of headaches for chess programmers is the "horizon effect", first described by Hans Berliner.

http://www.gamedev.net/reference/articles/article1014.asp (2 of 3) [25/06/2002 2:10:48 PM]


GameDev.net - Chess Programming Part I: Getting Started

Suppose that your program searches to a depth of 8-ply, and that it discovers to its horror that the opponent will
capture its queen at ply 6. Left to its own devices, the program will then proceed to throw its bishops to the wolves
so that it will delay the queen capture to ply 10, which it can't see because its search ends at ply 8. From the
program's point of view, the queen is "saved", because the capture is no longer visible... But it has lost a bishop, and
the queen capture reappears during the next move's search. It turns out that finding a position where a program can
reason correctly about the relative strength of the forces in presence is not a trivial task at all, and that searching
every line of play to the same depth is tantamount to suicide. Numerous techniques have been developed to defeat
the horizon effect; quiescence search and Deep Blue's singular extensions are among the topics covered in Part V on
advanced search.

Evaluation
Finally, the program must have some way of assessing whether a given position means that it is ahead or that it has
lost the game. This evaluation depends heavily upon the rules of the game: while "material balance" (i.e., the
number and value of the pieces on the board) is the dominant factor in chess, because being ahead by as little as a
single pawn can often guarantee a victory for a strong player, it is of no significance in Go-Moku and downright
misleading in Othello, where you are often better off with fewer pieces on the board until the very last moment.
Developing a useful evaluation function is a difficult and sometimes frustrating task. Part VI of this series covers the
efforts made in that area by the developers of some of the most successful chess programs of all time, including
Chess 4.5, Cray Blitz and Belle.

Conclusion
Now that we know which pieces we will need to complete the puzzle, it is time to get started on that first corner.
Next month, I will describe the most popular techniques used to represent chess boards in current games. See you
there!

François Dominic Laramée, April 2000


Discuss this article in the forums

Date this article was posted to GameDev.net: 5/17/2000


(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article1014.asp (3 of 3) [25/06/2002 2:10:48 PM]


GameDev.net - Chess Programming Part II: Data Structures

Chess Programming Part II: Data Structures GameDev.net

See Also:
Artificial Intelligence:Gaming

Chess Programming Part II: Data Structures


by François Dominic Laramée
Last month, I presented the major building blocks required to write a program to play chess, or any other two-player
game of perfect information. Today, I will discuss in a bit more detail the most fundamental of these building
blocks: the internal representation of the game board.
You may be surprised to notice that, for all intents and purposes, the state of the art in this area has not changed in
thirty years. This is due to a combination of ingenuity (i.e., smart people made smart choices very early in the field's
history) and necessity (i.e., good data structures were a pre-requisite to everything else, and without these effective
techniques, not much would have been achieved.)
While we're at it, I will also present three support data structures which, although not absolutely required to make the
computer play, are invaluable if you want it to play well. Of these, two (one of which consumes ungodly amounts of
memory) are designed to accelerate search through the game tree, while the third is used to speed up move
generation.
Before we go any further, a word to the wise: in chess as in any other game, the simplest data structure that will get
the job done is usually the one you should use. While chess programmers have developed numerous clever data
representation tricks to make their programs go faster, very simple stuff is quite sufficient in many other games. If
you are a novice working on a game for which there is limited literature, start with something easy, encapsulate it
well, and you can experiment with more advanced representations once your program works.

Basic Board Representations


Back in the 1970's, personal computer memory was at a premium (that's an understatement if there ever was one!),
so the most compact the board representations, the better. There is a lot to be said for the most self-evident scheme:
a 64-byte array, where each byte represents a single square on the board and contains an integer constant
representing the piece located in that square. (Any chess board data structure also needs a few bytes of storage to
track down en passant pawn capture opportunities and castling privileges, but we'll ignore that for now, since this is
usually implemented separately, and pretty much always the same way.)
A few refinements on this technique soon became popular:
● The original SARGON extended the 64-byte array by surounding it with two layers of "bogus squares"
containing sentinel values marking the squares as illegal. This trick accelerated move generation: for example,
a bishop would generate moves by sliding one square at a time until it reached an illegal square, then stop. No
need for complicated a priori computations to make sure that a move would not take a piece out of the
memory area associated with the board. The second layer of fake squares is required by knight moves: for
example, a knight on a corner square might try to jump out of bounds by two columns, so a single protection
layer would be no protection at all!
● MYCHESS reversed the process and represented the board in only 32 bytes, each of which was associated
with a single piece (i.e., the white king, the black King's Knight's pawn, etc.) and contained the number of the
square where that piece was located, or a sentinel value if the piece had been captured. This technique had a
serious drawback: it was impossible to promote a pawn to a piece which had not already been captured. Later

http://www.gamedev.net/reference/articles/article1046.asp (1 of 5) [25/06/2002 2:11:29 PM]


GameDev.net - Chess Programming Part II: Data Structures

versions of the program fixed this problem.


Today, this type of super-miserly structure would only be useful (maybe) if you wrote a chess program for a Palm
Pilot, a cell phone or a set-top box where 80-90% of resources are consumed by the operating system and non-game
services. However, for some other types of games, there is really no alternative!
For more information on vintage chess programs, read David Welsh's book, Computer Chess, published in 1984.

Bit Boards
For many games, it is hard to imagine better representations than the simple one-square, one-slot array. However,
for chess, checkers and other games played on a 64-square board, a clever trick was developed (apparently by the
KAISSA team in the Soviet Union) in the late 60's: the bit board.
KAISSA ran on a mainframe equipped with a 64-bit processor. Now, 64 happens to be the number of squares on a
chess board, so it was possible to use a single memory word to represent a yes-or-no or true-or-false predicate for the
whole board. For example, one bitboard might contain the answer to "Is there a white piece here?" for each square
of the board.
Therefore, the state of a chess game could be completely represented by 12 bitboards: one each for the presence of
white pawns, white rooks, black pawns, etc. Adding two bitboards for "all white pieces" and "all black pieces"
might accelerate further computations. You might also want to hold a database of bitboards representing the squares
attacked by a certain piece on a certain square, etc.; these constants come in handy at move generation time.
The main justification for bit boards is that a lot of useful operations can be performed using the processor's
instruction set's 1-cycle logical operators. For example, suppose you need to verify whether the white queen is
checking the black king. With a simple square-array representation, you would need to:
● Find the queen's position, which requires a linear search of the array and may take 64 load-test cycles.

● Examine the squares to which it is able to move, in all eight directions, until you either find the king or run out
of possible moves.
This process is always time-consuming, more so when the queen happens to be located near the end of the array, and
even more so when there is no check to be found, which is almost always the case!
With a bitboard representation, you would:
● Load the "white queen position" bitboard.

● Use it to index the database of bitboards representing squares attacked by queens.

● Logical-AND that bitboard with the one for "black king position".

If the result is non-zero, then the white queen is checking the black king. Assuming that the attack bitboard database
is in cache memory, the entire operation has consumed 3-4 clock cycles!
Another example: if you need to generate the moves of the white knights currently on the board, just find the attack
bitboards associated with the positions occupied by the knights and AND them with the logical complement of the
bitboard representing "all squares occupied by white pieces" (i.e, apply the logical NOT operator to the bitboard),
because the only restriction on knights is that they can not capture their own pieces!
For a (slightly) more detailed discussion of bitboards, see the article describing the CHESS 4.5 program developed at
Northwestern University, in Peter Frey's book Chess Skill in Man and Machine; there are at least two editions of this
book, published in 1977 and 1981.
Note: To this day, few personal computers use true 64-bit processors, so at least some of the speed advantages

http://www.gamedev.net/reference/articles/article1046.asp (2 of 5) [25/06/2002 2:11:29 PM]


GameDev.net - Chess Programming Part II: Data Structures

associated with bitboards are lost. Still, the technique is pervasive, and quite useful.

Transposition Tables
In chess, there are often many ways to reach the same position. For example, it doesn't matter whether you play 1.
P-K4 ... 2. P-Q4 or 1. P-Q4... 2. P-K4; the game ends up in the same state. Achieving identical positions in different
ways is called transposing.
Now, of course, if your program has just spent considerable effort searching and evaluating the position resulting
from 1. P-K4 ... 2. P-Q4, it would be nice if it were able to remember the results and avoid repeating this tedious
work for 1. P-Q4... 2. P-K4. This is why all chess programs, since at least Richard Greenblatt's Mac Hack VI in the
late 1960's, have incorporated a transposition table.
A transposition table is a repository of past search results, usually implemented as a hash dictionary or similar
structure to achieve maximum speed. When a position has been searched, the results (i.e., evaluation, depth of the
search performed from this position, best move, etc.) are stored in the table. Then, when new positions have to be
searched, we query the table first: if suitable results already exist for a specific position, we use them and bypass the
search entirely.
There are numerous advantages to this process, including:
● Speed. In situations where there are lots of possible transpositions (i.e., in the endgame, when there are few
pieces on the board), the table quickly fills up with useful results and 90% or more of all positions generated
will be found in it.
● Free depth. Suppose you need to search a given position to a certain depth; say, four-ply (i.e., two moves for
each player) ahead. If the transposition table already contains a six-ply result for this position, not only do you
avoid the search, but you get more accurate results than you would have if you had been forced to do the
work!
● Versatility. Every chess program has an "opening book" of some sort, i.e., a list of well-known positions and
best moves selected from the chess literature and fed to the program to prevent it from making a fool out of
itself (and its programmer) at the very beginning of the game. Since the opening book's modus operandi is
identical to the transposition table (i.e., look up the position, and spit out the results if there are any), why not
initialize the table with the opening book's content at the beginning of the game? This way, if the flow of the
game ever leaves the opening book and later translates back into a position that was in it, there is a chance that
the transposition table will still contain the appropriate information and be able to use it.
The only real drawback of the transposition table mechanism is its voracity in terms of memory. To be of any use
whatsoever, the table must contain several thousand entries; a million or more is even better. At 16 bytes or so per
entry, this can become a problem in memory-starved environments.
Other uses of transposition tables
CHESS 4.5 also employed hash tables to store the results of then-expensive computations which rarely changed in
value or alternated between a small number of possible choices:
● Pawn structure. Indexed only on the positions of pawns, this table requires little storage, and since there are
comparatively few possible pawn moves, it changes so rarely that 99% of positions result in hash table hits.
● Material balance, i.e., the relative strength of the forces on the board, which only changes after a capture or a
pawn promotion.
This may not be as useful in these days of plentiful CPU cycles, but the lesson is a valuable one: some measure of
pre-processing can save a lot of computation at the cost of a little memory. Study your game carefully; there may be
room for improvement here.

http://www.gamedev.net/reference/articles/article1046.asp (3 of 5) [25/06/2002 2:11:29 PM]


GameDev.net - Chess Programming Part II: Data Structures

Generating Hash Keys for Chess Boards


The transposition tables described above are usually implemented as hash dictionaries of some sort, which brings up
the following topic: how do you generate hashing keys from chess boards, quickly and efficiently?
The following scheme was described by Zobrist in 1970:
● Generate 12x64 N-bit random numbers (where the transposition table has 2^N entries) and store them in a
table. Each random number is associated with a given piece on a given square (i.e., black rook on H4, etc.)
An empty square is represented by a null word.
● Start with a null hash key.

● Scan the board; when you encounter a piece, XOR its random number to the current hash key. Repeat until
the entire board has been examined.
An interesting side effect of the scheme is that it will be very easy to update the hash value after a move, without
re-scanning the entire board. Remember the old XOR-graphics? The way you XOR'ed a bitmap on top of a
background to make it appear (usually in distorted colors), and XOR'ed it again to make it go away and restore the
background to its original state? This works similarly. Say, for example, that a white rook on H1 captures a black
pawn on H4. To update the hash key, XOR the "white rook on H1" random number once again (to "erase" its
effects), the "black pawn on H4" (to destroy it) and the "white rook on H4" (to add a contribution from the new rook
position).
Use the exact same method, with different random numbers, to generate a second key (or "hash lock") to store in the
transposition table along with the truly useful information. This is used to detect and avoid collisions: if, by chance,
two boards hash to the exact same key and collide in the transposition table, odds are extremely low that they will
also hash to the same lock!

History Tables
The "history heuristic" is a descendant of the "killer move" technique. A thorough explanation belongs in the article
on search; for now, it will be sufficient to say that a history table should be maintained to note which moves have
had interesting results in the past (i.e., which ones have cut-off search quickly along a continuation) and should
therefore be tried again at a later time. The history table is a simple 64x64 array of integer counters; when the search
algorithm decides that a certain move has proven useful, it will ask the history table to increase its value. The values
stored in the table will then be used to sort moves and make sure that more "historically powerful" ones will be tried
first.

Pre-processing move generation


Move generation (i.e., deciding which moves are legal given a specific position) is, with position evaluation, the
most computationally expensive part of chess programming. Therefore, a bit of pre-processing in this area can go a
long way towards speeding up the entire game.
The scheme presented by Jean Goulet in his 1984 thesis Data Structures for Chess (McGill University) is a personal
favorite. In a nutshell:
● For move generation purposes, piece color is irrelevant except for pawns which move in opposite directions.

● There are 64 x 5 = 320 combinations of major piece and square from which to move, 48 squares on which a
black pawn can be located (they can never retreat to the back rank, and they get promoted as soon as they

http://www.gamedev.net/reference/articles/article1046.asp (4 of 5) [25/06/2002 2:11:29 PM]


GameDev.net - Chess Programming Part II: Data Structures

reach the eight rank), and 48 where a white pawn can be located.
● Let us define a "ray" of moves as a sequence of moves by a piece, from a certain square, in the same
direction. For example, all queen moves towards the "north" of the board from square H3 make up a ray.
● For each piece on each square, there are a certain number of rays along which movement might be possible.
For example, a king in the middle of the board may be able to move in 8 different directions, while a bishop
trapped in a corner only has one ray of escape possible.
● Prior to the game, compute a database of all rays for all pieces on all squares, assuming an empty board (i.e.,
movement is limited only by the edges and not by other pieces).
● When you generate moves for a piece on a square, scan each of its rays until you either reach the end of the
ray or hit a piece. If it is an enemy piece, this last move is a capture. If it is a friendly piece, this last move is
impossible.
With a properly designed database, move generation is reduced to a simple, mostly linear lookup; it requires virtually
no computation at all. And the entire thing holds within a few dozen kilobytes; mere chump change compared to the
transposition table!
All of the techniques described above (bit boards, history, transposition table, pre-processed move database) will be
illustrated in my own chess program, to be posted when I finish writing this series. Next month, I will examine
move generation in more detail.
François Dominic Laramée, June 2000
Discuss this article in the forums

Date this article was posted to GameDev.net: 6/11/2000


(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article1046.asp (5 of 5) [25/06/2002 2:11:29 PM]


GameDev.net - Chess Programming Part III: Move Generation

Chess Programming Part III: Move Generation GameDev.net

See Also:
Artificial Intelligence:Gaming

Chess Programming Part III: Move Generation


by François Dominic Laramée
Last month, I finished Part II of this series by introducing a data structure which pre-processes and stores most of the
work related to chess move generation. This time, I will return to the topic in more detail, examining the two major
move generation strategies and explaining how to choose between them for a given application.

The Dilemma
No matter how you slice it, chess is a complicated game, especially for a computer.
In any given situation, a player may have 30 or more legal moves to choose from, some good, some suicidal. For
trained humans, it is easy to characterize the majority of these moves as foolish or pointless: even beginners learn
that they had better come up with a solid plan before leaving their queen in a position where she can be captured, and
masters know (more through instinctive pattern matching than by conscious effort) which 1-2 moves are likely to be
the strongest in the position.
However, coding this information (especially the unconscious type!) into a computer has proven spectacularly
difficult, and the strongest programs (except, to some extent, Hans Berliner's Hitech and its siblings) have given up
on this approach, instead relying on "brute force": if you can analyze all possible moves fast enough and predict their
consequences far enough down the road, it doesn't matter whether or not you start with a clear idea of what you are
trying to accomplish, because you'll discover a good move eventually. Therefore, move generation and search
should be made as fast as possible, so as to minimize the loss of effort required by the brute force method.
Search will be discussed in Parts IV and V of this series; this month, we will concentrate on move generation.
Historically, three major strategies have been used in this area:
● Selective generation: Examine the board, come up with a small number of "likely" moves and discard
everything else.
● Incremental generation: Generate a few moves, hoping that one will prove so good or so bad that search along
the current line of play can be terminated before generating the others.
● Complete generation: Generate all moves, hoping that the transposition table (discussed in Part II) will contain
relevant information on one of them and that there will be no need to search anything at all.
Selective generation (and its associated search technique, called forward pruning) have all but disappeared since the
mid 1970's. As for the other two, they represent two sides of the same coin, trading off effort in move generation vs
search. In games where move generation is easy and/or there are lots of ways to transpose into the same positions
(i.e., Othello and GoMoku), complete generation may be most efficient, while in games where move generation rules
are complicated, incremental generation will usually work faster. Both strategies are sound and viable, however.

http://www.gamedev.net/reference/articles/article1126.asp (1 of 4) [25/06/2002 2:11:53 PM]


GameDev.net - Chess Programming Part III: Move Generation

The Early Days: Forward Pruning


In 1949 (yes, really), Claude Shannon described two ways to build a chess-playing algorithm:
● Look at all possible moves, and all the possible moves resulting from each, recursively.

● Only examine the "best" moves, as determined from a detailed analysis of a position, and then only the "best"
replies to each, recursively.
At first, the second alternative seemed more likely to succeed. After all, this is how human players do it, and it
seems logical to assume that looking at only a few moves at each node will be faster than looking at all of them.
Unfortunately, the results disproved the theory: programs using selective search just didn't play very well. At best,
they achieved low to mid-level club player ratings, often committing humiliating blunders at the worst possible
time. Beating a world champion (or even playing reasonably well on a consistent basis) was beyond their reach.
The problem is that, for a "best move generator" to be any good, it has to be almost perfect. Suppose that a program
is equipped with a function that looks for the 5 best moves in a position, and that the objective best move is among
those 5 at least 95% of the time. (That, by the way, is a big assumption.) This means that the probability that the
generator's list will always contain the best choice at all times, during a 40-move game, is less than 13%. Even a
god-like generator with 99% accuracy will blunder at least once in about a third of its games, while a more
reasonable 90%-accurate function will play an entire game without messing up less than 1.5% of the time!
In the mid 1970's, the legendary Northwestern team of Slate and Atkin decided to do away with the complicated
best-move generator; it turned out that the time they saved in avoiding costly analysis during move generation was
enough to cover the added expense of a full-width search (i.e., examining all possible moves). To all intents and
purposes, this discovery buried forward pruning for good.
Botvinnik's work
An extreme example of a forward pruning algorithm was developed in the Soviet Union, in the 1970's and early
1980's, under the tutelage of former World chess champion Mikhail Botvinnik. Botvinnik was convinced that the
only way for a computer to ever play grandmaster-level chess was to play like a grandmaster, i.e., examine only a
few moves, but in great depth and detail. His program seeked to identify and implement the sort of high-level plans
and patterns which a world-class player might come up with during a game. While it led to some fascinating books,
revealing insights into the master's mind which only Botvinnik could provide, this work unfortunately did not reach
its lofty goals.

Generating All Moves Up-Front


Once forward pruning is eliminated from consideration, the most straightforward way to implement full-width
searching consists of:
● Finding all of the legal moves available in a position.

● Ordering them in some way, hopefully speeding up search by picking an advantageous order.

● Searching them all one at a time, until all moves have been examined or a cutoff occurs.

Early programs, for example Sargon, did this by scanning the board one square at a time, looking for pieces of the
moving side, and computing possible move destinations on the fly. Memory being at a premium, the expenditure of
CPU time required to re-compute these moves every time was a necessary evil.
These days, a pre-processed data structure like the one I described last month can avoid a considerable amount of
computation and code complexity, at the cost of a few dozens of Kbytes. When this super-fast move generation is
combined with transposition tables, an added bonus may fall into the programmer's lap: if even one of the moves has

http://www.gamedev.net/reference/articles/article1126.asp (2 of 4) [25/06/2002 2:11:53 PM]


GameDev.net - Chess Programming Part III: Move Generation

already been searched before, and if its evaluation (as retrieved from the table) is such that it triggers a cutoff, there
will be no need to search any of the moves at all! Obviously, the larger the transposition table, and the higher the
probability of a transposition given the rules of the game, the bigger the average payoff.
Not only is this technique conceptually simple, it is also the most "universal": while there are easy ways to segregate
chess moves into different categories (i.e., captures and non-captures), other games like Othello do not provide such
convenient tools to work with. Therefore, if you intend your program to play more than one game, you should
probably pick this technique instead of the one described in the next section.

One Move At A Time


Sophisticated chess programs since at least CHESS 4.5 have adopted the opposite strategy: generate a few moves at
a time, search them, and if a cutoff can be caused, there will be no need to generate the rest of the moves.
A combination of factors has made this technique popular:
● Search does not require much memory. Programs of the 1970's had to make do with small transposition tables
and could ill afford pre-processed move generation data structures, limiting the usefulness of the complete
generation scheme described above.
● Move generation is more complicated in chess than in most other games, with castling, en passant pawn
captures, and different rules for each piece type to contend with.
● Very often, a refutation (i.e., a move that will cause a cutoff) is a capture. For example, if Player A leaves a
queen "en prise", the opponent will quickly grab it and wreck A's game. Since there are usually few possible
captures in a position, and since generating captures separately is relatively easy, computing captures is often
enough to trigger a cutoff at little expense.
● Captures are usually one of the few (if not the only) type of move examined during quiescence search (which
will be discussed in a later article), so generating them separately is doubly useful.
Therefore, many programs will generate captures first, often starting with those of highly valuable pieces, and look
for a quick cutoff. A number of techniques have been developed to speed up piece-meal move generation, most of
them involving bitboards:
● CHESS 4.5 maintains two sets of 64 bitboards, with one bitboard per square on the board. One contains the
squares attacked by the piece, if any, located on that square; the other is the transpose, containing all squares
occupied by pieces attacking that square. Therefore, if the program is looking for moves that would capture
Black's queen, it looks for its position in the basic bitboard database, uses these new data structures to identify
the pieces which attack the queen's position, and only generates moves for these pieces.
● Maintaining these "attack bitboards" after each move requires some rather involved technical wizardry, but a
tool called a rotated bitboard can accelerate the work significantly. The best discussion of rotated bitboards I
have seen was written by James F. Swafford, and can be found on the web at
http://sr5.xoom.com/_XMCM/jswaff/chessprg/rotated.htm

Ordering Moves To Speed Up Search


As we will see next time, search efficiency depends on the order in which moves are searched. The gains and losses
related to good or poor move ordering are not trivial: a good ordering, defined as one which will cause a large
number of cutoffs, will result in a search tree about the square root of the size of the tree associated with the worst
possible ordering!
Unfortunately, it turns out that the best possible ordering is simply defined by trying the best move first. And of

http://www.gamedev.net/reference/articles/article1126.asp (3 of 4) [25/06/2002 2:11:53 PM]


GameDev.net - Chess Programming Part III: Move Generation

course, if you knew which moves are best, you wouldn't be searching in the first place. Still, there are ways to
"guess" which moves are more likely to be good than others. For example, you might start with captures, pawn
promotions (which dramatically change material balance on the board), or checks (which often allow few legal
responses); follow with moves which caused recent cutoffs at the same depth in the tree (so-called "killer moves"),
and then look at the rest. This is the justification for iterative deepening alphabeta, which we will discuss in detail
next month, as well as the history table we talked about last time. Note that these techniques do not constitute
forward pruning: all moves will be examined eventually; those which appear bad are only delayed, not eliminated
from consideration.
A final note: in chess, some moves may be illegal because they leave the King in check. However, such an
occurrence is quite rare, and it turns out that validating moves during generation would cost a tremendous amount of
effort. It is more efficient to delay the check until the move is actually searched: for example, if capturing the King
would be a valid reply to Move X, then Move X is illegal and search should be terminated. Of course, if search is
cutoff before the move has to be examined, validation never has to take place.

My Choice
For my chess program, I have chosen to generate all moves at the same time. These are only some of the reasons
why:
● I intend to use the program as a basis for several other games, most of which have no direct counterparts to
chess captures.
● I have plenty of memory to play with.

● The code required to implement this technique is simpler to write and to understand; you will thank me when
you see it.
● There are several freeware programs that implement piece-meal move generation; the curious reader should
look at Crafty, for example, as well as James Swafford's Galahad.
While overall performance may be slightly less stellar than otherwise, my program (written in Java, no less) wouldn't
exactly provide a challenge to Deep Blue even in the best case, so I won't feel too bad!

Next Month
Now, we are ready to delve into the brains of a chess-playing program, with search techniques. This is such a large
topic that it will require two articles. We will begin with the basic search algorithms common to all games, before
continuing with new developments and chess-specific optimizations in the next installment.

François Dominic Laramée, July 2000


Discuss this article in the forums

Date this article was posted to GameDev.net: 7/17/2000


(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article1126.asp (4 of 4) [25/06/2002 2:11:53 PM]


GameDev.net - Chess Programming Part IV: Basic Search

Chess Programming Part IV: Basic Search GameDev.net

See Also:
Artificial Intelligence:Gaming

Chess Programming Part IV: Basic Search


by François Dominic Laramée
Fourth article in this complicated, code-deprived, dry-as-Metamucil series, and you're still reading? Drop me an
email if you are, so that I know I'm writing these for a reason!
Anyway, this month's installment focuses on the basics of two-agent search in strategy games: why it is useful, how
to do it, and what it implies for the computer's style of play. The techniques I will discuss apply equally well to most
two-player games, but by themselves, they are not quite sufficient to play good chess; next month, I will describe
advanced techniques which significantly increase playing strength and search efficiency, usually at the same time.

Why Search?
Well, basically, because we are not smart enough to do without it.
A really bright program might be able to look at a board position and determine who is ahead, by how much, and
what sort of plan should be implemented to drive the advantage to fruition. Unfortunately, there are so many patterns
to discern, so many rules and so many exceptions, that even the cleverest programs just aren't very good at this sort
of thing. What they are good at, however, is computing fast. Therefore, instead of trying to figure out good moves
just by looking at a board, chess programs use their brute force to do it: look at every move, then at every possible
countermove by the opponent, etc., until the processor melts down.
Deep searches are an easy way to "teach" the machine about relatively complicated tactics. For example, consider
the knight fork, a move which places a knight on a square from which it can attack two different pieces (say, a rook
and the queen). Finding a way to represent this type of position logically would require some effort, more so if we
also had to determine whether the knight was itself protected from capture. However, a plain dumb 3-ply search will
"learn" the value of a fork on its own: it will eventually try to move the knight to the forking square, will test all
replies to this attack, and then capture one of the undefended pieces, changing the board's material balance. And
since a full-width search looks at everything, it will never miss an opportunity: if there is a 5-move combination,
however obscure, that leads to checkmate or to a queen capture, the machine will see it if its search is deep enough.
Therefore, the deeper the search, the more complicated the "plans" which the machine can stumble upon.

Grandpa MiniMax
The basic idea underlying all two-agent search algorithms is Minimax. It dates back from the Dark Ages; I believe
Von Neumann himself first described it over 60 years ago.
Minimax can be defined as follows:
● Assume that there is a way to evaluate a board position so that we know whether Player 1 (whom we will call
Max) is going to win, whether his opponent (Min) will, or whether the position will lead to a draw. This
evaluation takes the form of a number: a positive number indicates that Max is leading, a negative number,
that Min is ahead, and a zero, that nobody has acquired an advantage.

http://www.gamedev.net/reference/articles/article1171.asp (1 of 5) [25/06/2002 2:12:14 PM]


GameDev.net - Chess Programming Part IV: Basic Search

● Max's job is to make moves which will increase the board's evaluation (i.e., he will try to maximize the
evaluation).
● Min's job is to make moves which decrease the board's evaluation (i.e., he will try to minimize it).
● Assume that both players play flawlessly, i.e., that they never make any mistakes and always make the moves
that improve their respective positions the most.
How does this work? Well, suppose that there is a simple game which consists of exactly one move for each player,
and that each has only two possible choices to make in a given situation. The evaluation function is only run on the
final board positions, which result from a combination of moves by Min and Max.
Max Move Min Move Evaluation
A C 12
A D -2
B C 5
B D 6

Max assumes that Min will always play perfectly. Therefore, he knows that, if he makes move A, his opponent will
reply with D, resulting in a final evaluation of -2 (i.e., a win for Min). However, if Max plays B, he is sure to win,
because Min's best move still results in a positive final value of 5. So, by the Minimax algorithm, Max will always
choose to play B, even though he would score a bigger victory if he played A and Min made a mistake!
The trouble with Minimax, which may not be immediately obvious from such a small example, is that there is an
exponential number of possible paths which must be examined. This means that effort grows dramatically with:
● The number of possible moves by each player, called the branching factor and noted B.

● The depth of the look-ahead, noted d, and usually described as "N-ply", where N is an integer number and
"ply" means one move by one player. For example, the mini-game described above is searched to a depth of
2-ply, one move per player.
In Chess, for example, a typical branching factor in the middle game would be about 35 moves; in Othello, around 8.
Since Minimax' complexity is O( B^n ), an 8-ply search of a chess position would need to explore about 1.5 million
possible paths! That is a LOT of work. Adding a ninth ply would make the tree balloon to about 50 million nodes,
and a tenth, to an impossible 1.8 billion!
Luckily, there are ways to cut the effort by a wide margin without sacrificing accuracy.

Alphabeta: Making Minimax Feasible (a little)


Suppose that you have already searched Max' move B in the mini-game above. Therefore, you know that, at best,
Max' score for the entire game will be 5.
Now, suppose that you begin searching move A, and that you start with the path A-D. This path results in a score of
-2. For Max, this is terrible: if he plays A, he is sure to finish with, at best, -2, because Min plays perfectly; if A-C
results in a score higher than A-D's, Min will play A-D, and if A-C should be even worse (say, -20), Min would take
that path instead. Therefore, there is no need to look at A-C, or at any other path resulting from move A: Max must
play B, because the search has already proven that A will end up being a worse choice no matter what happens.
This is the basic idea being the alphabeta algorithm: once you have a good move, you can quickly eliminate
alternatives that lead to disaster. And there are a lot of those! When combined with the transposition table we
discussed earlier in the series, and which saves us from examining positions twice if they can be reached by different

http://www.gamedev.net/reference/articles/article1171.asp (2 of 5) [25/06/2002 2:12:14 PM]


GameDev.net - Chess Programming Part IV: Basic Search

combinations of moves, alphabeta turns on the Warp drive: in the best case, it will only need to examine roughly
twice the square root of the number of nodes searched by pure Minimax, which is about 2,500 instead of 1.5 million
in the example above.

Ordering Moves to Optimize Alphabeta


But how can we achieve this best case scenario? Do we even need to?
Not really. It turns out that Alphabeta is always very efficient at pruning the search tree, as long as it can quickly find
a pretty good move to compare others to. This means that it is important to search a good move first; the best case
happens when we always look at the best possible moves before any others. In the worst possible case, however, the
moves are searched in increasing order of value, so that each one is always better than anything examined before; in
this situation, alphabeta can't prune anything and the search degenerates into pure, wasteful Minimax.
Ordering the moves before search is therefore very important. Picking moves at random just won't do; we need a
"smarter" way to do the job. Unfortunately, if there was an easy way to know what the best move would turn out to
be, there would be no need to search in the first place! So we have to make do with a "best guess".
Several techniques have been developed to order the possible moves in as close to an optimal sequence as possible:
● Apply the evaluation function to the positions resulting from the moves, and sort them. Intuitively, this makes
sense, and the better the evaluation function, the more effective this method should be. Unfortunately, it
doesn't work well at all for chess, because as we will see next month, many positions just can't be evaluated
accurately!
● Try to find a move which results in a position already stored in the transposition table; if its value is good
enough to cause a cutoff, no search effort needs to be expended.
● Try certain types of moves first. For example, having your queen captured is rarely a smart idea, so checking
for captures first may reveal "bonehead" moves rapidly.
● Extend this idea to any move which has recently caused a cutoff at the same level in the search tree. This
"killer heuristic" is based on the fact that many moves are inconsequential: if your queen is en prise, it doesn't
matter whether you advance your pawn at H2 by one or two squares; the opponent will still take the queen.
Therefore, if the move "bishop takes queen" has caused a cutoff during the examination of move H2-H3, it
might also cause one during the examination of H2-H4, and should be tried first.
● Extend the killer heuristic into a history table. If, during the course of the game's recent development, moving
a piece from G2 to E4 has proven effective, then it is likely that doing so now would still be useful (even if the
old piece was a bishop and has been replaced by a queen), because conditions elsewhere on the board probably
haven't changed that much. The history heuristic is laughably simple to implement, needing only a pair of
64x64 arrays of integer counters, and yields very interesting results.
Having said all that about "reasonable ideas", it turns out that the most effective method is one which goes against
every single bit of human intuition: iterative deepening.

http://www.gamedev.net/reference/articles/article1171.asp (3 of 5) [25/06/2002 2:12:14 PM]


GameDev.net - Chess Programming Part IV: Basic Search

Iterative Deepening AlphaBeta


If you are searching a position to depth 6, the ideal move ordering would be the one yielded by a prior search of the
same position to the same depth. Since that is obviously impossible, how about using the results of a shallower
search, say of depth 5?
This is the idea behind iterative deepening: begin by searching all moves arising from the position to depth 2, use the
scores to reorder the moves, search again to depth 3, reorder, etc., until you have reached the appropriate depth.
This technique flies in the face of common sense: a tremendous amount of effort is duplicated, possibly 8-10 times or
more. Or is it?
Consider the size of a search tree of depth d with branching factor B. The tree has B nodes at depth 1, B*B at depth
2, B*B*B at depth 3, etc. Therefore, searching to depth (d-1) yields a tree B times smaller than searching to depth d!
If B is large (and remember that it is about 35 during the middle game in chess), the overwhelming majority of the
effort expended during search is devoted to the very last ply. Duplicating a search to depth (d-1) is a trifling matter:
in fact, even if it yielded no advantages whatsoever, iterative deepening would only cost less than 4% extra effort on
a typical chess position!
However, the advantages are there, and they are enormous: using the results of a shallower search to order the moves
prior to a deeper one produces a spectacular increase in the cutoff rate. Therefore, IDAB actually examines far fewer
nodes, on average, than a straight AB search to the same depth using any other technique for move ordering! When a
transposition table enters the equation, the gain is even more impressive: the extra work performed to duplicate the
shallow parts of the search drops to nothing because the results are already stored in the table and need not be
computed again.

Computer Playing Style


Iterative deepening alphabeta combined with a transposition table (and a history table to kickstart the effort) allows
the computer to search every position relatively deeply and to play a reasonable game of chess. That being said, its
Minimax ancestor imposes a very definite playing style on the computer, one which is not exactly the most
spectacular in the world.
For example, suppose that the machine searches a position to depth 8. While looking at a certain move, it sees that
every possible response by the opponent would let it win the game in dazzling manner... Except for a single opaque,
difficult, obfuscated and almost maddeningly counter-intuitive sequence, which (if followed to perfection) would
allow the opponent to salvage a small chance of eventually winning the game. Against a human player (even a
Grandmaster), such a trap might turn the game into one for the history books.
However, if the machine then finds a boring move which always forces a draw, it will immediately discard the trap,
because it assumes that the opponent would find the perfect counter, no matter how improbable that is, and that the
draw is the best it can hope for!
As a result, you might say that machines play an overly cautious game, as if every opponent was a potential world
champion. Combined with the fact that computers often can't search deep enough to detect the traps which human
players devise against them, this allows very skilled humans to "waste time" and confuse the machine into making a
blunder which they can exploit. (Human players also study their opponent's styles for weaknesses; if Kasparov had
been given access to, say, a hundred games played by Deep Blue before their match, he might have been able to find
the flaws in its game and beat it. But we'll never know for sure.)

http://www.gamedev.net/reference/articles/article1171.asp (4 of 5) [25/06/2002 2:12:14 PM]


GameDev.net - Chess Programming Part IV: Basic Search

Next Month
In Part V, we will discuss the limitations of straight, fixed-depth alphabeta search, and how to improve playing
strength using techniques like the null-move heuristic, quiescence search, aspiration search and MTD(f), and the
"singular extensions" which made Deep Blue famous. Hold on, we're almost done!
François Dominic Laramée, August 2000
Discuss this article in the forums

Date this article was posted to GameDev.net: 8/6/2000


(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article1171.asp (5 of 5) [25/06/2002 2:12:14 PM]


GameDev.net - Chess Programming Part V: Advanced Search

Chess Programming Part V: Advanced Search GameDev.net

See Also:
Artificial Intelligence:Gaming

Chess Programming Part V: Advanced Search


by François Dominic Laramée
Hey, it looks like there are dozens (and dozens) of you reading this series! I'm tickled pink!
In this next-to-last article, we will examine advanced search-related techniques which can speed up and/or strengthen
your chess-playing program. In most cases, the concepts (if not the actual code) also apply to a variety of other
2-player games; adapting them to your particular problem, however, shall remain an exercise for the proverbial
reader.

Why Bother?
So far, all of the search algorithms we have looked at examine a position's consequences to a fixed "depth".
However, this is rarely a good thing. For example, suppose that your program uses an iterative-deepening alpha-beta
algorithm with maximum depth 5-ply. Now look at these cases:
● Along a certain line of play, you discover a position where one of the players is checkmated or stalemated at
depth 3. Obviously, you don't want to keep searching, because the final state of the game has been resolved.
Not only would searching to depth 5 be a colossal waste of effort, it may also allow the machine to finagle its
way into an illegal solution!
● Now, suppose that, at depth 5, you capture a pawn. The program would be likely to score this position in a
favorable light, and your program might decide that the continuation leading to it is a useful one. However, if
you had looked one ply further, you might have discovered that capturing the pawn left your queen
undefended. Oops.
● Finally, suppose that your queen is trapped. No matter what you do, she will be captured by the opponent at
ply 4, except for one specific case where her death will happen at ply 6. If you search to depth 5, the
continuations where the queen is captured at ply 4 will be examined accurately, and scored as likely disasters.
However, the unique line of play where the queen is only captured at ply 6 (outside of the search tree) doesn't
reveal the capture to the machine, which thinks that the queen is safe and gives it a much better score! Now, if
all you have to do to push the queen capture outside of the search tree is delay the opponent with a diversion,
doing so may be worth the risk: although it could damage your position, it might also cause the opponent to
make a mistake and allow the queen to escape. But what if you can only delay the queen capture by
sacrificing a rook? To the machine, losing a rook at ply 4 is less damaging than losing a queen, so it will
merrily throw its good piece away and "hide" the too-horrible-to-mention queen capture beyond its search
horizon. (During its next turn, of course, the machine will discover that the queen must now be captured at ply
4 in all cases, and that it has wasted a rook for no gain.) Hans Berliner described this "horizon effect" a long
time ago, and it is the most effective justification for the "quiescence search" described in the next section.
The bottom line is this: a great many positions in chess (and in other games as well) are just too chaotic to be
evaluated properly. An evaluation function can only be applied effectively to "quiet" positions where not much of
importance is likely to happen in the immediate future. How to identify these is our next topic.

http://www.gamedev.net/reference/articles/article1197.asp (1 of 4) [25/06/2002 2:12:45 PM]


GameDev.net - Chess Programming Part V: Advanced Search

Quiet, Please!
There are two ways to assess a position's value: dynamic evaluation (i.e., look at what it may lead to) and static
evaluation (i.e., see what it looks like on its own, irrespective of consequences). Dynamic evaluation is performed
through search; as we have just mentioned, static evaluation is only feasible when the position is not likely to
undergo an overwhelming change of balance in the near future. Such relatively stable positions are called "quiet" or
"quiescent", and they are identified via "quiescence search".
The basic concept of Quiescence Search is the following: once the program has searched everything to a fixed depth
(say, 6-ply), we continue each line of play selectively, by searching "non-quiescent" moves only, until we find a
quiescent position, and only then apply the evaluator.
Finding a quiet position requires some knowledge about the game. For example, which moves are likely to cause a
drastic change in the balance of power on the board? For chess, material balance tends to be the overwhelming
consideration in the evaluator, so anything that changes material is fair game: captures (especially those of major
pieces) and pawn promotions certainly qualify, while checks may also be worth a look (just in case they might lead
to checkmate). In checkers, captures and promotions also seem like reasonable choices. In Othello, every single
move is a capture, and "material balance" can change so much in so little time that it might be argued that there are
no quiet positions at all!
My own program uses a simple quiescence search which extends all lines of play (after a full-width search to depth
X) by looking exclusively at captures. Since there are usually not that many legal captures in a given position, the
branching factor in the quiescence search tends to be small (4-6 on average, and quickly converging to 0 as pieces
are eaten on both sides). Nevertheless, the quiescence search algorithm is called on a LOT of positions, and so it
tends to swallow 50% or more of the entire processing time. Make sure that you need such a scheme in your own
game before committing to it.
Only when no capture is possible does my program apply its evaluator. The result is a selectively-extended search
tree which is anything but fixed-depth, and which defeats most of the nasty consequences of the "horizon effect".

The All-Important Null-Move


One of the most effective ways to speed up a chess program is to introduce the concept of a null move into the
equation.
The null move consists, quite simply, of skipping a turn and letting the opponent play two moves in a row. In the
overwhelming majority of positions, doing nothing is a bone-head idea: you should (almost) always be able to do
*something* to improve your lot. (To be honest, there are a few "damned if I do, damned if I don't" positions where
the null move would actually be your best bet, and the computer will not play them correctly, but such "zugzwang"
positions are hopeless anyway, so the loss of performance is not very traumatic.)
Allowing the computer to try a null move during search has several advantages related to speed and accuracy. For
example:
● Suppose that a position is so overwhelmingly in your favor that, even if you skipped your turn, the opponent
couldn't respond with anything that would help. (In program terms, you would get a beta cutoff even without
making a move.) Suppose further that this position is scheduled to be searched to depth N. The null move, in
effect, takes out an entire ply of the search tree (you are searching only the null move instead of all your legal
ones) and if your branching factor is B, searching the null move is equivalent to looking at a single depth N-1
subtree instead of B of them. With B=35 as in the typical chess middlegame, null-move search may only
consume 3% of the resources required by a full depth-N examination. If the null move search reveals that you

http://www.gamedev.net/reference/articles/article1197.asp (2 of 4) [25/06/2002 2:12:45 PM]


GameDev.net - Chess Programming Part V: Advanced Search

are still too strong even without playing (i.e., it creates a cutoff), you have saved 97% of your effort; if not,
you must examine your own legal moves as usual, and have only wasted an extra 3%. On average, the gain is
enormous.
● Now, suppose that, during quiescence search, you reach a position where your only legal capture is
rook-takes-pawn, which is immediately followed by the opponent's knight-takes-rook. You'd be a lot better
off not making the capture, and playing any other non-capture move, right? You can simulate this situation by
inserting the null move into the quiescence search: if, in a given position during quiescence search, it is
revealed that the null move is better than any capture, you can assume that continuing with captures from this
position is a bad idea, and that since the best move is a quiet one, this is a position where the evaluation
function itself should be applied!
Overall, the null-move heuristic can save between 20% and 75% of the effort required by a given search. Well worth
the effort, especially when you consider that adding it to a program is a simple matter of changing the "side to play"
flag and adding less than a dozen lines of code in the quiescence search algorithm!

Aspirated Search and MTD(f)


Plain old alphabeta assumes nothing about a position's ultimate minimax value. It looks at *everything*, no matter
how preposterous. However, if you have a pretty good idea of what the value will turn out to be (for example,
because you are running an iterative-deepening scheme and have received the previous iteration's results), you might
be able to identify lines of play that are so out of whack with your expectations that they can't possibly be right, and
cut them off pre-emptively.
For example, suppose that you have reason to believe that a position's value will be close to 0, because it is very well
balanced. Now, suppose that an internal node's preliminary evaluation is at +20,000. You can cutoff with
reasonable confidence.
This is the idea behind "aspiration search", a variant of alphabeta in which, instead of using +INFINITY and
-INFINITY as the initial bounds of the search, you set a small window around the expected value instead. If the
actual value happens to fall within the window, you win: you'll get it without error, and faster than you would
otherwise (because of the many extra cutoffs). If not, the algorithm will fail, but the error will be easy to detect
(because the minimax value you'll receive will be equal to one of the bounds); you'll have to waste a bit of time
re-searching with a wider window. If the former case happens more often than the latter, you win on average.
Obviously, the better your initial guess of the expected value, the more useful this technique is.
In the mid 1990's, researcher Aske Plaat extended aspiration search to its logical conclusion: what if you called an
aspirated alphabeta with a search window of width equal to zero? It would fail all the time, of course... But it would
do so *very quickly*, because it would cutoff every path almost immediately. Now, if the failure indicates that the
actual value is lower than your estimate, you can try again, with another zero-width window around a smaller
estimate, etc. In a sense, you could then use alphabeta to perform a binary search into the space of all possible
minimax values, until you reach the only call which will *not* fail because the zero-width window will be centered
on the position's actual value!
This brilliant idea, presented in a paper available on the web at http://theory.lcs.mit.edu/~plaat/mtdf.html, has been
embodied in the MTD(f) search algorithm, which is all of 10 lines long. Tacked on top of an alphabeta
implementation equipped with a transposition table, MTD(f) is incredibly efficient and highly parallel-friendly. It
also works better with "coarse-grain" (and therefore probably simpler and faster) evaluators: it is easy to see that it
takes fewer probes to zero in on the actual value in a binary search if the smallest "atom" of value is equal to, say,
0.1 pawns rather than 0.001 pawns.
There are other alphabeta variants in wider use (namely, the infamous NegaScout; I would rather teach General

http://www.gamedev.net/reference/articles/article1197.asp (3 of 4) [25/06/2002 2:12:45 PM]


GameDev.net - Chess Programming Part V: Advanced Search

Relativity to orang-utangs than get into that mess) but Plaat insists that MTD(f) is the most efficient algorithm in
existence today and I'll take his word for it. My own program uses MTD(f); you'll be able to marvel at the
algorithm's simplicity very shortly!

Singular Extensions
One last thing before we leave the topic of search: in chess, some moves are obviously better than others, and it may
not be necessary to waste too much time searching for alternatives.
For example, suppose that after running your iterative algorithm to depth N-1, you discover that one of your moves
is worth +9000 (i.e., a capture of the opponent's queen) and all others are below 0. If saving time is a consideration,
like in tournaments, you may want to bypass the whole depth N search and only look at the best move to depth N
instead: if this extra ply does not lower its evaluation much, then you assume that the other moves won't be able to
catch up, and you stop searching early. (Remember: if there are 35 valid moves at each ply on average, you may
have just saved 97% of your total effort!)
Deep Blue's team has pushed this idea one step further and implemented the concept of "singular extensions". If, at
some point in the search, a move seems to be a lot better than all of the alternatives, it will be searched an extra ply
just to make sure that there are no hidden traps there. (This is a vast oversimplification of the whole process, of
course, but that's the basic idea.) Singular extensions are costly: adding an extra ply to a node roughly doubles the
number of leaves in the tree, causing a commensurate increase in the number of calls to the evaluator; in other words,
Deep Blue's specialized hardware can afford it, my cranky Java code can't. But it's hard to argue with the results,
isn't it?

Next Month
In Part VI, we wrap up the series with a discussion of evaluation functions, the code which actually tells your
program whether a given board position is good or bad. This is an immense topic, and people can (and do) spend
years refining their own evaluators, so we will have to content ourselves with a rather high-level discussion of the
types of features which should be examined and their relative importance. If everything goes according to plan, I
should also have some Java code for you to sink your teeth into at about that time, so stick around, won't you?
François Dominic Laramée, September 2000
Discuss this article in the forums

Date this article was posted to GameDev.net: 9/6/2000


(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article1197.asp (4 of 4) [25/06/2002 2:12:45 PM]


GameDev.net - Chess Programming Part VI: Evaluation Functions

Chess Programming Part VI: Evaluation Functions GameDev.net

See Also:
Artificial Intelligence:Gaming

Chess Programming Part VI: Evaluation Functions


by François Dominic Laramée
It's been six months, and I know sometimes it must have felt like I would never shut up, but here we are: the sixth
and last installment of my chess programming series. Better yet: my Java chess program, primitive though it may be,
is now complete, and I shipped it to Gamedev along with this, which proves that I know (a little bit of) what I've
been talking about.
This month's topic, the evaluation function, is unique in a very real sense: while search techniques are pretty much
universal and move generation can be deducted from a game's rules and no more, evaluation requires a deep and
thorough analysis of strategy. As you can well imagine, it is impossible to assess a position's relative strengths if we
don't know what features are likely to lead to victory for one player or the other. Therefore, many of the concepts
discussed here may apply to other games in very different fashion, or not at all; it is your job as programmer to know
your game and decide what the evaluator should take into consideration.
Without further delay, let us take a look at some board evaluation metrics and at how they can be used to evaluate a
chess position.

Material Balance
To put it simply, material balance is an account of which pieces are on the board for each side. According to chess
literature, a queen may be worth 900 points, a rook 500, a bishop 325, a knight 300 and a pawn 100; the king has
infinite value. Computing material balance is therefore a simple matter: a side's material value is equal to
MB = Sum( Np * Vp )
where Np is the number of pieces of a certain type on the board and Vp is that piece's value. If you have more
material on the board than your opponent, you are in good shape.
Sounds simple, doesn't it? Yet, it is by far the overwhelming factor in any chess board evaluation function. CHESS
4.5's creators estimate that an enormous advantage in position, mobility and safety is worth less than 1.5 pawns. In
fact, it is quite possible to play decent chess without considering anything else!
Sure, there are positions where you should sacrifice a piece (sometimes even your queen) in exchange for an
advantage in momentum. These, however, are best discovered through search: if a queen sacrifice leads to mate in 3
moves, your search function will find the mate (provided it looks deep enough) without requiring special code. Think
of the nightmares if you were forced to write special-case code in your evaluator to determine when a sacrifice is
worth the trouble!
Few programs use a material evaluation function as primitive as the one I indicated earlier. Since the computation is
so easy, it is tempting to add a few more features into it, and most people do it all the time. For example, it is a
well-known fact that once you are ahead on material, exchanging pieces of equal value is advantageous. Exchanging
a single pawn is often a good idea, because it opens up the board for your rooks, but you would still like to keep
most of your pawns on the board until the endgame to provide defense and an opportunity for queening. Finally, you
don't want your program to panic if it plays a gambit (i.e., sacrifices a pawn) from its opening book, and therefore

http://www.gamedev.net/reference/articles/article1208.asp (1 of 4) [25/06/2002 2:13:13 PM]


GameDev.net - Chess Programming Part VI: Evaluation Functions

you may want to build a "contempt factor" into the material balance evaluation; this allows your program to think it's
ahead even though it is behind by 150 points of material or more, for example.
Note that, while material balance is highly valuable in chess and in checkers, it is deceiving in Othello. Sure, you
must control more squares than the opponent at the end of the game to win, but it is often better to limit his options
by having as few pieces on the board as possible during the middlegame. And in other games, like Go-Moku and all
other Connect-N variations, material balance is irrelevant because no pieces are ever captured.

Mobility and Board Control


One of the characteristics of checkmate is that the victim has no legal moves left. Intuitively, it also seems better to
have a lot of options available: a player is more likely to be able to find a good line of play if he has 30 legal moves
to choose from than if he is limited to 3.
In chess, mobility is easily assessed: count the number of legal moves for each side given the position, and you are
done. However, this statistic has proven to be of little value. Why? For one thing, many chess moves are pointless. It
may be legal to make your rook patrol the back rank one square at a time, but it is rarely productive. For another,
trying to limit the opponent's mobility at all costs may lead the program to destroy its own defensive position in
search of "pointless checks": since there are rarely more than 3-4 legal ways to evade check in any given position, a
mobility-oriented program would be likely to make incautious moves to put the opponent in check, and after a while,
it may discover that it has accomplished nothing and has dispersed its forces all over the board.
Still, a few sophisticated mobility evaluation features are always a good thing. My program penalizes "bad bishops",
i.e., bishops whose movement is hampered by a large number of pawns on squares of the same color, as well as
knights sitting too close to the edges of the board. As another example, rooks are more valuable on open and
semi-open files, i.e., files where there are no pawns (or at least no friendly ones).
A close relative of mobility is board control. In chess, a side controls a square if it has more pieces attacking it than
the opponent. It is usually safe to move to a controlled square, and hazardous to move to one controlled by the
opponent. (There are exceptions: moving your queen to a square attacked by an enemy pawn is rarely a good idea, no
matter how many ways you can capture that pawn once it has done its damage. You may also voluntarily put a piece
in harm's way to lead a defender away from an even more valuable spot.) In chess, control of the center is a
fundamental goal of the opening. However, control is somewhat difficult to compute: it requires maintaining a
database of all squares attacked by all pieces on the board at any given time. Many programs do it; mine doesn't.
While mobility is less important than it seems to the chess programmer, it is extremely important in Othello (where
the side with the fewest legal moves in the endgame is usually in deep trouble). As for board control, it is the victory
condition of Go.

Development
An age-old maxim of chess playing is that minor pieces (bishops and knights) should be brought into the battle as
quickly as possible, that the King should castle early and that rooks and queens should stay quiet until it is time for a
decisive attack. There are many reasons for this: knights and bishops (and pawns) can take control of the center,
support the queen's attacks, and moving them out of the way frees the back rank for the more potent rooks. Later on
in the game, a rook running amok on the seventh rank (i.e., the base of operations for the opponent's pawns) can
cause a tremendous amount of damage.
My program uses several factors to measure development. First, it penalizes any position in which the King's and
Queen's pawns have not moved at all. It also penalizes knights and bishops located on the back rank where they

http://www.gamedev.net/reference/articles/article1208.asp (2 of 4) [25/06/2002 2:13:13 PM]


GameDev.net - Chess Programming Part VI: Evaluation Functions

hinder rook movement, tries to prevent the queen from moving until all other pieces have, and gives a big bonus to
positions where the king has safely castled (and smaller bonuses to cases where it hasn't castled yet but hasn't lost the
right to do so) when the opponent has a queen on the board. As you can see, the development factor is important in
the opening but quickly loses much of its relevance; after 10 moves or so, just about everything that can be measured
here has already happened.
Note that favoring development can be dangerous in a game like Checkers. In fact, the first player to vacate one of
the squares on his back rank is usually in trouble; avoiding development of these important defensive pieces is a very
good idea.

Pawn Formations
Chess grandmasters often say that pawns are the soul of the game. While this is far from obvious to the neophyte, the
fact that great players often resign over the loss of a single pawn clearly indicates that they mean it!
Chess literature mentions several types of pawn features, some valuable, some negative. My program looks at the
following:
● Doubled or tripled pawns. Two or more pawns of the same color on the same file are usually bad, because
they hinder each other's movement.
● Pawn rams. Two opposing pawns "butting heads" and blocking each other's forward movement constitute an
annoying obstacle.
● Passed pawns. Pawns which have advanced so far that they can no longer be attacked or rammed by enemy
pawns are very strong, because they threaten to reach the back rank and achieve promotion.
● Isolated pawns. A pawn which has no friendly pawns on either side is vulnerable to attack and should seek
protection.
● Eight pawns. Having too many pawns on the board restricts mobility; opening at least one file for rook
movement is a good idea.
A final note on pawn formations: a passed pawn is extremely dangerous if it has a rook standing behind it, because
any piece that would capture the pawn is dead meat. My program therefore scores a passed pawn as even more
valuable if there is a rook on the same file and behind the pawn.

King Safety and Tropism


We have already touched on king safety earlier: in the opening and middle game, protecting the king is paramount,
and castling is the best way to do it.
However, in the endgame, most of the pieces on both sides are gone, and the king suddenly becomes one of your
most effective offensive assets! Leaving him behind a wall of pawns is a waste of resources.
As for "tropism", it is a measure of how easy it is for a piece to attack the opposing king, and is usually measured in
terms of distance. The exact rules used to compute tropism vary by piece type, but they all amount to this: the closer
you are to the opposing king, the more pressure you put on it.

http://www.gamedev.net/reference/articles/article1208.asp (3 of 4) [25/06/2002 2:13:13 PM]


GameDev.net - Chess Programming Part VI: Evaluation Functions

Picking the Right Weights


Now that we have identified the features we would like to measure, how do we assign relative weights to each?
That, my friends, is a million-dollar question. People can (and do) spend years fiddling with the linear combination
of features in their evaluation functions, sometimes giving a little more importance to mobility, sometimes
de-emphasizing safety, etc. I wish there were an absolute solution, but there isn't. This is still very much a matter for
trial and error. If your program plays well enough, great. If not, try something else, and play it against the old
version; if it wins most of the time, your new function is an improvement.
Three things you may want to keep in mind:
● It is very difficult to refine an evaluation function enough to gain as much performance as you would from an
extra ply of search. When in doubt, keep the evaluator simple and leave as much processing power as possible
to alphabeta.
● Unless you are after the World Championship, your evaluator does not need to be all-powerful!
● If your game plays really fast, you may want to try evolving an appropriate evaluator using a genetic
algorithm. For chess, though, this would likely require thousands of years!

Next Month
Well, there ain't no next month. This is it.
If I wanted to drag this series even longer, I could write about opening books, endgame libraries, specialized chess
hardware and a zillion other things. I could, I could. But I won't. Some of these topics I reserve for the book chapter I
will be writing on this very topic later this Fall. Others I just don't know enough about to contribute anything useful.
And mostly, I'm just too lazy to bother.
Still, I hope you enjoyed reading this stuff, and that you learned a useful thing or two or three. If you did, look me up
next year at the GDC or at E3, and praise me to whomever grants freelance design and programming contracts in
your company, will ya?
Cheers!
François Dominic Laramée, October 2000
Discuss this article in the forums

Date this article was posted to GameDev.net: 10/8/2000


(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article1208.asp (4 of 4) [25/06/2002 2:13:13 PM]


Gamasutra - Features - "Designing Need-Based AI for Virtual Gorillas" [12.22.00]

| | | |

| | | | | |

Features
by Ernest Adams
Gamasutra
[Author's Bio] Designing Need-Based AI for Virtual Gorillas
December 22, 2000
Printer Friendly
Now that I'm freelance, I get quite a variety of projects to work on. One of the most interesting
Version
involves updating the Virtual Gorilla exhibit at Zoo Atlanta, in Georgia. Longtime readers of The
Designer's Notebook might remember that I wrote about this project in "The VR Gorilla/Rhino Test" a
Discuss this couple of years ago. When you use the exhibit, you put on a VR headset and experience the zoo's
Article gorilla enclosure as if you were a gorilla yourself. Now the zoo's management has asked me to help
them port the exhibit over to new hardware and incorporate new AI for the virtual gorillas.

Letters to the Editor:


Write a letter
View all letters

One of the virtual gorillas


visitors may encounter at
Zoo Atlanta's exhibit.

At the moment, the gorillas don't do much; walking around and exhibiting dominance behavior is
about the extent of it. The dominance behavior occurs when a low-status gorilla wanders into a
higher-status gorilla's personal space, and it consists of an escalating series of aggressive displays
until the low-status gorilla is scared off. It's accurate as far as it goes, but we'd like to extend the
behavior model to involve a variety of actions: eating, drinking, sleeping, playing. This will involve
creating a model of "gorilla needs" which are fulfilled by these activities in a reasonably realistic
manner. That got me thinking about need mechanisms, and they're the subject of this month's
column.
A need mechanism consists of several elements. The main one is a variable which describes how much
of a given needed object or substance the organism has at the moment. Various events in the world,
or activities on the part of the organism, can cause this variable to go up and down. A very simple
example is the remaining ammunition in a first-person shooter. Firing your weapon consumes
ammunition and lowers the amount remaining; picking up clips off the floor raises it again.

http://www.gamasutra.com/features/designers_notebook/20001222/adams_01.htm (1 of 5) [25/06/2002 2:14:40 PM]


Gamasutra - Features - "Designing Need-Based AI for Virtual Gorillas" [12.22.00]

Ammunition level in a first-person shooter is


one simple need mechanism.

In designing an autonomous organism, such as a virtual gorilla or a 'bot in a first-person shooter, we


want them to act on these needs, to do something to satisfy its need when the variable gets low
enough. This requires establishing a threshold point at which the organism will start to exhibit the
behavior - getting something to eat (for a hungry gorilla) or going to look for ammunition (for a space
marine). The behavior is invoked whenever the need drops below the trigger threshold.

However, this creates a problem. Suppose you establish a threshold for remaining ammunition of
10%. Below that threshold our space marine will go look for some ammo; above it he'll continue
exploring. What will happen is that as soon as his ammunition drops below it, he'll go pick up a clip,
but once he's got that clip, he'll go back to exploring again, even if there is a second clip right there.
After he's fired a few bullets the count will drop and he'll start looking around for another clip. He'll be
in a very tight behavioral loop, always using clips and picking up new ones one at a time. Similarly, a
hungry gorilla would eat one bite, wander off for a few minutes, then come back and eat one more
bite, and so on. That's not the way gorillas, or space marines, behave.

What's needed is a second threshold—higher than the first—that tells when to stop the fulfillment
behavior. We want the gorilla to continue to eat until she is sated, not just until she is no longer
hungry. Similarly we want the space marine to go on picking up ammunition until he's sure he's got
enough to last him a while. This mechanism with the two thresholds, one to trigger the behavior and
one to inhibit it, occurs quite often in the natural world and in other kinds of devices as well. It's called
hysteresis, and it's the reason that the furnace in your house doesn't start and stop every 30 seconds.
When you set the thermostat at 68 degrees, the furnace comes on at 68, but it actually goes off at 72.
The thermostat doesn't display the inhibitory threshold, but it's built into the machinery.
But now we have another problem. Suppose our space marine has picked up all the available clips, or
our gorilla has eaten all the available food. She's above her hunger threshold, but not yet up to her
satiety threshold. With the current mechanism, she'll sit there forever, waiting for more food to come
along. Our space marine will continue to search for clips endlessly, even though he's got them all. We
need to place an artificial limit on the fulfillment behavior if it's no longer successful.
Exactly how this is done depends on how the needed item is distributed, and how smart you want the
organism to be. With human beings who know for a fact that there's only so much of the item around,
they should stop immediately when it runs out - if you eat all the food in the fridge, you don't hang
around the refrigerator hoping it'll magically get some more in it somehow. In the case of the gorilla,
though, I would expect her to wait hopefully by the food distribution point for a little while, maybe
searching around a bit before giving up. In the case of the space marine, although he may have found
all the clips he can see, he also knows that clips tend to be hidden in a variety of places. He shouldn't
necessarily stop looking for them as soon as he's picked up all the ones in a room; he should continue
to hunt around a little longer, and give up only after he hasn't found any for a while. In each case, we
want to set a timer (I'll call it the "persistence timer") every time the fulfillment occurs - the gorilla
eats something, or the space marine picks up a clip. It starts to run down while they look for more. If
they haven't obtained any by the time the timer runs out, then the behavior is apparently unsuccessful
and we interrupt it and return to other things.

http://www.gamasutra.com/features/designers_notebook/20001222/adams_01.htm (2 of 5) [25/06/2002 2:14:40 PM]


Gamasutra - Features - "Designing Need-Based AI for Virtual Gorillas" [12.22.00]

Different types of NPC's


require different fulfillment
behaviors.

However, even this isn't completely straightforward; it depends on the urgency of the need. A gorilla
who's eaten enough to not be hungry any more, but is not yet sated, should probably give up and
wander off fairly soon. But a gorilla who's still hungry might search for longer, and a starving gorilla
might search indefinitely. Depending on the importance of the item, you should set the length of the
persistence timer proportionately to the urgency of the need. A space marine who's completely out of
ammo should make finding more his first priority, as long as he's not actually under attack.
You can also set additional trigger thresholds to initiate a variety of different behaviors at different
levels of urgency. A hungry gorilla will go and eat; a very hungry gorilla might try to steal food away
from another one, and a starving gorilla might openly attack another one to get its food. The utmost
extreme, at least in humans, is cannibalism.

This brings me to another point: not everybody behaves the same way. Some people would rather die
than become cannibals; others have no such compunction. I have long felt that we need to make more
of an effort to create unique individuals in computer games. The positions of these trigger thresholds
and the length of the persistence timer should be partially randomized for each individual. Some
marines are risk-takers, and won't start to look for ammo until they're down to their last bullet. Others
are cautious and start looking early. These differences add richness to your game at a trivial cost in
code and CPU time, and I definitely intend to add them to the virtual gorilla exhibit. There's a more
detailed discussion of this in another early column of mine, "Not Just Another Scary Face."

It's also possible to have too much of a good thing. Just as there are thresholds and behaviors for
acquiring more of a needed item, we can also implement thresholds and behaviors for getting rid of
excess, at the top end of the scale. In role-playing games, for example, there's often a penalty for
carrying too much weight. When this occurs, we want our NPC to dump low-value items out of his
backpack until the weight is back down to a manageable level. Choosing which items to dump is of
course very tricky, but it's the right response in principle.
The amount of ammunition you have in a shooter is normally only modified by two actions: firing
reduces it; picking up clips raises it again. Otherwise it doesn't change. In the VR gorilla simulator,
we're going to want the gorillas' needs to change automatically over time - gorillas gradually get more
and more hungry regardless of what else happens. Similarly, different activities should affect the
needs at different rates. Gorillas that are very active should get hungry faster than gorillas that are
sedentary. This whole system is of course how The Sims works, quite explicitly and openly to the
player. In the case of the virtual gorillas we're not going to try to train them, so it can all be hidden.

http://www.gamasutra.com/features/designers_notebook/20001222/adams_01.htm (3 of 5) [25/06/2002 2:14:40 PM]


Gamasutra - Features - "Designing Need-Based AI for Virtual Gorillas" [12.22.00]

The Sims have a queue of things to do, ordered by the


urgency of the need.

The Sims also nicely illustrates another issue: needs interactions. In The Sims, the simulated people
have several needs: food, sleep, entertainment, hygiene, the toilet and so on, but they can only do
one thing at a time. The needs are all competing for the sim's attention, and they have to get them all
fulfilled or they become unhappy - or worse. To manage this they have a queue of things to do, and
it's ordered by the urgency of the need. Using the toilet is at the top, entertainment is at the bottom.
Because everything in the game takes a long time, often the Sims never get around to having any fun,
and get stressed out as a result.
With multiple needs, you don't necessarily have to wait until the inhibition threshold is reached to stop
the current behavior, or until the persistence timer runs out. If all the behaviors can interrupt one
another and they can all last an indefinite length of time, you can recompute the urgency of each need
every few seconds, and choose a new behavior whenever a new need is more urgent than the one
you're currently fulfilling. However, that could again lead to odd behaviors - a person might eat three
bites, sleep for five minutes, go to the bathroom, sleep for another five minutes, go eat another three
bites, and so on. In The Sims each behavior consists of a series of animations that take a certain
minimum amount of time, and they don't generally interrupt the current behavior until it's finished.
And of course it's all further complicated by the fact that you can give them instructions which override
their own instincts.
The virtual gorilla exhibit won't be as complex as The Sims, but we'll probably want a queue of
behaviors as well, sorted by urgency, and a certain minimum amount of time that any behavior can be
performed. The urgency is a combination of two factors: how far below the trigger threshold the
variable is, and a weighting for the type of need itself. For example, playing is intrinsically less
important than eating, but not in every possible circumstance. I'm sure as a child you've had the
experience of playing for a long time and having so much fun that you didn't notice you were hungry
until there was some interruption. For juvenile gorillas we'll give extra weight to the need to play,
which will cause it to be higher in the queue than eating until the gorilla gets really hungry.
One of the classic design questions for computer games is whether urgent needs should interfere with
performance. Clearly, if you're out of ammo, you can't fire your weapons, so you can't shoot anyone
and take his ammunition. If that were the only way to get any, it would create a deadlock. Most
shooter games break the deadlock by providing caches of ammunition that you don't have to shoot
anybody to get, or letting you use non-firearm weapons to dispatch enemies without needing any
ammunition.
With health, however, the situation is different. Obviously somebody who's near death from bullet
wounds shouldn't really be able to run around and fight at top speed, but most shooter games simply
ignore this. If you create a performance penalty for damage taken in an action game, negative
feedback sets in much too quickly and you don't have a chance. In a war game, on the other hand,
things move more slowly and you can often compensate for damaged forces with sound strategy or
efficient production. Besides, damage is supposed to convey an advantage to the other side; that's the
point of causing it.

http://www.gamasutra.com/features/designers_notebook/20001222/adams_01.htm (4 of 5) [25/06/2002 2:14:40 PM]


Gamasutra - Features - "Designing Need-Based AI for Virtual Gorillas" [12.22.00]

The effect of health on


performance creates a
problematic feedback loop.

I don't anticipate having any such problems with the virtual gorillas. For one thing, they don't kill each
other - their dominance behaviors, while dramatic, are not life-threatening. The only resource the
gorillas can't generate for themselves is food, and we'll make sure it's provided in abundance so that
we don't get any deadlocks. I'd like to create a feedback loop in which a well-rested gorilla becomes a
playful one; a long-playing gorilla becomes a hungry one; and a well-fed gorilla becomes a sleepy one,
so that we'll see a regular round of activities. We'll have to keep a sharp eye out to make sure that it's
a stable loop, though. Too much feedback and they start playing, eating and sleeping faster and
faster; too little and they slow down and do nothing.
It's going to be a fun project.

Discuss this article in Gamasutra's discussion forums

________________________________________________________
join | contact us | advertise | write | my profile
news | features | contract work | jobs | resumes | product guide | store

Copyright © 2001 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/designers_notebook/20001222/adams_01.htm (5 of 5) [25/06/2002 2:14:40 PM]


Gamasutra - Features "Game AI: The State of the Industry" [08.20.99]

| | | |

Features
By Steve Woodcock
Gamasutra
August 20, 1999
Game AI: The State of the Industry
This article originally It's been nearly a year since my first article outlining the then-current trends in the
appeared in the Contents
game development industry regarding game AI ("Game AI: The State of the
August, 1999 issue of: Industry," October 1998). Since that time, another Christmas season's worth of Introduction
releases has come and gone and another Game Developers Conference (GDC) has
provided yet another opportunity for AI developers to exchange ideas. While polls Technologies in the
taken at the 1999 GDC indicate that most developers (myself included) felt that Limelight
the last year had seen incremental, rather than revolutionary advances in the field
of game AI, it seemed that enough interesting new developments have taken Technologies on
place, which makes an update to my previous article seem natural. the Wane

I'm very pleased to say that good game AI is growing in importance within the Academia and the
industry, with both developers and marketeers seeing the value in building better Game Industry
and more capable computer opponents. The fears that multiplayer options on
games would make good computer AIs obsolete appear to have blown over in the What's Next?
face of one very practical consideration — sometimes, you just don't have time to
play with anybody else. The incredible pace of development in 3D graphics cards Sidebars
and game engines has made awesome graphics an expected feature, not an added Listing 1. Sample
one. Developers have found that one discriminator in a crowded marketplace is Baldur's Gate AI
good computer AI. script

As with last year's article, much of the insights presented herein flow directly from Influence Maps in
the AI roundtable discussions at the 1999 GDC. This interaction with my fellow a Nutshell
developers has proven invaluable in the past, and the 1999 AI roundtables proved
to be every bit as useful in gaining insight into what other developers are doing, AAAI Spring
the problems they're facing, and where they're going. I'll touch on some of the Symposium
topics and concerns broached by developers at the 1999 roundtables. I'll also
discuss what AI techniques and developments seem to be gaining favor among Further Info
developers, the academic world's take on the game AI field, and where some
developers think game AI will be headed in the coming year or two.
Is The Resource Battle Over?
Letters to the Editor:
Last year there were signs that development teams were beginning to take game AI much more
Write a letter
seriously than they had in the past. Developers were getting involved in the design of the AI earlier in
View all letters
the design cycle, and many projects were beginning to dedicate one or more programmers exclusively
to AI development. Polls from the AI roundtables showed a substantial increase in the number of
developers devoted exclusively to AI programming (see Figure 1).

Resources dedicated to AI development.

It was very apparent at the 1999 GDC that this trend has continued at a healthy clip, with 60 percent
of the attendees at my roundtables reporting that their projects included one or more dedicated AI
programmers. This number is up from approximately 24 percent in 1997 and 46 percent in 1998 and
shows a growing desire on the part of development houses to make AI a more important part of their
game design. If the trend continues, we'll see dedicated AI developers become as routine as dedicated
3D engine or sound developers.
AI specialists continue to be a viable alternative for many companies that lack internal resources to

http://www.gamasutra.com/features/19990820/game_ai_01.htm (1 of 2) [25/06/2002 2:17:19 PM]


Gamasutra - Features "Game AI: The State of the Industry" [08.20.99]
dedicate developers exclusively to AI development. Several developers and producers present at the
roundtables indicated that they had used independent contractors to roll the AI portions of their
process with varying degrees of success. The primary complaints about using contract help were
perhaps the universal ones — you never really know what you're getting, and maintaining good
communication is, at best, a chore.
The most interesting comments, however, concerned CPU resources available to the AI developers
(Figure 1). None of the developers answering the poll questions regarding CPU resources felt that they
had too little CPU available. Everybody felt they could use more if they had it, but nobody said that
they were having to fight tooth and nail for resources as they had in the past. This is an amazing turn
of events, which is in stark contrast to previous years when AI developers complained often and
bitterly of fighting the graphics engine guys for CPU cycles. The overall percentages of CPU cycles
most developers felt they were getting didn't really change, but developers were feeling much less
pinched than they had been in the past. When asked why this was the case, there were a variety of
theories. Most developers felt that this was, quite simply, due to the fact that faster hardware is now
standard on both PCs and consoles — 5 percent of a 400Mhz Pentium III is a heck of a lot more
horsepower than 5 percent of a 200Mhz Pentium I. Others thought that the availability of faster 3D
hardware, combined with greater expertise of the 3D engine manufacturers, had simply made 3D
engines more efficient than they had been and thus freed up more CPU resources for other tasks.
Whatever the reasons, everybody was happy about it, and they thought it would only get better as
hardware got faster.
The one great problem mentioned by all was the impending-shipping-date-syndrome. Christmas hasn't
moved from its place as an almost magical date for targeting new releases, and the increasing
complexity of games in general hasn't made meeting deadlines any easier. While there are more
programmers dedicated exclusively to the AI portion of game development now than there had been in
the past, most developers felt that the task itself had become more difficult.
Part of the reason for this, of course, is the increasing importance of game AI itself — having made the
case that good game AI is important in increasing the odds of a game's success, developers must now
actually deliver better game AI. Quite simply, that takes time. When coupled with the fact that most AI
testing can't really begin until substantial portions of the game's engine are up and running, you've got
a situation wherein dedicated AI developers find themselves making compromises in the face of
impending shipping dates.
Some developers also professed that part of the problem was the advances made in competing
products. For example, after one real-time strategy (RTS) game introduced production queues, players
started looking for all RTS games to do the same, and that means additional AI development for
handling such things. There is also a desire on the part of most developers to avoid doing the "same
old thing" in a new release.

Technologies in the Limelight

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990820/game_ai_01.htm (2 of 2) [25/06/2002 2:17:19 PM]


Gamasutra - Features - "Game AI: The State of the Industry" [11.01.00]

| | | |

Features
by Steven Woodcock
[Author's Bio]
November 1, 2000
Game AI: The State of the Industry
This article originally
appeared in the August In the first of this two-part report on the state of game AI, Steven Woodcock
2000 issue of: shares what issues came up while moderating the AI roundtables at the Contents
2000 Game Developers Conference. Next week, in Part Two, John E. Laird
will discuss how academics and developers can better share information with
each other, and Ensemble Studios' Dave Pottinger will peer into the future of
game AI. Trends Since Last Year

One thing was made clear in the aftermath of this year's Game Developers Can AI SDKs Help?
Conference: game AI has finally "made it" in the minds of developers,
producers, and management. It is recognized as an important part of the
game design process. No longer is it relegated to the backwater of the
schedule, something to be done by a part-time intern over the summer. For many people, crafting a
game's AI has become every bit as important as the features the game's graphics engine will sport. In
other words, game AI is now a "checklist" item, and the response to both our AI roundtables at this
Printer Friendly year's GDC and various polls on my game AI web site (www.gameai.com) bear witness to the fact that
Version developers are aggressively seeking new and better ways to make their AI stand out from that of
other games.
Discuss this
The technical level and quality of the GDC AI roundtable discussions continues to increase. More
Article important, however, was that our "AI for Beginners" session was packed. There seem to be a lot of
developers, producers, and artists that want to understand the basics of AI, whether it's so they can
go forth and write the next great game AI or just so they can understand what their programmers are
telling them.
As I've done in years past, I'll use this article to touch on some of the insights I gleaned from the
roundtable discussions that Neil Kirby, Eric Dybsand, and I conducted. These forums are invaluable for
discovering the problems developers face, what techniques they're using, and where they think the
industry is going. I'll also discuss some of the poll results taken over the past year on my web site,
some of which also provided interesting grist for the roundtable discussions.
Resources: The Big Non-issue

Last year's article (Game AI: The State of the Industry) mentioned that AI developers were (finally)
becoming more involved in the game design process and using their involvement to help craft better
AI opponents. I also noted that more projects were devoting more programmers to game AI, and AI
programmers were getting a bigger chunk of the overall CPU resources as well.
This year's roundtables revealed that, for the most part, the resource battle is over (Figure 1). Nearly
80 percent of the developers attending the roundtables reported at least one person working full-time
on AI on either a current or previous project; roughly one-third of those reported that two or more
developers were working full-time on AI. This rapid increase in programming resources has been
evident over the last few years in the overall increase in AI quality throughout the industry, and is
probably close to the maximum one could reasonably expect a team to devote to AI given the realities
of the industry and the marketplace.

Letters to the Editor:


Write a letter
View all letters

http://www.gamasutra.com/features/20001101/woodcock_01.htm (1 of 5) [25/06/2002 2:32:27 PM]


Gamasutra - Features - "Game AI: The State of the Industry" [11.01.00]

Figure 1: AI poll results from


the GDC 2000 roundtables

Even more interesting was the amount of CPU resources that developers say they're getting. On
average, developers say they now get a whopping 25 percent of the CPU's cycles, which is a 250
percent increase over the average amount of CPU resources developers said they were getting at the
1999 roundtables. When you factor in the increase in CPU power year after year, this trend becomes
even more remarkable.
Many developers also reported that general attitudes toward game AI have shifted. In prior years the
mantra was "as long as it doesn't affect the frame rate," but this year people reported that there is a
growing recognition by entire development teams that AI is as important as other aspects of the
game. Believe it or not, a few programmers actually reported the incredible luxury of being able to say
to their team, "New graphics features are fine, so long as they don't slow down the AI." If that isn't a
sign of how seriously game AI is now being taken, I don't know what is.
Developers didn't feel pressured by resources, either. Some developers (mostly those working on
turn-based games) continued to gleefully remind everyone that they devoted practically 100 percent
of the computer's resources for computer-opponent AI, but they also admitted that this generally
allowed deeper play, but not always better play. (It's interesting to note that all of the turn-based
developers at the roundtables were doing strategy games of some kind -- more than other genres,
that market has remained the most resistant to the lure of real-time play.) Nearly every developer was
making heavy use of threads for their AIs in one fashion or another, in part to better utilize the CPU
but also often just to help isolate AI processes from the rest of the game engine.
AI developers continued to credit 3D graphics chips for their increased use of CPU resources. Graphics
programmers simply don't need as much of the CPU as they once did.
Trends Since Last Year

A number of AI technologies noted at the 1998 and 1999 GDCs has continued to grow and accelerate
over the last year. The number of games released in recent months that emphasize interesting AI --
and which actually deliver on their promise -- is a testament to the rising level of expertise in the
industry. Here's a look at some trends.
Artificial life. Perhaps the most obvious trend since the 1999 GDC was the wave of games using
artificial life (A-Life) techniques of one kind or another. From Maxis's The Sims to CogniToy's Mind
Rover, developers are finding that A-Life techniques provide them with flexible ways to create realistic,
lifelike behavior in their game characters.

http://www.gamasutra.com/features/20001101/woodcock_01.htm (2 of 5) [25/06/2002 2:32:27 PM]


Gamasutra - Features - "Game AI: The State of the Industry" [11.01.00]

Figure 2: A smart rover navigates a maze in


CogniToy's Mind Rover

The power of A-Life techniques stems from its roots in the study of real-world living organisms. A-Life
seeks to emulate that behavior through a variety of methods that can use hard-coded rules, genetic
algorithms, flocking algorithms, and so on. Rather than try to code up a huge variety of extremely
complex behaviors (similar to cooking a big meal), developers can break down the problem into
smaller pieces (for example, open refrigerator, grab a dinner, put it in the microwave). These
behaviors are then linked in some kind of decision-making hierarchy that the game characters use (in
conjunction with motivating emotions, if any) to determine what actions they need to take to satisfy
their needs. The interactions that occur between the low-level, explicitly coded behaviors and the
motivations/needs of the characters causes higher-level, more "intelligent" behaviors to emerge
without any explicit, complex programming.

The simplicity of this approach combined with the amazing resultant behaviors has proved irresistible
to a number of developers over the last year, and a number of games have made use of the
technique. The Sims is probably the best known of these. That game makes use of a technique that
Maxis co-founder and Sims designer Will Wright has dubbed "smart terrain." In the game, all
characters have various motivations and needs, and the terrain offers various ways to satisfy those
needs. Each piece of terrain broadcasts to nearby characters what it has to offer. For example, when a
hungry character walks near a refrigerator, the refrigerator's "I have food" broadcast allows the
character to decide to get some food from it. The food itself broadcasts that it needs cooking, and the
microwave broadcasts that it can cook food. Thus the character is guided from action to action
realistically, driven only by simple, object-level programming.

Figure 3: The Sims made ample use of A-Life


technology

Developers were definitely taken with the possibilities of this approach, and there was much discussion
about it at the roundtables. The idea has obvious possibilities for other game genres as well. Imagine a
first-person shooter, for example, in which a given room that has seen lots of frags "broadcasts" this
fact to the NPCs assisting your player's character. The NPC could then get nervous and anxious, and
have a "bad feeling" about the room -- all of which would serve to heighten the playing experience and
make it more realistic and entertaining. Several developers took copious notes on this technique, so

http://www.gamasutra.com/features/20001101/woodcock_01.htm (3 of 5) [25/06/2002 2:32:27 PM]


Gamasutra - Features - "Game AI: The State of the Industry" [11.01.00]
we'll probably be seeing even more A-Life in games in the future.
Pathfinding. In a remarkable departure from the roundtables of previous years, developers really
didn't have much to ask or say about pathfinding at this year's GDC roundtables. The A* algorithm (for
more details, see Bryan Stout's excellent article Smart Moves: Intelligent Path-Finding) continues to
reign as the preferred pathfinding algorithm, although everybody has their own variations and
adaptations for their particular project. Every developer present who had needed pathfinding in their
game had used some form of the A* algorithm. Most had also used influence maps, attractor-repulsor
systems, and flocking to one degree or another. Generally speaking, the game community has this
problem well in hand and is now focusing on particular implementations for specific games (such as
pathfinding in 3D space, doing real-time path-granularity adjustments, efficiently recognizing when
paths were blocked, and so on).

Figure 4: Ensemble Studios revamped the


pathfinding in Age of Empires II: The Age of
Kings by including terrain analysis

As developers become more comfortable with their pathfinding tools, we are beginning to see complex
pathfinding coupled with terrain analysis. Terrain analysis is a much tougher problem than simple
pathfinding in that the AI must study the terrain and look for various natural features -- choke-points,
ambush locations, and the like. Good terrain analysis can provide a game's AI with multiple
"resolutions" of information about the game map that are well tuned for solving complex pathfinding
problems. Terrain analysis also helps make the AI's knowledge of the map more location-based, which
(as we've seen in the example of The Sims) can simplify many of the AI's tasks. Unfortunately, terrain
analysis is made somewhat harder when randomly generated maps are used, a feature which is
popular in today's games. Randomly generating terrain precludes developers from "pre-analyzing"
maps by hand and loading the results directly into the game's AI.
Several games released in the past year have made attempts at terrain analysis. For example,
Ensemble Studios completely revamped the pathfinding approach used in Age of Empires for its
successor, Age of Kings, which uses some fairly sophisticated terrain-analysis capabilities. Influence
maps were used to identify important locations such as gold mines and ideal locations for building
placement relative to them. They're also used to identify staging areas and routes for attacks: the AI
plots out all the influences of known enemy buildings so that it can find a route into an enemy's
domain that avoids any possible early warning.

Another game that makes interesting use of terrain analysis is Red Storm's Force 21. The developers
used a visibility graph (see "Visibility Graphs" sidebar) to break down the game's terrain into distinct
but interconnected areas; the AI can then use these larger areas for higher-level pathfinding and
vehicle direction. By cleanly dividing maps into "areas I can go" and "areas I can't get to," the AI is
able to issue higher-level movement orders to its units and leave the implementation issues (such as
not running into things, deciding whether to go over the bridge or through the stream, and so on) to
the units themselves. This in turn has an additional benefit: the units can make use of the A*
algorithm to solve smaller, local problems, thus leaving more of the CPU for other AI activity.
Formations. Closely related to the subject of pathfinding in general is that of unit formations --
techniques used by developers to make groups of military units behave realistically. While only a few
developers present at this year's roundtables had actually needed to use formations in their games,
the topic sparked quite a bit of interest (probably due to the recent spate of games with this feature).
Most of those who had implemented formations had used some form of flocking with a strict overlying
rules-based system to ensure that units stayed where they were supposed to. One developer, who was
working on a sports game, said he was investigating using a "playbook" approach (similar to that used
by a football coach) to tell his units where to go.

http://www.gamasutra.com/features/20001101/woodcock_01.htm (4 of 5) [25/06/2002 2:32:27 PM]


Gamasutra - Features - "Game AI: The State of the Industry" [11.01.00]
State machines and hierarchical AIs. The simple rules-based finite- and fuzzy-state machines (FSMs
and FuSMs) continue to be the tools of choice for developers, overshadowing more "academic"
technologies such as neural networks and genetic algorithms. Developers find that their simplicity
makes these approaches far easier to understand and debug, and they work well in combination with
the types of encapsulation seen in games using A-Life techniques.
Developers are looking for new ways to use these tools. For many of the same reasons A-Life
techniques are being used to break down and simplify complex AI decisions into a series of small,
easily defined steps, developers are taking more of a layered, hierarchical approach to AI design.
Interplay's Starfleet Command and Red Storm's Force 21 take such an approach, using higher-level
strategic "admirals" or "generals" to issue general movement and attack orders to tactical groups of
units under their command. In Force 21 these units are organized at a tactical level into platoons;
each platoon has a "tactician" who interprets the orders the platoon has received and turns them into
specific movement and attack orders for individual vehicles.
Most developers at the roundtables who were working on strategy games reported that they were
either planning to implement or already had used this type of layered approach to their AI engines.
Not only was it a more realistic representation, but it made debugging simpler. Most of those who used
this design also liked the way it allowed them to add hooks at the strategic level to allow for user
customization of AIs, building strategies, and so on, while isolating the lower-level "get the job done"
AI from anything untoward that the user might accidentally do to it. This is another trend we're seeing
in strategy games that players find quite enjoyable -- witness the various "empire mods" for games
such as Stars, Empire of the Fading Suns and Alpha Centauri.
________________________________________________________
Can AI SDKs Help?

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/20001101/woodcock_01.htm (5 of 5) [25/06/2002 2:32:27 PM]


Gamasutra - Features - "Game AI: The State of the Industry" [11.01.00]

| | | |

Features
by Steven Woodcock
[Author's Bio]
Can AI SDKs Help?
November 1, 2000
The single biggest topic of discussion at the GDC 2000 roundtables was the
This article originally feasibility of AI SDKs. There are at least three software development kits Contents
appeared in the August currently available to AI developers:
2000 issue of:
● Mathématiques Appliquées' DirectIA, an agent-based toolkit that
uses state machines to build up emergent behaviors.
Trends Since Last Year
● Louder Than A Bomb's Spark!, a fuzzy-logic editor intended for AI
engine developers. Can AI SDKs Help?
● The Motion Factory's Motivate, which can provide some fairly
sophisticated action/reaction state machine capabilities for animating
characters. It was used in Red Orb's Prince of Persia 3D, among
others.

Many developers (especially those at the "AI for Beginners" session) were relatively unaware of these
toolkits and hence were very interested in their capabilities. It didn't seem, however, that many of the
Printer Friendly more experienced developers thought these toolkits would be all that useful, though a quick poll did
Version reveal that one or two developers were in the process of evaluating the DirectIA toolkit. Most
expressed the opinion that one or more SDKs would come to market that would prove them wrong.

Figure 5: Red Orb's Prince of Persia 3D


used The Motion Factory's Motivate SDK

In discussing possible features, most felt that an SDK that provided simple flocking or pathfinding
functions might best meet their needs. One developer said he'd like to see some kind of standardized
Letters to the Editor: "bot-like" language for AI scripts, though there didn't seem to be any widespread enthusiasm for this
Write a letter idea (probably because of fears it would limit creativity). Also discussed briefly in conjunction with this
View all letters topic was the matter of what developers would be willing to pay for such an SDK, should a useful one
actually be available. Most felt that price was not a particular object; developers today are used to
paying (or convincing their bosses to pay) thousands of dollars for toolkits, SDKs, models, and the
like. This indicates that if somebody can develop an AI SDK flexible enough to meet the demands of
developers, they should be able to pay the rent.
Technologies on the Wane
It's become clearer since last year's roundtables that the influence of
the more "nontraditional" AI techniques, such as neural networks and Visibility Graphs
genetic algorithms (GAs), is continuing to wane. Whereas in previous
years developers had many stories to tell of exploring these and other
technologies during their design and development efforts, at this

http://www.gamasutra.com/features/20001101/woodcock_02.htm (1 of 4) [25/06/2002 2:34:32 PM]


Gamasutra - Features - "Game AI: The State of the Industry" [11.01.00]
year's sessions there was much more focus on making the more
traditional approaches (state machines, rules-based AIs, and so on) One of the interesting areas
work better. The reasons for this varied, but essentially boiled down that game AI is beginning to
to variations on the fact that these approaches are better understood explore is the realm of
and work "well enough." Developers seemed to want to focus much terrain analysis. Terrain
more on how to make them work better and leave exploration of analysis takes the relatively
theory to the academic field. simple task of path-finding
across a map to its next
Genetic algorithms have taken a particularly hard hit in the past year. logical step, which is to get
There wasn't a single developer at any of the roundtables that the AI to recognize the
reported using them in any current projects, and most felt that their strategic and tactical value of
appeal was overrated. While last year's group had expressed some various terrain features such
interest in experimenting with using GAs to help with game tuning, as hills, ridges, choke-points,
the developers who had tried reported this year that they hadn't and so on, and incorporate
found this to be very useful. Nobody could think of much use for GAs this knowledge into its
outside of the well-known "life simulators" such as the Creatures and planning. One tool that offers
Petz series. much promise for dealing
with this task is the visibility
The one exception to this, as previously noted, is the continued use of graph.
A-Life techniques. From flocking algorithms that help guide unit
formations (Force 21, Age of Kings, Homeworld) to object-oriented Visibility graphs are fairly
desire/satisfaction approaches (The Sims), developers are finding that simple constructs originally
these techniques make their games much more lifelike and developed for the field of
"predictably unpredictable" than ever before. robotics motion. They work
as follows: Assume you are
Where We're Headed looking down at a map that
has a hill in the center and a
Always interesting at the roundtables are the inevitable discussions of pasture with clumps of trees
where the industry in general, and game AI in particular, is headed. all around it. Let
As usual, we got almost as many opinions as there were attendees, appropriately shaped
but some common trends could be seen emerging down the road. polygons represent the hill
and the trees. The visibility
Everybody thought that game AI would continue to be an important graph for this scene uses the
part of most games. The recent advances were unlikely to be lost to a vertices of the polygons for
new wave of "gee-whiz" 3D graphics engines, and the continued the vertices in the graph, and
increase in CPU and 3D card capabilities was only going to continue to builds the edges of the graph
give AI developers more horsepower. There was the same feeling as between the vertices
last year that the industry would continue to move slowly away from wherever there is a clear
monolithic and rigid rules-based approaches to more (unobstructed) path between
purpose-oriented, flexible AIs built using a variety of approaches. It the corresponding polygon
seems safe to assume that extensible AIs will continue to enjoy some vertices. The weight of each
popularity and support among developers, mostly in the first-person connecting line equals the
shooter arena but also in more sophisticated strategy games. distance between the two
corresponding polygon
vertices. This gives you a
simplified map against which
you can run a pathfinding
algorithm to traverse the
map while avoiding the
obstacles.

Visibility graphs were used in


Red Storm Entertainment's
Force 21
Figure 6: Relic Entertainment's Homeworld
used flocking techniques There are some problems
with visibility graphs,
however. They only give raw
Academia and the defense establishment continue to influence the connection information, and
game AI field (see John Laird's "Bridging the Gap Between Developers paths built using them tend
and Researchers" to be published in Part Two next week), though it to look a little mechanical.
sometimes seems that the academic world learns more from game Also, the developer needs to
developers than the other way around. For the most part, developers do some additional work to
seem to feel that the academic study of AI is interesting but won't prevent all but the smallest
really help them ship their product, while researchers from the units from colliding with
academic field find the rapid progress of the game industry enviable polygon (graph) edges as

http://www.gamasutra.com/features/20001101/woodcock_02.htm (2 of 4) [25/06/2002 2:34:32 PM]


Gamasutra - Features - "Game AI: The State of the Industry" [11.01.00]
even if the techniques aren't all that well documented. they move, since the path
generated from a visibility
There can be no doubt that the game AI field continues to be one of graph doesn't take into
the most innovative areas of game development. We know what account unit size at all. Still,
works and tools are beginning to appear to help us do our jobs. With they're a straightforward way
CPU constraints essentially eliminated and the possibilities of good to break down terrain into
game AI now part of the design process, AI developers can look simplified areas, and they
forward to a bright future of innovation and experimentation. have uses in pathfinding,
setting up ambushes (the
For More Information unobstructed graph edges
are natural ambush points),
Web Sites
and terrain generation.
Far and away the best place to find out more about any aspect of
game AI is the Internet. There are more excellent web sites filled with
tutorials, information, sample code, and so on, than anybody could possibly list in one place. Some of
the recommended ones include:
www.gameai.com
Steven Woodcock's site, dedicated to all things game-AI-related. Provides links to other AI resources,
reviews on AI implementations in games already on the market, and archives of various Usenet
threads.
www.gamedev.net
Another excellent site dedicated to all aspects of game development, there is an extensive list of
resources and an active discussion group on the topic.
www.red.com/cwr.boids.html
This site remains the single best source for any information about flocking and related A-Life
technologies.
www.pcai.com/pcai
PC AI magazine has a marvelous web site crammed with all kinds of useful AI resources. From sample
applications to research papers, you can find it here.
http://ai.eecs.umich.edu/people/laird/gamesresearch.html
John E. Laird's site
www.aaai.org
American Association for Artificial Intelligence
Newsgroups
Of course Usenet continues to be a great place to do research on a variety of AI-related topics. The
best newsgroups for this purpose remain comp.ai.games, comp.ai, and rec.games.programmers.
Papers
Laird, J. E., and M. van Lent. "Interactive Computer Games: Human-Level AI's Killer Application."
Proceedings of the AAAI National Conference on Artificial Intelligence, August 2000.
Laird, J. E. "It Knows What You're Going to Do: Adding Anticipation to a Quakebot." Proceedings of the
AAAI 2000 Spring Symposium Series: Artificial Intelligence and Interactive Entertainment, March 2000
(AAAI technical report #SS-00-02).
Books
Unfortunately, there really aren't very many books that discuss game AI. Probably the best
comprehensive reference remains:
Russell, Stuart J., and Peter Norvig. Artificial Intelligence: A Modern Approach. Upper Saddle River, N.
J.: Prentice Hall, 1995.
Discuss this article in Gamasutra's discussion forums

________________________________________________________

http://www.gamasutra.com/features/20001101/woodcock_02.htm (3 of 4) [25/06/2002 2:34:32 PM]


Gamasutra - Features - "Game AI: The State of the Industry" [11.01.00]

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/20001101/woodcock_02.htm (4 of 4) [25/06/2002 2:34:32 PM]


Gamasutra - Features - "Game AI: The State of the Industry, Part Two [11.08.00]

| | | |

Features
by Dave C. Pottinger
[Author's Bio]
November 8, 2000
Game AI: The State of the Industry, Part Two
This article originally
appeared in the August Last week in Part One of this article, Steven Woodcock took inventory of the
current state game AI, based on the roundtables he led at the 2000 Game Contents
2000 issue of:
Developers Conference. Now in Part Two, Ensemble Studios' Dave Pottinger
looks at what the future holds for game AI, and University of Michigan
Professor John E. Laird discusses bridging the gap between AI researchers Better AI Development
and game developers. Better Pathfinding AI

As I slowly reclined back into the seat of the last E3 bus this spring, I was
Bridging the Gap by Prof.
certain of two things: some really great games were coming out in the next John E. Laird
year and my feet hurt like hell. A lot of the games that created a buzz
featured excellent AI.Since my fellow Ensembleites assured me (repeatedly)
that no one really cared to hear about my feet, I thought I'd use this space to talk about some of the
games coming out in the next 18 months and the new and improved AI technology that will be in
them.
Printer Friendly
Version Better AI Development Processes and Tools

Discuss this AI has traditionally been slapped together at the eleventh hour in a product's development cycle. Most
programmers know that the really good computer-player (CP) AI has to come at the end because it's
Article darn near impossible to develop CP AI until you know how the game is going to be played. As the use
of AI in games has matured, we're starting to see more time and energy spent on developing AI
systems that are modular and built in a way that allows them to be tweaked and changed easily as the
gameplay changes. This allows the AI development to start sooner, resulting in better AI in the final
product. A key component in improving the AI development process is building better tools to go along
with the actual AI.
For Ensemble's third real-time strategy (RTS) game, creatively code-named RTS3, we've spent almost
a full man-year so far developing a completely new expert system for the CP AI. It's been a lot of work
taking the expert system (named, also creatively, XS) from the in-depth requirements discussions with
designers to the point where it's ready to pay off. We've finally hit that payoff and have a very robust,
extensible scripting language.

Letters to the Editor:


Write a letter
View all letters

Interesting new entity AI features will be a


key component in Cataclysm.

The language has been so solid and reusable that, in addition to using it to write the CP AI content,

http://www.gamasutra.com/features/20001108/laird_01.htm (1 of 3) [25/06/2002 2:35:46 PM]


Gamasutra - Features - "Game AI: The State of the Industry, Part Two [11.08.00]
we're using it for console and UI command processing, cinematic control, and the extensive trigger
system. We also expect to use XS to write complicated conditional and prerequisite checking for the
technology tree; this way, the designers can add off-the-wall prerequisites for research nodes without
programmer intervention. Finally, we will also use the XS foundation to write the script code that
controls the random map generation for RTS3. The exciting aspect of XS from a tools standpoint is
that we will have XS debugging integrated with RTS3's execution. For fans who used the Age of
Empires II: The Age of Kings (AoK) expert-system debugging (a display table of 40 or so integer
values), this is a huge step up, since XS will significantly increase the ease with which players can
create AI personalities.
Better NPC Behavior

In the early days of first-person shooters, non-player characters (NPCs) had the intelligence of nicely
rounded rocks. But they've been getting much better lately -- look no further than Half-Life's
storytelling NPCs and Unreal Tournament's excellent bot AI. The market success of titles such as these
has prompted developers to put more effort into AI, so it looks as if smarter NPCs will continue to
show up in games.

Grey Matter Studios showed some really impressive technology at E3 with Return to Castle
Wolfenstein. When a player throws grenades at Nazi guards, those guards are able to pick up the
grenades and throw them back at the player, adding a simple but very effective new wrinkle to NPC
interactivity. A neat gameplay mechanic that arises out of this feature is the player's incentive to hold
on to grenades long enough so they explode before the guards have a chance to throw them back.
Thankfully, Grey Matter thought of this and has already made the guards smart enough not to throw
the grenades back if there's no time to do so.

Watch out for those tricky grenade-throwing


guards when you get off of the gondola in
Return to Castle Wolfenstein.

More developers are coupling their AI to their animation/simulation systems to generate characters
which move with more realism and accuracy. Irrational did this with System Shock 2 and other
developers have done the same for their projects. The developers at Raven are doing similar things
with their NPC AI for Star Trek: Elite Force. They created a completely new NPC AI system that's
integrated into their Icarus animation system. Elite Force's animations are smoothly integrated into the
character behavior, which prevents pops and enables smooth transitions between animations. The
result is a significant improvement to the look and feel of the game. I believe that as the use of
inverse kinematics in animation increases, games will rely on advanced AI state machines to control
and generate even more of the animations. As a side benefit, coupling AI to animation gives you the
benefit of more code reuse and memory savings.
Better Communication Using AI

Since the days of Eliza and HAL, people have wanted to talk with their computers. While real-time
voice recognition and language processing are still several years off, greater strides are being made to
let players better communicate with their computer opponents and allies.
For example, in our upcoming Age of Empires: The Age of Kings expansion pack, The Conquerors,
we've enabled a chat communication system that lets you command any computer player simply by
sending a chat message or selecting messages from a menu. Combined with AoK's ability to let you
script your own CP AI, this lets you craft a computer ally that plays on its own and lets you have
conversational exchanges with it in random-map games. This is a small step toward the eventual goal
of having players talk to their computer allies in the same way as to humans. Unfortunately, we still
have to wait a while for technology to catch up to our desire.
________________________________________________________

http://www.gamasutra.com/features/20001108/laird_01.htm (2 of 3) [25/06/2002 2:35:46 PM]


Gamasutra - Features - "Game AI: The State of the Industry, Part Two [11.08.00]
Better Pathfinding AI

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/20001108/laird_01.htm (3 of 3) [25/06/2002 2:35:46 PM]


Gamasutra - Features - "Game Developers Conference 2001: An AI Perspective" [04.23.01]

| | | |

Features
by Eric Dybsand
Gamasutra
[Author's Bio] Game Developers Conference 2001: An AI
April 23, 2001
Perspective
With each year that I attend the conference (I have attended 14 of the 16 conferences that have been
held so far) there are conflicts with sessions, networking opportunities and just the sheer magnitude of
Tutorial:"Artificial Life the conference, which prevents me from attending all the great sessions that I am interested in. I am
for Computer Games" most disappointed about being unable to attend "Design Plunder" (lecture by Will Wright) and "Those
Darn Sims: What Makes Them Tick?" (lecture by Jamie Doornbos), both which discussed The Sims
"The Basics of Team
AI"
game AI.

Printer Friendly That being said, I was able to attend many other excellent computer game AI related sessions. The
Version following represents the perspective I obtained, from the computer game AI related sessions I did
attend during the recent GDC 2001 in San Jose.
Discuss this
Article Tuesday, March 20, Tutorial:"Artificial Life for Computer Games"
This tutorial was an update from the tutorial by the same name, presented first at last year's GDC. The
same speakers were present: Bruce Blumberg, John Funge, Craig Reynolds and Demetri Terzopoulos.
With its focus on applications of artificial life techniques, this tutorial offered the new-to-ALife attendee
a comprehensive look at some of the research work of these noted ALife experts.

Letters to the Editor:


Write a letter
View all letters

Duncan, a virtual dog that behaves autonomously, created by the Synthetic


Characters Group at the MIT Media Lab.

Since I had sat through the same entire tutorial last year, and from attending the first hour of this
year's tutorial, and having a conflict with another tutorial at the same time, I can only comment on
one speaker's presentation. That speaker was Bruce Blumberg who described the latest status of the
work on virtual creatures done by the Synthetic Characters Group at the MIT Media Lab. Specifically,
Blumberg reviewed the status of the research and development on Duncan, a virtual dog that behaves
autonomously. In a session to be held later in the conference (that I discuss later in this report) two of
Blumberg's students presented more detail regarding the architecture and development of Duncan.

http://www.gamasutra.com/features/20010423/dybsand_01.htm (1 of 3) [25/06/2002 2:37:20 PM]


Gamasutra - Features - "Game Developers Conference 2001: An AI Perspective" [04.23.01]

Because I had to dash off to the conflicting tutorial, I was not able to attend the presentations by
Funge, Reynolds or Terzopoulos. So, I can only speculate that Reynolds offered an update of his work
with steering behaviors and flocking (for which he is credited with being the "Father of Flocking") and
probably demonstrated his hockey game application of these low level behaviors. Also, I would
suggest that Terzopoulos provided an update of his work with physics based locomotion learning.

Tuesday, March 20, Tutorial: "Cutting Edge Techniques for Modeling &
Simulation III"
This is the tutorial that conflicted with the Artificial Life for Computer Games tutorial and was a
presentation by Roger Smith, of the status of techniques used in military simulations. I had missed the
GDC 2000 version of this tutorial because I had sat through the complete Artificial Life for Computer
Games tutorial for GDC 2000. [As I mentioned in the beginning of this report, the sheer magnitude of
the GDCs means that conflicts arise in what sessions a person wants to see.]
Much of the first part of this tutorial, was more relevant to game design (primarily war game, flight
sim and FPS game design) as Smith went into some detail regarding the history and design of military
training simulations. In doing so, Smith presented some interesting parallels between game and
simulation development. Smith further reviewed some code models, interface specifications and object
declarations found in use in today's military simulations.
As Smith discussed modeling concepts, many of his examples came from The Sims computer game by
Maxis, which was the most successful sim game released last year. And as he presented an AI Vehicle
Movement Model, I found myself relating to my own current work in developing an artificial driver for a
soon-to-be-released Trans-Am car racing game.

Many of Smith's examples


came from The Sims.

Probably the part of the presentation that related the most widely to computer game AI was when
Smith reviewed behavioral modeling. The design needs of an intelligent agent for a military simulation
are very much like those for most computer games (where behavior is to be expected to appear
intelligent). During this part of the presentation, Smith reviewed: simple reflex agents, goal-based
agents and utility-based agents (all of which I personally have seen in use in computer game AI
implementations). Smith further discussed finite state machines (both in singular and hierarchal
usage), expert systems, Markov chains, constraint satisfaction and fuzzy logic. All of these concepts
are widely used in computer game AI development. (Well, maybe not hidden Markov chains.) Perhaps
the best take-away for an AI programmer attending this tutorial, was the opportunity to see a variety
of techniques "and how they were being used" in the way the military designs and implements its
training simulations. As a result, this was certainly a tutorial well worth attending, for this AI
programmer.

Tuesday, March 20: 4th Annual AI Programmers Dinner


Fast becoming a GDC institution and tradition, is the Annual AI Programmers Dinner hosted by Neil
Kirby, Steve Woodcock and myself. Four years ago, at the only GDC held in Long Beach, we (the
"Three AI Guys") decided that an informal gathering of AI programmers and those game developers
interested in AI was needed, and so the AI Programmers Dinner was born. From that first gathering,
each dinner has been fun and enlightening. Complete AI systems have been designed and dissected on
napkins and tablecloths during the event. All manner of computer game AI questions have been posed
and answered and then refuted during the dinner, all in a good natured and community spirited
manner. At this fourth AI Programmers Dinner, we enjoyed Italian/American cuisine at Milos
Restaurant in the Crowne Plaza Hotel with 42 developers and AI programmers. And rumor has it, that
the fifth AI Programmers Dinner during the GDC 2002 will be held at a popular San Jose Chinese
restaurant. So, if you are an AI Programmer or interested in informal discussion about computer game
AI (and can pay your share of the bill) then make your plans to attend this event during GDC 2002. I
know I will!

http://www.gamasutra.com/features/20010423/dybsand_01.htm (2 of 3) [25/06/2002 2:37:20 PM]


Gamasutra - Features - "Game Developers Conference 2001: An AI Perspective" [04.23.01]

The 4th Annual AI Programmers Dinner.

Wednesday, March 21, Tutorial: "Artificial Intelligence: Tactical


Decision-Making Techniques"
This tutorial by John Laird and Michael van Lent was worth the price of admission for the new
computer game AI programmer, all by itself. The first part of the session was a review of many of the
traditional AI decision-making techniques: finite state machines, decision trees, fuzzy logic, neural
networks, genetic algorithms and others. Even though I'm familiar with all of these techniques and
have used most of them for one aspect or another of a computer game AI or test bed, it is still good
for me to attend a review such as this. For the new computer game AI programmer or someone just
becoming familiar with computer game AI, this review would be an invaluable summary of all the tools
available to be used in developing computer game AI.
Using the work Laird and van Lent have done for developing a SOAR-based NPC control process for
bots, using Quake II as a backdrop, the speakers presented these various techniques and how they
related to computer game AI, then analyzed each technique for its strengths and weaknesses and
provided references to more information on each technique.

The section of the tutorial that covered planning was especially interesting to me, as the speakers
described the planning process of a bot playing Quake II. Various selection criteria were reviewed, and
multi-step look-ahead techniques were suggested. Since my AI development tends to produce agents
that are very goal-oriented (a component of planning) this section was very relevant to me.
The speakers concluded the tutorial with a presentation of components of their SOAR-based Quake bot
work. During this section a variety of 'bot behaviors were described and the SOAR approach
presented. What stood out for me, within this section, was the approach by the SOAR-based bot to
anticipate its enemy's actions.
While this tutorial was interesting and enlightening, I am sure I still would be fragged if I encountered
the SOAR-based bot in a death match, despite now knowing more about the processes that it uses to
make decisions.

________________________________________________________

"The Basics of Team AI"

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/20010423/dybsand_01.htm (3 of 3) [25/06/2002 2:37:20 PM]


Gamasutra - Features - "More AI in Less Processor Time: 'Egocentric' AI" [06.19.00]

| | | |

Features
by Ian Wright and
James Marshall
Gamasutra
June 19, 2000 More AI in Less Processor Time: 'Egocentric'
Printer Friendly
Version
AI
The design brief for the new game's AI has just been handed to you, and to
call it optimistic would be an understatement. You are charged with Contents
Discuss this Article developing a real-time living, breathing city, populated by thousands of
pedestrians, hundreds of cars, and dozens of non-player characters. The
Games Should Be Fun
'incidental' pedestrians and traffic need to react convincingly to each other
and to your actions, while the NPCs absolutely positively must act in a Process Manager
believable manner when you encounter them. It's going to be
computationally expensive, but you've only been given 20% of the processor Process Peaking
time each frame, and if you exceed that and the game frames out, you've
failed.
Modern games will increasingly make such demands on hardware and
programmers. Fortunately help is at hand with techniques to control and manage real-time AI
execution, techniques that open up the possibility of future hardware acceleration of AI.
Games Should Be Fun

Games should be fun. This requirement has many consequences. One important consequence is that
games that allow player input at any moment ("arcade-style" games) should run in real-time,
presenting events that occur sufficiently fast to challenge the player's reactions. Lower frame-rates
look bad, reduce the opportunity for interaction, increase player frustration, and do not reflect the
speed of events in the real world. With this firmly in mind, we set out to design a framework for the
general execution of AI code.
Latter stages of a game project involve optimising parts of game code for processing time reductions.
This includes AI code, which, depending on the type of game, can take up more or less of the available
CPU time. Given this, an important requirement for general AI execution is that (a) it conforms to the
timing constraint of the overall game frame rate. A consequence of (a) is that the AI never exceeds a
maximum per-frame processing time.
Letters to the Editor: AI requires the execution of arbitrarily complex and heterogeneous pieces of code, often grouped
Write a letter
together as behavioural "rules" or "behavioursets" for various game objects or agents, such as the AI
View all letters
code for a trapdoor, obstacle, spring, or the code for an adversary, racing vehicle or character.
Therefore, a further requirement for general AI execution is that (b) it makes no assumptions about
the exact nature of the AI code, including assumptions about how long the code will take to execute.
Rendering code normally has to execute every frame in order to construct the visual scene. The
situation is different for AI code. Consider a soccer player, who may need to check for passing and
shooting opportunities every frame, but only need check its position against the team's formation
every other frame, or only in a dead-ball situation. AI code involves a wide range of execution
frequencies compared to non-AI game code. If all AI code is fully executed every frame when this is
not required then the resulting code is inefficient. Also, some games require different execution
frequencies for objects and agents, in addition to controlling the execution frequencies of their internal
processes. For example, a very slow moving tortoise need not be processed every frame, whereas the
hare may need to be. Hence, a further requirement for general AI execution is (c) it allows different
execution frequencies to be specified both for agents and their constitutive internal processes.
Finally we realised that some AI processes can be extensively time-sliced across many frames,
particularly if the results of the process are not immediately required. For example, if a strategy game
agent needs to plan a route through a terrain, then the planning can potentially take place over many
frames before the agent actually begins to traverse the deduced route. Time slicing allows
computationally expensive processes to be 'smeared' across many frames thereby reducing the per
frame CPU hit. Therefore, a final requirement for general AI execution is (d) it allows AI processes to
be dynamically suspended and reactivated.

http://gamasutra.com/features/20000619/wright_01.htm (1 of 2) [25/06/2002 2:53:56 PM]


Gamasutra - Features - "More AI in Less Processor Time: 'Egocentric' AI" [06.19.00]
There are no general methods for supporting different execution frequencies of parts of AI code and
time-slicing non-urgent AI processes. If these techniques are employed they are employed in a
project-specific, ad-hoc manner. There is no 'AI operating system' that allows programmers to control
these factors. This represents an important missed opportunity for the development of more complex
AI in games. If all AI code were executed through a common AI operating system or engine, with
mechanisms for specifying execution frequencies, upper bounds on CPU time, time-slicing, and
suspension and reactivation of processes, then it would be possible to get more AI for the same CPU
power.
To recap, here are four main requirements for general-purpose AI execution:
1. Conformity to the timing constraint of the overall game frame rate despite variable AI load;
2. No assumptions about the exact nature of AI code;
3. Allow different AI processes to have different execution frequencies, both for agents and their
constitutive internal processes;
4. Allow AI processes to be dynamically suspended and reactivated.
By now you may have realised that (a) asks the impossible: AI algorithms that take the same amount
of time even when asked to do more work. However, games must entertain the player, not implement
a perfect simulation. In the next section we'll look at why we can partially satisfy requirement (a).
Believability Vs Accuracy

An arcade-style game is somewhat like the real world, consisting of both active and passive agents
and events that unfold over time. But the game need not process every object, agent and event in the
virtual world in order to present a believable, entertaining experience. For example, if a truck is within
the player's field of view when planting a mine then the game necessarily needs to process the truck
movement and the mine drop, and the rendering code necessarily needs to draw this event to the
screen. However, if the truck is 'off-screen' the rendering code need not be run, and the AI code
controlling the truck could simply assert the existence of a mine on the road at a certain time, rather
than processing the fine-grained movement of the truck. Virtual game worlds need to present a
believable world to the player, and not necessarily present an accurate simulation of the real world.
Events not 'interactively close' to the human player need not be fully processed. Therefore,
requirement (a) can be satisfied if some AI processes need only be "believable" rather than "accurate".
These kinds of processes can be time-sliced over many frames, executed at a lower frequency, be
replaced with computationally less expensive "default" behaviours, or simply postponed. Furthermore,
what may need to be "accurate" at one time may need to be only "believable" at another, depending
on the current activities of the human player. We call the idea of prioritising the update of parts of the
game world currently most relevant to the player "egocentric processing". Our Process Manager
implements this idea.

________________________________________________________
Process Manager

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://gamasutra.com/features/20000619/wright_01.htm (2 of 2) [25/06/2002 2:53:56 PM]


GameDev.net - Recognizing Strategic Dispositions thread

Recognizing Strategic Dispositions thread GameDev.net

See Also:
Artificial Intelligence:Gaming

Recognizing Strategic Dispositions thread


compiled by Steve Woodcock

Hello Everybody: 7/15/95

Here's the second summarization of the excellent "Recognizing


Strategic Dispositions" thread, a followup containing posts made since
my original summarization posted on 6/12/95. As before, if I missed
any posts I apologize; let me know and I'll fix it in future summarizations
(if there's any demand for such).

The posts are presented essentially as is, with some *minor* editing
on my part for formatting.

Here are the e-mail addresses for those contributors whom I have.
Again, my profound apologies if I missed anybody; PLEASE let me know
and I'll correct this forthwith:

Uri Bruck (bruck@actcom.co.il)


Marc Carrier (mcarrier@bnr.ca)
Richard Cooley (pixel@gnu.ai.mit.edu, pixel@usa1.com)
Owen Coughlan (biggles@gmgate.vircom.com)
Dennis W. Disney (disney@mcnc.org)
Graham Healey (grahamh@oslonett.no)
Neil Kirby (nak@archie.cb.att.com)
Alexander Marc (903022@student.canberra.edu.au)
Andrae Muys (a.muys@mailbox.uq.oz.au, ccamuys@dingo.cc.uq.oz.au)
.oO FactorY Oo. (cs583953@lux.latrobe.edu.au)
Christopher Spencer (clspence@iac.net)
Viktor Szathmary (szviktor@inf.bme.hu)
Daniele Terdina (sistest@ictp.trieste.it)
Robert A. Uhl (ruhl@phoebe.cair.du.edu)
Will Uther (will@cs.su.oz.au)
Steven Woodcock (swoodcoc@cris.com, woodcock@escmail.orl.mmc.com)

Hopefully this will spark a renewal of the original thread and prove to
be informative to all concerned. I know that *I* have found this to be
this most illustrative and informative thread I've ever seen on the Net;
this is truly where this medium shines.

Enjoy!

Steven

Andrae Muys (ccamuys@dingo.cc.uq.oz.au) started off this

http://www.gamedev.net/reference/articles/article1085.asp (1 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

thread on May 9, 1995....


==============================================================================
I am currently trying to write a game that will provide a computer
opponent in a computer wargame. I intend eventually to incorporate
relitivly complex game mechanics such as can be found in commercial table
top rules systems. However the current project is nowhere near as
extensive with *extremely* basic rules and mechanics. However as each
unit may still move once each turn, and a number of distances the
branching factor puts the CHESS/GO thread to shame. In fact a lower
bound calculated from rules far simpler than I intend to use ended with a
branching factor of 2.6^8. A simple PLY search is therefore out of the
question. Brainstorming suggested that by abstracting the battlefield
into three frontal sections and maybe a reserve leaves a basic set of
rules with a branching factor of approx 16(much more manageable).
However to implement this the AI needs to be able to recognise what
constitues a flank/centre/rear/front etc... From practical wargaming
experience this is something that is normally arrived at by intuition.
Surely it is a problem which has been faced before and I was wondering if
there was any theory/code/code examples which I might use to build
such an engine.
In the meantime I intend to start on the AI assuming that a way can be
found to recognise such stratigic dispositions.

Thanks in Advance.

Andrae Muys.

==============================================================================

Andrae Muys (ccamuys@dingo.cc.uq.oz.au) wrote:


: I am currently trying to write a game that will provide a computer
: opponent in a computer wargame. I intend eventually to incorporate
: relitivly complex game mechanics such as can be found in commercial table
: top rules systems. However the current project is nowhere near as
: extensive with *extremely* basic rules and mechanics. However as each
: unit may still move once each turn, and a number of distances the
: branching factor puts the CHESS/GO thread to shame. In fact a lower
: bound calculated from rules far simpler than I intend to use ended with a
: branching factor of 2.6^8. A simple PLY search is therefore out of the
: question. Brainstorming suggested that by abstracting the battlefield
: into three frontal sections and maybe a reserve leaves a basic set of
: rules with a branching factor of approx 16(much more manageable).
: However to implement this the AI needs to be able to recognise what
: constitues a flank/centre/rear/front etc... From practical wargaming
: experience this is something that is normally arrived at by intuition.
: Surely it is a problem which has been faced before and I was wondering if
: there was any theory/code/code examples which I might use to build
: such an engine.
: In the meantime I intend to start on the AI assuming that a way can be
: found to recognise such stratigic dispositions.

: Thanks in Advance.
: Andrae Muys.
Andrae:

http://www.gamedev.net/reference/articles/article1085.asp (2 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

Glad to see I'm not the only one wrestling with this problem! ;)

Your approach to break things down into 'front', 'flank', and


'rear' makes sense and seems like a reasonable simplification of the problem.
A first-order definition of each might be:

front -- Where the mass of the enemy units are known to be. The
direction I want to attack.

flank -- Any area in which there are fewer (1/3 ?) as many enemy
units as the area designed as 'front'. (Note this is
purely arbitrary, based as much on prior experience as
anything else.) Perhaps also selected based on natural
defensive terrain (i.e., oceans or mountains).

rear -- Any area in which no (known) enemy units are operating,


or an area completely surrounded and controlled by me.

These definitions work by identifying the front, then extrapolating


from that. As enemy units move around, become detected, attack, etc.,
the 'size' of the front will likely grown and shrink, forcing similar changes
to the flanks (especially) and perhaps to the rear areas as well.

One problem I can think of off the top of my head is how to handle
multiple front situations; there's at least some possibility of overlapping
definitions, meaning that some precedence order must be established.
Special exceptions will also have to be made for overflying enemy aircraft
and incursions by enemy units of various types. (Example: If the enemy
drops some paratroopers into your 'rear' area, does it automatically become
a 'front'?)

In extreme situations of mass attack, I could see virtually the entire


play area being designated as a 'front' (imagine the Eastern Front in
WWII), which of course makes your branching problem worse. On the other
hand, attempts to continually minimize the size of the front will cut down
on the branching options, but might result in poor strategic and tactical
choices (i.e., the entire German army focuses on capturing Malta, rather than
overrunning Malta on its way to North Africa).

More brainstorming as I come up with ideas..............

Steven

==============================================================================

In article <3onpj8$lbc@theopolis.orl.mmc.com>,
woodcock@escmail.orl.mmc.com (Steve Woodcock) wrote:

> Andrae Muys (ccamuys@dingo.cc.uq.oz.au) wrote:


> : I am currently trying to write a game that will provide a computer
> : opponent in a computer wargame. I intend eventually to incorporate
> : relitivly complex game mechanics such as can be found in commercial table
> : top rules systems. However the current project is nowhere near as

http://www.gamedev.net/reference/articles/article1085.asp (3 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

> : extensive with *extremely* basic rules and mechanics. However as each
> : unit may still move once each turn, and a number of distances the
> : branching factor puts the CHESS/GO thread to shame. In fact a lower
> : bound calculated from rules far simpler than I intend to use ended with a
> : branching factor of 2.6^8. A simple PLY search is therefore out of the
> : question. Brainstorming suggested that by abstracting the battlefield
> : into three frontal sections and maybe a reserve leaves a basic set of
> : rules with a branching factor of approx 16(much more manageable).
> : However to implement this the AI needs to be able to recognise what
> : constitues a flank/centre/rear/front etc... From practical wargaming
> : experience this is something that is normally arrived at by intuition.
> : Surely it is a problem which has been faced before and I was wondering if
> : there was any theory/code/code examples which I might use to build
> : such an engine.
> : In the meantime I intend to start on the AI assuming that a way can be
> : found to recognise such stratigic dispositions.
>
> : Thanks in Advance.
> : Andrae Muys.
>
> Andrae:
>
>
> Glad to see I'm not the only one wrestling with this problem! ;)

Seems like there's a couple of us....

> Your approach to break things down into 'front', 'flank', and
> 'rear' makes sense and seems like a reasonable simplification of the problem.
> A first-order definition of each might be:

A friend here (richard@cs.su.oz.au) I was discussing it with actually came


up with a different solution. (based on two tanks on a board moving
pieces about - the game Bolo
http://www.cs.su.oz.au/~will/bolo/brains.html)

you define your 'center', their 'center' and the board 'center'. Anything
closer to your center than theirs is yours and can be taken. If they take
it back then it takes you less time to re-take than it takes them - if
they bother to take it they lose more time than you.

The idea is to
i) move your center towards the center of the board, then
ii) move your center towards their center, always keeping it between
their center and the center of the board. This should push them off the
board.

the defintion of 'center' is tricky. It's more a 'focus' than a


center. At the moment we have a couple we're looking at:

- The modal position of all your pieces. (mean would be horrible if


you get split in two somehow)
- The average position of your tank

There is also one other interesting piece of info we've come across. It

http://www.gamedev.net/reference/articles/article1085.asp (4 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

involves finding neighbours.

Most people would say that the 'Y' in the following diagram is not between
the 'X's:

X X

Whereas, they would say that the Y is between these two X's:

X X
Y

The definition we found for 'between' or 'blocking' and hence allows you
get neighbour (nothing is between) is as follows.
i) draw a circle between the two items such that the items sit on either
end of a diameter of the circle.
ii) If there is anything inside that circle it is between the two items
otherwise it's not.

We thought of defining front as that section 'between' the two centers.


This doesn't really handle multiple fronts, but then the game only really
has two moving pieces which makes multiple fronts difficult anyway.

Looking forward to comments,

\x/ill :-}

William Uther will@cs.su.oz.au

==============================================================================

Andrae Muys (ccamuys@dingo.cc.uq.oz.au) wrote:


: I am currently trying to write a game that will provide a computer
: opponent in a computer wargame. I intend eventually to incorporate
: relitivly complex game mechanics such as can be found in commercial table
: top rules systems. However the current project is nowhere near as
: extensive with *extremely* basic rules and mechanics.
<<>>

I thought it might be appropriate if I quickly outlined the game


mechanics I intend on using for an *extremely* basic game.
I have decided to create a napolionic era game. The reason's for this are..
a) Personal knowledge, and posession of a number of rule systems for the
era which may be adapted for computer play.
b) The rigid formations and formal tactics of the era should make the
pattern recognsion easier.
c) The principles of combined arms are easy to grasp and formalise
heruisticly in this era.
d) Only a small number of troop types have to be provided to provide a
reasonable game.

The rules which I intend on implementing to date consist of three troop

http://www.gamedev.net/reference/articles/article1085.asp (5 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

INF: slow, strong, short range;


CAV: fast, strong, zero range;
ART: slow, weak, long range;
and three extra terrain types(other than clear)
WOODS: defence from melee and ranged attacks, slows movement;
ROUGH(HILLS): defence from melee, slows movement, aids attack;
ROAD: no defence, faster movement;

Of course the principles of pattern recognition and stratigic disposions


apply to any game in any era, and also to more abstract forms of the
wargame such as CHESS, GO, and CHECKERS. I am using these rules because
I beleive they provide the basic elements of wargames, and are necessary
and sufficient for the application of traditional military stratagy and
tactics. Therefore if I can produce a computer player which successfully
plays a sound stratigic game and which has a 'grasp' of tactics, it
should be able to be applied to any napolionic rule system with little
modification, and to other eras without too much difficulty.

Andrae.

==============================================================================

Steve Woodcock (woodcock@escmail.orl.mmc.com) wrote:


: Andrae Muys (ccamuys@dingo.cc.uq.oz.au) wrote:
: : I am currently trying to write a game that will provide a computer
: : opponent in a computer wargame.

: Your approach to break things down into 'front', 'flank', and


: 'rear' makes sense and seems like a reasonable simplification of the problem.
: A first-order definition of each might be:

: front -- Where the mass of the enemy units are known to be. The
: direction I want to attack.

: flank -- Any area in which there are fewer (1/3 ?) as many enemy
: units as the area designed as 'front'. (Note this is
: purely arbitrary, based as much on prior experience as
: anything else.) Perhaps also selected based on natural
: defensive terrain (i.e., oceans or mountains).

: rear -- Any area in which no (known) enemy units are operating,


: or an area completely surrounded and controlled by me.

:
: These definitions work by identifying the front, then extrapolating
: from that. As enemy units move around, become detected, attack, etc.,
: the 'size' of the front will likely grown and shrink, forcing similar changes
: to the flanks (especially) and perhaps to the rear areas as well.

Identifing the front first and then defining the rest w.r.t it would seem
to simplify the problem further. I hadn't thought of that, it looks like
a good idea. However one question to contimplate. Where are the fronts
in the following position. {X - YOURS, Y - THEIRS}

YY Now by any standards X is in a bad way. It has been

http://www.gamedev.net/reference/articles/article1085.asp (6 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

Y completely outflanked and his left flank is already


XXX YY overrun. Intuitivly his front is now perpendicular
XX XXX XX XY Y to Y's. I think we may need a concept such as
X X Y contact point, which in this case is Y's centre, and
Y X's left flank. Naturally in most battles there would
YYY be multiple contact points. Personally I would draw the
Y fronts as follows.
|
| What do you think?
--------C|
|
|

: One problem I can think of off the top of my head is how to handle
: multiple front situations; there's at least some possibility of overlapping
: definitions, meaning that some precedence order must be established.
: Special exceptions will also have to be made for overflying enemy aircraft
: and incursions by enemy units of various types. (Example: If the enemy
: drops some paratroopers into your 'rear' area, does it automatically become
: a 'front'?)

This is why I am using a very basic set of game mechanics, and using a
different era(see other post). This way the only way troops can reach
your rear is to march there. Also there are very few battles in this era
with multiple fronts. Although allowance must be made for bent and
twisted fronts. The hinge being a very critical point in an extended line.

: In extreme situations of mass attack, I could see virtually the entire


: play area being designated as a 'front' (imagine the Eastern Front in
: WWII), which of course makes your branching problem worse. On the other
: hand, attempts to continually minimize the size of the front will cut down
: on the branching options, but might result in poor strategic and tactical
: choices (i.e., the entire German army focuses on capturing Malta, rather than
: overrunning Malta on its way to North Africa).

In the rules I have in mind, most cases you will only have mass attacks
or at least dense fronts. One problem you do have if you try to model a
high echelon game such as the eastern front(WWII) what happened next.
The russian front fragmented and from one dense front you ended up with
hundreds of small localised fronts, the resulting loss of cohesion being
one of the greatest advantages of blitzcrieg. Because cohesion is so
much more important at a grand stratigic level(not that it isn't in
stratagies at a operational/tactical level) I feel that a search for a
front maybe counter productive. My gut feeling is that it would be
better to consider area controled by your forces, controlled by their
forces, and contested. With an emphisis on your forces maintaining
unbroken contact between spheres of influence. So the insertion of
forces 'behind the lines' would only alter the balance of control in the
local area. A domino effect would be possible where forces stratigicly
inserted would weaken a units control of an area weakening a unit relying
on it for its 'cohesive link' weaking its control of another area and so
on. However this is what happens in real life so if any thing it
suggests that it may be a good approach.

http://www.gamedev.net/reference/articles/article1085.asp (7 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

: More brainstorming as I come up with ideas..............

ditto.

: Steven

Andrae

==============================================================================

On 11 May 1995, Andrae Muys wrote:


> one of the greatest advantages of blitzcrieg. Because cohesion is so
> much more important at a grand stratigic level(not that it isn't in
> stratagies at a operational/tactical level) I feel that a search for a
> front maybe counter productive. My gut feeling is that it would be
> better to consider area controled by your forces, controlled by their
> forces, and contested. With an emphisis on your forces maintaining
> unbroken contact between spheres of influence. So the insertion of
> forces 'behind the lines' would only alter the balance of control in the
> local area. A domino effect would be possible where forces stratigicly
> inserted would weaken a units control of an area weakening a unit relying
> on it for its 'cohesive link' weaking its control of another area and so
> on. However this is what happens in real life so if any thing it
> suggests that it may be a good approach.

In real life, I would imagine one of the main targets in any campaign to
be supply lines. For example, The Dambusters is a movie about some
special bombers with special bombs desiged to destroy dams, with the aim
of crippling Germany's iron/steel industry. General Custer was in trouble
because he was surrounded and cut off from supplies and reinforcements
(yes, my knowledge is very sketchy).

Another approach to defining a front is that it is where you want it to


be! Perhaps call your front where you are trying to hold back/push back
enemy forces. Incursions don't necessarily happen on a front - you may
quick-march or drop forces into enemy territory next to a vital supply
line, or sneak in sabateurs to destroy strategic bridges ("A Bridge Too
Far").

I heard someone claim once that war is about economic ruin rather than
outright carnage. Is there any way your AI can calculate the move that
will cause most damage to industry and support, rather than shoot the
most enemy? Of course, these strategies apply to wars, not battles...

Just a thought...
-Alex

==============================================================================

Satrapa / Alexander Marc (ISE) (u903022@student.canberra.edu.au) wrote:


: On 11 May 1995, Andrae Muys wrote:
<<>>
: > front maybe counter productive. My gut feeling is that it would be
: > better to consider area controled by your forces, controlled by their
: > forces, and contested. With an emphisis on your forces maintaining

http://www.gamedev.net/reference/articles/article1085.asp (8 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

: > unbroken contact between spheres of influence. So the insertion of


<<>>

: In real life, I would imagine one of the main targets in any campaign to
: be supply lines. For example, The Dambusters is a movie about some
: special bombers with special bombs desiged to destroy dams, with the aim
: of crippling Germany's iron/steel industry. General Custer was in trouble
: because he was surrounded and cut off from supplies and reinforcements
: (yes, my knowledge is very sketchy).

Yes you are right one of the major considerations at a Stratigic level is
supply, how do I attack yours, how do I protect mine. One point
concerning General Custer however, his problem wasn't so much that his
supply lines were cut, more that he was surrounded with no avenue of
retreat. This is a position which is so inheriently poor that any AI
should automatically avoid it without any requirement for a 'special case'.

: Another approach to defining a front is that it is where you want it to


: be! Perhaps call your front where you are trying to hold back/push back
: enemy forces. Incursions don't necessarily happen on a front - you may
: quick-march or drop forces into enemy territory next to a vital supply
: line, or sneak in sabateurs to destroy strategic bridges ("A Bridge Too
: Far").

With the game mechanics we have been considering of late the AI won't
have to be concerned with most of these problems. I personally can't see
how defining a front 'where you want it to be' is useful although this is
probably more me not thinking it though properly than a problem with the
idea. What do you mean by it, and is it in anyway related to the concept
of critical point/contact point currently being discussed?

: I heard someone claim once that war is about economic ruin rather than
: outright carnage. Is there any way your AI can calculate the move that
: will cause most damage to industry and support, rather than shoot the
: most enemy? Of course, these strategies apply to wars, not battles...

Personally I prefer Sun Tzu's philosophy. Basically it holds that to win


without fighting is best, and the aim of war is to capture territory
without damaging it.

BTW: does anyone know if there is a e-text version of The Art of War
anywhere?

Andrae.

==============================================================================

Andrae Muys (ccamuys@dingo.cc.uq.oz.au) wrote:


: With the game mechanics we have been considering of late the AI won't
: have to be concerned with most of these problems. I personally can't see
: how defining a front 'where you want it to be' is useful although this is
: probably more me not thinking it though properly than a problem with the
: idea. What do you mean by it, and is it in anyway related to the concept
: of critical point/contact point currently being discussed?

http://www.gamedev.net/reference/articles/article1085.asp (9 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

That's right; at the moment we're sort of focusing on a roughly


Napoleonic-era level of combat for the sake of simplicity. Paratroopers
will be added in AI v. 2.0. ;)

: : I heard someone claim once that war is about economic ruin rather than
: : outright carnage. Is there any way your AI can calculate the move that
: : will cause most damage to industry and support, rather than shoot the
: : most enemy? Of course, these strategies apply to wars, not battles...
: Personally I prefer Sun Tzu's philosophy. Basically it holds that to win
: without fighting is best, and the aim of war is to capture territory
: without damaging it.

If we hold to the concept of specifying various objectives (as discussed


way back at the start of this thread), then I would think moves designed
to inflict economic damage would flow naturally out of that. Oil wells,
ports, rail lines, etc. would all be natural objectives, and as the AI
considers its moves and attempts to seize objectives they would naturally
be overrrun.

: BTW: does anyone know if there is a e-text version of The Art of War
: anywhere?

Good question...I'd like to know myself.

Steven

==============================================================================

ccamuys@dingo.cc.uq.oz.au (Andrae Muys) writes:


>Personally I prefer Sun Tzu's philosophy. Basically it holds that to win
>without fighting is best, and the aim of war is to capture territory
>without damaging it.
>BTW: does anyone know if there is a e-text version of The Art of War
>anywhere?
>
>Andrae.

http://timpwrmac.clh.icnet.uk/Docs/suntzu/szcontents.html
for the Art of War (not the 1960's translation, an older one) , and

http://fermi.clas.virginia.edu/~gl8f/paradoxes.html
for George Silver's Paradoxes of defence, which is probably
in a similarr vein, but I have not got around to reading it yet.

Anybody who does any type of strategic or tactical stuff should


read at least the Art of War, very good stuff indeed.

==============================================================================

William Uther (will@cs.su.oz.au) wrote:

<<>>

http://www.gamedev.net/reference/articles/article1085.asp (10 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

: There is also one other interesting piece of info we've come across. It
:

: most people would say that the 'Y' in the following diagram is not between
: the 'X's:

: X X

: Y

However Y is definately infulencing the connection between the X's. The


right X should be able to support the left, however it is possible that
the reverse is impossible. Of course terrain could modify this.

: Whereas, they would say the the Y is between these two X's:

: X X
: Y

Here I would consider Y to have interdicted the link, or the X's are
still neighbours but they have outflanked/maybe even overrun, Y.

: The definition we found for 'between' or 'blocking' and hence allows you
: get neighbour (nothing is between) is as follows.
: i) draw a circle between the two items such that the items sit on either
: end of a diameter of the circle.
: ii) If there is anything inside that circle it is between the two items
: otherwise it's not.

An alternitive(I havn't played bolo so I don't know if it's appropriate)


would be to draw a circle radius Y's maximum EFFECTIVE range (you decide
what that is) and if it cuts a line drawn between the two X's the link is
cut.

: We thought of defining front as that section 'between' the two centers.


: This doesn't really handle multiple fronts, but then the game only really
: has two moving pieces which makes multiple fronts difficult anyway.

Andrae.
==============================================================================

Will Uther (will@cs.su.oz.au) wrote:


: In article <3onpj8$lbc@theopolis.orl.mmc.com>,

: A friend here (richard@cs.su.oz.au) I was discussing it with actually came


: up with a different solution. (based on two tanks on a board moving
: pieces about - the game Bolo
: http://www.cs.su.oz.au/~will/bolo/brains.html)

: you define your 'center', their 'center' and the board 'center'. Anything
: closer to your center than theirs is yours and can be taken. If they take
: it back then it takes you less time to re-take than it takes them - if
: they bother to take it they lose more time than you.

http://www.gamedev.net/reference/articles/article1085.asp (11 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

: The idea is to
: i) move your center towards the center of the board
: then ii) move your center towards their center, always keeping it between
: their center and the center of the board. This should push them off the
: board.

: the defintion of 'center' is tricky. It's more a 'focus' than a


: center. At the moment we have a couple we're looking at:

: - The modal position of all your pieces. (mean would be horrible if


: you get split in two somehow)
: - The average position of your tank

: There is also one other interesting piece of info we've come across. It
: involves finding neighbours.

: most people would say that the 'Y' in the following diagram is not between
: the 'X's:

: X X

: Y

: Whereas, they would say the the Y is between these two X's:

: X X
: Y

: The definition we found for 'between' or 'blocking' and hence allows you
: get neighbour (nothing is between) is as follows.
: i) draw a circle between the two items such that the items sit on either
: end of a diameter of the circle.
: ii) If there is anything inside that circle it is between the two items
: otherwise it's not.

: We thought of defining front as that section 'between' the two centers.


: This doesn't really handle multiple fronts, but then the game only really
: has two moving pieces which makes multiple fronts difficult anyway.

: Looking forward to comments,

: \x/ill :-}

Hmmm...this does have some merit to it. I like the idea of the 'center'
being arrived at via this circle-method; it has an elegance to it that also
is somewhat intuitive.

The only potential problem I can see with this approach is that it will
tend towards great massive brute force engagements by migrating the bulk
of both forces towards a common 'center'. This is fine in the case of
two combatants (i.e., two Bolo tanks) but not so good for two armies.

http://www.gamedev.net/reference/articles/article1085.asp (12 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

I think we could solve the multiple front problem if we generalized the


problem to find SEVERAL 'localized centers', thus allowing for multiple axes
of advance along a somewhat more fluid 'front'. In the case of two armies,
you might get something like this:

x x x

y1 y2 y3 y4

x x x

In this case, each of the x-x pairs make INDEPENDENT determinations


of what lies 'between' them. Then, based on the relative combat strengths
and other factors, you could issue separate orders for each section of
the battlefield. This effectively sets up a variety of 'mini-centers'
(using our terminology from above) and more realistically (IMHO) emulates
realworld operations (i.e., lots of mini-objectives, the possibility for
overlapping objectives, etc.).

Opinions? Comments?

Steven

==============================================================================

>"Satrapa / Alexander Marc (ISE)" wrote:


>
> On Wed, 10 May 1995, Will Uther wrote:
> [bigsnip]
> > you define your 'center', their 'center' and the board 'center'. Anything
> > closer to your center than theirs is yours and can be taken. If they take
> [snip]
> > The idea is to
> > i) move your center towards the center of the board
> > then ii) move your center towards their center, always keeping it between
> > their center and the center of the board. This should push them off the
> > board.
>
> One problem - any enemy who knows this strategy may use it against you
> (and therefore both sides end up fighting over the center while the edge
> of the board is unused), or may act like water and flow - so you get
> through to their centre, but find that the edges of the board have
> suddenly caved in on you. So their center is your center, but their
> distribution is wider than yours... while you're attacking them over
> here, they're stealing your bases/terrain over there.

In the type of combat I was thinking of, you each have only one active
piece that moves around the board moving the other 'static' pieces. If
Player A surrounds Player B, but Player B has supplies 'inside', then
player B has the
advantage. e.g.

1 2

http://www.gamedev.net/reference/articles/article1085.asp (13 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

3 4

Supplies

5 6

7 8

Assume that 1,2,7 and 8 are owned by A and effectively 'surround' B who
owns 3, 4, 5 and 6 (and some supplies that mean he never has to leave his
fort). B can attack 1. when A moves there to defend, B can break off and
attack 8. A has a long way to go to get to 8 (A has to go right around
whereas B can go through the centre) and so will probably lose that
piece. Being more spead-out than the opposition can be a big problem.

I agree that this is probably not the type of combat you were thinking
about. For an example of this type of combat look at the mac game Bolo.

\x/ill :-}

P.S. as a side note, Sun Tzu (sp?) in his 'Art of War' recommends against
sieges which is effectively the situation you have above.

William Uther will@cs.su.oz.au

==============================================================================

Steve Woodcock wrote:


>
> Hmmm...this does have some merit to it. I like the idea of the 'center'
>being arrived at via this circle-method; it has an elegance to it that also
>is somewhat intuitive.
>
> The only potential problem I can see with this approach is that it will
>tend towards great massive brute force engagements by migrating the bulk
>of both forces towards a common 'center'. This is fine in the case of
>two combatants (i.e., two Bolo tanks) but not so good for two armies.

Actually, Bolo can be played by as many as 16 players, with as many


as 16 different sides. But most people play against bots only when
they cannot get any human competition, and so play against only one.

> I think we could solve the multiple front problem if we generalized the
>problem to find SEVERAL 'localized centers', thus allowing for multiple axes
>of advance along a somewhat more fluid 'front'. In the case of two armies,
>you might get something like this:
>
>
> x x x
>
> y1 y2 y3 y4

http://www.gamedev.net/reference/articles/article1085.asp (14 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

>
> x x x
>
> In this case, each of the x-x pairs make INDEPENDENT determinations
>of what lies 'between' them. Then, based on the relative combat strengths
>and other factors, you could issue separate orders for each section of
>the battlefield. This effectively sets up a variety of 'mini-centers'
>(using our terminology from above) and more realistically (IMHO) emulates
>realworld operations (i.e., lots of mini-objectives, the possibility for
>overlapping objectives, etc.).

Hmmm...

I assume that this assumes a single master strategist giving orders


to the units. I would think that the master should group neighboring
units as one, in order to save time when calculating the following:
calculate a center between every piece and every other. When his is
all done, choose the minimum amount of centers which take care of the
maximum amount of enemies. This should allow certain interesting
effects over varying terrain, if said terrain is taken into account as
it should be.

Otherwise, it sounds pretty good. If I ever get my set of Bolo


brains working, I'll turn 'em loose and see what happens.

Robert Uhl

==============================================================================

First off, let me just say that I think this is a *great* thread, easily
one of the more interesting I've seen in this newsgroup. *This* is the kind
of brainstorming the Net was made for....

Andrae Muys (ccamuys@dingo.cc.uq.oz.au) wrote:

: Identifing the front first and then defining the rest w.r.t it would seem
: to simplify the problem further. I hadn't thought of that, it looks like
: a good idea. However one question to contimplate. Where are the fronts
: in the following position. {X - YOURS, Y - THEIRS}

: YY Now by any standards X is in a bad way. It has been


: Y completely outflanked and his left flank is already
: XXX YY overrun. Intuitivly his front is now perpendicular
: XX XXX XX XY Y to Y's. I think we may need a concept such as
: X X Y contact point, which in this case is Y's centre, and
: Y X's left flank. Naturally in most battles there would
: YYY be multiple contact points. Personally I would draw the
: Y fronts as follows.
: |
: | What do you think?
: --------C|
: |
: |

http://www.gamedev.net/reference/articles/article1085.asp (15 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

I would agree with your assessment of the situation and your breakdown
of the forces into a front. Obviously in this case, X either a.) needs
to rapidly execute a turn of his front or b.) is in the midst of a brilliant
plan that wil prove to be Y's undoing. (The challenge, of course, is to
get a computer AI to execute 'b' more often than 'a'.)

If we define a contact point, then does that give us a natural focus


towards which to direct our forces and our strategic 'thinking'? They would
seem to.

: : One problem I can think of off the top of my head is how to handle
: : multiple front situations; there's at least some possibility of overlapping
: : definitions, meaning that some precedence order must be established.
: : Special exceptions will also have to be made for overflying enemy aircraft
: : and incursions by enemy units of various types. (Example: If the enemy
: : drops some paratroopers into your 'rear' area, does it automatically become
: : a 'front'?)
: This is why I am using a very basic set of game mechanics, and using a
: different era(see other post). This way the only way troops can reach
: your rear is to march there. Also there are very few battles in this era
: with multiple fronts. Although allowance must be made for bent and
: twisted fronts. The hinge being a very critical point in an extended line.

Okay; let's go with that simplification for now. It'll certainly make this
easier to think about, and we can always make the AI smarter in Rev 2.0! ;)

: In the rules I have in mind, most cases you will only have mass attacks
: or at least dense fronts. One problem you do have if you try to model a
: high echelon game such as the eastern front(WWII) what happened next.
: The russian front fragmented and from one dense front you ended up with
: hundreds of small localised fronts, the resulting loss of cohesion being
: one of the greatest advantages of blitzcrieg. Because cohesion is so
: much more important at a grand stratigic level(not that it isn't in
: stratagies at a operational/tactical level) I feel that a search for a
: front maybe counter productive. My gut feeling is that it would be
: better to consider area controled by your forces, controlled by their
: forces, and contested. With an emphisis on your forces maintaining
: unbroken contact between spheres of influence. So the insertion of
: forces 'behind the lines' would only alter the balance of control in the
: local area. A domino effect would be possible where forces stratigicly
: inserted would weaken a units control of an area weakening a unit relying
: on it for its 'cohesive link' weaking its control of another area and so
: on. However this is what happens in real life so if any thing it
: suggests that it may be a good approach.

Okay then, fronts are out. Spheres of influence are in. They do seem
to better reflect the 'domino effect', as you suggest.

If we use your previous suggestion for identifying centers, combined with


the above-mentioned contact points, then this may lead us towards a more

http://www.gamedev.net/reference/articles/article1085.asp (16 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

natural way of handling the above situation. Based on what we've discussed
so far, I would envision an AI's logic train going something like this:

Pass #1: Identify natural 'groups' of X-Y forces using the


'circle method' discussed earlier, perhaps taking into account
the possibilities of influence and interdiction as a previous
poster suggested.

Pass #2: Having identified these natural groupings, identify


contact points amongst the forces within each group. These
will serve as natural foci for our planning process.

Pass #3: Having identified natural groupings and focus points,


we now begin thinking about steps needed to link up our groups,
minimize the size of enemy-held areas, elimination of enemy units,
etc.

One thing we do want to avoid, of course, is TOO much reliance on


maintaining unbroken contact between spheres of influence. As you know,
in the Napoleanic era cavalry forces on the same side were often miles
apart and in some cases didn't even know of each other's existence! In this
case the AI would need to know to treat the two forces independently and
NOT make linking them a high priority.

Steven

==============================================================================

Steve Woodcock (woodcock@escmail.orl.mmc.com) wrote:


: Andrae Muys (ccamuys@dingo.cc.uq.oz.au) wrote:

: : YY Now by any standards X is in a bad way. It has been


: : Y completely outflanked and his left flank is already
: : XXX YY overrun. Intuitivly his front is now perpendicular
: : XX XXX XX XY Y to Y's. I think we may need a concept such as
: : X X Y contact point, which in this case is Y's centre, and
: : Y X's left flank. Naturally in most battles there would
: : YYY be multiple contact points. Personally I would draw the
: : Y fronts as follows.
: : |
: : | What do you think?
: : --------C|
: : |
: : |

: I would agree with your assessment of the situation and your breakdown
: of the forces into a front. Obviously in this case, X either a.) needs
: to rapidly execute a turn of his front or b.) is in the midst of a brilliant
: plan that wil prove to be Y's undoing. (The challenge, of course, is to
: get a computer AI to execute 'b' more often than 'a'.)

Of course the ultimate AI wouldn't find itself in such a dangerous


position. A bit like the untimate General wouldn't. But if it does it

http://www.gamedev.net/reference/articles/article1085.asp (17 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

must extricate itself well. Just one more 'special' situation to test
the AI's abiltiy.

: If we define a contact point, then does that give us a natural focus


: towards which to direct our forces and our strategic 'thinking'? They would
: seem to.

Well this thread is useful. The idea of contact points should radically
prune any decision tree.(And sooner or later the AI will have to make a
choice) Of course at the stratigic/grand stratigic levels we may need a
modified definition of contact point but at the level I am interested in
Contact points appear to be a good way to look at things. In fact now
that I think about it, contact points are how **I** allocate **MY**
consideration. This approach however leads us to consider how to
recognise potential contact points and how to evaluate the relitive
benifits of creating/avoiding specific contact points. e.g. in the
example above X should avoid contact with Y until he has rotated his
front.(it looks like we may still need to consider fronts as well).

: : This is why I am using a very basic set of game mechanics, and using a
: : different era(see other post). This way the only way troops can reach
: : your rear is to march there. Also there are very few battles in this era
: : with multiple fronts. Although allowance must be made for bent and
: : twisted fronts. The hinge being a very critical point in an extended line.
: Okay; let's go with that simplification for now. It'll certainly make this
: easier to think about, and we can always make the AI smarter in Rev 2.0! ;)

My thoughts exactly.

<<<>>>
: : front maybe counter productive. My gut feeling is that it would be
: : better to consider area controled by your forces, controlled by their
: : forces, and contested. With an emphisis on your forces maintaining
: : unbroken contact between spheres of influence. So the insertion of
: : forces 'behind the lines' would only alter the balance of control in the
: : local area. A domino effect would be possible where forces stratigicly
: : inserted would weaken a units control of an area weakening a unit relying
: : on it for its 'cohesive link' weaking its control of another area and so
: : on. However this is what happens in real life so if any thing it
: : suggests that it may be a good approach.

: Okay then, fronts are out. Spheres of influence are in. They do seem
: to better reflect the 'domino effect', as you suggest.

However as discussed above at a lower level fronts again become


important. IMHO this is because tactical considerations require physical
cohesion, while stratigic(level) utilise logical cohesion. As you later
note cavalry detatchments regularlly operated independently of other
units. However (excluding irregulars) many/most operated in conjunction
with armies, their presence in a particular area, important to enemy
forces, deliberate and planned. Even though operating independently.

: If we use your previous suggestion for identifying centers, combined with


: the above-mentioned contact points, then this may lead us towards a more
: natural way of handling the above situation. Based on what we've discussed

http://www.gamedev.net/reference/articles/article1085.asp (18 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

: so far, I would envision an AI's logic train going something like this:

: Pass #1: Identify natural 'groups' of X-Y forces using the


: 'circle method' discussed earlier, perhaps taking into account
: the possibilities of influence and interdiction as a previous
: poster suggested.

: Pass #2: Having identified these natural groupings, identify


: contact points amongst the forces within each group. These
: will serve as natural foci for our planning process.

: Pass #3: Having identified natural groupings and focus points,


: we now begin thinking about steps needed to link up our groups,
: minimize the size of enemy-held areas, elimination of enemy units,
: etc.
This sounds like it should work reasonably well, however I have a feeling
there may be problems with the way it handles many sparse stratigic
situations. I'll think about it and we can discuss it futher when I have
clearified my concerns.

: One thing we do want to avoid, of course, is TOO much reliance on


: maintaining unbroken contact between spheres of influence. As you know,
: in the Napoleanic era cavalry forces on the same side were often miles
: apart and in some cases didn't even know of each other's existence! In this
: case the AI would need to know to treat the two forces independently and
: NOT make linking them a high priority.

One more thing about cavalry is that the other arms(INF/ART) can't
operate this way as it requires a level of speed.

One thing I am considering is cross posting some questions regarding real


life treatment of these concerns to rec.games.miniatures after they finish
their reorganisation.

Andrae.

==============================================================================

Steve Woodcock wrote:


>
>Andrae Muys (ccamuys@dingo.cc.uq.oz.au) wrote:
>
>: Identifing the front first and then defining the rest w.r.t it would seem
>: to simplify the problem further. I hadn't thought of that, it looks like
>: a good idea. However one question to contimplate. Where are the fronts
>: in the following position. {X - YOURS, Y - THEIRS}
>
>: YY Now by any standards X is in a bad way. It has been
>: Y completely outflanked and his left flank is already
>: XXX YY overrun. Intuitivly his front is now perpendicular
>: XX XXX XX XY Y to Y's. I think we may need a concept such as
>: X X Y contact point, which in this case is Y's centre, and
>: Y X's left flank. Naturally in most battles there would
>: YYY be multiple contact points. Personally I would draw the
>: Y fronts as follows.

http://www.gamedev.net/reference/articles/article1085.asp (19 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

>: |
>: | What do you think?
>: --------C|
>: |
>: |
>
>
>
> I would agree with your assessment of the situation and your breakdown
>of the forces into a front. Obviously in this case, X either a.) needs
>to rapidly execute a turn of his front or b.) is in the midst of a brilliant
>plan that wil prove to be Y's undoing. (The challenge, of course, is to
>get a computer AI to execute 'b' more often than 'a'.)

Heh. Actually, if X is driving through Y, in order to create two


fronts along which Y must fight, thereby forcing Y to regroup, then
'b' has been executed.

> If we define a contact point, then does that give us a natural focus
>towards which to direct our forces and our strategic 'thinking'? They would
>seem to.

Yes. The AI should attempt to create as many contatc points as


possible. Why? These points mean that he is actually fighting the
enemy and hopefully killing the same. That is why two parallel lines
are a good formation. With a good lookahead, though, the T formation
is also good, because it will enable many points of contact _over
time_, and hurt Y's ability to create them. Part of the battle is
controlling contact. In fact, if the AI can do this controlling, then
it is my belief that it will have an extremely good chance at winning.

> Okay then, fronts are out. Spheres of influence are in. They do seem
>to better reflect the 'domino effect', as you suggest.

The sphere of influence idea seems to work well with the points of
contact idea. Perhaps each unit has a sphere wherein it can contact
within a certain number fo moves (say 3), and its job is to contact
the fewest enemies at once but the most over time. IOW, it doesn't
want to be outnumbered but it wants to see action.

And two units have a greater sphere of influence, say 8 moves, than
just one. This would help reflect the greater power of two. Controlled
territory would be defined as that surrounded by my own units and
without enemy units, once again utilizing the SoIs. Contested would,
of course, be that which neither side controls. Perhaps a 'strength'
value would be attached to areas, indicating the number of units
controlling it, time for more to get there &c. Thiswould provide an
incentive for the AI to encircle enemy territory.

Robert Uhl

==============================================================================

Robert A. Uhl (ruhl@phoebe.cair.du.edu) wrote:

http://www.gamedev.net/reference/articles/article1085.asp (20 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

: Heh. Actually, if X is driving through Y, in order to create two


: fronts along which Y must fight, thereby forcing Y to regroup, then
: 'b' has been executed.

Good point. Yet another example of how a human can spot an opportunity
in the face of the most daunting situations....

: > If we define a contact point, then does that give us a natural focus
: >towards which to direct our forces and our strategic 'thinking'? They would
: >seem to.

: Yes. The AI should attempt to create as many contatc points as


: possible. Why? These points mean that he is actually fighting the
: enemy and hopefully killing the same. That is why two parallel lines
: are a good formation. With a good lookahead, though, the T formation
: is also good, because it will enable many points of contact _over
: time_, and hurt Y's ability to create them. Part of the battle is
: controlling contact. In fact, if the AI can do this controlling, then
: it is my belief that it will have an extremely good chance at winning.

I'm not sure I agree that I would want the AI to maximize its number
of contacts with the enemy; I would agree that it should seek to control
them.

Maximization in itself will only lead to toe-to-toe WWI slugfests, and


basically leads to the AI playing a war of attrition. That's perhaps one
of the defining characteristics of most AIs today--if they don't cheat
somehow, then they tend to fight wars of attrition.

: The sphere of influence idea seems to work well with the points of
: contact idea. Perhaps each unit has a sphere wherein it can contact
: within a certain number fo moves (say 3), and its job is to contact
: the fewest enemies at once but the most over time. IOW, it doesn't
: want to be outnumbered but it wants to see action.

Yes indeed. That's sort of where we were headed, I believe.


I do like the idea of factoring time into the equation somehow; the AI
ought to be willing to have NO enemy contact for two turns if it's busy
moving forces around for MAJOR enemy contract (say, the Normandy invasion)
on the third turn. That does make sense intuitively. Perhaps we can tie
time-weighted values to multi-turn engagement decisions?

: And two units have a greater sphere of influence, say 8 moves, than
: just one. This would help reflect the greater power of two. Controlled
: territory would be defined as that surrounded by my own units and
: without enemy units, once again utilizing the SoIs. Contested would,
: of course, be that which neither side controls. Perhaps a 'strength'
: value would be attached to areas, indicating the number of units
: controlling it, time for more to get there &c. Thiswould provide an
: incentive for the AI to encircle enemy territory.

http://www.gamedev.net/reference/articles/article1085.asp (21 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

Agreed. This is very similar to the idea of using a 'fire distribution'


map that was presented by Daniele, actually, just a different method of
solving the same problem. We'll want to do SOMETHING like this in order
to properly consider the limiting effects of enemy weaponry and the
local terrain.

Steven
==============================================================================

On 15 May 1995, Andrae Muys wrote:

> Steve Woodcock (woodcock@escmail.orl.mmc.com) wrote:


> : Andrae Muys (ccamuys@dingo.cc.uq.oz.au) wrote:
>
> : : YY Now by any standards X is in a bad way. It has been
> : : Y completely outflanked and his left flank is already
> : : XXX YY overrun. Intuitivly his front is now perpendicular
> : : XX XXX XX XY Y to Y's. I think we may need a concept such as
> : : X X Y contact point, which in this case is Y's centre, and
> : : Y X's left flank. Naturally in most battles there would
> : : YYY be multiple contact points. Personally I would draw the
> : : Y fronts as follows.
[snip]
> benifits of creating/avoiding specific contact points. e.g. in the
> example above X should avoid contact with Y until he has rotated his
> front.(it looks like we may still need to consider fronts as well).

Two options I can see here - either X moves its forces into Y's "front"
to create as much damage as possible (in a concentrated strike or
"blitzkreig" style attack) or X moves its "front" back, forcing Y to make
the next move (allowing X the advantage in defence).

Is X aggressive or defensive? Y's forces are rather spread out, so X can


"spear" through the front and attack from both sides (like an inverse of
the "pincer" movement).

Just a thought...
-Alex
==============================================================================

I appologise for the excessive quoting but without the diagram any reply
is awkward.

Satrapa / Alexander Marc (ISE) (u903022@student.canberra.edu.au) wrote:


: On 15 May 1995, Andrae Muys wrote:

: > Steve Woodcock (woodcock@escmail.orl.mmc.com) wrote:


: > : Andrae Muys (ccamuys@dingo.cc.uq.oz.au) wrote:
: >
: > : : YY Now by any standards X is in a bad way. It has been
: > : : Y completely outflanked and his left flank is already
: > : : XXX YY overrun. Intuitivly his front is now perpendicular
: > : : XX XXX XX XY Y to Y's. I think we may need a concept such as
: > : : X X Y contact point, which in this case is Y's centre, and

http://www.gamedev.net/reference/articles/article1085.asp (22 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

: > : : Y X's left flank. Naturally in most battles there would


: > : : YYY be multiple contact points. Personally I would draw the
: > : : Y fronts as follows.
: [snip]
: > benifits of creating/avoiding specific contact points. e.g. in the
: > example above X should avoid contact with Y until he has rotated his
: > front.(it looks like we may still need to consider fronts as well).

: Two options I can see here - either X moves its forces into Y's "front"
: to create as much damage as possible (in a concentrated strike or
: "blitzkreig" style attack) or X moves its "front" back, forcing Y to make
: the next move (allowing X the advantage in defence).

IMO The first of your options is probably going to lead to disaster. X


is not only outflanked but overrun. From this postion any attack he
makes is going to be piecemeal, and forces commited to an attack
piecemeal are destroyed piecemeal to very little effect. IMHO the second
option is the only option avaliable to X, the question for X is how far
back to regroup, and what to do next. Some of his options as I see them
involve, pulling all units at the contact point back, all units not at
the contact point foward, form an offensive formation and try to break
Y's, by now dissordered, centre. Or maybe send forces to delay Y's flank,
utilise defence in depth at the contact point to buy time, and either
prepare an attack, prepare a defence, or use the manuvour to start a
rearguard action.

: Is X aggressive or defensive? Y's forces are rather spread out, so X can


: "spear" through the front and attack from both sides (like an inverse of
: the "pincer" movement).
Personally I would quite enjoy gaming Y against an X attacking from this
postion.

Andrae.
==============================================================================

Andrae Muys (ccamuys@dingo.cc.uq.oz.au) wrote:

: Of course the ultimate AI wouldn't find itself in such a dangerous


: position. A bit like the untimate General wouldn't. But if it does it
: must extricate itself well. Just one more 'special' situation to test
: the AI's abiltiy.

War appears to be full of special cases, eh?

;)

: : If we define a contact point, then does that give us a natural focus


: : towards which to direct our forces and our strategic 'thinking'? They would
: : seem to.

: Well this thread is useful. The idea of contact points should radically
: prune any decision tree.(And sooner or later the AI will have to make a
: choice) Of course at the stratigic/grand stratigic levels we may need a
: modified definition of contact point but at the level I am interested in

http://www.gamedev.net/reference/articles/article1085.asp (23 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

: Contact points appear to be a good way to look at things. In fact now


: that I think about it, contact points are how **I** allocate **MY**
: consideration. This approach however leads us to consider how to
: recognise potential contact points and how to evaluate the relitive
: benifits of creating/avoiding specific contact points. e.g. in the
: example above X should avoid contact with Y until he has rotated his
: front.(it looks like we may still need to consider fronts as well).

Yes, I'd come to that conclusion as well. By identifying contact


points and making them the focus of (at least some) activity, we've
pruned our number of decisions fairly radically.

One concern I have with relying on them exclusively, however, is


that this makes the AI very reactive without some additional logic to
force it to 'seek' the enemy. Two obvious ways to handle this would be
a.) random movement of forces until contact with the enemy has been
achieved and b.) deliberate movement of forces to attain certain objectives
(which are, in effect, 'artificial' contact points, if you will).

It would be simple enough (I'm still talking Napoleonic-level for


simplicity's sake) to provide the AI with whatever its objectives are
for a given scenario; that's easy. The hard part is how to make it
approach those objectives in some fashion that makes sense rather than
to blindly throw its units down a road leading towards said objective
(perhaps the main failure of the AIs in Perfect General and Empire Deluxe,
to name two examples).

As I recall, Napoleonic-era tacticians were trained to recognize


'classic' battle formations (one such example being the 'T' you presented
earlier) and react accordingly. Pattern recognition is easy enough
to do on a computer via a variety of methods, ranging from simple table
lookup to full-blown neural nets.

: : If we use your previous suggestion for identifying centers, combined with


: : the above-mentioned contact points, then this may lead us towards a more
: : natural way of handling the above situation. Based on what we've discussed
: : so far, I would envision an AI's logic train going something like this:

: : Pass #1: Identify natural 'groups' of X-Y forces using the


: : 'circle method' discussed earlier, perhaps taking into account
: : the possibilities of influence and interdiction as a previous
: : poster suggested.

: : Pass #2: Having identified these natural groupings, identify


: : contact points amongst the forces within each group. These
: : will serve as natural foci for our planning process.

: : Pass #3: Having identified natural groupings and focus points,


: : we now begin thinking about steps needed to link up our groups,
: : minimize the size of enemy-held areas, elimination of enemy units,
: : etc.
: This sounds like it should work reasonably well, however I have a feeling
: there may be problems with the way it handles many sparse stratigic
: situations. I'll think about it and we can discuss it futher when I have

http://www.gamedev.net/reference/articles/article1085.asp (24 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

: clearified my concerns.

Agreed, it may. However, based on the above idea of using pattern


recognition to manuever our units in something resembling a smart fashion,
we can now elaborate on Pass #3:

Pass #3: Having identified natural groupings and focus points,


run a pattern-matching algorithm to identify most likely tactical
situations and reasonable responses. Randomly pick from the best
2 or 3 to avoid predicatability. For each turn a course of
action is in place, make some evaluation of its effectiveness
and/or success (i.e., has the pattern changed signficantly enough
to warrant a new course of action? have we attained the objective
in question? have we lost contact with the enemy?).

In the case of a sparse strategic situation, the AI defaults towards


trying to attain known and logical objectives (i.e., moving towards Paris).
Once contact is made, the situation naturally becomes less sparse, and
the AI begins to make moves based on sound engagement philosophies.

The AI will end up being a bit 'bookish', if you will, but certainly
ought to surprise you once in a while.

: However as discussed above at a lower level fronts again become


: important. IMHO this is because tactical considerations require physical
: cohesion, while stratigic(level) utilise logical cohesion.

I'll buy that. That's a good definition, particularly for this


problem (Napoleonic-era combat), and better than many I've seen.

: [regarding comment that we post this to rec.games.miniatures]

Actually, I've already seen one thread on comp.ai recommeding THIS


thread as 'worthwhile'. ;)

Steven

==============================================================================

In article <3oro0c$egf@dingo.cc.uq.oz.au>, ccamuys@dingo.cc.uq.oz.au


(Andrae Muys) wrote:

> : One problem I can think of off the top of my head is how to handle
> : multiple front situations; there's at least some possibility of overlapping
> : definitions, meaning that some precedence order must be established.
> : Special exceptions will also have to be made for overflying enemy aircraft
> : and incursions by enemy units of various types. (Example: If the enemy
> : drops some paratroopers into your 'rear' area, does it automatically become
> : a 'front'?)
.

http://www.gamedev.net/reference/articles/article1085.asp (25 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

> This is why I am using a very basic set of game mechanics, and using a
> different era(see other post). This way the only way troops can reach
> your rear is to march there. Also there are very few battles in this era
> with multiple fronts. Although allowance must be made for bent and
> twisted fronts. The hinge being a very critical point in an extended line.

I don't know if it can help, however, when I was thinking about how to
build a program for a computer opponent for Squad Leader (an Avalon Hill's
boardgame) I didn't consider directly recognizing the troops patterns. I
would build intermediate maps describing entities like density of fire,
distances in unit movement points (rether than linear distance),
probability of passing the zone without damage, etc. and base the
reasoning on this intermediate maps. It turns out that looking (as an
example) at the fire distribution makes it clearer if some allineated
troops makes a real front or not. Many sub tactical problems could be
solved by looking for the shortest path in some of these maps: from what
side should I attack that hill? find the longest path from your units to
the hill in the percentage of survival map, where the path lenght is the
product of the percentage of the zones passed!
Sure it is not so simple, because the enviroment is highly dinamic and
there are a lot of interdependancies, but it is a good starting point.

Please answer by e-mail

Daniele Terdina
sistest@ictp.trieste.it
==============================================================================

Daniele had so many good points that I did in fact respond to her by
e-mail, but thought everybody here (hopefully) would be interested
as well.....

Daniele Terdina (sistest@ictp.trieste.it) wrote:


: I don't know if it can help, however, when I was thinking about how to
: build a program for a computer opponent for Squad Leader (an Avalon Hill's
: boardgame) I didn't consider directly recognizing the troops patterns. I
: would build intermediate maps describing entities like density of fire,
: distances in unit movement points (rether than linear distance),
: probability of passing the zone without damage, etc. and base the
: reasoning on this intermediate maps. It turns out that looking (as an
: example) at the fire distribution makes it clearer if some allineated
: troops makes a real front or not. Many sub tactical problems could be
: solved by looking for the shortest path in some of these maps: from what
: side should I attack that hill? find the longest path from your units to
: the hill in the percentage of survival map, where the path lenght is the
: product of the percentage of the zones passed!
: Sure it is not so simple, because the enviroment is highly dinamic and
: there are a lot of interdependancies, but it is a good starting point.

: Please answer by e-mail

: Daniele Terdina
: sistest@ictp.trieste.it

http://www.gamedev.net/reference/articles/article1085.asp (26 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

Hello Daniele:

*Very* interesting approach...I don't believe I've seen it presented


here before.

It wouldn't be too hard to generate a density map such as you


described and use it in conjunction with a route-finding algorithm to
manage unit movement. That certainly has the added advantage of having
units attempt to 'avoid' enemy fire (very much a realistic behavior!)
as well as still focusing on objectives. And your point concerning
whether a front is REALLY a front based on fire distribution is a good
one; we'd been assuming almost toe-to-toe contact, I think, in the above
thread, but the inclusion of any type of ranged weapon muddies that up.

Question: How would we build a fire distribution map if a given


unit has several different types of weaponry? An example of that might
be a machine-gun crew from Squad Leader; the individuals in the squad
all have fairly short-range sidearms and such, but they're all supporting
the longer-range MG. I guess the worse-case situation is the one you
have to assume, even though you have no idea that they'll shoot at you
(it's that "fog of war" thing again).

This could very well be an interesting solution to the sub-tactical,


ordering-units-to-shoot-at-something part of the problem.

Steven

==============================================================================
==============================================================================
At least one article from myself is apparently missing here. It may have
been an e-mail from myself to Daniele, or somebody else may have posted it
but I missed snagging it. If somebody should happen to have it I'd
appreciate a copy.
==============================================================================
==============================================================================

woodcock@escmail.orl.mmc.com (Steve Woodcock) wrote:


> We'd want to add some kind of thresholding function, though, to prevent
>the AI making some REALLY poor moves. If we only use the above approach,
>then I could see a situation in which NO move is a particularly good one
>(i.e., I'm in a building with all 4 streets covered by heavy enemy MG fire),
>but of course the AI would pick the 'best' one and proceed to get mowed down
>crossing the street. We'd want to build in some type of calculation that
>would take into account that not moving at all is smarter than trying to
>obtain whatever the objective is. That's easily enough done, I should
>think, using a standard costing function.

These really poor moves should be avoided by the use of a planner. The idea
is very rough because I hadn't time to try things out. In the first
scenario (all buildings) the russian has to conquest some buildings
initially occupied by germans. The planner would make reasonings of the

http://www.gamedev.net/reference/articles/article1085.asp (27 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

following sort:

- to seize a building I'd better start from another building in front of


it. If I haven't any, this is a sub-goal.
- if an attack can't be attempted because it is too risky (the threshold
you mention) locate the key enemy defensive position (the units that cause
the most of the fire density in the area) and make 'concentrate fire on key
unit' a sub-goal/
- and so on

The net of sub-goals is a graph with weighted arcs. The weight is


proportional the difficulty of achieving the goals, i.e. of reaching the
node , and inversely proportional to its usefulness, which is given by the
importance of the other goals accessible from that node. So again the
starting point to decide which goals should actually be activated is a
shortest path problem on the planning graph. I like much this idea,
however, it's easier to say than to implement.

Daniele Terdina e-mail: sistest@ictp.trieste.it


==============================================================================
==============================================================================
Again, at least one post is missing here. If anybody knows what should go
in here please send it along.
==============================================================================
==============================================================================

woodcock@escmail.orl.mmc.com (Steve Woodcock) wrote:

> Robert A. Uhl (ruhl@phoebe.cair.du.edu) wrote:


> : Yes. The AI should attempt to create as many contatc points as
> : possible. Why? These points mean that he is actually fighting the
> : enemy and hopefully killing the same. That is why two parallel lines
> : are a good formation. With a good lookahead, though, the T formation
> : is also good, because it will enable many points of contact _over
> : time_, and hurt Y's ability to create them. Part of the battle is
> : controlling contact. In fact, if the AI can do this controlling, then
> : it is my belief that it will have an extremely good chance at winning.
>
> I'm not sure I agree that I would want the AI to maximize its number
> of contacts with the enemy; I would agree that it should seek to control
> them.
>
> Maximization in itself will only lead to toe-to-toe WWI slugfests, and
> basically leads to the AI playing a war of attrition. That's perhaps one
> of the defining characteristics of most AIs today--if they don't cheat
> somehow, then they tend to fight wars of attrition.

Very good point.

> : The sphere of influence idea seems to work well with the points of
> : contact idea. Perhaps each unit has a sphere wherein it can contact
> : within a certain number fo moves (say 3), and its job is to contact
> : the fewest enemies at once but the most over time. IOW, it doesn't
> : want to be outnumbered but it wants to see action.

http://www.gamedev.net/reference/articles/article1085.asp (28 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

This is where the determination of neighbours came in before. Assume the


following:

2 3

A is attacking and could normally reach 1, 2 or 3. However most humans


would rule out attacking 1 because 2 and 3 are 'in the way' - 1 is not a
neighbour of A.
This is assuming normal distances are used. If you penalise paths for
travelling close to an enemy then the shortest path A -> 1 may be around
the outside of 2 or 3 - making it 'out of range' anyway. You have to
search for shortest paths in this case though.

\x/ill :-}

William Uther will@cs.su.oz.au

==============================================================================

In article ,
Will Uther wrote:
>In article <3pd6nc$lf7@theopolis.orl.mmc.com>,
>woodcock@escmail.orl.mmc.com (Steve Woodcock) wrote:
>
>> Robert A. Uhl (ruhl@phoebe.cair.du.edu) wrote:
>>
>> Maximization in itself will only lead to toe-to-toe WWI slugfests, and
>> basically leads to the AI playing a war of attrition. That's perhaps one
>> of the defining characteristics of most AIs today--if they don't cheat
>> somehow, then they tend to fight wars of attrition.
>
>Very good point.

Well, to minimize the possibility of a WWI-type war, something needs


to be done to keep the AI from planning one (gee, that was
brilliant:-). There are two possibilities: allow it to do so, but give
ti the long-range planning skills necessary to make it realize that
such a war is undesirable, or keep it from planning such a war. I will
focus on preventing this.

First of all, the unit must attempt to maximize contact with weak
units _over time_. It must seek to minimize contact with any units,
esp. strong ones, at once. To do so, it can be given either planning
skills, or merely given a function which will do so. Once again, I
will concentrate on the simpler way and faster. Planning takes CPU
time, something which most 'puters lack. Planning would take the form
of a path finder which would decide the path for a unit which would
bring it in contact with units individually.

The function must first seek to find nearby units. The first
criteria for 'nearness' would simply be physical nearness. This would

http://www.gamedev.net/reference/articles/article1085.asp (29 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

choose, say 10 or 20 units. These would be sorted by strength, then by


friendly units in the rough direction of the enemy. The idea is that a
unit doesn't know where its allies are headed, but if they are near an
enemy, the odds are that it is a melee situation. After sorting these
the second time, the AI would choose the 'nearest' unit. In close
quarters, it would most likely choose the physically nearest one. But
from a distance, it would tend towards otehr criteria, such as unit
strength, friendlies in the area, &c.

The one problem is that such an AI might be too chaotic at first,


sending units out willy-nilly. These first units would be likely to go
one way, then the other. But soon, I believe that some order would
arise. Units would have a tendency to conglomerate in an area and
fight as one. This is just my own thought, though. It is quite
possible that it would be entirely too chotic.

Also, it must be made possible for them to retreat. With this, your
units willbe akin to men; they don't like to be hit on too much.
Perhaps they would be craven or berserk. That would be neat.

>> : The sphere of influence idea seems to work well with the points of
>> : contact idea. Perhaps each unit has a sphere wherein it can contact
>> : within a certain number fo moves (say 3), and its job is to contact
>> : the fewest enemies at once but the most over time. IOW, it doesn't
>> : want to be outnumbered but it wants to see action.
>
>This is where the determination of neighbours came in before. Assume the
>following:
>
> 1
>
> 2 3
>
> A
>
> A is attacking and could normally reach 1, 2 or 3. However most humans
>would rule out attacking 1 because 2 and 3 are 'in the way' - 1 is not a
>neighbour of A.
> This is assuming normal distances are used. If you penalise paths for
>travelling close to an enemy then the shortest path A -> 1 may be around
>the outside of 2 or 3 - making it 'out of range' anyway. You have to
>search for shortest paths in this case though.

Hmm. I would suggest that the AI measure the distances. For the sake
of argument A->1 = 3, A->2 = 2, A->3 = 2.

The AI takes its 5 closest neigbours, that is 1, 2, 3, null and


null. Of the the closest is 2 (since it is numerically before 3, which
is the same distance away). In real life, this would be adjusted with
another function, perhaps measuring the relative strength of the unit
in question. It opts to attack 2. After attacking two, it opts to
attack 1. After one, it takes on 3. Of course, it may be too weak by
then and therefor leaves the field or waits for reinforcements.

Simplistic, and probably not the best method. In fatc, I know that

http://www.gamedev.net/reference/articles/article1085.asp (30 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

it is rather bad, but it works, and that is what counts for the
moment. Perhaps it would only be good for individual movement. In
fact, I think thta it would mimic a single man rather well. The man
takes on the closest, weakest enemy.

Robert Uhl

==============================================================================

Robert A. Uhl (ruhl@phoebe.cair.du.edu) wrote:

: The function must first seek to find nearby units. The first
: criteria for 'nearness' would simply be physical nearness. This would
: choose, say 10 or 20 units. These would be sorted by strength, then by
: friendly units in the rough direction of the enemy. The idea is that a
: unit doesn't know where its allies are headed, but if they are near an
: enemy, the odds are that it is a melee situation. After sorting these
: the second time, the AI would choose the 'nearest' unit. In close
: quarters, it would most likely choose the physically nearest one. But
: from a distance, it would tend towards otehr criteria, such as unit
: strength, friendlies in the area, &c.

Okay. Earlier in the thread a mechanism for 'grouping' enemy and


allied units was proposed that involved drawing circles based variously
on movement capabilities and/or firepower.

: The one problem is that such an AI might be too chaotic at first,


: sending units out willy-nilly. These first units would be likely to go
: one way, then the other. But soon, I believe that some order would
: arise. Units would have a tendency to conglomerate in an area and
: fight as one. This is just my own thought, though. It is quite
: possible that it would be entirely too chotic.

I wonder if some variant of this isn't what many games already do?
Most of them do tend to exhibit a nasty tendency to trickle units unit
a battle.

: >
: > 1
: >
: > 2 3
: >
: > A
: >

: Hmm. I would suggest that the AI measure the distances. For the sake
: of argument A->1 = 3, A->2 = 2, A->3 = 2.

: The AI takes its 5 closest neigbours, that is 1, 2, 3, null and


: null. Of the the closest is 2 (since it is numerically before 3, which
: is the same distance away). In real life, this would be adjusted with
: another function, perhaps measuring the relative strength of the unit
: in question. It opts to attack 2. After attacking two, it opts to
: attack 1. After one, it takes on 3. Of course, it may be too weak by
: then and therefor leaves the field or waits for reinforcements.

http://www.gamedev.net/reference/articles/article1085.asp (31 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

: Simplistic, and probably not the best method. In fatc, I know that
: it is rather bad, but it works, and that is what counts for the
: moment. Perhaps it would only be good for individual movement. In
: fact, I think thta it would mimic a single man rather well. The man
: takes on the closest, weakest enemy.

As you say, this solution is fast but sub-optimal. There's no provision


or mechanism for combining the firepower of several allied units against
a single enemy unit. That's a situation that occurs in both the strategic
arena and the tactical.

On a related subject, I wonder how one would best determine how many
units to allocate to the attack against a given enemy unit? I can think
of two approaches:

a.) The AI operates on a battle-to-battle basis, always strives to


achieve a certain ratio (3:1) or certain probability of kill (say, 90%)
and allocates units (depending on their proximity to the battle) to
achieve this;
b.) The AI works on a more 'big picture' basis, attempting to maximize
OVERALL probability of kill across several battles (i.e., willing to
take a 20% chance of success in one battle for three others at 90%).

Intellectually, option 'b' is more appealing, and more reflective of


how most gamers actually play. You balance your needs and are sometimes
willing to sacrifice one unit to save several others.

There's no need, nor is terribly desirable, for the AI to attempt to make


this tradeoff across the entire battlefield. I think we could tie this in
with the influence mapping suggested earlier to easily identify sub-areas
of the battlefield that the AI could make the tradeoffs in, perhaps by
using some type of edge detection across the values of the hexes?

Steven
==============================================================================

I have been paying close attention to this thread and the Influence Mapping
one and their are two observation I have made:

1) First, the influence mapping algorithm proposed (not the priority queue
but the previous one) seemed very effective but is computationally intensive.
However, I expect it would not have to be computed on every game turn and that
instead, modifications could be made to an already computed influence map to
obtain a sufficiently good approximation. (eg. When your troops move, you increase
your influence in one hex and decrease it in the other, but your influence in the
area, ie. adjacent hexes, remains approximately unchanged.) Only when units are
destroyed or after a number of game turns would you have to go through the big
expense of applying the influence mapping algorithm to get an accurate map.

2) The search for a fronts is extremelly important in planning your strategy but
only looking for the front only gives you half the picture. Let me expand:

When I play war games, my first priority is to defend at fronts so my troops are

http://www.gamedev.net/reference/articles/article1085.asp (32 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

not overunned. Maintaining sufficient power fire to DEFEND at the front is my


number one priority (excluding exceptions here and there). Forget about an
offensive for the moment. That comes later.

My second priority is finding, creating and assaulting "targets of opportunity".


These are enemy units or objectives which are not well defended and usually
isolated. Unfortunately, the influence mapping algorithm would show the hexes
where these targets of opportunity are as under my control if they were
surrounded by my troops. This enemy unit may be far from the front but is
an unit that every one would attack immediately (excluding exceptions here
and there.)

My last objective, once the destruction of all targets of opportunities has been
addressed, is to ATTACK on the fronts where I have superiority.

My beef about all this is that the current discussion does not include and
detect these targets of opportunity. I think what need to be done is to compute
two influence maps:

a) One map computed the standard way which defines fronts and zones of control.
b) One map computed the standard way except that the influence of opposing forces
does not affect the value of a hexes occupied by a friendly unit. Hence, a hex
occupied by a friendly is always positive, a hex occupied by a foe is always
negative.

Map b) would show target of opportunity since one or a few hexes would have negative
values while being surrounded by large positive values (Large derivatives would
identify these areas) and the map a) would show that you have a large influence
on that enemy occupied hex and hence, could destroy it easily.

Map b) would also identify which of your units are about to be destroyed because
they are greatly out numbered (the opposite situation) and some strategic decision
would have to be made about sacrificing , regrouping or reinforcing the unit.

Map b) would also indicate where on the front, identified with map a), you should
strike since it would identify where the enemy is the weakest and you the strongest.
ie. map a) identifies fronts as hexes where the influence value is zero but the
derivative is large. Map b) values at the front hexes will indicate which side
is on the defensive or offensive along each portion of the front. This is crucial
to 1) DEFEND along the front, and 2) start an offensive where the enemy is the
weakest on the front.

Marc Carrier
==============================================================================
==============================================================================
I apparently missed a few articles here as well. Judging from the next
post I have, there appear to have been several responses (at least 3) to
Marc Carrier's comment above. Again, if anybody has the originals I'd
love to get a copy to incorporate here.
==============================================================================
==============================================================================

Marc Carrier (mcarrier@bnr.ca) wrote:

: In article <3q44ru$52r@hasle.oslonett.no>, grahamh@oslonett.no (Graham Healey)

http://www.gamedev.net/reference/articles/article1085.asp (33 of 52) [25/06/2002 2:55:43 PM]


GameDev.net - Recognizing Strategic Dispositions thread

writes:
: |> In article <3ptn8e$o9g@bmtlh10.bnr.ca>, Marc Carrier wrote:
: |> >
: (SNIP...)
: |> >My second priority is finding, creating and assaulting "targets of
opportunity".
These are enemy units or objectives which are not well defended and usually isolated.
Unfortunately, the influence mapping algorithm would show the hexes where these ta
rgets|> of opportunity are as under my control if they were surrounded by my troops.
This enemy unit may be far from the front but is an unit that every one would attack
immediately (excluding exceptions here and there.)
: |> >
: (SNIP...)
: |>
: |> I think you are overdoing the targets of opportunity (TOP's). Going for
<<<>>>
: I disagree!
:
: First, I agree I might have over emphasised the "creating" targets of
: opportunity part of my statement. But when one exist, taking adventage of
: it should be considered prior to carrying on your move towards your
: objectives. A single enemy unit behind your lines or out on the flanks
: can cause major damages later on if not addressed.
: Example: In Clash Of Steel, I win the game on the eastern front by
: sending single tanks around and behind enemy line to cut supplies, then
: I destroy the unsupplied units. Whether I play the AXIS or the
: ALLIES, the strategy usually works since the COS AI is not very good at
: recongnizing that enemy units are about to cut his supply routes (In
: fact, the COS AI is not very good period).

OK now its my turn to disagree. In my games I have found that when I


allow myself to be distracted from the overall picture to try and take
some trivial advantage(eg some poor soul stuck behind my lines without
supply and/or support) I lose. IMHO one of the main reasons behind this
thread is the realisation that taking time out from the pursuit of your
stratigic objectives to claim a tactical victory is poor play. One
mustn't forget that strength is only one of many factors which decide a
battle, in fact one of the less important IMO. Time, morale, position,
cohesion, leadership, etc are IMHO more important than mere manpower.
This is shown time and time again, from Alexander the Great, to Hannabal,
to William the Conquerer(an example of cohesion), Pitt, Frederik the
Great, Napoleon, et al, WWI, Vietmam, Korea, Hundread Years War as
examples of the effects of neglecting those other factors. (I can
explain my reasoning behind each of the examples but will do so only by
email for brevity). Oh and Israel and WWII German occupation of france
as an example of time. In each case victory was ensured against
signifigant numerical odds by the recognition of the relitive
inimportance of overall strength. It is this realisation which is
proving hard to teach a computer.

: In chess, unless you can see that you can check-mate your opponent in a
: number of moves, your rarely pass the opportunity to capture one of his
: pieces if you will not lose anything. What you have to be careful about
: is that doing so is not a trap by your opponent to open up your defenses
: and that you will not end up in a less desirable state.

http://www.gamedev.net/reference/articles/article1085.asp (34 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

I would here remind you of the gambit. And the sacrifice. I have won
many games of chess simply because I had a three or four move lead in my
development, bought with a couple of pawns, which I used wisely. I
personally consider this a good analogy to taking the time, and effort,
to pick off an insignifigant force left behind the lines.\

: The point about my previous post was that their is a distinction between
: being on the defensive and offensive on a front. And maximizing contact
: points over time is something you usually do only if you are on the
: offensive. So far, many good algorithms have been proposed to identify
: fronts. Now I am turning my attention to identifying the state of the
: troops on that front (Offensive/Defensive). Further more, if troops
: are on the defensive, can they stand or must they fall back and regroup?
: If they are on the offensive, can they attack now or should they wait to
: group and concentrate fire power instead of trickling into combat?

Good questions.

One question of my own is how you might represent time on an influence


map? I have a feeling that if we could somehow represent the map w.r.t
as a 2D-vector field we might be able to quantify the effects of 'attacking
now' vs 'attacking in strength'. I seem to remember someone mentioning
that they were using Electo-Magnetic Field theory to aid in producing a
good graph. As an electrical engineering student I know some theory
regarding vector fields however representing them on a computer is
something I don't know anything about. I was thinking though if you
considered the level of control the angle of the vector, say PI/2=Completely
under my control, and -PI/2=Completely under their control, you would be
left with the magnitude of the vector to represent some parameter w.r.t.
time. So if the resulting vector was in the 1st or 2nd quadrent I would
have some sort of advantage w.r.t time at the location, and a
disadvantage in the 3rd/4th quadrants. The dig' below might help.

|
|
Enemy Control | My Control
My time adv | My time adv
|
|
==============================
|
|
My Control | Enemy Control
Enemy time adv| Enemy time adv
|
|

Now of course you could swap them and represent the time variable by the
angle but I was thinking that if you represent the strenth by an angle
then you can have no poles(points going to infinity) in the final
function. This should help with array bounds or the like. Still I have
no idea how you might form the time function(the strenth function should
require only a little modification to apply and I will take a shot at it
after exams).

http://www.gamedev.net/reference/articles/article1085.asp (35 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

Well what do you think?


and please no posts telling me how computationally intensive its going to
be, I all ready know.

Andrae.

==============================================================================

In article <3qmg26$pv0@dingo.cc.uq.oz.au>, ccamuys@dingo.cc.uq.oz.au (Andrae Muys)


writes:
|> One question of my own is how you might represent time on an influence
|> map? I have a feeling that if we could somehow represent the map w.r.t
|> as a 2D-vector field we might be able to quantify the effects of 'attacking
|> now' vs 'attacking in strength'. I seem to remember someone mentioning
|> that they were using Electo-Magnetic Field theory to aid in producing a
|> good graph. As an electrical engineering student I know some theory
|> regarding vector fields however representing them on a computer is
|> something I don't know anything about. I was thinking though if you
|> considered the level of control the angle of the vector, say PI/2=Completely
|> under my control, and -PI/2=Completely under their control, you would be
|> left with the magnitude of the vector to represent some parameter w.r.t.
|> time. So if the resulting vector was in the 1st or 2nd quadrent I would
|> have some sort of advantage w.r.t time at the location, and a
|> disadvantage in the 3rd/4th quadrants. The dig' below might help.
|>
|> |
|> |
|> Enemy Control | My Control
|> My time adv | My time adv
|> |
|> |
|> ==============================
|> |
|> |
|> My Control | Enemy Control
|> Enemy time adv| Enemy time adv
|> |
|> |
|>
|> Now of course you could swap them and represent the time variable by the
|> angle but I was thinking that if you represent the strenth by an angle
|> then you can have no poles(points going to infinity) in the final
|> function. This should help with array bounds or the like. Still I have
|> no idea how you might form the time function(the strenth function should
|> require only a little modification to apply and I will take a shot at it
|> after exams).
|>
|> Well what do you think?
|> and please no posts telling me how computationally intensive its going to
|> be, I all ready know.
|>
|> Andrae.

This was somewhat the idea I had w.r.t. using two influence maps since there is more

http://www.gamedev.net/reference/articles/article1085.asp (36 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

than just the soil you control that determines your posture. However, i had not
considered time as a second dimentioned but it makes a lot more of sense and the
simple matrix you presented with its four quadrant can simply summarized the four
behaviors I mentioned earlier:

1) retreat and regroup


2) stand and defend
3) stand and group / prepare for attack
4) attack now (attack who is another question though.)

On the subject of how to represent a 2D-vector field on the computer, use a


rectangular coordinate system such as with imaginary numbers in the form a + ib,
instead of angles and magnitudes. You can store it as two different maps (array)
or as an array of records/structures with two fields. I am sure one of these two
will be better in terms of computational efficiency than the other because of the
indexing involved but I do not know which one off the top of my head.

Now, how do we represent time? One method that came to mind would be to use the
normal influence mapping algorithm proposed and compared the results after n
iterations and 2*n iterations where n < number of iterations required for value
to stabilize and converge. Basically, compute the map with the influence of the
units propagated n times which shows your more immediate influence and compare
it to the map with the strengh propagated 2*n times (for example) which is your
influence in the more distant future. (In not too sure this thinking is correct
but I will do some simulation this weekend to try it out.)

An Example: After n iterations , the influence map could show that the ground
where a friendly unit is located is under enemy control (ie. lots of enemy units
close by) however, after 2*n iterations, the influence value might turn positive,
indicating that a lot more friendly units are close enough to reinforce the
position.
(I want to simulate this to see if this can happen with the influence mapping
algorithm.)

One problem I see with this is that the group of friendly units that could reinforce
the position will have its influence spread all around and many friendly units in
enemy territory may count on that reinforcement when the 2*n map is considered.
Unfortunately this method does not help to make the decision of where the
reinforcement
should go and the units in enemy territory that do not get the reinforcement will die
if they stand.

Marc

==============================================================================

I've read, with some interest, all of the proposals for wargame
AI, and I haven't seen this technique postulated. This is the technique
I intend to use for my Fantasy Wargame Engine.

First, there are two objectives that are possible in any tactical
level combat: 1) Offensive, 2) Defensive.

Offensive: Destroy, or severely damage the enemy's capability to hold the


field. OR taking an objective.

http://www.gamedev.net/reference/articles/article1085.asp (37 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

Defensive: Protect an objective or 'holding the line' against enemy attack.

Both of these rely on two things: concentration of force and


taking and holding good ground (ground that gives you the best offensive
and defensive capabilities).
The first thing the AI must do is to look for good ground on the
battlefield. These points are used for anchors for any offensive or
defensive operations. No units actually have to *occupy* the ground,
they just have to protect against the opponent gaining it.
This is done by having the AI look for any enemy units near to
the ground every turn. If an enemy unit approaches the AIs 'good
ground', the AI calculates how much force it must use to defeat the
enemy unit. Then it searches through all of its units to find the ones
that are least engaged (see below in concentration of force) and
detaching them to attack the enemy unit(s). Then simply move the units
chosen to the best ground in which to attack the enemy unit.

Concentration of force:
The *MOST* important thing in a battle is to decide where the
enemy is weakest and kick his ass there. This is why flank attacks are
so effective. Flanks are usually weaker than the center and are *not*
mutually supportable. Whereas if you attack the center, it can be
reinforced by the flanks (or rear) if need be.
Breaking the center will destroy an army because it isolates the
flanks (therefore they can't be mutually supporting), but breaking the
center is *hard* because it's constantly being reinforced by the flanks
and rear.

So, with that little military theory out of the way...

First you have to find the weak points in the enemy line. This
can simply be done by taking each enemy unit, add up its attack and
defense strengths, then add up the attack&defense strengths of the units
1 move away, and add them to the unit. The unit with the lowest score
will be the weakest.
Now this won't help if that unit happens to have 8 million
archers between the AI troops and itself. So the computer also has to
take into account "spheres of influence".
Basically, each unit has a sphere of influence, that is an area
of the battlefield where it can inflict damage on the enemy. The enemy
units must have a sphere of influence as well, where those sphere's
overlap is the contested area. Now, the idea is to try to get into a
position where the AIs sphere of influence is larger than the human
player's unit. This way the contested area is farther away from the
computer's units than the human. Note that the AI units and the Human
units might well be out of range of each other, but that doesn't matter.
Right now all the AI is trying to do is to control the field.
Then all the computer has to do is to move its units so that if
it's losing the strength battle in a contested area, it either moves the
outnumbered unit towards its other units (combining their contested
areas), or moving a reserve unit to the understrength unit thereby
reinforcing its contested area.

Now, here's where the fun begins...

http://www.gamedev.net/reference/articles/article1085.asp (38 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

The above model works fairly well with units making independant
decisions about where they want to go and who they want to attack. But
this is just the starting point!
The description up above mentioned a few things like detaching
units to defend a piece of ground, or reinforcement of in-dnager units.
This requires a "General" level of AI. The General has to make overall
decisions about the course of the battle.
Luckily the General only has to make two types of decisons.

1) Where is the enemy line weakest so that it can concentrate its forces
on that area.

2) Where is its line weakest so that it can reinforce the area with more
units.

Don't forget to have the general define some units at the


beginning of the battle as reserve troops. This can easily be
calculated. Just set back the units that have the greatest sphere of
influence and/or greatest strength *AFTER* it assigns units to control
the good ground.

Well, I've typed enough. If anyone has any questions, feel free
to E-Mail me (as I don't read this newsgroup often).

Chris.

==============================================================================

My apolgies to all for the lengthly quotes, but I didn't want any
quotes or responses to be out of context.

Christopher Spencer (clspence@iac.net) wrote:


: I've read, with some interest, all of the proposals for wargame
: AI, and I haven't seen this technique postulated. This is the technique
: I intend to use for my Fantasy Wargame Engine.

Oooohhh....sounds interesting! ;)

: Both of these rely on two things: concentration of force and


: taking and holding good ground (ground that gives you the best offensive
: and defensive capabilities).
: The first thing the AI must do is to look for good ground on the
: battlefield. These points are used for anchors for any offensive or
: defensive operations. No units actually have to *occupy* the ground,
: they just have to protect against the opponent gaining it.
: This is done by having the AI look for any enemy units near to
: the ground every turn. If an enemy unit approaches the AIs 'good
: ground', the AI calculates how much force it must use to defeat the
: enemy unit. Then it searches through all of its units to find the ones
: that are least engaged (see below in concentration of force) and
: detaching them to attack the enemy unit(s). Then simply move the units
: chosen to the best ground in which to attack the enemy unit.

http://www.gamedev.net/reference/articles/article1085.asp (39 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

You do address this somewhat farther on, but I'll go ahead and bring
this up here: I am somewhat concerned over the effects of finding those
units which are 'least engaged' and then detaching them to attack the
identified weaker enemy units. First, there may be some penalty for detaching
from combat, which must be taken into account in any such weighting of
possible options. Second, while it may be desirable to attack the enemy
in question I *may* want to finish off the unit I'm currently engaging
first, even if it is in the middle of a nest of enemy units. The enemy
I'm currently engaged with may only need one more hit/attack/whatever to
finish off, while the enemy unit I would *like* to engage may be a turn
of two away from its (presumed) objective. In other words, sometimes it's
better to take the bird in hand than the one in the bush.

This is easily solved by adding an additional 'cost' for disengaging


and/or adding value to the destruction of the weak enemy unit or units
currently being engaged.

: Concentration of force:
: The *MOST* important thing in a battle is to decide where the
: enemy is weakest and kick his ass there. This is why flank attacks are
: so effective. Flanks are usually weaker than the center and are *not*
: mutually supportable. Whereas if you attack the center, it can be
: reinforced by the flanks (or rear) if need be.
: Breaking the center will destroy an army because it isolates the
: flanks (therefore they can't be mutually supporting), but breaking the
: center is *hard* because it's constantly being reinforced by the flanks
: and rear.

Good summation....

: So, with that little military theory out of the way...

: First you have to find the weak points in the enemy line. This
: can simply be done by taking each enemy unit, add up its attack and
: defense strengths, then add up the attack&defense strengths of the units
: 1 move away, and add them to the unit. The unit with the lowest score
: will be the weakest.
: Now this won't help if that unit happens to have 8 million
: archers between the AI troops and itself. So the computer also has to
: take into account "spheres of influence".
: Basically, each unit has a sphere of influence, that is an area
: of the battlefield where it can inflict damage on the enemy. The enemy
: units must have a sphere of influence as well, where those sphere's
: overlap is the contested area. Now, the idea is to try to get into a
: position where the AIs sphere of influence is larger than the human
: player's unit. This way the contested area is farther away from the
: computer's units than the human. Note that the AI units and the Human
: units might well be out of range of each other, but that doesn't matter.
: Right now all the AI is trying to do is to control the field.
: Then all the computer has to do is to move its units so that if
: it's losing the strength battle in a contested area, it either moves the
: outnumbered unit towards its other units (combining their contested
: areas), or moving a reserve unit to the understrength unit thereby
: reinforcing its contested area.

http://www.gamedev.net/reference/articles/article1085.asp (40 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

Okay, that's an interesting approach. Isn't it somewhat similar,


however, to the 'mesh analysis' approach suggested in the earlier thread?
In this case, rather than spreading the influence of a unit across the
map, it merely spreads to adjacent hexes. This is certainly simpler,
but don't you lose the ability to readily identify fronts and lines of
control?

On the other hand, it may be a good 'sub-system' approach for determing


actual engagement strategy. That is, having first used the mesh analysis
approach for mapping out the battlefield zones of control, one could then
use this methodology for picking out individual weak points.

: The above model works fairly well with units making independant
: decisions about where they want to go and who they want to attack. But
: this is just the starting point!
: The description up above mentioned a few things like detaching
: units to defend a piece of ground, or reinforcement of in-dnager units.
: This requires a "General" level of AI. The General has to make overall
: decisions about the course of the battle.
: Luckily the General only has to make two types of decisons.

: 1) Where is the enemy line weakest so that it can concentrate its forces
: on that area.

: 2) Where is its line weakest so that it can reinforce the area with more
: units.

This begins to approach something mentioned earlier, the concept


of breaking the AI into a 'General' and a 'Sergeant' makes a lot of
sense. The General draws up the overall battle plan and determines
what objectives to take. The Segeant determines the best way to do it.
Beyond Squad Leader will reportedly use a similar approach.

: Don't forget to have the general define some units at the


: beginning of the battle as reserve troops. This can easily be
: calculated. Just set back the units that have the greatest sphere of
: influence and/or greatest strength *AFTER* it assigns units to control
: the good ground.

Reserves are something we never even really talked about. Good point.

Steve
==============================================================================
Steven Woodcock _
Senior Software Engineer, Gameware _____C .._.
Lockheed Martin Information Systems Group ____/ \___/
Phone: 407-826-6986 <____/\_---\_\ "Ferretman"
E-mail: woodcock@gate.net (Home)
swoodcoc@oldcolo.com (Alternate Home)
woodcock@escmail.orl.mmc.com (Work)
Disclaimer: My opinions in NO way reflect the opinions of the
Lockheed Martin Information Systems Group, although
(like Rush Limbaugh) they should. ;)

http://www.gamedev.net/reference/articles/article1085.asp (41 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

Motto:
"...Men will awake presently and be Men again, and colour and laughter and
splendid living will return to a grey civilization. But that will only come
true because a few Men will believe in it, and fight for it, and fight in its
name against everything that sneers and snarls at that ideal..."

-- Leslie Charteris
THE LAST HERO

==============================================================================

Steve Woodcock proclaimed:


>Christopher Spencer (clspence@iac.net) wrote:
>: detaching them to attack the enemy unit(s). Then simply move the units
>: chosen to the best ground in which to attack the enemy unit.
>
> You do address this somewhat farther on, but I'll go ahead and bring
>this up here: I am somewhat concerned over the effects of finding those
>units which are 'least engaged' and then detaching them to attack the
>identified weaker enemy units. First, there may be some penalty for detaching
>from combat, which must be taken into account in any such weighting of
>possible options. Second, while it may be desirable to attack the enemy
>in question I *may* want to finish off the unit I'm currently engaging
>first, even if it is in the middle of a nest of enemy units. The enemy
>I'm currently engaged with may only need one more hit/attack/whatever to
>finish off, while the enemy unit I would *like* to engage may be a turn
>of two away from its (presumed) objective. In other words, sometimes it's
>better to take the bird in hand than the one in the bush.

The best way of calculating this is to first apply the reserve


(if it can get there in time). If the reserve is commited, or if it
can't get there in time, then check the units that have the greatest
positive imbalance in their areas of contention.
The second method will (somewhat) violate the rules of
concentration of force, but it's better than losing the good ground.

>: Then all the computer has to do is to move its units so that if
>: it's losing the strength battle in a contested area, it either moves the
>: outnumbered unit towards its other units (combining their contested
>: areas), or moving a reserve unit to the understrength unit thereby
>: reinforcing its contested area.
>
> Okay, that's an interesting approach. Isn't it somewhat similar,
>however, to the 'mesh analysis' approach suggested in the earlier thread?

Hmmm....could be. If I'm correct in understanding which method


you are talking about, then it's very similar except that the only values
that need to be calculated are the contested area values. The
calculation of the contested area, then comparing the strength of the two
opposing units in that area *should* take much less time than the mesh
analysis approach since you are calculating only a bare small percentage
of the battlefield.

>In this case, rather than spreading the influence of a unit across the
>map, it merely spreads to adjacent hexes. This is certainly simpler,

http://www.gamedev.net/reference/articles/article1085.asp (42 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

>but don't you lose the ability to readily identify fronts and lines of
>control?

In the words of Robert E. Lee: "In front, behind, the direction


does not matter. We will fight them wherever they are."
A front is an artificial contruct devised to show the general
area of conflict between two forces. As a General, it is your *only*
consideration to take objectives, hold good ground, and protect your
supply line (a concept that also includes protecting industrial centers
and transportation infrastructure).
When the AI finds good ground, it moves to occupy it or deny it
to the enemy. It pushes out from there by extending the army group's
sphere of influence thereby denying the enemy room to maneuver or
deploy.
So, you see, the front will take care of itself. And the lines
of control will also take care of themselves if the computer maintains a
contested area strength>=the human player contested area strength.
Also, as implied by this scheme, if the computer is losing the
strength contest in the contested area, it will automatically shorten its
own line by moving the understrength units towards other units.

>
> On the other hand, it may be a good 'sub-system' approach for determing
>actual engagement strategy. That is, having first used the mesh analysis
>approach for mapping out the battlefield zones of control, one could then
>use this methodology for picking out individual weak points.

That makes for another step to the process and more "thinking"
time for the AI (not bad, but even *I* get bored at some wargames and
start shouting at the screen: "Come on!!"). Also, the extra step is not
needed as the sphere of control scheme, by reflex (implied by the
concept), will automatically take care of finding the front, and
maintaining good lines of control.
Think of it this way: Two amoebae (sp?) are fighting with their
cilia (the units). They're stuck in the same test tube in a limited area
(the battlefield). Each feels the genetic need to grow to fill the test
tube and can only do that by killing the other amoeba's cilia and pushing
its cell wall back.
Initially, the two amoebae are seperated by a certain distance.
However, they feel the need to expand. The growth is instinctive (AI is
programmed to do this, the human player needs to do this to gain control
of the battle).
The AI amoeba looks out with its sensors to the maximum range
that it's cilia can attack (with deady morphogens!!). Suddenly it sees
the contested area for a few of its cilia, and the Human amoeba has more
cilia in that area. Well, the DNA looks for nearby cilia that have a low
area of contention or none at all, and moves them to strengthen the weak
area.
The advance continues.

I hope this analogy brings across the basic theory behind what
I'm saying.

> This begins to approach something mentioned earlier, the concept


>of breaking the AI into a 'General' and a 'Sergeant' makes a lot of

http://www.gamedev.net/reference/articles/article1085.asp (43 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

>sense. The General draws up the overall battle plan and determines
>what objectives to take. The Segeant determines the best way to do it.
>Beyond Squad Leader will reportedly use a similar approach.

Precisely. It's how actual armies work, and there's a good


reason for that....why? Because it works!

>
>: Don't forget to have the general define some units at the
>: beginning of the battle as reserve troops. This can easily be
>: calculated. Just set back the units that have the greatest sphere of
>: influence and/or greatest strength *AFTER* it assigns units to control
>: the good ground.
>
> Reserves are something we never even really talked about. Good point.

Yah. I've beaten more AIs by probing the enemy line with my
front line units while holding back a sizable reserve. When I identify a
weakness in the enemy attack (or defense), I send my reserves in to bust
open the enemy line and kick 'em in the ass!
I want my AI to be able to do that.

>Steve

Chris.

==============================================================================

Christopher Spencer (clspence@iac.net) wrote:

: The best way of calculating this is to first apply the reserve


: (if it can get there in time). If the reserve is commited, or if it
: can't get there in time, then check the units that have the greatest
: positive imbalance in their areas of contention.
: The second method will (somewhat) violate the rules of
: concentration of force, but it's better than losing the good ground.

True, using the reserve is a logical approach, but I'm still a bit
worried that this overall technique will tend to force units to run
back and forth around the battlefield, engaging the weakest enemy unit
they see and/or grabbing the most valuable piece of real estate in the
immediate vicinity. Without some kind of factoring in the value of
an 'attack in progress', so to speak, I'm not sure that units using
this technique will ever finish the job.

: In the words of Robert E. Lee: "In front, behind, the direction


: does not matter. We will fight them wherever they are."
: A front is an artificial contruct devised to show the general
: area of conflict between two forces. As a General, it is your *only*
: consideration to take objectives, hold good ground, and protect your
: supply line (a concept that also includes protecting industrial centers
: and transportation infrastructure).
: When the AI finds good ground, it moves to occupy it or deny it
: to the enemy. It pushes out from there by extending the army group's

http://www.gamedev.net/reference/articles/article1085.asp (44 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

: sphere of influence thereby denying the enemy room to maneuver or


: deploy.
: So, you see, the front will take care of itself. And the lines
: of control will also take care of themselves if the computer maintains a
: contested area strength>=the human player contested area strength.
: Also, as implied by this scheme, if the computer is losing the
: strength contest in the contested area, it will automatically shorten its
: own line by moving the understrength units towards other units.

Hmmm.

While I readily admit that it may all be due to a difference in


semantics, I still think this is a derivation of the approach discussed
earlier. The influence of the units is only propogated to adjacent hexes
(which I admit is MUCH faster and may be 'good enough' to do good tactical
maneuvers on, mind you) and you're choosing to examine the battlefield
in a more piecemeal fashion.

I assume, although I didn't explicity see it stated, that objectives


are induced by marking those hexes/sites/whatever as the equivalent of
'extra good ground'?

: Think of it this way: Two amoebae (sp?) are fighting with their
: cilia (the units). They're stuck in the same test tube in a limited area
: (the battlefield). Each feels the genetic need to grow to fill the test
: tube and can only do that by killing the other amoeba's cilia and pushing
: its cell wall back.
: Initially, the two amoebae are seperated by a certain distance.
: However, they feel the need to expand. The growth is instinctive (AI is
: programmed to do this, the human player needs to do this to gain control
: of the battle).
: The AI amoeba looks out with its sensors to the maximum range
: that it's cilia can attack (with deady morphogens!!). Suddenly it sees
: the contested area for a few of its cilia, and the Human amoeba has more
: cilia in that area. Well, the DNA looks for nearby cilia that have a low
: area of contention or none at all, and moves them to strengthen the weak
: area.
: The advance continues.
: I hope this analogy brings across the basic theory behind what
: I'm saying.

This *does* help somewhat, actually. I'm still somewhat concerned


over the focus of the technique though. Without actually coding it
up (maybe I'll have time this weekend to try that experiment), it feels
to me as if units will rush to and fro.

: > This begins to approach something mentioned earlier, the concept


: >of breaking the AI into a 'General' and a 'Sergeant' makes a lot of
: >sense. The General draws up the overall battle plan and determines
: >what objectives to take. The Segeant determines the best way to do it.
: >Beyond Squad Leader will reportedly use a similar approach.

: Precisely. It's how actual armies work, and there's a good


: reason for that....why? Because it works!

http://www.gamedev.net/reference/articles/article1085.asp (45 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

But that's my point, I think. This approach seems aimed purely at the
General's side of things--"these are important objectives to seize, these
are units I'd like to see killed"--without consideration of the 'practical'
aspects of the problem (the Sergeant's job, in other words). Please don't
misunderstand; I think this is valuable IF being presented as an approach
for the General side of things.

Steve

==============================================================================
Steven Woodcock _
Senior Software Engineer, Gameware _____C .._.
Lockheed Martin Information Systems Group ____/ \___/
Phone: 407-826-6986 <____/\_---\_\ "Ferretman"
E-mail: woodcock@gate.net (Home)
swoodcoc@oldcolo.com (Alternate Home)
woodcock@escmail.orl.mmc.com (Work)
Disclaimer: My opinions in NO way reflect the opinions of the
Lockheed Martin Information Systems Group, although
(like Rush Limbaugh) they should. ;)
Motto:
"...Men will awake presently and be Men again, and colour and laughter and
splendid living will return to a grey civilization. But that will only come
true because a few Men will believe in it, and fight for it, and fight in its
name against everything that sneers and snarls at that ideal..."

-- Leslie Charteris
THE LAST HERO

==============================================================================

From iplmail.orl.mmc.com!news.den.mmc.com!news.coop.net!cs.umd.edu!zombie.ncsc.mil!
news.mathworks.com!news.kei.com!wang!news Fri Jun 16 16:15:16 1995
Newsgroups: comp.ai.games
Path: iplmail.orl.mmc.com!news.den.mmc.com!news.coop.net!cs.umd.edu!zombie.ncsc.mil!
news.mathworks.com!news.kei.com!wang!news
From: bruck@actcom.co.il (Uri Bruck)
Subject: Re: Influence Mapping: Strategic . . .
Organization: ACTCOM - Internet Services in Israel
Date: Fri, 16 Jun 1995 13:05:10 GMT
Message-ID:
References: <3rict3$bl2@theopolis.orl.mmc.com> <3rl3f3$88b@clarknet.clark.net>
<3rnl56$n83@theopolis.orl.mmc.com>
Sender: news@wang.com
Lines: 103

I thinkI can add something to the thread about influence mapping, using
the already mentioned idea of General/Sargent algorithm.
This may seem obvious to many of you, but it's a point worth mentioning.

The hierarchical division to General/Sargent can be used to save a lot of

http://www.gamedev.net/reference/articles/article1085.asp (46 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

computation time if each level sees the map in a different resolution.


Personally I prefer having four levels (am currently trying to implement
something using four levels of units) where the two lower levels
are actually represented as field units, and therefore structurally similar,
the two higher levels are command levels.

This design (with more or less levels) becomes effective if the General
only sees the map in a lower resolution than.
(Perhaps I should also mention that I like to use cartesian coordinates rather
than hexes, since I have no pre-computer war game experience, this deosn't
mean using squares - like Dune II apparently does, but continous coordinates)
Influence mapping can still be done this way.

Suppose we have different maps of the playing fields, in different


resolutions (sp?), each level sees the map in an appropriate resolution.
The General(HQ) sees the entire map in low resolution, it will see groups
of units as areas of influence, we could also add heading and velocity of
units to determine how dangerous they are to areas that the AI wants to
defend. A faster group of units would be more dangerous because it would
reach the AI area in less time. The AI knows it has several 2nd level
commanders at its disposal, it can assign one to defend a specific area,
it can assign a couple of others to secure a position that will threaten
an enemy installation, this will be done after low resolutions analysis
is done using perhaps one of the previously posted methids of influence mapping.

The 2nd level commanders - Colonels - receive their instructions, such as,
attack a certain area,= they would need a more detailed map of that area, and
perhaps the way to get there, it is not necessary to make the detailed map
for the entire playing field, only for those areas which the Colonels need
information about, if two Colonels need information about the same area, they
can use the same piece of map. Colonels have different kinds of units they
can use to carry out their mission. The units are grouped under sargents,
I see two possible kinds of groups, homogenous groups, or mixed groups,
provided the units in the mixed groups can travel the same types of terrain
at more or less the same speeds.

Colonels need to find the method that will give the best chance of success
in their mission. They can another method mentioned under several names
in the thread, like 'flowing' through the influence map and determine
the shortest route, finding the weakest spot etc.
They can try to determine which tatics would have the best chance of
success.

If they 'know' (pre-programmed knowledge) that destroying a certain defended


installation can be done in one of two ways:
1. head on attack
2. long range artillery first - short range attack later.
each of these tactics would state some requirements
(f'rinstance no.2 would require that the Colonel can use its long range units
to shell the destination, while the other forces maneaver into position
to attack - it would need to calculate the approximate amount of time
it would take to carry this out, it could also test the possibily of
using different balances)
In short the Colonel would have access to many generalized tactical
scripts for each command would have to choose the one with the highest

http://www.gamedev.net/reference/articles/article1085.asp (47 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

possibilty of success.

Sargents are simpler - They receive one of the basic command from the Colonel
and distribute them among their units, basic command like move, attack, stand
and hold fire, all the lower level stuff like changing direction, updating
position etc. is handled by the unit itself. The Sargent may receiv a command
to attack a group of units, and assign each of its units to on of the units in
the group. Most of the units in the group can be pretty dumb - the Sargent can
be used to check out the surounding area and adjust the movement orders if
necessary.

F'rinstance, if they are being shot at while en route, it would be up to


the Sargent to decide whether they should try to evade the attack and still
try to make it to their assigned destination, or dispatch some of the units
to take care of the immediate danger while the others continue on their
main mission.

This may sound comlicated to do at every turn - but then I was not thinking
of a turn based game in the usual sense, but something more along the lines
of Dune II, which runs continously.

What is left is detemining how often each level of command should be updated.
Units that actually move should be continously updated. The higher the level
the less updating one needs, what the higher level units should do is mostly
check on the progress of the current mission and things that may need
immediate attention. this can be done both by using the maps,and the reports
(In my implementation I update the maps about every six program cycles, when
considereing a turn based game this sound too slow, but I my design was
a continous play game, to prvent jerkiness, I update 1 sixth of the
general map, every turn)

Reports - just as commands flow down the command hierarchy, reports should
flow upwards, information from reports can sometimes greatly enhance the
information received from control maps, letting each command level know
the exact status of the units one level below, thus it can determine whether
its plans are being carried out successfuly, whether it is necessary to call
on resrves, change plan of action, give commands at the proper time.
(this assumes flasless communications - it would interesting to watch
what happens if we allow comminications to falter)

AI should also try to guess where the enemy intends to attack, recognize
concentrations of force before they happen, this is posssible, by
extrapolating movement vectors of groups o funits, this isn't precise, but
it give the AI a general idea where the enemy might converge and it could send
some forces to be in the vicinity, so they can at least slow down the
enemy forces.

Uri Bruck

==============================================================================

Christopher Spencer wrote:


> I've read, with some interest, all of the proposals for wargame
>AI, and I haven't seen this technique postulated. This is the technique
>I intend to use for my Fantasy Wargame Engine.

http://www.gamedev.net/reference/articles/article1085.asp (48 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

>:
>:Lots of stuff
>:

I would prefer a General/Sargeant AI over this. The General AI should:


1. Look at the types of troops you have
2. Look at the types of troops the opponent has
3. Determine a strategy that best suits the troop types
4. Given the constraint of the given geography, selects the best set up
and determines objectives.

Types of troops I would break into three categories:


1. Mobile
2. Infantry or ground
3. Missile

Let me site some historical examples of why I think this would work best:
1. At crechy bridge, the English had superior missile strength and
inferior mobile troops. Neither force had much infantry. Now, if the
English had made their stand on a hill in open country, the French would
have been able to threaten all sides then concentrate the forces on a
weak point. Instead, the English chose to defend a gap between woods,
which funneled the French mobile troop into the English fire and enable
the English to set up a short defensive line to repel what French made
it through the fire.
2. During the era of Phalanx combat, high ground was a disadvantage. Missile
fire at the time was not very effective and more effective for disrupting
formations. The general's goal was to find an area were his Phalanx
would be on level, clear ground while they were fighting so that their
line would be the most cohesive.
3. When Marc Anthony invaded Persia, he had very little cavalry and the
Persians had a lot of effective horsebowmen. Then eventually forced
Marc Anthony's retreat not by keeping any ground but by taking advantage
that they could damage him from a distance and that the Romans were
not mobile enough to catch them.

You also have to consider the bigger picture. Let's say the battle is between
a small, mobile force and your larger force of infantry and artillery. Logic
would say that you should use your artillery to pin the opponent down, move up
infantry in a deliberate fashion to prevent the units from being disorganized
when the mobile units could possible attack, then overwhelm the mobile units.
But what if the mobile units are holding a bridge and are soon going to be
relieved by an army much larger than yours? Now, you need a whole new attack
method so that you can destroy the bridge as quickly as possible.

Dennis W. Disney
disney@mcnc.org

==============================================================================

In article <3rsl65$ade@stingray.mcnc.org>,
disney@mcnc.org (Dennis W. Disney) wrote:
>Christopher Spencer wrote:
>> I've read, with some interest, all of the proposals for wargame
>>AI, and I haven't seen this technique postulated. This is the technique

http://www.gamedev.net/reference/articles/article1085.asp (49 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

>>I intend to use for my Fantasy Wargame Engine.


>>:
>>:Lots of stuff
>>:
>
>I would prefer a General/Sargeant AI over this. The General AI should:
>1. Look at the types of troops you have
>2. Look at the types of troops the opponent has
>3. Determine a strategy that best suits the troop types
>4. Given the constraint of the given geography, selects the best set up
> and determines objectives.
>
[lot's of interesting and intelligent stuff deleated]

>You also have to consider the bigger picture. Let's say the battle is
between
>a small, mobile force and your larger force of infantry and artillery. Logic
>would say that you should use your artillery to pin the opponent down, move
up
>infantry in a deliberate fashion to prevent the units from being disorganized
>when the mobile units could possible attack, then overwhelm the mobile units.
>But what if the mobile units are holding a bridge and are soon going to be
>relieved by an army much larger than yours? Now, you need a whole new attack
>method so that you can destroy the bridge as quickly as possible.
>
>Dennis W. Disney
>disney@mcnc.org

I'm not a programmer but I know a little about strategy. You might try for a
"personality" algorythm of some kind -- e.g. a agressive, defensive, or
conservative algorythm which could be varied according to circumstances.

Two possible examples:

1 - The "Commanders" are alocated thier algorythm at the start of the game and
make all moves accordingly.

2 - The "Commanders" "adopt" an algorythm according to the intelligence


"available" to them.

Re: 1 -- A commander with a conservative offensive algorythm would lose the


bridge in the example above.

Re: 2 --

One historical example that comes to mind is Montgomery's defeat at Arnhiem


bridge. According to Cornelius Ryan, Montgomery's strategy was completely
"out of character". He made the attacks in a daring move based on available
intelligence. In a game AI scenerio this could be used as an example of a
personality template. The "Player", if he understands the weaknesses of a
particular opponent, can devise a strategy taking advantage of these
weaknesses -- the example of the bridge above could work this way. In other
words, the Player gets to "play against the man".

This may be of no use to anyone, but the idea of a game with "personality

http://www.gamedev.net/reference/articles/article1085.asp (50 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

algorythms" effecting strategies is intriguing to me. I mean, would anyone


really consider the Charge Of The Light Brigade good strategy? It was an act
of personality and ego -- and cost too many lives.

Owen Coughlan

PS, I don't want to argue about Montgomery, I'm just citing it as a tenuous
example.

==============================================================================

Dennis W. Disney (disney@mcnc.org) wrote:

: I would prefer a General/Sargeant AI over this. The General AI should:


: 1. Look at the types of troops you have
: 2. Look at the types of troops the opponent has
: 3. Determine a strategy that best suits the troop types
: 4. Given the constraint of the given geography, selects the best set up
: and determines objectives.

: Types of troops I would break into three categories:


: 1. Mobile
: 2. Infantry or ground
: 3. Missile

Okay. Reasonable enough partitioning.

: Let me site some historical examples of why I think this would work best:
: 1. At crechy bridge, the English had superior missile strength and
: inferior mobile troops. Neither force had much infantry. Now, if the
: English had made their stand on a hill in open country, the French would
: have been able to threaten all sides then concentrate the forces on a
: weak point. Instead, the English chose to defend a gap between woods,
: which funneled the French mobile troop into the English fire and enable
: the English to set up a short defensive line to repel what French made
: it through the fire.
: 2. During the era of Phalanx combat, high ground was a disadvantage. Missile
: fire at the time was not very effective and more effective for disrupting
: formations. The general's goal was to find an area were his Phalanx
: would be on level, clear ground while they were fighting so that their
: line would be the most cohesive.
: 3. When Marc Anthony invaded Persia, he had very little cavalry and the
: Persians had a lot of effective horsebowmen. Then eventually forced
: Marc Anthony's retreat not by keeping any ground but by taking advantage
: that they could damage him from a distance and that the Romans were
: not mobile enough to catch them.

Good examples. We have perhaps been lax on using historical examples


while considering the feasibility of these 'theoretical' AIs.

: You also have to consider the bigger picture. Let's say the battle is between
: a small, mobile force and your larger force of infantry and artillery. Logic
: would say that you should use your artillery to pin the opponent down, move up
: infantry in a deliberate fashion to prevent the units from being disorganized

http://www.gamedev.net/reference/articles/article1085.asp (51 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Recognizing Strategic Dispositions thread

: when the mobile units could possible attack, then overwhelm the mobile units.
: But what if the mobile units are holding a bridge and are soon going to be
: relieved by an army much larger than yours? Now, you need a whole new attack
: method so that you can destroy the bridge as quickly as possible.

This is the point I was trying to make earlier with regards to deciding
which units to use. Most of the algorithms we've discussed fail to 'weight'
their decision making based on what they're doing AT THE MOMENT. If already
engaged in battle, for example, they may very well be ABLE to run over and
nuke some isolated enemy unit, but they might be better off standing where
they are to finish off the unit or units they're presently engaged with.

Consideration of time in the equation is also a factor which we've


tended to overlook, as you point out. If we use an idea previously presented
regarding the computation of movement vectors, then the AI can use that
information to 'look ahead' and see that the enemy is maneuvering units
towards reinforcement of the bridge. That information could be used to
weight its decision making towards taking the bridge now while it can
vs. waiting for more reinforcements. The odds might be lower, but the
price of success LATER is so much higher that it may make sense to attack
now.

Steven
Discuss this article in the forums

Date this article was posted to GameDev.net: 7/5/2000


(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article1085.asp (52 of 52) [25/06/2002 2:55:44 PM]


GameDev.net - Application of Genetic Programming to the Snake Game

Application of Genetic Programming to the Snake Game GameDev.net

See Also:
Artificial Intelligence:Genetic Algorithms

Application of Genetic Programming to the "Snake Game"


by Tobin Ehlis
http://www.rexall.com/tobin

Abstract: This paper describes the evolution of a genetic program to optimize a problem featuring task prioritization in a
dynamic, randomly updated environment. The specific problem approached is the "snake game" in which a snake confined to a
rectangular board attempts to avoid the walls and its own body while eating pieces of food. The problem is particularly interesting
because as the snake eats the food, its body grows, causing the space through which the snake can navigate to become more
confined. Furthermore, with each piece of food eaten, a new piece of food is generated in a random location in the playing field,
adding an element of uncertainty to the program. This paper will focus on the development and analysis of a successful function
set that will allow the evolution of a genetic program that causes the snake to eat the maximum possible pieces of food.

Introduction and Overview


Artificial intelligence (AI) techniques have been proven highly successful at the problems of navigation, task prioritization, and
problem avoidance. Traditionally, humans have encoded rule-based AIs to create the behaviors necessary to allow an automaton
to achieve a specific task or set of tasks. Genetic programming (GP), however, has been proven to allow a computer to create
human-competitive results. Specifically, examples such as the wall-following robot (Koza 1992) and Pac Man® (Koza 1992)
demonstrate the effectiveness of GP at evolving programs capable of navigation and task prioritization behaviors which are
competitive with human-produced results.
In an original approach to demonstrating the effectiveness of GP at producing human-competitive results, this paper describes the
evolution of a genetic program that can successfully achieve the maximum possible score in the "snake game." The problem
posed by the snake game is of particular interest for two main reasons. First, the size and shape of the area through which the
main game character, the snake, can move is constantly changing as the game progresses. Second, as the snake eats the single
available piece of food on the game board, a new piece is generated in a random location. Because of these two factors, the snake
game presents a unique challenge in the developing of a function and terminal set to allow GP to evolve an optimal solution that
is generalized for successive runs of the snake game.
The "Background" section of this paper outlines the rules and discusses the specific details of the "snake game." Next, "Statement
of the Problem" explains the problem being addressed by this paper. The "Methods" section provides the GP specifics of how the
problem was approached. The "Results" section gives numerous examples of results produced by the GP runs along with a
discussion and analysis of those results. The "Conclusion" section summarizes the ultimate results achieved by the paper. The
"Future Work" section discusses potential for further study in line with the work discussed in this paper. Finally, the "References"
section provides a bibliography for the paper.

Background
The "snake game" has been in existence for over a decade and seen incarnations on nearly every popular computing platform. The
game begins with a snake having a fixed number of body segments confined to a rectangular board. With each time step that
passes, the snake can either change direction to the right or left, or move forward. Hence the snake is always moving. Within the
game board there is always one piece of food available. If the snake is able to maneuver its head onto the food, its tail will then
grow by a single body segment and another piece of food will randomly appear in an open portion of the game board during the
next time step. The game ends when the snake’s head advances into a game square that is filled with either a snake body segment,
or a section of the wall surrounding the game board. From a task prioritization standpoint, then, the snake’s primary goal is to
avoid running into an occupied square. To the extent that this first priority is being achieved, its second priority is to pursue the
food.

http://www.gamedev.net/reference/articles/article1175.asp (1 of 11) [25/06/2002 2:59:39 PM]


GameDev.net - Application of Genetic Programming to the Snake Game
The version of the game used for this paper, shown in figure 1, is a replica of the game as it currently exists on Nokia cell phones.
In this version, which is available for play online at www.nokia.com/snake, the game board is made up of 220 total squares, 20
horizontal and 11 vertical, and the food begins in position (11,6) on the game board, represented by a diamond in the figure. The
snake is initially made up of 9 body segments, occupying
positions (1,11)-(9,11) on the board, with the head in position
(9,11) and the snake moving to the right, represented by the
arrow in the figure. The maximum number of pieces of food that
can be eaten is the size of the game board minus the initial size
of the snake. With the given parameters, then, this equates to
220-9=211 pieces of food. This is because with each piece of
food eaten, the snake grows by a body segment, reducing the
amount of free space in which it can move. Hence when it has
eaten 211 pieces of food, its body will fill the entire game board,
rendering any further movement impossible. One critical piece
of information is whether or not it is even possible for the snake
to eat the maximum amount of food. Indeed it is conceivable
that after eating a certain amount of food, the snake will have
grown so large that it restricts itself from access to a portion of
the board. Upon close inspection, however, the reader will note
that by tracing certain patterns repeatedly over the board, it is
possible for the snake to cover every square exactly once and return to its initial position. One such pattern is shown in figure 2,
which features a snake of 210 body segments about to eat the final piece of food. Hence by continually tracing the pattern shown,
the snake can eat the maximum possible pieces of food.
In evolving a genetic program to successfully eat the maximum
amount of food, a human competitive solution, in terms of score,
will have been obtained. With that in mind, there are some
important differences in the game when being played by a
human as opposed to a computer-generated program.
For a human player, the fact that the snake is always moving
adds an element of pressure, forcing him/her to make decisions
in a timely manner. When using a computer to play the game,
this is not a concern, as the computer will have the time between
each move to parse through a program tree and determine the
next move. The nearest equivalent to "pressure" for a computer
is any limitation imposed on the size and depth of the genetic
program’s function tree. These limitations restrict the possible
number of decision trees that can be generated, thereby ensuring
that the computer will have a finite amount of time in which to
determine the next move for the snake. The particular function
tree limitations imposed for this problem will be discussed in the following "methods" section.

Statement of the Problem


The fundamental problem of the snake game is to eat the maximum number of food pieces before "dying" by running into either a
wall or a segment of the snake’s body. The problem being addressed in this paper is to provide a function and terminal set that
will allow for the evolution of a GP that will maximize the number of food pieces eaten by the snake. The maximum goal for the
particular configuration of the snake game used in this paper is 211 pieces of food.

http://www.gamedev.net/reference/articles/article1175.asp (2 of 11) [25/06/2002 2:59:39 PM]


GameDev.net - Application of Genetic Programming to the Snake Game

Methods
Table 1 provides the tableau for the initial runs of the snake game. Following over twenty initial runs of the program, the
maximum score that had been achieved was 123 hits. As it was apparent that a maximum solution would not be obtained using
the initial function set, the function set was expanded to enhance the snake’s movement and environment sensing capabilities. For
the remainder of the paper, any GP runs performed with the function and terminal sets given in Table 1 will be referred to as a run
made with the "initial" function set. Any run made with the enhanced function set, which includes the complete initial function
set as a subset, will be referred to as having been made with the "final" function set. A discussion of both the initial and final
function sets follows.
Table 1. Tableau for Snake-Game Problem

Objective: Find a computer program that eats the maximum possible pieces of
food.

Terminal set: (forward), (left), (right)

Function set: ifFoodAhead, ifDangerAhead, ifDangerRight, ifDangerLeft, progn2

Fitness cases: One fitness case.

Raw Fitness: Pieces of food eaten.

Standardized Maximum possible pieces of food eaten (211) minus the raw fitness.
fitness:

Hits: Total pieces of food eaten during a run of the program, same as raw
fitness.

Wrapper: None.

Parameters: M = 10000. G = 500.

Success predicate: A program scores 211 hits.

Terminals: The terminal set chosen for the problem was right, left, and forward. Each terminal was a macro that would cause the
snake to take the corresponding action during a time step as follows:
Right: the snake would change its current direction, making a move to the right
Left: the snake would change its current direction, making a move to the left
Forward: the snake would maintain its current direction, and move forward. This is the same as a no-op, as the snake must make
a move during each time step.
These three terminals represent the minimal terminal set with which the snake can effectively navigate its surroundings. While
some problems consisting of navigation in a two-dimensional grid can be successfully navigated by way of only one direction
changing terminal, that is impractical for the snake game because the facts that the game board is enclosed and that the snake has
an extended body that is impassible necessitate the ability for the snake to move in either direction in order to avoid death. More
advance terminals, such as moving the snake along the shortest path to the food, were not implemented. Rather, the function set
was constructed in such a manner that the GP could evolve the necessary capabilities to achieve the maximum score.
Functions: Initially the snake was given very limited functionality. One function gave it information about the location of the
food, three other functions gave it information about any immediately accessible danger, and progn2 was provided as connective
"glue" to allow a function tree to make multiple moves in a single pass. All functions were implemented as macros of arity two,
and therefore would only execute one of their arguments depending on the current state of the game, except for progn2, which
would execute both of its arguments. Even though no expressions evolved from this initial function and terminal set were able to
achieve the optimum score of 211 pieces of food, this set served as a baseline by which to evaluate progress and determine

http://www.gamedev.net/reference/articles/article1175.asp (3 of 11) [25/06/2002 2:59:39 PM]


GameDev.net - Application of Genetic Programming to the Snake Game
enhancements the would lead to the eventual optimal solution. Following is a description of the initial function set:
ifFoodAhead: If there is food in line with the snake’s current direction, this function will execute its first argument, otherwise it
will execute the second argument. This was the only initial function that gave the snake information beyond its immediate
surroundings.
ifDangerAhead: If the game square immediately in front of the snake is occupied with either a snake body segment or the wall,
this function will execute its first argument, otherwise it will execute its second argument.
ifDangerRight: If the game square immediately to the right of the snake is occupied with either a snake body segment or the wall,
this function will execute its first argument, otherwise it will execute its second argument.
ifDangerLeft: If the game square immediately to the left of the snake is occupied with either a snake body segment or the wall,
this function will execute its first argument, otherwise it will execute its second argument.
progn2: This is a connectivity function that will first execute its right argument, then its left. It is the only function that allows
execution of more than one terminal in a single parse of the function tree. Although this function will always execute both of its
arguments, it was necessary to implement it as a macro because of the way that the software used to make GP runs, Dave’s
Genetic Programming in C (DGPC), evaluated functions vs. macros. To avoid unnecessary modification of DGPC, implementing
progn2 as a macro proved the simplest option.
As mentioned previously, no GP runs performed with the initial function set were able to score greater than 123 hits. In order to
increase the probability of evolving a function tree capable of achieving the maximum number of hits, the initial function set was
enhanced. Functions were added to extend the snake’s capabilities for detecting food and danger, as well functions that were
conditional on the snake’s current movement direction. Following is a discussion of the additional functions that, along with the
initial function set, make up the final function set.
Additional Functions, all of arity 2:
ifDangerTwoAhead: If the game square two spaces immediately in front of the snake is occupied by either the wall or a segment
of the snake’s body, this function will execute the first parameter, otherwise it will execute the second.
ifFoodUp: If the current piece of food on the board is closer to the top of the game board than the snake’s head, then the first
parameter of this function will be executed, otherwise the second parameter will be executed.
ifFoodRight: If the current piece of food on the board is further to the right of the game board than the snake’s head, then the first
parameter of this function will be executed, otherwise the second parameter will be executed.
ifMovingRight: If the snake is moving right, then the first parameter of this function will be executed, otherwise the second
parameter will be executed.
ifMovingLeft: If the snake is moving left, then the first parameter of this function will be executed, otherwise the second
parameter will be executed.
ifMovingUp: If the snake is moving upward, then the first parameter of this function will be executed, otherwise the second
parameter will be executed.
ifMovingDown: If the snake is moving downward, then the first parameter of this function will be executed, otherwise the second
parameter will be executed.
There are two characteristics of the final function set that should be given special attention. First, note that the "ifFoodUp" and
"ifFoodRight" functions are direction independent, meaning that the direction in which the snake is moving has no impact on the
function’s behavior. This is in contrast to the initial set of functions, such as "ifDangerAhead", in which the direction that the
snake was traveling would have an impact on the return value of the function. The reason for the difference is to maintain
simplicity in the function set. The snake can potentially be surrounded by danger, but there will only be one piece of food on the
board at any one time. If the "ifDanger*" functions were direction-independent, then two significant complexities would be added
to the problem.
1. An additional function would be required, as there would need to be one for all cardinal directions in order to account for
all possible surrounding dangers. An added downfall of this complexity is that one of the "ifDanger*" functions will be
virtually meaningless depending on the direction of snake’s travel, since the snake’s neck segment adjacent to the snake’s
head is always an adjacent danger, although not one of any consequence to the snake, since it is unable to move back on
itself.

http://www.gamedev.net/reference/articles/article1175.asp (4 of 11) [25/06/2002 2:59:39 PM]


GameDev.net - Application of Genetic Programming to the Snake Game

2. Anytime an "ifDanger*" function was used, it would need the aid of a helper function, such as the new "ifMoving*"
functions in order to make intelligent moves based on an assessment of the danger.
Taking the second complexity into account, the reader may now note that the same disadvantage is true of the two new functions,
"ifFoodUp" and "ifFoodRight." Indeed this is true, but an important difference between the role of food and the role of danger in
the game makes for a worthwhile tradeoff. The difference is that there will only be one piece of food on the board at any time.
This allows the new "ifFood*" functions to serve as two functions each. To clarify, consider the ifFoodUp function. When not
true, it is indicating that the food is either down, or on the same horizontal plane as the snake’s head. Now consider a hypothetical
"ifDangerUp" function. If this function were not true, it would tell nothing about whether or not danger is down, because it can be
anywhere simultaneously. Likewise is would not even tell whether existing danger that was "up" posed a immediate threat to the
snake, as the further information of the snake’s current moving direction would need to be known, as discussed earlier. For the
second special characteristic of the new functions, consider the new "ifMoving*" functions. These functions can be used as helper
functions with the two new "ifFood*" functions to create beneficial schemata.
As an example of a beneficial schemata, consider "ifFoodUp(ifMovingRight(left, ifMovingUp(fwd, right))))", which will orient
the snake to pursue food that is upward. As will be seen in the results section, not only does the GP learn how to use these
functions in conjunction with the two new "ifFood*" functions, but they also prove useful in helping the snake discover patterns
that greatly extend its life. Discussion of other schemata is given below in the description of schemata, and specific examples are
given in the "Results" section.
Fitness Cases: For initial runs of the problem, only a single fitness case was used to determine the fitness for each individual.
Because the food placement is random both during a single run, and from one run to another, occasionally individuals would
score a number of hits because of fortuitous placement of the food, and not as much on the merit of their function tree.
To better ensure that the most successful individuals achieved high fitness measures primarily on the basis of their function tree,
new GP runs were often made featuring a "primed" population in which the fitness was measured as the average of four runs of
an individual. The procedure for this is as follows: once a run had completed without obtaining a solution, or if a run had stalled
on a single individual for a large number (100 or more) of generations, a new run was begun with this final individual as one of
the initial individuals. For this new run, however, the fitness was taken as the average fitness of an individual over four runs
instead of merely a single run. The averaging of the fitness over four runs helped eliminate the possibility of an individual having
a high fitness due simply to lucky placement of the food. Using this averaging method to determine fitness was only used in
primed populations because it increased the time of a GP run fourfold. Furthermore, it was common for the generations that timed
out to feature an individual who had scored a high fitness as a result of a lucky run. By beginning a new run with this individual
in the initial population, it not only assured a more realistic fitness measure, but it introduced an entirely new mix of randomly
generated schemata that could potentially benefit the stalled individual. Details of results produced by primed runs are given in
the results section.
Fitness Measure: The fitness measure used is the maximum possible pieces of food eaten, 211, minus the actual number of
pieces of food eaten. Furthermore, if the snake was unsuccessful at eating any food the fitness would be penalized by the number
of board squares that it was from the food. This additional criterion was added to favor individuals who moved toward the food in
early generations of snakes who were unable eat any food.
Parameters: Population was set to 10000. The maximum number of generations was set to 500. The size of a function tree was
limited to 150 points. These parameters were chosen mainly based on available computer resources, covered in computer
equipment and run-time explanation below.
Designating a result and criterion for terminating a run: The best of generation individual will be the one that is able to eat
the most pieces of food. A run will end when one of three termination criteria are met:
1. The snake runs into a section of the game board occupied by a wall

2. The snake runs into a section of the game board occupied by a segment of the snake’s body

3. The number of moves made by the snake exceeds a set limit. This limit was set to 300, slightly larger than the size of the
game board. This will prevent a snake from wandering aimlessly around a small portion of the board.
The reader may note that there is no termination criterion for the completely successful snake. That is because upon eating the
final piece of food, the snake’s tail will grow onto its head, causing it to satisfy termination criteria 2 above. Hence even the

http://www.gamedev.net/reference/articles/article1175.asp (5 of 11) [25/06/2002 2:59:39 PM]


GameDev.net - Application of Genetic Programming to the Snake Game
optimal solution will end in death for the snake.
Crossover, mutation rates: Crossover of nodes was the primary genetic operator employed during the GP runs. The crossover
fraction for leaves was set to .10; the crossover fraction for a node was set to .80; the mutation fraction was set to 0. Additionally,
primed GP runs were used to improve genetic diversity, as described above in the description of fitness cases.
Computer equipment and run time: The majority of the computer runs were performed on a 550MHz Intel® Celeron Processor
running Microsoft® Windows 98 SE Operating System. The software used was Version 2.0 of Dave’s Genetic Programming In
C, and Microsoft® Visual C++ 5.0. In addition, a stand-alone simulation of the snake game was created that was able to read in
the function trees produced by DGPC and graphically display a run of a particular function tree. This utility proved invaluable, as
it provided a fast, visual method to determine the overall optimization strategy represented by the function tree. The alternative of
hand-evaluating each function tree would have proven not only more time consuming, but much less conclusive. A complete run
of 500 generations took around 20 hours to complete. Because of the length of time for each run, many runs were farmed out to
separate computers, all with approximately equivalent computer power.
Schemata: Given the initial function set, there were a few highly desirable sub-tree schemata that could be produced. First,
considering a minimal sub-tree of 3 points, any sub-tree that would evade impending danger by changing directions is certainly
the key to survival of an individual. One such sub-tree is "ifDangerAhead(right, forward)." Secondly, a basic sub-tree that will
avoid changing directions into impending danger is solely beneficial to an individual. One example is "ifDangerRight(forward,
right)." The reader will note that anytime a change in direction is about to be undertaken, it would be wise to have such a check
before making the move. Thirdly, a 3-pointed sub-tree that aims at pursuing the food, and modifying directions if no food is
ahead, is required to give the individual more than a random opportunity to eat the food pieces. One such individual is
"ifFoodAhead(forward, right)."
As explained previously, the "ifFoodAhead" function will return true for a piece of food any number of squares in front of the
snake. Therefore, in addition to seeking the food, it would also be desirable for the individual to continually scan for impending
danger while the food is being sought. Hence a final example of a desirable schemata is any combination of the above three
examples that effectively combines the goals of each. For example, consider the following function tree of 7 points:
ifFoodAhead(ifDangerAhead(right, forward), ifDangerRight(left, right)). This schema will cause the snake to pursue food ahead
as long as no immediate threat is observed. If however, there is a threat or no food ahead, the sub-tree will cause the individual to
change direction avoiding any observed danger, or pursuing a new vector to find food. Specific examples of the emergence of
such schemata will be given in the results section.
In addition to the potential beneficial schemata, touched on above, there are also "detrimental" schemata. The detrimental
schemata would be any function branch whose primary goal is to either seek danger or avoid food. Examples of detrimental
schemata are essentially the converse of the previously outlined beneficial schemata, and their further consideration is left to the
reader.
Certainly all schemata are not strictly beneficial or detrimental, and any such schemata will be called "neutral schemata."
Consider, for example, the simple subtree "ifDangerRight(left, forward)." This function will turn left if danger is present to the
right, and continue forward otherwise. This schema makes either a left or forward move without having any apparent knowledge
of what lies in those directions. This could certainly prove to be detrimental, but the move to the left when danger is right is at
least avoiding the danger to the right. Schemata such as this can actually prove beneficial when placed in the context of a
complete function tree. An examination of actual schemata produced during the GP runs in question follows in the results section.

Results
As mentioned in the methods section, there were three types of GP runs made in an attempt to evolve a solution to the snake
game: runs using the initial function set, the final function set, and primed runs, also using the final function set. The highest
number of hits generated by a run using the initial function set was 123. Three separate solutions were generated using the final
function set, although none of them were found to consistently generate a solution. The number of hits achieved by each solution
depended on the placement of the food. It was not until the method of "priming" a run, described in the methods section, was used
that a consistent solution was generated. Of ten primed runs, using variousinitial seeds, exactly five of them evolved a solution,

http://www.gamedev.net/reference/articles/article1175.asp (6 of 11) [25/06/2002 2:59:39 PM]


GameDev.net - Application of Genetic Programming to the Snake Game
all of which were consistent solutions over multiple runs.
Comparatively, over twenty runs using the full function set were
made, and only three of them produced solutions, none consistent.
A summary of the overall results achieved in each type of run is
given in figure 3. Each line on the graph is the average of ten runs.
Note that the initial and final function sets produce a roughly
equivalent maximum number of hits until about generation fifty.
At this point the final function set continues to improve while the
initial function set levels off. By generation 200, the initial
function set has virtually no improvement, while the final function
set continues improving past generation 400. Because the final
function set is both more complete and larger, new and more
successful individuals continue to evolve while individuals
produced by the initial function set max out around 100 hits.
Another feature to note in figure 3 is the impressive results
achieved by the primed runs. All primed runs were begun with an
individual from a final function set run who had achieved at least 150 hits. When taken as the average over four runs, however,
these individuals are only able to achieve about 50 hits, as shown in the first generation of the primed runs. These individuals
jumpstart the population to great success, and by generation 25 the maximum number of hits has more than tripled to around 160.
By generation 150 the primed runs level off to about 200 hits. Following is an evaluation of some of the most prominent
strategies evolved during the various GP runs. Specific examples of individuals from each type of run are presented and analyzed,
and all function trees are reduced for simplicity sake.
Zig-zagger: One strategy that was prevalent in individuals across multiple runs is what will be referred to as the "zig-zagger."
These individuals would trace the board diagonally in a stair-stepper pattern until they either reached a wall or had lined the
direction of their movement up with the food. Upon reaching a wall, they would change their direction as if bouncing off of the
wall, and continue diagonally tracing the board in a new direction. If they were successful in aligning their movement with the
piece of food, they would typically head directly toward the food, perhaps avoiding danger depending on the particular individual.
Variations in zig-zaggers occurred between which directions they would head when hitting the wall, how often they would seek
the food, and how they would react in enclosed situations, such as corners or heading towards food that was blocked by their
body. Obviously the more successful individuals evolved traits that allowed them to avoid danger in close quarters and dodge
their body when it blocked progress toward the food. One example of a zig-zagger, who was able to score a maximum of 33 hits
in one particular run, is given below:

(ifFoodAhead (ifDangerLeft (right )(ifDangerRight (1forward )(ifDangerAhead (left )(


1forward ))))
(ifDangerAhead (ifDangerLeft (right )(left ))
(ifDangerLeft (ifDangerRight (forward )( 2right ))
(progn2 (left )(right )))))

Consider initially the rightmost sub-tree of the function tree, which is given on the last line as progn2 (left)(right). This is the
branch executed initially and for the majority of this zig-zagger’s run. When executed repeatedly, this sub-tree will cause the
snake to move left then right, progressing diagonally across the board. For this example, the sub-tree is executed whenever there
is no food ahead of the snake’s line of movement, and there is no danger in front of or to the left of the snake’s head. This
continuous zig-zagging motion allows the snake to examine successive rows or columns of the board in search of the food.
Because both branches of the progn2 are executed before returning to the beginning of the function tree, however, the snake will
only detect the food if the second argument of the progn2, right, leaves the snake’s head in line with the food.

http://www.gamedev.net/reference/articles/article1175.asp (7 of 11) [25/06/2002 2:59:39 PM]


GameDev.net - Application of Genetic Programming to the Snake Game
Once the food is directly in line with the movement of the
snake’s head, the left-hand sub-tree, given on the first line
above, is executed. As noted with a "1" above, the snake will
continue forward if there is no danger to the left, and either there
is danger to the right, or there is no danger to the right or ahead.
Unfortunately for the snake, if there is no danger to the left, but
danger to the right and ahead, this function tree will lead it
directly into the danger ahead, noted with the first "1" above.
This is exactly what happened to the snake in figure 4, shown
one time step before its demise after having eaten 24 pieces of
food. This snake, whose head is in (14,5), began moving towards
the food in (4,5) after having released from the wall.
Finally, note that when the "right" portion of the "progn2"
sub-tree causes the snake to be either facing or next to a wall, the
sub-trees on the second and third line above will be executed
respectively. Further investigation reveals that each of these
sub-trees will cause the snake to move away from the wall in a direction that avoids danger, even in corners. In this fashion, the
snake appears to "bounce" from the walls and proceed to zig-zag in an alternate direction. Two examples of this are seen at
positions (17,1) and (20,5) of figure 4. In both of these cases the snake made the right turn noted with a "2" above in order to
avoid the wall.
Wall-slitherer: The strategy that scored the highest out of all individuals using the initial function and terminal set is what will be
referred to as the "wall-slitherer." These individuals would follow along the wall, not simply moving forward, but rather slithering
back and forth between the two squares closest to the wall. Once able to align its head with the food, the individual would move
away from the wall in a straight line to obtain the food. Than, when the food was eaten, successful wall-slitherers would either
double-back along their own body and head for the wall or head in a random direction toward a wall. Variations on wall-slitherers
occurred in the direction they would take around the wall and when they would leave the wall to pursue the food. One highly
successful wall-slitherer is shown below. This individual scored a maximum of 107 hits in one particular run, and an evaluation
of its important characteristics follows:

(ifFoodAhead (ifDangerAhead (left )(forward ))


(1ifDangerAhead (ifDangerRight (left )(progn2 (right )
(ifFoodAhead (ifDangerRight (forward )(right )))(ifDangerRight (forward
)(right ))))
(2ifDangerRight (ifDangerLeft (forward )(left ))
(3ifDangerLeft (right )
(4progn2 (left )(ifFoodAhead (ifDangerLeft (right )(left ))(ifDangerRight
(left )
( 5progn2 (ifDangerAhead (right )(ifDangerLeft (right )(left )))
(ifDangerRight (forward )(right ))))))))))

In evaluating this individual, first consider the root, which consists of the "ifFoodAhead" function. For any case in which there is
food ahead, the very simple left sub-tree is executed. This subtree simply checks for danger ahead and attempts to avoid it to the
left if present, otherwise the snake will continue along its current movement path towards the food. While this sub-tree proves
both simple and effective, the fact is clear that the individual spends the majority of its run without the food immediately ahead,
which is handled by the much larger right-hand sub-tree.
While it appears much more complicated than the left-hand sub-tree, the fundamental strategy of the right-hand sub-tree is to
avoid danger. This strategy is executed impressively by the three different "ifDanger*" functions, noted with 1, 2, and 3. These
functions provide the roots for the three sub-trees along the right-hand side of the main function tree. The reader can verify that
each of these three sub-trees contains schemata that are highly effective at avoiding any impending danger to the snake. Having
already taken precautions to pursue food and avoid danger, the final sub-tree provides the snake with its wall-slithering motion, in
which it spends the majority of its time.
The final sub-tree, noted with a "4" above, is rooted with a progn2. This indicates that multiple actions will be carried out every

http://www.gamedev.net/reference/articles/article1175.asp (8 of 11) [25/06/2002 2:59:39 PM]


GameDev.net - Application of Genetic Programming to the Snake Game
time this sub-tree is reached, which proves to be very frequently. Initially the branch will make a move to the left, which is
already known to be safe. Following this move, if there is no food ahead and no food to the right, then the second progn2, noted
with a "5", is reached, making for a total of three moves to be executed on this single pass of the function tree. This three-move
sequence is both common and highly beneficial to the success of this wall-slitherer. In figure 5, note the snake’s body segment at
(19,8). At this point in the past, the snake was facing downward and a new parse of the function tree was beginning. As no danger
was immediately present, sub-tree 4 was reached and the snake turned left towards the wall. Needing to complete the second
argument of the progn2, and with no food ahead or danger to the right, the progn2 at 5 was reached, which caused the snake to
turn right twice, leading to the next parse of the function tree. Looking back in history through the illustration, note that the same
pattern was carried out at points (19,6), (19,4), (19,2), and numerous other times in the brief potion of the snake’s run
demonstrated here. This repeated slithering pattern served to maximize the amount of ground covered by the snake while
minimizing the danger that its body would pose to itself. A time step prior to a fatal flaw in the snake’s movement, however, is
illustrated.
As the food is in front of the snake’s head, the simple sub-tree
on the left is entered. Since there is danger ahead of the snake, it
will simply turn left. As shown in the illustration, this turn will
lead to the snake’s death, as it hits it’s own body after having
eaten 61 pieces of food. While it may seem surprising that this
flaw in the left-hand function tree was not encountered sooner,
the snake survives by keeping its body along the walls as much
as possible. In the illustration, it is clear that the snake left the
wall 57 time steps earlier in order to pursue a piece of food
across the board. Once the food was eaten, the snake resumed its
slither pattern clockwise around the edge of the board.
Unfortunately, its body had grown so long that by the time its
head was in line with the food at position (19,9) its body was
still blocking its path to the food. The snakes evasion tactic of
going to the left when danger is encountered with the food ahead
had saved in previous similar situations because once a single
successful left was made, the snake was no longer in line with
the food and it would continue any necessary evasive maneuvers via the much more robust sub-trees, 1, 2, and 3. In this final,
fatal case, however, the combination of the snake’s long body, its previous cross-board pursuit of the food, and the placement of
the next piece of food three board squares off of the wall caused the evasive left to lead the snake directly into its own body.
Circler: After the function set was enhanced to include the further food-sensing capabilities of "ifFoodUp" and "ifFoodRight" as
well as the four "ifMoving*" functions, a new strategy of behavior that evolved is what will be referred to as the "circler." These
individuals would follow along the outside of the wall in a circular pattern and only leave the wall to get the food. Once they
reached the food they would continue forward until they reached the wall, then they’d start to circle again. Typically they would
only attempt to eat the food while moving in one particular direction. While similar to the wall-slitherer, they differ in two key
ways. The first is that the circler will always remain directly next to the wall and not move back and forth like the wall-slitherer.
The second is that the circler will typically only leave the wall while headed in one direction. Both of these differences are a
direct result of the new functions. Before further discussion, consider the following circler, who scored a maximum of 80 hits
over a single run:

(ifDangerAhead (ifDangerTwoAhead (ifDangerLeft (right )(left ))


(ifFoodUp (ifDangerLeft (right )(2left ))(forward )))
(ifMovingUp (ifDangerRight (ifFoodUp (forward )(3left ))(right ))
(1forward )))

First note the leftmost branch of the function tree, in which the snake will primarily avoid danger to both the front and the left.
Certainly the left-hand sub-tree, though simple, proves highly effective at achieving the snake’s primary goal of avoiding danger.
Secondly, take note of the right-hand sub-tree, which is parsed whenever danger is not immediately ahead of the snake. If the
snake is not moving upwards, it simply continues forward, which is already known to be a safe move. This proves to be the move
that snake most commonly makes. If, however, the snake is moving upwards, and there is danger to the right, then it will turn left
as soon as the food in no longer above it. The primary moves of this snake, then, are to continue forward around the outside of the
board until either there is danger ahead and it turns left, or the snake is moving upwards and there is food to left, when it turns

http://www.gamedev.net/reference/articles/article1175.asp (9 of 11) [25/06/2002 2:59:39 PM]


GameDev.net - Application of Genetic Programming to the Snake Game
left. Note that these moves are marked 1, 2, 3 respectively in the function tree above. Hence when seen in action the snake will
make a counterclockwise circular motion around the outside of the board with the top of the circle determined by the current
piece of food.
Pattern Following Solution: As a final example of an evolved strategy, an individual that was able to score the maximum
number of hits, 211, will be considered. All individuals who were able to score the maximum number of hits demonstrated some
pattern similar to that shown in figure 2. All of these individuals took little to no consideration of where the food was on the
board, but rather followed a set pattern that would cover the entire board, eventually causing them to eat the food. Furthermore,
the pattern they followed would be continuous, meaning that their head would eventually reach its original starting position,
allowing the pattern to continue indefinitely. One such pattern follower, produced in generation 27 of a "primed" run, is given
below:

(ifDangerRight (ifDangerAhead (ifDangerTwoAhead (8left )(forward ))


(ifMovingRight (6left )( 4,7forward )))
(ifDangerAhead (ifDangerLeft (ifFoodUp (right )(right ))(ifDangerTwoAhead (left
)(forward )))
(ifMovingUp (ifDangerTwoAhead (ifFoodAhead (ifMovingRight (3right )( 9forward ))
(ifMovingDown (left )( 2right )))
(1progn2 (forward )(forward )))
(ifMovingRight (right )(forward )))))

This individual followed a pattern exactly the same as that shown in figure 2. There were only a few minor deviations from the
pattern that would occur during very infrequent states of the game board. Before considering any such deviations an examination
of the major pattern following steps will be made.
The overall pattern followed by the individual above is as follows, with the movement steps noted by superscripts on the
individual. To simplify the analysis consider that the snake has already eaten enough food to be as long as the board is high, 11
segments, and that the snake is currently moving upward with its head at position (2,10) of the board:
1. While moving upward, if there is not danger two ahead, move forward twice.
2. Once there is danger two ahead, turn right; snake now moving right one row from the top of the board.
3. Turn right again, to begin heading downward.
4. Continue moving downward until there is danger directly ahead.
5. Once there is danger ahead, turn left; snake now moving right at the bottom of the board.
6. Turn left again and return to step one until there is danger to the right of the snake.
7. Danger right indicated the final right-hand column, so the snake now moves up until danger is one ahead.
8. Once there is danger ahead, turn left to follow the top row of the board (4,7) while moving left; repeat this same step to
move down the left-hand side of the board, and when the bottom of the board is reached, return to step 5.
While it is clear that by repeatedly following this pattern the snake will continually trace the whole board, causing it to eat at least
one piece of food on each pass of the board, there is one notable exception from the pattern that is made whenever the food is in
the top row of the board and the snake is moving upward toward it. In this rare case, when step 2 of the pattern is reached, rather
than turning right, the snake will continue forward to eat the food, as noted with a "9" in the function tree. When this case occurs
the snake will resume the pattern to the right following its consumption of the food. If, however, this case occurs too far to the
right and the snake’s body is long enough, the snake can trap itself on the right side of the board, causing it to die. This is the only
way that the way that the individual shown above will not successfully eat 211 pieces of food.

http://www.gamedev.net/reference/articles/article1175.asp (10 of 11) [25/06/2002 2:59:39 PM]


GameDev.net - Application of Genetic Programming to the Snake Game

Conclusion
This paper has presented the development and evaluation of a function set capable of evolving an optimal solution to the snake
game. An initial function set was presented and evaluated, but proved unsuccessful at evolving an optimal solution. The initial
function set was then expanded upon to create the successful final function set, and consistently optimal solutions were generated
using primed GP runs. A comparison was made of the results achieved by each function set, as well as by the primed GP runs.
Examples of commonly evolved strategies were presented and evaluated, and a final analysis of a consistently successful optimal
solution was given.

Future Work
The work presented in this paper provides innumerable opportunities for further investigation into the evolution of a task
prioritization scheme within a dynamically changing, randomly updated environment. Specific to the snake problem,
modifications can be made to create completely new and interesting problems, such as a non-rectangular game board, obstacles
within the game board, or multiple pieces of food. Multiple snakes could be co-evolved to competitively pursue the food. The
function set could be modified to feature enhanced detection capabilities and more advanced navigational options. The techniques
used for navigating the snake could be generalized to apply to various other problems of interest. Possibilities include automated
navigation of multiple robots through a crowded workspace, an automaton for tracking fleeing police suspects through harsh
environments, or a control scheme for an exploratory vehicle seeking a particular goal on a harsh alien planet. The possibilities
are only limited by the imagination.

References
Koza, John R. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. Cambridge,
Massachusetts: The MIT Press.
Discuss this article in the forums

Date this article was posted to GameDev.net: 8/10/2000


(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article1175.asp (11 of 11) [25/06/2002 2:59:39 PM]


GA Playground - Java Genetic Algorithms Toolkit

A general GA toolkit implemented in Java, for experimenting with genetic algorithms and
handling optimization problems

Contents
The GAA Applet/Application
● Overview
● Browser Requirements and Loading Times
● General Notes
❍ Alphabet
❍ Problem Definition - Definition Files
❍ Problem Definition - Source Modifications
❍ Special GA Mechanisms
■ Automatic 'Kick'
■ Kin-competition compensating factor
■ Memory
■ Pre & Post Breed Functions
❍ Interactively (on-the-fly) defined functions
❍ Continuous Reporting
❍ Graphic Display
❍ User-Initiated Logging
❍ File Input/Output
■ Application Mode IO
■ Applet Mode IO
❍ Online Help
❍ Documentation

http://www.aridolan.com/ga/gaa/gaa.html (1 of 10) [25/06/2002 3:01:05 PM]


GA Playground - Java Genetic Algorithms Toolkit

● Download and Installation


❍ Applet Mode
❍ Application Mode
❍ Classpath (Application mode)
❍ Application Batch File
❍ Javadoc Files
● Limitations and Request for feedback

Examples and Test Problems


● General
● Important Notes
● List of demo problems
❍ Multiple Problems Applets
■ All demo problems
❍ TSP
■ TspDemo - All cities on a circle
■ TspBayg29 - 29 cities in Bavaria
■ TspAtt48 - 48 capitals of the US
❍ Knapsack Problems
■ A single knapsack problem
■ A Multiple knapsack problem (Weing1)
■ A Multiple knapsack problem (Weish01)
❍ Bin Packing Problems
■ Binpack1 u120_00 (120 Objects)
■ Binpack5 t60_19 (60 Objects in triplets)
❍ Facility Allocation
■ Steiner - All cities on a circle
■ Steiner - Cities coordinates read from a file
❍ Multi-Modal Functions
■ Ackley's function
■ Rosenbrock's function
■ Schwefel's function
■ Rastrigin's function
■ Griewank's function

http://www.aridolan.com/ga/gaa/gaa.html (2 of 10) [25/06/2002 3:01:05 PM]


GA Playground - Java Genetic Algorithms Toolkit

❍ Simple Function Optimization


■ SphereModel
■ Single Variable Minimization
■ Multiple Variable Minimization
■ Simpleton - A trivial 10 variables problem

Overview
The GA Playground is a general purpose genetic algorithm toolkit where the user can define and run his
own optimization problems. The toolkit is implemented in the Java language, and requires (when used as
an application, in its full mode), a Java compiler and a very basic programming knowledge (just enough
for coding a fitness function). Defining a problem consists of creating an Ascii definition file in a format
similar to Windows Ini files, and modifying the fitness function in the GaaFunction source file. In
addition, other methods can (optionally) be overwritten (e.g. the drawing method), other classes can be
extended or replaced, and additional input can be supplied through Ascii files.
The GA Playground is primarily designed to be used as an application and not as an applet, since it
requires re-compiling of at least one class and use of local file I/O. In addition, it is a little heavy as an
applet, taking a relatively long loading time over the net. However, although its use as an applet does not
enable defining new problems with new fitness functions, it enables extensive playing with many
variations of an already existing problem type, by opening every aspect of the problem definition to the
user. For example, any TSP test problem can be loaded through the 'Parameters' module. Used as an
applet, the toolkit takes advantage of the Java cross-platform nature and the cross-world nature of the
Internet, to bring a GA Playground to anyone interested in experimenting with genetic algorithms.

Browser Requirements and Loading Times


The applet is written in JDK 1.1.5 and uses the new event model. Therefore it requires a browser that
supports AWT 1.1. Currently (June 1998) this means MSIE 4.01, Netscape Communicator 4.05 Preview
Release (The regular Communicator 4.05 supports only version 1.1.2) or Netscape Communicator 4.0+
with the JDK 1.1 patch. Other options are HotJava 1.1 or using the Java Plugin.
Download sites (all are free):
● Netscape Communicator 4.05 Preview Release

● Microsoft Internet Explorer 4.01


● Sun's Java Plugin
● Hot Java

The applet is large and takes a relatively long time to load. It uses four jar files, the first is about 100K
and the other three about 50K each. Please be patient until the loading process is finished. When the GA
Playground is used as an application (the program's natural mode), and loaded locally and not from the
Internet, the loading time is obviously very short.

http://www.aridolan.com/ga/gaa/gaa.html (3 of 10) [25/06/2002 3:01:05 PM]


GA Playground - Java Genetic Algorithms Toolkit

General Notes
Alphabet

The implementation of the genetic algorithm uses a high alphabet to encode the chromosome's genes. In
this implementation, each locus on the chromosome stands for a complete gene or variable. Since Java
uses Unicode internally, the available range for the alleles of each gene is 64K, which provides an
adequate resolution for most cases.

Problem Definition - Definition Files

The input is defined by a Problem Definition file, an Ascii file formatted like a Windows Ini file. Two
additional input files are optional: An alleles-definition file and a mapping file. These files are optional
since in many cases the input they specify can be created automatically by the program.

Problem Definition - Source Modifications

The design of the program is modular, and each module is packaged separately as an easily replaceable
(or extendable) class. Of the many classes that make the toolkit, only one should be handled by the user
in defining a new problem. This is the GaaFunction class, that contains three methods: getValue, draw
and createAllelesMap. The getValue method calculates the individual's fitness, and should obviously be
written explicitly for each problem. The draw method is optional, and should be overwritten only if a
graphic output is needed. The createAllelesMap is required only when a mapping file is not supplied, and
the mapping table should be generated by the program.
Besides these, any class can be extended or re-written to create a different genetic algorithm (e.g. the
GaaMutation class or the GaaCrossover class)

Special GA Mechanisms

The implementation of the genetic algorithm follows the standard GA structure but it incorporates
several less standard mechanisms:
1. An automatic 'Kick': A sensor in the program monitors the evolutionary process, and when it finds
that there has not been any advance in the recent N generations (N is user definable), it gives the
population a 'Kick' and scrambles it a little (in a user defined manner). This mechanism helps against the
GA tendency to get stuck at a mediocre local optimum.
2. A kin-competition compensating factor: If a population contains identical individuals, only one of
them receives the nominally calculated fitness. The others are assigned decreased fitness values. This
helps to maintain diversity in the population, and reduces the danger of the whole population being taken
by a single, relatively superior, individual. The evolutionary justification for this mechanism (a
justification is not really needed, but anyway), is that identical individuals compete over the same niche,
so that although each might possess good genes, the very existence of the others makes it more difficult
for him.
3. Memory: Each individual in a population owns both a chromosome (a solution string) and a memory

http://www.aridolan.com/ga/gaa/gaa.html (4 of 10) [25/06/2002 3:01:05 PM]


GA Playground - Java Genetic Algorithms Toolkit

string, where history data can be recorded. The use of the memory string is optional. If required, the
memory maintenance code can be included in the fitness function.
4. Pre & Post Breed Functions: These two functions (empty by default) are activated just before and
just after breeding takes place (correspondingly). This enables to add any extra processing to the old
population (preBreed) or to the new population (postBreed) during the process of creating a new
generation

Interactively (on-the-fly) defined functions

The GA Playground supports optimization of functions and expressions defined on-the-fly. The user can
enter an expression of arbitrary length and complexity, with up to 20 variables, define range constraints
for each variable, and let the program search for an optimized (minimum or maximum) solution. This
option is available only in application mode (cannot be run in a browser).
The interactive function optimization code is based on the Java Expressions Library (JEL), an amazing
library that enables fast evaluation of dynamic string expressions. The library was written by Konstantin
Metlov (metlov@fzu.cz), and its site is at http://galaxy.fzu.cz/JEL/.

Continuous Reporting

The applet user-interface supports three tools for monitoring the evolution process. Each is switchable,
the trade off being performance (switched off) versus information (switched on). They are: The text
window, where textual information is continuously displayed (when the text window is enabled), the
graphic window that supports the draw method of the GaaFunction class when relevant (e.g. in TSP
problems), and the Log window, that can gather information in the background and be displayed on
request. All these tools can be toggled on or off anytime, including during execution.

Graphic Display

The program's Gui provides a graphic window which can optionally be used for displaying graphic
representation of the evolutionary process. Problems of geometrical nature, such as TSP or resource
allocation problems, are natural candidates for taking use of this option. To use the graphic option it is
required to override the "draw" function in the user-modifiable GaaFunction file. The graphic display can
be turned on or off at any time, including in the course of the evolution process.

User-Initiated Logging

In addition to the switchable continuous reporting, the user can output information to the log file at any
time. While the evolutionary process is going on, it is possible to log current chunks of data, such as a list
of the current population (list of current chromosome strings), or a description of the current mating
process (in the format: Father + Mother => Kid => Mutated Kid). The population list can be printed in
several formats, some more suited for short chromosomes, other for longer ones. All logging functions
are accessible through the Log menu. Saving the log file to disk is possible only in application mode,
while displaying the log file on screen is available in any mode. The log file can be viewed (or saved)
either during the calculation process or afterwards, and can be used to analyze what happened during the
evolutionary process.

http://www.aridolan.com/ga/gaa/gaa.html (5 of 10) [25/06/2002 3:01:05 PM]


GA Playground - Java Genetic Algorithms Toolkit

File Input/Output

1. Application Mode IO: Both input and output files can be saved to and loaded from disk. Parameters
that were modified interactively can be saved to a file, and loaded later as a new problem definition.
Parameter files (as well as all other Ascii files) can be edited in a text editor, saved (optionally under
different names) and later used to define different problems
Population of strings can be saved, together with the their function and fitness values, to be studied and
analyzed. Saved population files can be loaded (completely or partially) into the program, to define a
specific population with known individuals. The population file can be edited in order to modify strings
(chromosomes) or create new ones.
Finally, the log screen that optionally stores information about the evolutionary process, can also be
saved to disk for subsequent examination.
2. Applet Mode IO: When used as an applet, the GA Playground cannot access user's local disk.
Therefore input can only be modified through the multi-tabbed Parameters module. This module supports
editing of any aspect of the problem definition, as long as it uses the same fitness function. Output is
limited to screen, but it can be copied and pasted to another application through the clipboard (I am
talking Windows here).

Online Help

The GA Playground assumes some familiarity with genetic algorithm programs, and the help is relatively
concise. There are three help utilities
1. Online Help Screens: Two basic help screens (General Help and Input Files Help) are accessible
from within the program (Help menu).
In addition there are two compact context-sensitive help mechanisms, which are particularly relevant to
the parameters (input) panels:
2. Automatic Tool-Tip: A short help string is automatically displayed in the status bar when the cursor
is over a particular component (Textfield, label or button). This automatic mechanism can be toggled on
or off through the Options menu.
3. Right-Mouse-Button click: An additional short help string can be displayed in the status bar by
clicking the right mouse button when the cursor is over a particular component. This second help tip
complements the automatic help tip described above.
Documentation: Currently there is only an empty, skeleton Javadoc documentation. While I have all the
good intentions of filling it with real documentation, it might take some time. Meanwhile it can be used,
with some intuition and possible trial-and-error, as a basis for extending and subclassing GA Playground
classes. The classes can be retrieved by expanding the gaa.jar file.

Download and Installation

The GA-Playground can be downloaded to be run from local hard disks. The program can be run either
as an applet (from any browser supporting JDK 1.1.5), or as an application (running on a JDK 1.1.5

http://www.aridolan.com/ga/gaa/gaa.html (6 of 10) [25/06/2002 3:01:05 PM]


GA Playground - Java Genetic Algorithms Toolkit

VM).
Please download the GaPlayground.zip (about 470K) and unzip it to any directory. The same files are
used for both applet and application modes. Sample code for the GaaFunction class (this class should be
modified when defining new problems) can be downloaded from the author's new web site

Applet Mode: Just load any of the Html problem files into your (JDK 1.1.5 capable) browser. The file
gaa.html (this file) contains an index of all the demo problems, and can be used to load any of them.
However, any specific problem can be loaded directly into the browser (e.g. TspDemo.html). In any case,
once the applet is loaded, any of the demo problems can be loaded through the Ga/Entry Screen menu.
Application Mode: The application is activated by the command:
java GaaApplet [Parameters File].
The Parameter file name is optional: When not given the "All Demos" version (All.par) will be activated
by default.
Examples:
To run the "All Demos" version: java GaaApplet
To run the TSP demo of Bavarian cities: java GaaApplet bayg29.par
Classpath (Application mode): When used as application the three jar files of the GA Playground
should be defined in the CLASSPATH variable. Each one should be entered separately.
Example:
If you unzipped the GaPlayground.zip file into C:\GAPL directory, your Classpath assignment should be
similar to the following:
set CLASSPATH=.;C:\GAPL\gaa.jar;C:\GAPL\ScsGrid.jar;C:\GAPL\tabsplitter.jar; C:\jel.jar
Application Batch File: On Windows you can create an icon that activates a batch file for running GA
Playground as an application.
The batch file should be similar to the following:
CLASSPATH=.;tabsplitter.jar;ScsGrid.jar;jel.jar;gaa.jar
java GaaApplet %1
Assuming the batch file is named RunGaa.bat, entering "RunGaa" at the command prompt will activate
GA Playground with the default (problem selection) mode, while entering e.g. "RunGaa TspDemo.par"
will run the specific TspDemo problem.
Javadoc files: The "skeleton" Javadoc files set is packaged as a zip file and can be downloaded It is a
little thin, but it might be useful.

Limitations and Request for feedback

The GA Playground is currently at a preliminary stage, still under construction. The options for
definitions of new problems are not fully implemented yet, and there is no Java documentation, so it is
currently limited to playing with the defined problems. When playing with the parameters (or preparing
new parameters files in the application mode), please be careful with your input parameters, as there is
no protection against illegal entries

http://www.aridolan.com/ga/gaa/gaa.html (7 of 10) [25/06/2002 3:01:05 PM]


GA Playground - Java Genetic Algorithms Toolkit

I shall be glad to receive any comments or suggestions. Please use my mailbox for any sort of feedback.

Examples and Test Problems


General

If you will look at the source of any of the following Demo Html pages (by the View Source function in
your browser), you will see that each activates the same Java class file (GaaApplet.class), the only
difference is the ascii definition file that is given as a parameter to the applet. In each case, all the class
files used are the same. The only exception is the GaaFunction.class, which either should be specific for
each problem definition, or contain several alternative functions that are activated according to the
specific problem code.
Once you load a specific example, you can change the problem definition (by modifying problem
attributes, variables ranges and variables mapped values) and also experiment with the genetic algorithm
by playing with any of the GA parameters.

Important Notes:

● The applet requires a browser that supports JDK 1.1.5 or above (Communicator 4.05 (Preview
Release), MSIE 4.01, HotHava 1.1).
● If you browser does not support JDK 1.1.5 you can download the free Sun's Java Plugin:
http://www.javasoft.com/products/plugin/ (good for Netscape 3.0 and above, MSIE 3.02 and
above).
● The applet has a relatively long loading time, especially when loaded for the first time.
● Once the applet is loaded (with any of the demo problems), it is possible to load any other problem
from within the applet, through the 'GA/Entry Screen' menu command.
● The applet is best viewed in 800x600 (or higher) screen resolution

List of Demo Problems

Multiple
Problems
Applets

All the demo problems: In this configuration you can select any of the examples
All Demos listed below, and switch between them. Selection is done from the applet's 'Entry
Screen' (GA menu)

TSP:

A Tsp where all cities are located on a circle. The number of cities is user
TSP on circle
definable.

http://www.aridolan.com/ga/gaa/gaa.html (8 of 10) [25/06/2002 3:01:05 PM]


GA Playground - Java Genetic Algorithms Toolkit

TSP Bayg29 TSP of 29 Cities in Bavaria (Groetschel,Juenger,Reinelt)

TSP Att48 TSP of 48 capitals of the US (Padberg/Rinaldi)

Knapsack:

Single
A single knapsack problem with 50 objects
0/1-Knapsack

Weing1:
multiple A multiple knapsack problem with 2 knapsacks and 28 objects
0/1-Knapsack

Weish01:
multiple A multiple knapsack problem with 5 knapsacks and 30 objects
0/1-Knapsack

Bin Packing:

Binpack1
120 objects uniformly distributed in (20,100), bins of size 150
u120_00

Binpack5
60 objects in 'triplets' of items from (25,50), bins of size 100
t60_19

Facility
Allocation:

Steiner on
A facility allocating problem (where all cities are on a circle)
circle

Steiner by file A facility allocating problem (where city coordinates are read from a file)

Multi-modal
Functions:

Ackley's A multi-modal test function:


Minimize f(x) =
Function
20+e-20*exp(-0.2*exp(sqrt((1/n)*sum(x(i)^2))-exp((1/n)*sum(cos(2*Pi*x(i)))

Rosenbrock's A multi-modal test function:


Function Minimize f(x) = sum(100*(x(i)-x(i-1)^2)^2 + (1-x(i-1))^2

Schwefel's A multi-modal test function:


Function Minimize f(x) = 418.9829*n + sum(-x(i)*sin(sqrt(abs(x(i))))

http://www.aridolan.com/ga/gaa/gaa.html (9 of 10) [25/06/2002 3:01:05 PM]


GA Playground - Java Genetic Algorithms Toolkit

Rastrigin's A multi-modal test function:


Function Minimize f(x) = 10.0*n + sum(x(i)^2 - 10.0*cos(2*Pi*x(i)))

Griewank's A multi-modal test function:


Function Minimize f(x) = 1/4000*sum(x(i)-100)^2 - prod((x(i)-100)/sqrt(i)) + 1

Function
Optimization:

A well known test function:


Sphere Model
Minimize f(x) = sum((x(i)-1)^2)

Single Variable
Minimize f(x) = x^4 - 12*x^3 + 15*x^2 + 56*x - 60
Minimization

Multi Variable Minimize f(x1,x2,x3,x4,x5) = x1*sin(x1) + 1.7*x2*sin(x1) - 1.5*x3 -


Minimization 0.1*x4*cos(x4+x5-x1) + (0.2*x5^2-x2) - 1

A Trivial 10 Variables Maximization Problem:


Simpleton
Maximize: (x1*x2*x3*x4*x5)/(x6*x7*x8*x9*x10) where (x1..x10)=[1..10]

| Home Page | Floys | iFloys | eFloys | tFloys | Floys Description | Java CA | Wica | Doll House |
Picture-Browser | Download | Alife Database | New Alife Database | Newest Alife Database | GA
Playground | Experiments |

Ariel Dolan
aridolan@netvision.net.il
Tel. 972-3-7526264
Fax. 972-3-5752173
Last modified on: Thursday, 10 September, 1998.

http://www.aridolan.com/ga/gaa/gaa.html (10 of 10) [25/06/2002 3:01:05 PM]


generation5.org - Genetic Algorithm Example: Diophantine Equation

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

Genetic Algorithm Example: Diophantine Equation


Make sure that you have read the genetic algorithms essay before reading this example. You also must have a working knowledge of
C++ and object-oriented programming to utilize the classes and code examples provided.

Genetic Algorithm Example


Let us consider a diophantine (only integer solutions) equation: a+2b+3c+4d=30, where a,b,c,d are positive integers. Using a genetic
algorithm, all that is needed is a little time to reach a solution (a,b,c,d). Of course you could ask, why not just use a brute force
method (plug in every possible value for a,b,c,d given the constraints 1 =< a,b,c,d =< 30)? The architecture of GA systems allow for
a solution to be reached quicker since "better" solutions have a better chance of surviving and procreating, as opposed to randomly
throwing out solutions and seeing which ones work.
Let's start from the beginning. First we will choose 5 random initial solution sets, with constraints 1 =< a,b,c,d =< 30. (Note that we
can choose smaller constraints for b,c,d, but for the sake of simplicity we shall use 30)
Chromosome (a,b,c,d)
1 (1,28,15,3)
2 (14,9,2,4)
3 (13,5,7,3)
4 (23,8,16,19)
5 (9,13,5,2)

Table 1: 1st Generation Chromosomes and Their Contents


To calculate the fitness values, plug each solution set into the expression a+2b+3c+4d. Then, calculate the absolute value of the
difference of each expression with 30, this will be our fitness value.
Chromosome Fitness Value
1 |114-30|=84
2 |54-30|=24
3 |56-30|=26
4 |163-30|=133
5 |58-30|=28

Table 2: Fitness Values of 1st Generation Chromosomes (Solution sets)


Since values that are lower are closer to the desired answer (30), these values are more desirable. In this case, higher fitness values
are not desirable, while lower ones are. In order to create a system where chromosomes with more desirable fitness values are more
likely to be chosen as parents, we must first calculate the percentages that each chromosome has of being picked. One solution is to
take the sum of the multiplicative inverses of the fitness values (0.135266), and calculate the percentages from there. (Note: ALL

http://www.generation5.org/gaexample.shtml (1 of 3) [25/06/2002 3:03:04 PM]


generation5.org - Genetic Algorithm Example: Diophantine Equation
simulations were created using a random number generator)
Chromosome Likelihood
1 (1/84)/0.135266 = 8.80%
2 (1/24)/0.135266 = 30.8%
3 (1/26)/0.135266 = 28.4%
4 (1/133)/0.135266 = 5.56%
5 (1/28)/0.135266 = 26.4%

Table 3: Parent Selection by Percentages


In order to pick our 5 pairs of parents (each of which will have 1 offspring, and thus 5 new solutions sets total), imagine that we had a
10000 sided die, and on 880 of those sides, chromosome 1 was labeled, and on 3080 of those sides, chromosome 2 was labeled, and
on 2640 of those sides, chromosome 3 was labeled, and on 556 of those sides, chromosome 4 was labeled, and on 2640 of those
sides, chromosome 5 was labeled. To choose our first pair we roll the die twice, and take those chromosomes to be our first two
parents. Continuing in this fashion we have the following parents:
Father Chromosome Mother Chromosome
3 1
5 2
3 5
2 5
5 3

Table 4: Simulated Selection of Parents


The offspring of each of these parents contains the genetic information of both father and mother. How this can be determined is very
arbitrary. However for this case, we could use something called a "cross-over". Let us say a mother has the solution set a1,b1,c1,d1,
and a father has the solution set a2,b2,c2,d2, then there can be six possible cross overs (| = dividing line):
Father Chromosome Mother Chromosome Offspring Chromosome
a1 | b1,c1,d1 a2 | b2,c2,d2 a1,b2,c2,d2 or a2,b1,c1,d1
a1,b1 | c1,d1 a2,b2 | c2,d2 a1,b1,c2,d2 or a2,b2,c1,d1
a1,b1,c1 | d1 a2,b2,c2 | d2 a1,b1,c1,d2 or a2,b2,c2,d1

Table 5: Generalization of Cross-overs Between Parents


There are many other ways in which parents can trade genetic information to create an offspring, crossing over is just one way.
Where the dividing line would be located is completely arbitrary, and so is whether or not the father or mother will contribute to the
left or right of the dividing line. Now let's apply this to our offspring:
Father Chromosome Mother Chromosome Offspring Chromosome
(13 | 5,7,3) (1 | 28,15,3) (13,28,15,3)
(9,13 | 5,2) (14,9 | 2,4) (9,13,2,4)
(13,5,7 | 3) (9,13,5 | 2) (13,5,7,2)
(14 | 9,2,4) (9 | 13,5,2) (14,13,5,2)
(13,5 | 7, 3) (9,13 | 5, 2) (13,5,5,2)

Table 6: Simulated Crossovers from Parent Chromosomes


Now we can calculate the fitness values for the new generation of offsprings.
Offspring Chromosome Fitness Value
(13,28,15,3) |126-30|=96

http://www.generation5.org/gaexample.shtml (2 of 3) [25/06/2002 3:03:04 PM]


generation5.org - Genetic Algorithm Example: Diophantine Equation

(9,13,2,4) |57-30|=27
(13,5,7,2) |57-30|=22
(14,13,5,2) |63-30|=33
(13,5,5,2) |46-30|=16

Table 7: Fitness Values of Offspring Chromosomes


The average fitness value for the offspring chromosomes were 38.8, while the average fitness value for the parent chromosomes were
59.4. Of course, the next generation (the offspring) are supposed to mutate, that is, for example we can change one of the values in
the ordered quadruple of each chromosome to some random integer between 1 and 30. Progressing at this rate, one chromosome
should eventually reach a fitness level of 0 eventually, that is when a solution is found. If you tried and simulated this yourself,
depending on the mutations you may actually get a fitness average that is higher, but on the long-run, the fitness levels will decrease.
For systems where the population is larger (say 50, instead of 5), the fitness levels should more steadily and stabily approach the
desired level (0).

C++ Class Code


The C++ class is an all encompassing class that takes 5 values upon construction, the 4 coefficients and a result. So for our above
example the class is initialized like this:
CDiophantine dp(1,2,3,4,30);
Then to solve the equation, call the Solve() function, this will then return the index of the allele with the solution set in it. Call
GetGene() to get the gene with the correct values for a, b, c and d. Therefore a typical main.cpp using the class should look like:

#include <iostream.h>
#include "diophantine.h"

void main() {

CDiophantine dp(1,2,3,4,30);

int ans;
ans = dp.Solve();
if (ans == -1) {
cout << "No solution found." << endl;
} else {
gene gn = dp.GetGene(ans);

cout << "The solution set to a+2b+3c+4d=30 is:\n";


cout << "a = " << gn.alleles[0] << "." << endl;
cout << "b = " << gn.alleles[1] << "." << endl;
cout << "c = " << gn.alleles[2] << "." << endl;
cout << "d = " << gn.alleles[3] << "." << endl;
}
}
For a complete breakdown of the class, see here, or to just download the code (complete with Visual C++ 5.0 project file) here.

● Genetic Algorithm Essays - Many essays on theory and applications of GAs.


● Genetic Algorithm Programs - Full source code included.
● Genetic Algorithm Books
● Genetic Algorithm Interviews
● Genetic Algorithm Software
● AISolutions

All content copyright © 1998-2002, Generation5

http://www.generation5.org/gaexample.shtml (3 of 3) [25/06/2002 3:03:04 PM]


generation5.org - An Introduction to Genetic Algorithm and Genetic Programming

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

An Introduction to Genetic Algorithm and Genetic


Programming
After scientists became disillusioned with classical and neo-classical attempts at modelling intelligence, they looked in other
directions. Two prominent fields arose, connectionism (neural networking, parallel processing) and evolutionary computing. It is the
latter that this essay deals with - genetic algorithms and genetic programming.

Symbolic AI vs Genetic Algorithms


Most symbolic AI systems are very static. Most of them can usually only solve one given specific problem, since their architecture
was designed for whatever that specific problem was in the first place. Thus, if the given problem were somehow to be changed,
these systems could have a hard time adapting to them, since the algorithm that would originally arrive to the solution may be either
incorrect or less efficient. Genetic algorithms (or GA) were created to combat these problems. They are basically algorithms based on
natural biological evolution. The architecture of systems that implement genetic algorithms (or GA) are more able to adapt to a wide
range of problems. A GA functions by generating a large set of possible solutions to a given problem. It then evaluates each of those
solutions, and decides on a "fitness level" (you may recall the phrase: "survival of the fittest") for each solution set. These solutions
then breed new solutions. The parent solutions that were more "fit" are more likely to reproduce, while those that were less "fit" are
more unlikely to do so. In essence, solutions are evolved over time. This way you evolve your search space scope to a point where
you can find the solution. Genetic algorithms can be incredibly efficient if programmed correctly.

General Algorithm for Genetic Algorithms


Genetic algorithms are not too hard to program or understand, since they are biological based. Thinking in terms of real-life evolution
may help you understand. Here is the general algorithm for a GA:
Create a Random Initial State
An initial population is created from a random selection of solutions (which are analagous to chromosomes). This is unlike the
situation for Symbolic AI systems, where the initial state in a problem is already given instead.
Evaluate Fitness
A value for fitness is assigned to each solution (chromosome) depending on how close it actually is to solving the problem (thus
arriving to the answer of the desired problem). (These "solutions" are not to be confused with "answers" to the problem, think of
them as possible characteristics that the system would employ in order to reach the answer.)
Reproduce (& Children Mutate)
Those chromosomes with a higher fitness value are more likely to reproduce offspring (which can mutate after reproduction). The
offspring is a product of the father and mother, whose composition consists of a combination of genes from them (this process is
known as "crossing over".
Next Generation
If the new generation contains a solution that produces an output that is close enough or equal to the desired answer then the problem
has been solved. If this is not the case, then the new generation will go through the same process as their parents did. This will
continue until a solution is reached.

http://www.generation5.org/ga.shtml (1 of 3) [25/06/2002 3:03:14 PM]


generation5.org - An Introduction to Genetic Algorithm and Genetic Programming

Applications of GAs
The possible applications of genetic algorithms are immense. Any problem that has a large search domain could be suitable tackled
by GAs. A popular growing field is genetic programming (GP).

Genetic Programming
In programming languages such as LISP and Scheme, the mathematical notation is not written in standard notation, but in prefix
notation. Some examples of this:

+ 2 1 : 2 + 1
* + 2 1 2 : 2 * (2+1)
* + - 2 1 4 9 : 9 * ((2 - 1) + 4)
Notice the difference between the left-hand side to the right? Apart from the order being different, no parenthesis! The prefix method
makes life a lot easier for programmers and compilers alike, because order precedence is not an issue. You can build expression trees
out of these strings that then can be easily evaluated, for example, here are the trees for the above three expressions.

+ * *
/ \ / \ / \
1 2 + 2 + 9
/ \ / \
1 2 - 4
/ \
2 1
You can see how expression evaluation is thus a lot easier. What this have to do with
GAs? If for example you have numerical data and 'answers', but no expression to conjoin
the data with the answers. A genetic algorithm can be used to 'evolve' an expression tree
to create a very close fit to the data. By 'splicing' and 'grafting' the trees and
evaluating the resulting expression with the data and testing it to the answers, the
fitness function can return how close the expression is. The limitations of genetic
programming lie in the huge search space the GAs have to search for - an infinite number
of equations. Therefore, normally before running a GA to search for an equation, the user
tells the program which operators and numerical ranges to search under. Uses of genetic
programming can lie in stock market prediction, advanced mathematics and military
applications (for more information on military applications, see the interview with Steve
Smith).

Evolving Neural Networks


GAs have successfully been used to evolve various aspects of GAs - either the connection
weights, the architecture, or the learning function. You can see how GAs are perfect for
evolving the weights of a neural network - there are immense number of possibilities that
standard learning techniques such as back-propagation would take thousands upon thousands
of iterations to converge to. GAs could (given the appropriate direction) evolve working
weights within a hundred or so iterations.
Evolving the architecture of neural network is slightly more complicated, and there have
been several ways of doing it. For small nets, a simple matrix represents which neuron
connection which, and then this matrix is, in turn, converted into the necessary 'genes',
and various combinations of these are evolved.
Many would think that a learning function could be evolved via genetic programming.
Unfortunately, genetic programming combined with neural networks could be incredibly
slow, thus impractical. As with many problems, you have to constrain what you are
attempting to create. For example, in 1990, David Chalmers attempted to evolve a function

http://www.generation5.org/ga.shtml (2 of 3) [25/06/2002 3:03:14 PM]


generation5.org - An Introduction to Genetic Algorithm and Genetic Programming
as good as the delta rule. He did this by creating a general equation based upon the
delta rule with 8 unknowns, which the genetic algorithm then evolved.

Other Areas
Genetic Algorithms can be applied to virtually any problem that has a large search space.
Al Biles uses genetic algorithms to filter out 'good' and 'bad' riffs for jazz
improvisation, the military uses GAs to evolve equations to differentiate between
different radar returns, stock companies use GA-powered programs to predict the stock
market.

● Genetic Algorithm Essays - Many essays on theory and applications of GAs.


● Genetic Algorithm Programs - Full source code included.
● Genetic Algorithm Books
● Genetic Algorithm Interviews
● Genetic Algorithm Software
● AISolutions

All content copyright © 1998-2002, Generation5

http://www.generation5.org/ga.shtml (3 of 3) [25/06/2002 3:03:14 PM]


generation5.org - An Introduction to Artificial Intelligence

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

An Introduction to Artificial Intelligence


Computers can do many wonderful things. They can perform calculations millions or billions of times faster than human beings. Yet,
is this all that computers can do? Just crunch numbers? Certainly present digital computers are capable of much more. In the past,
computer scientists have created a great many programs that could perform tasks that people wouldn't have otherwise believed a
computer could do. This is not limited to only playing chess, or proving theorems, but also programs that can hold a regular
conversation with humans, understand stories and perform many other human-like tasks. Yet, there have also been very many
legitimate questions whether or not the intelligence that these programs exhibit can be comparable to human intelligence. The
problem lies with architecture, the way our programs are structured. Alan Turing, one of the fathers of AI, once created a theorem
that stipulated that all computers (he uses the term turing machines, which can be likened to digital computers) can compute anything
that is computable. If creating a human being via a digital computer was ever possible, we must first answer some philosophical
questions concerning whether emotion or consciousness is computable.
There were some scientists that questioned the capability of digital computers. Citing that the architecture of digital computers would
be a terrible approach to emulating human intelligence, they strove to create artificial neurons. This was based on the architecture of
our own biological brains. The idea of course, failed for the mean time, since very little was known about the subject (the same is
true today). The digital computer proved to be the only sufficient medium to carry out artificial intelligence (AI).
There then came another question. People doubted whether or not symbolic AI programs, which encompassed just about every
program in the 1950s, could exhibit true intelligence. Some examples of Symbolic AI programs are chess-playing programs, expert
systems such as theorem provers, ELIZA, STUDENT (which solves calculus problems) etc. Just about any task-center AI program is
a symbolic AI program. These programs operate by manipulating symbols. The critics claimed that the architecture of these programs
were not sufficient enough for intelligence. In essence, they have no "common sense" and can rarely perform tasks other than the
tasks that were assigned to them. One of the early natural language programs could translate Russian to English, and vice versa.
When it converted the English phrase "The spirit is willing but the flesh is weak" to Russian, then back to English, it came up with
"The vodka is good but the meat is rotten". Of course, the machine translation technology that we have today has certainly improved,
but many "common sense" mistakes like this are common. You may actually want to try this with the SYSTRAN translator. These
critics of Symbolic AI systems were the connectionists. They in turn created the neural network architecture. Neural networks are
able to draw links between meanings and thus exhibit some form of "common sense" in some situations. More generally they are
based on the architecture of neurons, synapses and dendrites in brains. As much as these systems have been hyped, they have not
nearly been able to replace symbolic AI systems. On the other hand, they have been very useful for things such as image recognition.
Adaptability and learning has since been almost essential to many AI programs. This follows the goal that one day our machines will
function completely free of their masters, able to learn and adapt freely from the environment that they live in. Humans, and animals
can adapt to their environments, so why not machines? For example, Sam Hsiung's program, IQATS (Intelligent Question and
Answer Test Summarizer) is a program that asks questions and provides answers given an article or essay (for test-making purposes),
it can learn to ask new questions and make new answers by memorizing the pattern of other questions and answers that could be
asked (but is not already in its knowledge base). We cannot expect, for example, for a program to innately understand that a falling
glass of water will break, neither can we expect to teach a program every single detail in this universe. The only plausible solution is
for our machines to learn.
Whatever the approach, we are sure that although our programs may not exhibit exactly human-like intelligence, they are
nevertheless intelligent. By simply looking at what our computers can accomplish today, there is no question in this fact. If our
machines are getting smarter and smarter, will there once be a day that they'll take over the earth? Marvin Minsky, a greatly respected

http://www.generation5.org/aiintro.shtml (1 of 2) [25/06/2002 3:05:21 PM]


generation5.org - An Introduction to Artificial Intelligence
scientist believes so. According to Minsky, one day, our nanotechnology may even make us immortal. We'll be able to store our
human brain's composition inside of artificial brains. Many other scientists share this view. Will there once be a day where our robots
will become so evolved that our earth will be inhabited by nothing but robots. Or is it just a science fiction fantasy?

● Artificial Intelligence and the Skepticism.


● Aspects of Artificial Intelligence Systems.
● Interview with Marvin Minsky.

All content copyright © 1998-2002, Generation5

http://www.generation5.org/aiintro.shtml (2 of 2) [25/06/2002 3:05:21 PM]


An Introduction to Neural Networks

An Introduction to Neural Networks


Prof. Leslie Smith
Centre for Cognitive and Computational Neuroscience
Department of Computing and Mathematics
University of Stirling.
lss@cs.stir.ac.uk
last major update: 25 October 1996: minor update 22 April 1998 and 12 Sept 2001: links updated (they were really out of date) 12 Sept 2001
This document is a roughly HTML-ised version of a talk given at the NSYN meeting in Edinburgh, Scotland, on 28 February 1996, then updated a few times in
response to comments received. Please email me comments, but remember that this was originally just the slides from an introductory talk!

Overview:
Why would anyone want a `new' sort of computer?
What is a neural network?

Some algorithms and architectures.


Where have they been applied?
What new applications are likely?
Some useful sources of information.
Some comments added Sept 2001

Why would anyone want a `new' sort of computer?


What are (everyday) computer systems good at... .....and not so good at?
Good at Not so good at
Fast arithmetic Interacting with noisy data or data from the environment
Doing precisely what the programmer programs them to do Massive parallelism
Massive parallelism
Fault tolerance
Adapting to circumstances

Where can neural network systems help?


● where we can't formulate an algorithmic solution.

● where we can get lots of examples of the behaviour we require.

http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html (1 of 9) [25/06/2002 3:06:36 PM]


An Introduction to Neural Networks

● where we need to pick out the structure from existing data.

What is a neural network?


Neural Networks are a different paradigm for computing:
● von Neumann machines are based on the processing/memory abstraction of human information processing.

● neural networks are based on the parallel architecture of animal brains.

Neural networks are a form of multiprocessor computer system, with


● simple processing elements

● a high degree of interconnection

● simple scalar messages

● adaptive interaction between elements

A biological neuron may have as many as 10,000 different inputs, and may send its output (the presence or absence of a short-duration spike) to many other
neurons. Neurons are wired up in a 3-dimensional pattern.
Real brains, however, are orders of magnitude more complex than any artificial neural network so far considered.
Example: A simple single unit adaptive network:

The network has 2 inputs, and one output. All are binary. The output is
1 if W0 *I0 + W1 * I1 + Wb > 0

0 if W0 *I0 + W1 * I1 + Wb <= 0

We want it to learn simple OR: output a 1 if either I0 or I1 is 1.

Algorithms and Architectures.


The simple Perceptron:

The network adapts as follows: change the weight by an amount proportional to the difference between the desired
output and the actual output.
As an equation:
&Delta Wi = &eta * (D-Y).Ii

where &eta is the learning rate, D is the desired output, and Y is the actual output.
This is called the Perceptron Learning Rule, and goes back to the early 1960's.
We expose the net to the patterns:
I0 I1 Desired output
http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html (2 of 9) [25/06/2002 3:06:36 PM]
An Introduction to Neural Networks

0 0 0
0 1 1
1 0 1
1 1 1

We train the network on these examples. Weights after each epoch (exposure to complete set of patterns)
At this point (8) the network has finished learning. Since (D-Y)=0 for all
patterns, the weights cease adapting. Single perceptrons are limited in what
they can learn:
If we have two inputs, the decision surface is a line. ... and its equation is
I1 = (W0/W1).I0 + (Wb/W1)

In general, they implement a simple hyperplane decision surface


This restricts the possible mappings available.

Developments from the simple perceptron:

Back-Propagated Delta Rule Networks (BP) (sometimes known and multi-layer perceptrons (MLPs)) and Radial Basis Function Networks (RBF) are both
well-known developments of the Delta rule for single layer networks (itself a development of the Perceptron Learning Rule). Both can learn arbitrary mappings or
classifications. Further, the inputs (and outputs) can have real values

Back-Propagated Delta Rule Networks (BP)

is a development from the simple Delta rule in which extra hidden layers (layers additional to the input and output layers, not connected externally) are added. The
network topology is constrained to be feedforward: i.e. loop-free - generally connections are allowed from the input layer to the first (and possibly only) hidden

http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html (3 of 9) [25/06/2002 3:06:36 PM]


An Introduction to Neural Networks
layer; from the first hidden layer to the second,..., and from the last hidden layer to the output layer.

Typical BP network architecture:

The hidden layer learns to recode (or to provide a representation for) the
inputs. More than one hidden layer can be used.
The architecture is more powerful than single-layer networks: it can be
shown that any mapping can be learned, given two hidden layers (of units).
The units are a little more complex than those in the original perceptron: their
input/output graph is

As a
function:
Y=1/
(1+ exp(-k.(&s
Win *
Xin))

The graph
shows the
output for
k=0.5, 1,
and 10, as
the
activation
varies
from -10 to 10.

Training BP Networks

The weight change rule is a development of the perceptron learning rule. Weights are changed by an amount proportional to the error at that unit times the output
of the unit feeding into the weight.
Running the network consists of
Forward pass:
the outputs are calculated and the error at the output units calculated.
Backward pass:
The output unit error is used to alter weights on the output units. Then the error at the hidden nodes is calculated (by back-propagating the error at the
output units through the weights), and the weights on the hidden nodes altered using these values.
For each data pair to be learned a forward pass and backwards pass is performed. This is repeated over and over again until the error is at a low enough level (or
we give up).
http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html (4 of 9) [25/06/2002 3:06:36 PM]
An Introduction to Neural Networks

Radial Basis function Networks

Radial basis function networks are also feedforward, but have only one hidden layer.

Typical RBF architecture:

Like BP, RBF nets can learn arbitrary mappings: the primary difference is in
the hidden layer.
RBF hidden layer units have a receptive field which has a centre: that is, a
particular input value at which they have a maximal output.Their output tails
off as the input moves away from this point.
Generally, the hidden unit function is a Gaussian:

Gaussians with three different standard deviations.

Training RBF Networks.

RBF networks are trained by


● deciding on how many hidden units there should be

● deciding on their centres and the sharpnesses (standard deviation) of their Gaussians

● training up the output layer.

Generally, the centres and SDs are decided on first by examining the vectors in the training data. The output layer weights are then trained using the Delta rule. BP
is the most widely applied neural network technique. RBFs are gaining in popularity.
Nets can be

http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html (5 of 9) [25/06/2002 3:06:36 PM]


An Introduction to Neural Networks
●trained on classification data (each output represents one class), and then used directly as classifiers of new data.
● trained on (x,f(x)) points of an unknown function f, and then used to interpolate.

RBFs have the advantage that one can add extra units with centres near parts of the input which are difficult to classify. Both BP and RBFs can also be used for
processing time-varying data: one can consider a window on the data:

Networks of this form (finite-impulse response) have been used in many applications.
There are also networks whose architectures are specialised for processing time-series.

Unsupervised networks:
Simple Perceptrons, BP, and RBF networks need a teacher to tell the network what the desired output
should be. These are supervised networks.
In an unsupervised net, the network adapts purely in response to its inputs. Such networks can learn to
pick out structure in their input.

Applications for unsupervised nets

clustering data:
exactly one of a small number of output units comes on in response to an input.
reducing the dimensionality of data:
data with high dimension (a large number of input units) is compressed into a lower dimension
(small number of output units).
Although learning in these nets can be slow, running the trained net is very fast - even on a computer simulation of a neural net.

Kohonen clustering Algorithm:

- takes a high-dimensional input, and clusters it, but retaining some topological ordering of the output.

After training, an input will cause some the output units in some area to become
active.
Such clustering (and dimensionality reduction) is very useful as a preprocessing
stage, whether for further neural network data processing, or for more traditional
techniques.

Where are Neural Networks applicable?


..... or are they just a solution in search of a problem?
Neural networks cannot do anything that cannot be done using traditional
computing techniques, BUT they can do some things which would otherwise be
very difficult.
http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html (6 of 9) [25/06/2002 3:06:36 PM]
An Introduction to Neural Networks

In particular, they can form a model from their training data (or possibly input data) alone.
This is particularly useful with sensory data, or with data from a complex (e.g. chemical, manufacturing, or commercial) process. There may be an algorithm, but
it is not known, or has too many variables. It is easier to let the network learn from examples.

Neural networks are being used:

in investment analysis:
to attempt to predict the movement of stocks currencies etc., from previous data. There, they are replacing earlier simpler linear models.
in signature analysis:
as a mechanism for comparing signatures made (e.g. in a bank) with those stored. This is one of the first large-scale applications of neural networks in the
USA, and is also one of the first to use a neural network chip.
in process control:
there are clearly applications to be made here: most processes cannot be determined as computable algorithms. Newcastle University Chemical Engineering
Department is working with industrial partners (such as Zeneca and BP) in this area.
in monitoring:
networks have been used to monitor
❍ the state of aircraft engines. By monitoring vibration levels and sound, early warning of engine problems can be given.

❍ British Rail have also been testing a similar application monitoring diesel engines.

in marketing:
networks have been used to improve marketing mailshots. One technique is to run a test mailshot, and look at the pattern of returns from this. The idea is to
find a predictive mapping from the data known about the clients to how they have responded. This mapping is then used to direct further mailshots.

To probe further:
A rather longer introduction (which is more commercially oriented) is hosted by StatSoft, Inc.
The Natural Computing Applications Forum runs meetings (with attendees from industry, commerce and academe) on applications of Neural Networks. Contact
NCAF through their website, by telephone +44 (0)1332 246989, or by fax +44 (0)1332 247129
Internet addresses: NeuroNet which was at Kings College, London, was a European Network of Excellence in Neural Networks which finished in March 2001.
Howwever, their website remains a very useful source of information
IEEE Neural Networks Council http://www.ieee.org/nnc/index.html

CRC NCRC Institute for Information Technology Artificial Intelligence subject index has a useful entry on Neural Networks.
Newscomp.ai.neural-nets has an very useful set of frequently asked questions (FAQ's), available as a WWW document at: ftp://ftp.sas.com/pub/neural/FAQ.html

Courses

Quite a few organisations run courses: we used to run a 1 year Masters course in Neural Computation: unfortunately, this course in in abeyance. We can even run

http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html (7 of 9) [25/06/2002 3:06:36 PM]


An Introduction to Neural Networks
courses to suit you. We are about to start up a centre in Computational Intelligence, called INCITE.

More Specialised Information

Some further information about applications can be found from the book Stimulation Initiative for European Neural Applications (SIENA) pages, and there is an
interesting and useful page about applications.
For more information on Neural Networks in the Process Industries, try A. Bulsari's home page .
The company BrainMaker has a nice list of references on applications of its software package that shows the breadth of applications areas.

Journals.

The best journal for application-oriented information is


Neural Computing and Applications, Springer-Verlag. (address: Sweetapple Ho, Catteshall Rd., Godalming, GU7 3DJ)

Books.

There's a lot of books on Neural Computing. See the FAQ above for a much longer list.
For a not-too-mathematical introduction, try
Fausett L., Fundamentals of Neural Networks, Prentice-Hall, 1994. ISBN 0 13 042250 9 or
Gurney K., An Introduction to Neural Networks, UCL Press, 1997, ISBN 1 85728 503 4
Haykin S., Neural Networks , 2nd Edition, Prentice Hall, 1999, ISBN 0 13 273350 1 is a more detailed book, with excellent coverage of the whole subject.

Where are neural networks going?


A great deal of research is going on in neural networks worldwide.
This ranges from basic research into new and more efficient learning algorithms, to networks which can respond to temporally varying patterns (both ongoing at
Stirling), to techniques for implementing neural networks directly in silicon. Already one chip commercially available exists, but it does not include adaptation.
Edinburgh University have implemented a neural network chip, and are working on the learning problem.
Production of a learning chip would allow the application of this technology to a whole range of problems where the price of a PC and software cannot be
justified.
There is particular interest in sensory and sensing applications: nets which learn to interpret real-world sensors and learn about their environment.

New Application areas:

Pen PC's
PC's where one can write on a tablet, and the writing will be recognised and translated into (ASCII) text.
Speech and Vision recognition systems

http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html (8 of 9) [25/06/2002 3:06:36 PM]


An Introduction to Neural Networks
Not new, but Neural Networks are becoming increasingly part of such systems. They are used as a system component, in conjunction with traditional
computers.
White goods and toys
As Neural Network chips become available, the possibility of simple cheap systems which have learned to recognise simple entities (e.g. walls looming, or
simple commands like Go, or Stop), may lead to their incorporation in toys and washing machines etc. Already the Japanese are using a related technology,
fuzzy logic, in this way. There is considerable interest in the combination of fuzzy and neural technologies.

Some comments (added September 2001)


Reading this through, it is a bit outdated: not that there's anything incorrect above, but the world has moved on. Neural Networks should be seen as part of a larger
field sometimes called Soft Computing or Natural Computing. In the last few years, there has been a real movement of the discipline in three different directions:
Neural networks, statistics, generative models, Bayesian inference
There is a sense in which these fields are coalescing. The real problem is making conclusions from incomplete, noisy data, and all of these fields offer
something in this area. Developments in the mathematics underlying these fileds have shown that there are real similarities in the techniques used. Chris
Bishop's book Neural Networks for Pattern Recognition, Oxford University Press is a good start on this area.
Neuromorphic Systems
Existing neural network (and indeed other soft computing) systems are generally software models for solving static problems on PCs. But why not free the
concept from the workstation? The area of neuromorphic systems is concerned with real-time implementations of neurally inspired systems, generally
implemented directly in silicon, for sensory and motor tasks. Another aspect is direct implementation of detailed aspects of neurons in silicon (see
Biological Neural Networks below). The main centres worldwide are at the Institute for neuroinformatics at Zurich, and at the Center for Neuromorphic
Systems Engineering at Caltech. There are also some useful links at this page (from a UK EPSRC Network Project on Silicon and Neurobiology)
Biological Neural Networks
There is real interest in how neural network research and neurophysiology can come together. The pattern recognition aspects of Artificial Neural Networks
don't really explain too much about how real brains actually work. The field called Computational Neuroscience has taken inspiration from both artificial
neural networks and neurophysiology, and attempts to put the two together.

Back to top of page


Back to Leslie Smith's home page

http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html (9 of 9) [25/06/2002 3:06:36 PM]


generation5.org - Robotics

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

An Introduction to Robotics
Most of Artificial Intelligence will eventually lead to robotics. Most neural networking, natural language processing, image
recognition, speech recognition/synthesis research aims at eventually encorporating their technology into the epitome of robotics -
the creation of a fully humanoid robot.
The field of robotics has been around nearly as long as Artificial Intelligence - but the field has made little progress. This is only
natural, since the field not only attempts to conquer intelligence, but also the body that embodies it - a formidable task indeed!
Robotics, though, is not just about humanoid robots; but also about their commerical applications in manufacturing, safety and
hundreds of other fields. Let us back-track though, and look at what could consistute a robot?

What is a Robot?
According to the Oxford Dictionary, a robot is an "apparently human automaton, intelligent and
obedient but impersonal machine". Indeed, the word robot comes from robota, Czech for
'forced labour'. Yet, as robotics advances this definition is rapidly becoming old. Basically, a
robot is a machine designed to do a human job (excluding research robots) that is either tedious,
slow or hazardous. It is only relatively recently that robots have started to employ a degree of
Artificial Intelligence in their work - many robots required human operators, or precise guidance
throughout their missions. Slowly, robots are becoming more and more autonomous.
The difference between robots and machinery is the presence of autonomy, flexibility and
precision. Indeed, many few robots are mere extensions of machinery - but as the field advances
more and more, the current 'fine line' will widen more and more. To understand more of what robots can include, let us look at some
examples.

Current Projects
There are many interesting robot projects. RoboMonkey is a great example - a robot built to emulate the gibbon. This incredibly agile
robot can swing from bar to bar (fixed distance) by using its body to give it the correct momentum. The robot learns from its
mistakes, and will adapt accordingly. But for me, the most interesting projects are being conducted at MIT's robotics laboratory. Two
major robot projects are in progress (and have been so for many years) - Cog and Kismet.

http://www.generation5.org/robotics.shtml (1 of 3) [25/06/2002 3:07:37 PM]


generation5.org - Robotics

Cog
When MIT started the Cog project, their primary ideology was that the robot must be built from the
bottom-up. The robot would be taught all the necessary data it needed - with very little explicit
programming. Cog will eventually be a complete humanoid robot; as it currently stands (sits) it has a
head, torso and arms all with proportional degrees of freedom.
As of yet, Cog does not perform any higher-level functions of a human, nevertheless the current robot is
quite incredible. Cog can track target smoothly with his head (smooth-pursuit), he can move his head if
a stimulus is provided somewhere in his field of vision (saccade to motion). Cog even has a
vestibulo-ocular reflex (the ability to keep the eyes in a fixed position when the head moves). Cog can
recognize faces and can detect whether eye contact has been established by using his high-resolution
cameras. The robot has also learnt how to reach for targets through trial-and-error and can even imitate
simply facial movements.
The philosophy behind Cog is essential to its success, and continued success. The following passage really demonstrates how the
Cog team feels:
"…We believe that classical and neo-classical AI make a fundamental error: both approaches make the mistake of assuming that because a description of reasoning/behavior/learning is
possible at some level, then that description must be made explicit and internal to any system that carries out the reasoning/behavior/learning. This introspective confusion between surface
observations and deep structure has led AI away from its original goals of building complex, versatile, intelligent systems and toward the construction of systems capable of performing
only within limited problem domains and in extremely constrained environmental conditions.…We believe that [our] abilities are a direct result of four intertwining key human attributes:
developmental organization, social interaction, embodiment and physical coupling, and multimodal integration of the system…"

Kismet
Kismet is aimed at helping research robots and social interaction with human beings. Kismet is only
a head, but much more detailed than its Cog counterpart. Kismet's head consist of two eye's (with
embedded cameras), ears, a mouth, and eyebrows. These can be combined to produce numerous
facial expressions denoting Kismet's feelings and emotions. Kismet can be stimulating using toys
such as a slinky - over stimulation will upset him, under-stimulation will bore him, but just the right
amount will make him happy.
The Cog team believes that social interaction is key to helping robots learn and grow - just like
babies. Therefore, Kismet is a precursor to teaching Cog all about life! Kismet has a dedicated
'trainer' that plays and interacts with Kismet, and watches his output (Kismet is controlled by a
Pentium processor) on the monitor to gain a further understanding of the robots interal states.

Robots at Home
Robotics is slowly making its way into the home - either through leisure, or actual commercial home-based bots. Recently, Probotics
released the world's firt true personal robot - Cye. Cye allows its human operator to create a map of the environment (using a
Windows interface) and download it via an IR link to the robot. The robot will then be able to navigate the area doing various tasks -
including vacuuming! Consumer robots, though, have not yet made a big impact. So-called leisure-robots are.
Recently, Tiger released a furry little toy called a Furby. These toys were touted to contain some impressive artificial intelligence,
plus the ability to communicate, talk and learn from other Furbies. Furbies have an array of sensors - light sensors, tilt sensors, a
microphone, various strategically-placed buttons, and an IR communication device. Furbies have their own language, but can
apparently learn any other language. Once they have learnt the language, they will start to teach it to other Furbies in their vicinity.
Furbies are extremely well-priced now, and a robotics version The Gigabot is now available. Sony released their own version - a
much hyped, expensive electronic dog, Aibo. The dog does various tricks, has a proximity sensor to avoid obstacles, and can even
pick itself up when it falls over! The unit was released in several short bursts, with Sony releasing about 10,000 each valued at about
$2,500 each!
Below are three pictures of (left to right) the Cye SR, a Furby, and Aibo.

http://www.generation5.org/robotics.shtml (2 of 3) [25/06/2002 3:07:37 PM]


generation5.org - Robotics

Conclusion
Robotics is an absolutely fascinating field that interests most people - AI buff or not. As research from more serious robotics projects
such as Cog and Kismet filter down into the commerical arena we should look forward to some very interesting (and cheap) virtual
pets like Aibo and the furbies. Hopefully, commericial home-based robots will also be avaible for a price not more than an expensive
vacuum cleaner. With computers becoming more and more powerful, interfacing home robots with your computer will become a
reality, and house work will (hopefully!) disappear.

● Hardware Reviews - Reviews for robots like the Sony AIBO, LEGO Mindstorms and more!
● Introduction to Robotics - The basics of robots.
● Problems with Machine Vision - An intro to the problems that image recognition faces.
● AISolutions - Many robotics articles.

All content copyright © 1998-2002, Generation5

http://www.generation5.org/robotics.shtml (3 of 3) [25/06/2002 3:07:37 PM]


generation5.org - Artificial Intelligence and Skeptism

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

Artificial Intelligence and Skeptism


What else could be more exciting than having intellectuals (us) creating intellect? Artificial Intelligece (AI) is a scientific study
encompassing philosophy, computer science, mathematics, and even history, biology or engineering. Perhaps the most far-reaching
goal in AI is to build an artificial human being. Unfortunately (or maybe fortunately) we have not nearly reached this level. Perhaps
the most important purpose of AI is to increase human understanding of learning, reasoning, and other cognitive processes. One day
we may be able to answer the important philosophical questions that were once unanswerable. What is intelligence? Are machines
intelligent? Are machines capable of conciousness? emotion? or even aesthetics? and if so, how can we build machines that have
these animal/human characteristics of intelligence? These considerations must be taken very seriously.

The Skepticism: It Can't Be Done!


Many people believe that only people and animals can possess intelligence; almost all science fiction movies portray computers as
zombie-like objects that can only deal with 1s and 0s. Machines can be intelligent. The intelligence that a chess-playing program
could exhibit may not be the same as a human chess-player, but if we were to say that that chess playing program did not possess
intelligence in terms of playing chess it would be ludicrous. However, it is a not uncommon to believe that machines can only
"crunch numbers". If they ever do anything that would exhibit human intelligence, it is always superficial and menial. This
misconception usually lies in the paradox that although our computers can make calculations millions of time faster than any average
human being, they are yet unable to perform many simple tasks that humans can accomplish. We can build chess-playing programs
that will take even the highest of chess grand-masters down, but we have yet to create a machine that can talk as prociently as a
human of average intelligence. There are expert systems that exhibit problem-solving skills and other aspects of human intelligence.
MOLGEN, (created by M. Stefik at Stanford University in 1979), is a program that plans scientific experiments to help molecular
geneticists. This program actually developed a method for producing insulin in bacteria. It works by manipulating its huge
knowledge base of molecular genetics to generate sequences of small steps (all of which are constrained by certain rules) that
constitute plans that would help it arrive to a solution.
This is not to say that if something so mechanical and "dead" can possess intelligence, then human beings are no longer special. All
machines are subservient to us, we are far from the point of creating machines that will surpass the status of human beings. (How do
you feel? Please take time to cast your vote on this subject).
Some AI programs can be very flexible. They are able to adapt to certain situations, as well as perform a variety of tasks, very much
like human beings. Though, usually, most AI programs are not very flexible, specifically, Symbolic AI systems. These programs are
usually created to accomplish specific tasks for the betterment of our lives. If we wish to build a machine that can emulate human
thinking, a Symbolic AI program which operates under a rigid set of rules is definately not the right approach. Human reason cannot
be likened done to just a simple set of conditions and subsequent if-then actions. Although in a top-down general fashion this may be
true. Suppose that you want to go to the mall. However if it was raining, then you would change your mind. It turns out that its not
raining so you do. A system that operates under a set of rules will have great difficulty creating totally different, "out of the blue"
rules that may be neccessary for learning and adaptation. A story understander such as SAM (Script Applier Mechanism) that utilizes
conceptual representation may be told by a script that JOHN is a PERSON. Yet, does it really know that JOHN is really a PERSON?
With legs, arms, a brain etc.? This is a problem that had perplexed many AI researchers. The solution to this problem was a more
flexible approach to reasoning: the connectionist approach. In this system, meanings are interconnected and linked. Each definition of
something gives a definition to something else. JOHN could be a PERSON but he could also be a DOCTOR.

http://www.generation5.org/introduction.shtml (1 of 2) [25/06/2002 3:08:27 PM]


generation5.org - Artificial Intelligence and Skeptism
Where do these ideas come from? The answers to machine intelligence lies not in the thinking of machines, but the way we think.

Learning Machines?
Back in the 50s, AI researchers were never concerned with learning. We were very content with our chess-playing programs and
theorem provers. As more research had been done on understanding, it was suddenly discovered that learning could become a very
powerful tool for intelligent systems. Many types of systems present different approaches to learning (see aspects of artificial
intelligent systems). Generally speaking, learning is very important and almost essential if we expect a machine to adapt to its ever
changing environment. It is impossible to endow a machine with all the thoughts and concepts that humans possess. If we created a
system that could learn these things instead- by themselves, then our task is made much simpler. Through simple observations, or
experimenting through trial and error, our machines can learn.

A Definition of Understanding
In AI, we can say, in the most general terms that a system understands if it behaves accordingly. For example if a robot is told to
sweep the floor, it understands if it follows the command, but if not, then it does not understand. This poses a problem. Suppose a
system "accidentally" behaves in a way that would imply understanding, it conceptually, it doesn't really understand, does it? And in
some other cases, a system doesn't neccessarily have to behave physically in order to show that it understands. For example we can
understand that 2+2=4 without having to write it out on paper. For these cases we must seek a deeper definition. A system
understands something if it makes changes to its internal conceptual structure, or appropriately modifies its knowledge base. This
definition is very ambiguous, but generality is not always bad. It's true that SAM (Script Applier Mechanism) can only understand
the letters in words, but not necessarily what those words would imply in the real world (SAM wouldn't know how JOHN looks like).
Nevertheless, it understands something about the nature of words, and their syntax and semantics.
All content copyright © 1998-2002, Generation5

http://www.generation5.org/introduction.shtml (2 of 2) [25/06/2002 3:08:27 PM]


generation5.org - Artificial Life

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

Artificial Life
Artificial Life, or ALife, is the attempt to get computers to accurately model the ways and practises of nature. As you can tell from
this definition, not only is ALife a large domain, it overlaps very much with Artificial Intelligence. So much so, in fact, that ALife
deals with various aspects of AI (such as genetic algorithms), and AI deals with various aspects of Alife (such as flocking).

Introduction to ALife
Alife and AI coexist together very well. ALife looks at algorithms that can mimic nature and it ways -- cellular automata, simulation
of group behaviour (ants, for example) -- whereas AI tends to look at mimicking (creating?) human intelligence.

Cellular Automata
A prime area of ALife is that of Cellular Automata (CA). The best definition of CA I've found so far is as follows:
"...A regular spatial lattice of "cells", each of which can have any one of a finite number of states. The state of all cells
in the lattice are updated simultaneously and the state of the entire lattice advances in discrete time steps. The state of
each cell in the lattice is updated according to a local rule which may depend on the state of the cell and its neighbors at
the previous time step..." From FOLDOC.
There are two very good examples of CAs. Wolfram's 1D CA, and Conway's Life, a 2D CA.

Wolfram's 1D-CA
Wolfram was a genius for his age, and he showed how incredible complexity could come out of a simple rules and
simple structures. Wolfram created a program where each of the cells were either on or off (dead or alive). It started
off with an initial configuration (either one cell, or a random stream of them), then the cells beneath the initial line
are determined by the previous line. There are eight possible combinations for Line A that will determine Line B -
000, 001, 010, 011, 100, 101, 110, 111. The results are dependent on the rule set (there are 256 possiblities).
Recently (23/8/99), I created a Windows 95 program that allows you to create your own 1D CAs. The program can generate some
very interesting results, and can generate results both from a point and a line. This program replaced the old Wolfram Pascal program
I'd created.

Conway's Life
Sorry, your browser doesn't support Java. A good example of ALife is the classic program, Conway's life. An example Java applet by
Fabio Ciucci is running at the right. The rules to Conway's Life are simple - each pixel represents a cell. A cell cannot live with either
more than 3 or less than two other cells immediately adjacent to it. If the number of cells around an empty space is 3, then a new cell
is born. These simple rules lead to rather complex behaviour, as demostrated by the applet to the right. The Life system will
eventually stabilize, either by dying out completely, or by an equilibrium point being established (as will happen in the applet if left
for a few minutes).

http://www.generation5.org/alife.shtml (1 of 2) [25/06/2002 3:09:00 PM]


generation5.org - Artificial Life
Conway's Life has a few more interesting aspects to it though. There is a certain pattern of cells called a glider. Gliders
are cells that will, after a few life-cycles, repeat themselves, but in a different position. The most common glider is
shown to the left. Enter this pattern into the Multilife program and you will see the recurring pattern. What's so
important about gliders? They do math! The explanation behind all of this is rather complicated, and beyond the scope
of this essay. But gliders can be aligned in such a way that they can perform bit-wise operators like AND, OR, NOT,
XOR etc. More complicated functions often take very complicated aligning etc, but they are possible. So, a simulation following
extremely simple rules can yield incredibly complicated results. This is the ALife equvalent to Turing Machines.

For an interactive Java simulation of life, check out Multilife.

Behaviour
Another area Alife covers it that of animal behaviours. Finding mathematical formulas behind behaviour. The most famous study of
behaviour was Craig Reynolds' Boids - a program that simulated flocking to an incredibly realistic extent. Again, the program used 3
very simple rules, yet yielded incredibly realistic behaviour. The rules were:
● Separation: Steer to avoid crowding local flockmates.

● Alignment: Steer towards the average heading of local flockmates.

● Cohesion: Steer to move towards average position of local flockmates.

You can find an more complicated example of flocking based on Boids here.

The Philosophies
Just where does ALife begin and Artificial Intelligence end? Will Artificial Intelligence come naturally if Artificial Life is
successfully created? Does Artificial Intelligence constitute Artificial Life? These are three questions that have been with ALife and
AI since they were first conceived.
Will Artificial Intelligence come naturally if ALife is created? Perhaps, perhaps not. Successfull is so subjective, and the question of
intelligence has been a philosophical question that has plagued AI since its foundations.
The question of whether AI is ALife is interesting. Imagine an artificially intelligent program the doesn't simulate any of the natural
processes of life, yet shows the ability to cognitively understand human speech, make completely autonomous decisions, possibilty
even simulate emotions - would that constitute life? Some would say yes, some no. These areas in Alife are where ALife/AI meet
morals, ethics and politics.

● ALife Essays
● ALife Programs - Windows-based, with full source code!
● ALife Books
● ALife Software
● ALife Software
● AISolutions

All content copyright © 1998-2002, Generation5

http://www.generation5.org/alife.shtml (2 of 2) [25/06/2002 3:09:00 PM]


generation5.org - Essays

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

Aspects of Artificial Intelligence Systems


Symbolic Artificial Intelligence
Symbolic AI systems are designed and programmed, rather than trained or evolved. In nature they are algorithmic, yet powerful.
They work function under rules. Symbolic AI systems are typically confined to a narrow task such as chess playing, or theorem
proving. Thus, they tend to be very fragile, and rarely are effective outside of their assigned domain; for example, a chess-playing
program would not, if at all, perform as well diagnosing malaria as would a disease diagnosing expert system.
See Production Systems, Turing Machines.

Artificial Neural Networks


Artificial Neural Networks (ANN) are computational-cognitive models based on the structure of the nervous system. The difference,
or maybe an advantage of ANNs over expert systems is that they are trained rather than programmed. They learn and evolve to their
environment, beyond the care and attention of their creator. Although expert systems are capable of pattern-matching and learning to
some degree, the amount of learning that an ANN can undergoe is greater as well as more flexible. For example, consider the
knowledge that human being possess. It would be quite impossible to straight-forwardly program a system that would store and
manipulate information to that capacity (you would have to specify everything a human being knew- manually). The problem would
be made much more feasible if we created a "learning machine". Learning is an important prerequisite for artificial minds. ANNs are
most widely used for pattern recognition or classification problems, however in theory, anything any computer can do can be
accomplished by an ANN.
See Artificial Neural Networks.

Computational Neuroethology
Computational neuroethology (CN) is the study of a systems behavior within an environment. It is concerned with the modeling of
the behavior of these systems, as well as their neural substrates (which is what neuroehtology is concerned with). CN systems
percieve their environments directly. i.e. they are not stored in some global database that was created through human input. They
work in a "closed-loop" environment, free from outside interactions. Their actions are solely based on what they conclude from the
state of their environment and as well as their prior actions. For example, if we wish to simulate a robot in a closed-loop
environment, then it must act not based on whatever semantics or clues that could be provided from a human, but simply from the
changes (or the state) of the environment that it is in.
CN systems learn neurally and evolve genetically. They are adaptive, and act based on the circumstances that they face in their
environment. The drawback with CN systems are that they require enormous computational resources. However, many CN
advocators insist that much more can be gained from building complex models of simple animals (systems) then from building
simple models of complex animals (which is the traditional, and more direct approach). Following this idea, the ultimate goal of AI is
to create a human being. Yet to accomplish this, we must first create a baby, not a full-grown adult.

http://www.generation5.org/overview.shtml (1 of 2) [25/06/2002 3:10:19 PM]


generation5.org - Essays

Artificial Life
Artificial Life (AL) is the study of artificial systems that behave like other natural living systems. One example of AL is Reynold's
Boids (link to Java adaption of Boids - Flozoids). This is a computer model of flocking behavior in animals (such as birds or fish).
One of its characteristics is that, the flock (which is made up of many "boids") will always reassemble if it passes through an obstacle
(which causes it to scatter). How does it accomplish this? Well, each boid follows a few rules, such as: don't fall behind, keep up with
nearby boids, try to stay a minumum distance between your neighbors and obstacles, move towards what seems to be the center of
mass of nearby boids. While these rules may seem very simple, the result is a bunch of boids behaving like a real flock. If you would
like to learn more about boids, see our exclusive interview with Craig Reynolds or Artificial Life.

Another example of AL are surprisingly, computer viruses. Computer viruses exhibit reproductive behavior, usually with the
intention of making trouble, emulating their biological counterparts.
Artificial life systems are mainly concerned with the formal basis of life. Its not important how an AL system was created, but how it
acts and behaves under its environment. They attempt to emulate lifelike behavior. Many AL systems often evolve and coevolve,
simulating evolutionary processes for adaptation to an environment (luckily computer viruses can not evolve, hopefully this will
remain true for decades to come). However, there are limitations to these simulations; this is because of the simple fact that
everything in the physical universe can't be fully detailed. The more accurate the details (as well as, the more the details) that are
given in the simulation, the more likely it is to be successful.

● ALife Essays
● ALife Programs - Windows-based, with full source code!
● ALife Books
● ALife Software
● ALife Software
● AISolutions

All content copyright © 1998-2002, Generation5

http://www.generation5.org/overview.shtml (2 of 2) [25/06/2002 3:10:19 PM]


Basics of Game AI

Basics of Game AI
By Geoff Howland

Game AI (artificial intelligence) is the subject of a lot of discussion recently, with good cause. As games forge ahead in the
areas of graphics and sounds it becomes increasingly obvious when the actions of the game controlled players are not
functioning in an "intelligent" manner.

More important than the game controlled player's "intelligence", is really the game controlled player's stupidity. Most
game players are not expecting to load up the newest first person shooter and face off against Professor Moriarty with a
nail gun. They do expect that they will not be playing against an inbred lobotomy patient that is confounded by the
complexity of a corner.

Unit Behavioral AI
Game AI is not necessarily AI in the standard meaning of the word. Game AI for units is really an attempt to program
life-like attributes to provide a challenge or an appearance of reality.

A guard in a game that stands in one place never moving can seem very unrealistic. However, if you create a routine to
make that guard look around in different directions from time to time, or change his posture, he can start to look more
alive. By creating a situation where another guard who walks in a pre-made path occasionally stops in front of the standing
guard and appears to speak to him, the appearance of reality can be increased greatly.

There are two categories of actions in unit AI: reactionary and spontaneous.

Units act in a reactionary manner whenever they are responding to a change in their environment. If an enemy spots you
and starts to run towards you and shoot, then they have acted as a reaction to seeing you.

Units act in a spontaneous manner when they perform an action that is not based on any change in their environment. A
unit that decides to move from his standing guard post to a walking sentry around the base has made a spontaneous action.

By using both types of unit behaviors in your game, you can create the illusion that you have "intelligent" units who are
autonomous, and not necessarily simple machines.

Reactionary AI
Reactionary inputs should always be based on the senses of the units. When modeling AI after human characters you need
to take into consideration their range and distance of sight, their sense of hearing and if applicable, smell.

Creating levels of alertness is a good way to handle the input of different senses. If a unit sees an enemy directly in its
sights, then it should switch to an alert mode that corresponds to how it would react to confronting an enemy. If the unit
does not see the enemy but hears footsteps or a gunshot, then the unit should go to a corresponding level of alert that
would apply to non-direct knowledge of the situation.

In the case of a unit that was a guard, hearing a gunshot would make the guard act to investigate in the area the shot was
fired in. Hearing footsteps may make the guard stand in waiting to ambush the walking unit. All of these different kinds of
actions and alerts can be set up in a rule-based or fuzzy logic system so that you can interpret each sound or sighting by
every unit and have them react in an appropriate manner.

A general alert is also an important factor in the appearance of reality and intelligence in a game. If you have been running
around shooting up a base full of people and keep encountering new enemies who are completely clueless to the fact there
has been non-stop gun fire for the past ten minutes, it is going to really seem out of place. By creating an alert system for
all the units, and perhaps an alert plan, you can strengthen the appearance of reality in your game world.

http://www.lupinegames.com/articles/basicai.htm (1 of 2) [25/06/2002 3:10:38 PM]


Basics of Game AI

An alert plan would consist of rules that units would follow if there were an alert, as opposed to rules for non-alert
situations. For instance, if there was an alert you could have all units that are in non-critical areas run to the entrances of
the base to assist in the defense.

Spontaneous AI
Spontaneous AI is extremely important in creating a sense of life in your game worlds. If everyone you meet is just
standing there waiting for you to talk to them or kill them, or almost worse, wandering around aimlessly, it is not going to
be very convincing for the player.

Some methods of breaking up the standing around problem are to give each of the units a set of non-alert goals. These
could include walking pre-set paths, walking to pre-set areas randomly, stopping next to other units occasionally when
passing them and walking with other units to pre-set destinations. I have said pre-set in all of these situations because
unless you come up with a very good algorithm for setting destinations or paths your units are going to look like they are
wandering aimlessly.

Unit Actions
What really makes a game unit look intelligent is their actions. If they move in a way that the player might, or do an
action, such as ducking, in a situation that the player might, then the unit can look intelligent. You do not necessarily need
a lot of actions to create an illusion of intelligent actions, you just need enough to cover whatever basic situations your
units will be involved in.

The more areas you cover, if you cover them well, the greater the chance that your players will believe your units are
acting "intelligently". Put yourself in the place of your units. What would you do in their situation? How would you
respond to various forms of attack or enemy encounters? What would you be doing if nothing was going on at all?

If you answer these, and correctly implement them for each situation your units will encounter you will maximize your
chances of having intelligent looking units, and that is the first step to creating a good, solid game AI.

http://www.lupinegames.com/articles/basicai.htm (2 of 2) [25/06/2002 3:10:38 PM]


GameDev.net - Comp.AI.Games FAQ

Comp.AI.Games FAQ GameDev.net

See Also:
Artificial Intelligence:Introduction

COMP.AI.GAMES
Frequently Asked Questions
8th January 2000
Version 2.01
Currently maintained by David J Burbage (dave@blueleg.co.uk)
Originally found at http://www.geocities.com/cagfaq/cf1.htm

Table of Contents
0 Introductory Stuff
0.1 Copyright
0.2 Disclaimer
0.3 Work in Progress
0.4 Netiquette and related topics
0.5 Whom to blame
0.6 What's New in this Release?

1 Overview of comp.ai.games
1.1 What is the purpose of comp.ai.games?
1.2 What are appropriate topics on c.a.g?
1.3 What are _not_ appropriate topics on c.a.g?
1.4 Is there an archive for this group?

2 Overview of Artificial Intelligence


2.1 What is Artificial Intelligence
2.2 What constitutes AI in games? What doesn't?
2.3 Should computer opponents be considered AI at all?
2.4 What languages and tools are available for developing AI applications?
2.5 What development methods have been shown to work?
2.5A Choose the least amount of AI that will get the job done
2.5B Start Small

3 Solved and Unsolved Problems in AI for Games


3.1 What game AI problems have been solved?
3.1A Finding the shortest path from point A to point B
3.1B Traversing a maze
3.1C Exhaustive search of sequences of a limited set of moves (gametrees - minimax)
3.1D Database knowledge of information to apply
3.2 What game-applicable AI problems have not been solved yet?
3.3 What games have been "solved"

http://www.gamedev.net/reference/articles/article200.asp (1 of 16) [25/06/2002 3:11:19 PM]


GameDev.net - Comp.AI.Games FAQ

4 Promising areas of AI research in gaming

5 Fact Sheets and descriptions of gaming AI projects

6 Game AI in the larger world


6.1 Who is sponsoring research into AI in games?
6.2 How can I get a job or assistanceship writing games?
6.3 What commercial games use AI?
6.4 What commercial games advertise AI but don't really use it?
6.5 Are there any AI competitions?

7 Further information
7.1 Where is Web information on XXX?
7.2 What are some related news groups?
7.3 Where can I FTP text and binaries?
7.4 What books are available on AI and gaming topics?
8 Source Code
9 Glossary
10 Acknowledgements
Note: You can most easily move from section to section below by searching for lines of "-".
---------------------------------------------------------------------------
Proposed FAQ for comp.ai.games

0 Introductory Stuff
0.1 Copyright
Portions copyright 1999 David J Burbage. All rights reserved.
Portions copyright 1996 Sunir Shah. All rights reserved.
Portions copyright (c) 1995 by Steve Furlong. All rights reserved.
Portions copyright 1995 by Doug Walker and others.
Sunir Shah currently has permission to edit sections copyright Steve Furlong, Doug Walker and Rob Uhl.
This FAQ may be freely redistributed in its entirety without modification provided that this copyright notice is not
removed. It may not be sold for profit or incorporated in commercial documents (e.g., published for sale on CD-ROM,
floppy disks, books, magazines, or other print form) without the prior written permission of the copyright holder.
Permission is expressly granted for this document to be made available for file transfer from installations offering
unrestricted anonymous file transfer on the Internet.

0.2 Disclaimer
This FAQ is provided AS IS without any express or implied warranty. While the copyright holder and others have made
every effort to ensure that the information contained herein is accurate, they cannot be held liable for any damages arising
from its use.

http://www.gamedev.net/reference/articles/article200.asp (2 of 16) [25/06/2002 3:11:19 PM]


GameDev.net - Comp.AI.Games FAQ

0.3 Work in Progress


As the new maintainer of this FAQ, it needs a spring clean. So it may be changing somewhat over the next few weeks and
months. I'm going to introduce a version number so it can be tracked historically. All versions previous to this can be
considered as 1.x where x is a changing date... there is a new revision history at the bottom of this text.
The FAQ has been maintained rather sporadically in the past, I hope to do at least a bit better. So nag me if you don't see
updates. Especially if you've submitted new information for inclusion.

0.4 Netiquette and Related Topics


See Section 1 for the introduction to comp.ai.games . This is the official FAQ.
The newsgroup is rather good insofar as there are not thousands of messages every day, and there is a good mix of
expertise from people in the games industry, people who are interested in the topic but work in other computing fields, and
people who are simply interested in the topic for itself. This makes for a better newsgroup than some. I'd like to see it kept
that way.
We don't see much flaming, there is (of course) some spamming but far better than in some other newsgroups. In general,
there are several broad areas which can harm the signal-to-noise ratio in a newsgroup. By taking a little extra care in
posting, we can all help keep the noise down.
Some guidelines to keeping everyone sane and happy are as follows:
First, do not post questions which are answered in the FAQ. The FAQ will be posted regularly and will also be available
by www. DO, however, send pertinent and contributory information to the keeper of the FAQ, dave@blueleg.co.uk . That
will help us keep the FAQ useful. The reason some questions are frequently asked is that the FAQ has dated information
and just asking often provides much more information than reading the FAQ.
Second, do not respond to flames, flame-bait, and other obnoxious posts. Often, trollers (in the UK, known as wind-up
merchants) will post false or inflammatory messages and attempt to whip up a flamewar by working both sides of the
debate. Let them flame each other, there's no need to help them out.
Third, use KEYWORDS. If your post is about chess, put "CHESS:" at the start of your subject line. This will enable those
with smart newsreaders to automatically select or kill your article. If it's a question, add "Q:" to the subject line. A larger
list of suggested keywords will be included in the FAQ. Any keywording conventions which arise in the course of the
group's development will also be added.
Fourth, please don't cross-post. If an article is earth-shatteringly important to more than one group, fine. When following
up an article which is excessively cross-posted, remove the cross-posts.
There was a three-month flamewar/debate over the merits of Ada which would have died if the original troll hadn't been
cross-posted to every comp.lang group in existance. What finally killed it was a lot of people removing the excessive
cross-posts. If it starts here, let's put it out of our misery quickly. You can set a Follow-Up To: newsgroup in your article,
to let folks who are interested follow the discussion without taking up more than one newsgroup. If your software allows
it, of course.
Finally, if someone is posting in the wrong group, tell them via mail. There are a lot of newbies out there, and there are
more on the way. If we let a flamewar erupt every time a newbie posts in the wrong place, we'll be all noise and no signal
in no time.

http://www.gamedev.net/reference/articles/article200.asp (3 of 16) [25/06/2002 3:11:19 PM]


GameDev.net - Comp.AI.Games FAQ

0.5 Whom to blame


If you have comments or concerns about this FAQ, please direct them to Dave Burbage (dave@blueleg.co.uk).

0.6 What's new in this release?


(2.00) - Inclusion of explanation of some important algorithms. Newer links to the interesting sites on the Web. Additional
book references. An online version with hyperlinks will be available soon. Again, nag me if it doesn't happen soon.
(2.01) - Corrections from Sunir, Bjorn and others.
---------------------------------------------------------------------------

1 Overview of comp.ai.games
1.1 What is the purpose of comp.ai.games?
CHARTER for comp.ai.games
The newsgroup comp.ai.games will provide a forum for the discussion of artificial intelligence in games and
game-playing. The group will explore practical and theoretical aspects of applying artificial intelligence concepts to the
domain of game-playing. In addition to the traditional game areas such as chess and go, the group will also welcome those
seeking to bring artificial intelligence into other computer games. Computer games in this context would consist of games
played by humans against computers, and by computers (including robots) against each other. This newsgroup is not an
appropriate place for the discussion of machine specific coding problems, nor is it the proper place to discuss strategies for
defeating computer opponents in existing games. There are other newsgroups already in existence to answer these
questions.
It should be pointed out that comp.ai.games is *not* just for professional or academic discussion of artificial intelligence
in games. Amateurs and hobbyist game developers will find themselves welcome here as well.

1.2 What are appropriate topics on comp.ai.games?


There are plenty. Frequently occurring topics include
● pathfinding

● minimax algorithms

● neural nets

● boardgame logic

● algorithm discussions and approaches

It is certainly in order to post links to samples, demos and downloads for purposes of discussion of AI in games. There are
a fair number on the Net already, and many can be listed in this FAQ. Send me your URLs. 1.2.1
Game Designers - If you are/have/will be working on a commercially released game, please feel free to discuss how the
AI was approached for the game(s) in question. Of course, we don't want to know anything which is commercially
sensitive, but there's a distinct lack of knowledge in this area (how are well known games' AI approached)?

http://www.gamedev.net/reference/articles/article200.asp (4 of 16) [25/06/2002 3:11:19 PM]


GameDev.net - Comp.AI.Games FAQ

1.3 What are _not_ appropriate topics for comp.ai.games?


● Discussion of basic 'hints and tips' for games. Where a game has poor or strong AI, on the other hand, let us know.
● Discussion of why you liked or didn't like a particular game, except as it used AI.
● Endless is not/is too discussions of what constitutes AI, of why academics do or do not have anything to show for
all their work, or of why Game A's purported AI is better than Game B's.
● Veering into basic programming issues, apart from approaches to the AI development. There are many other
newsgroups which are better places for these discussions!

1.4 Is there an archive for this group?


Older ftp archive (up to 16th August 1997): ftp://ftp.cs.cmu.edu:/user/ai/pubs/news/comp.ai.games/
New archive : use DejaNews (http://www.dejanews.com).
If there is a desire to keep an independent (not in the hands of DejaNews) archive then let me know and it can be planned
accordingly. If anyone knows of another source of the newsgroup archive, let me know and I will point to it in this FAQ.
---------------------------------------------------------------------------

2 Overview of Artificial Intelligence


2.1 What is Artificial Intelligence?
Unfortunately there's no easy answer to this, as it means different things to different people.

2.2 What constitutes AI in games? What doesn't constitute


AI?
What it is : Code which looks at game information which results in apparently 'intelligent' responses. Whether this is as a
result of a Finite State machine, a Fuzzy State Machine, genetic algorithms, scripting rules, minimax, database lookup or
plain cheating, it can all be called AI with varying degrees of honesty.
What can't be called AI : random responses (with few exceptions). Fixed scripts eg the gate opens after 10 seconds, a
predetermined happening.

2.3 Should computer opponents be considered AI at all?


Some games can be attacked by mathematical means instead of AI. While analyzing a game ahead of time and coding in
an optimal algorithm might not be considered AI, mathematical analysis -- when possible! -- usually achieves much better
results than AI. For analysis on hundreds of such games, see "Winning Ways for your mathematical plays" volume 1 & 2,
by Elwyn R. Berlekamp, John H. Conway, and Richard K. Guy. ISBN 0-12-031101-9 and 0-12-091102-7.

2.4 What languages and tools are available for developing


AI applications?
2.4A "Pure" AI Languages

http://www.gamedev.net/reference/articles/article200.asp (5 of 16) [25/06/2002 3:11:19 PM]


GameDev.net - Comp.AI.Games FAQ

2.4A1 LISP
2.4A2 PROLOG
Games in these languages will probably not be commercially released, but who knows? There are LISP-like scripting
elements in some commercial games.
2.4B Conventional Languages
2.4B1 C and Pascal
2.4B2 C++ and Object Pascal
2.4B3 Visual Basic or other BASICs
All these can be summarised at once ... AI techniques are algorithms. The language they are implemented in is besides the
point. So, if you can code most algorithms in a language you're all set. The intelligence is in the algorithm, not the
language.
2.4C 4th Generation Languages
A very high-level language. May use natural English or visual constructs. Algorithms or data structures may be chosen by
the compiler.
Best example is Smalltalk. Not known for their applicability to Games.
2.4D Third-party Libraries
"AISEARCH--C++ Search Class Library", Victor Volkman, C/C++ Users Journal, November 1994. Describes a library
with several search algorithms. The library is available in binary from the C Users Group Library as CUG volume 42.
2.4E Development Environments

2.5 What development methods have been shown to work?


2.5A Choose the least amount of AI that will get the job done
Q#: I'm writing a game, and want to use artificial intelligence for the computer opponent. What's the best way to go about
that?
A#: Start simple. Make the computer opponent do something. You can always build on the success of the bits that work,
and clear out the bits that don't. Evolutionary programming...
There are many different approaches :
A#a: Fixed Responses
You can easily write a computer opponent which behaves in a set fashion. A 'Finite State Machine' would be an example
of this. For instance, the Space Invaders aliens move in a fixed pattern and drop bombs at random. The PacMan ghosts
always moved toward your guy, subject to the constraints of the walls. These examples are very dated, but this should
ensure that the examples are those which almost everyone will have seen, and which used the most basic computer
opponent methods.
The computer opponent either takes no notice of the player's activities (eg, Space Invaders) or treats them in a simplistic
manner (eg, PacMan). This makes for easier programming, with "memory" and "reasoning" taking almost no part in
planning the computer's actions.
The problem with these methods, of course, is that they aren't too satisfying to the player. The only way for the computer
to beat a non-klutz in all of the older and most of the newer action games is to throw so many opponents that the player is
overwhelmed, or speed up to the point that human reflexes just can't keep up, or simply wear him out. This can make for
an exciting, playable game even today, but it shouldn't be confused with AI.

http://www.gamedev.net/reference/articles/article200.asp (6 of 16) [25/06/2002 3:11:19 PM]


GameDev.net - Comp.AI.Games FAQ

A#b: Fixed Rules


The next rung on the complexity ladder is giving the computer opponent some "judgement". "Judgement" is in quotes
because here we're talking about decisions based on an algorithm determined when the program is written, not a truly
intelligent judgement worthy of a human being with a brain worthy of the name. Appropriate techniques might include
having a space craft choose a weapon based on the distance to and velocity of the target, or even deciding to run away.
The decision on which weapon to fire can be taken from tables generated by the programmer, based on his walk-through
experience, or mathematical calculation, or simply his best guess as to the most appropriate choice in different
circumstances.
More generally, the computer opponent can choose among alternatives by using an arbitrarily complex fixed algorithm
and an arbitrarily large database. This leads to a good simulation of intelligence which will satisfy most players. In fact,
many of the early posters to comp.ai.games referred to this method in their quest for intelligent opponents. A computer
playing backgammon decides its next move based on the current board position, the dice roll, and the value of the
doubling cube. Similarly, a chess program uses an algorithm which assigns values to the different pieces, and usually to
positions as well. The computer will determine the value of each position some number of moves down the road, assuming
(at this level) perfect play by the human, and will choose the move which leads to the position with the maximum value.
This cannot really be called "intelligence" on the part of the computer opponent, because the computer is obeying rules set
forth when the program was written. The computer also is not analyzing the user's style. A computer opponent could,
supposedly, analyze the human player's style and use that as another variable in a fixed algorithm or lookup table. That
could lead to a combinatorial explosion in table size.
A variation on this method is to have several computer opponents, which have different styles and possibly different
capabilities. The computer opponent can be chosen by the player or at random.
A#c: AI-Generated Lookup Tables
"Fixed Rules", when characterized as non-intelligent, it didn't mean to diminish the amount of work that goes into
developing the algorithms or the lookup tables. In the case of backgammon, for instance, the computer has to choose two
or four moves from possibly several dozen moves, as well as decide whether to double or resign. The programmer could
conceivably write a lookup table for every possible piece position and dice roll, but that requires too much storage to be
practical.
A practical backgammon program (to continue the example) assigns weights to different combinations of positions and
calculates the value of each position reachable from the current position and dice roll. The programmer might develop a
weighting system to evaluate the worth of each potential position, but that's a very difficult proposition even in an easy
game like backgammon or chess. In a hundred-unit wargame, with every unit having unique characteristics, and the terrain
and random factors complicating the issue still further, the optimal system of weights cannot be determined by a person,
possibly cannot theoretically be determined, and certainly couldn't be determined in any reasonable amount of time, even
with lots of big computers working on it.
The solution is to settle for "good enough" rather than "best". This could be characterised as a 'Fuzzy Logic' state machine,
as opposed to a Finite State Machine. One of the best ways to get the "good enough" weights and algorithms is to set up
the computer to play both sides in a game, with a lot of variation in the algorithms and weights. Let the two computer
"players" duke it out and see which set wins most often. Modify the programs and try it again until you're happy or the
game needs to ship.
An even better refinement of the above method is to let the computer do its own modifications to find the best winning
technique. A convenient way to look at this is that there is a top-level program which is trying to develop a winning set of
algorithms and values. The top-level program constantly fiddles with the "player" programs to find the best one. If the
human programmer feels that the algorithms are good enough, then only the weights assigned to each position or piece or
whatever need to be optimized. Choosing the weights at random is not the most effective way to find the best set, but it
has the benefit of easy implementation. Development of algorithms and more sophisticated assignment of weights can be
performed with "genetic algorithms".
In any event, the programmer sets up the program which will determine a good set of parameters, then lets it run for a

http://www.gamedev.net/reference/articles/article200.asp (7 of 16) [25/06/2002 3:11:19 PM]


GameDev.net - Comp.AI.Games FAQ

while. When he comes back, he takes the best algorithm and weights that were generated, sticks that into his program as a
fixed algorithm and lookup table, and goes to market.
This "AI During Development" method has both the worst and the best of both worlds. On the "worst" side of the ledger,
the programmer is going to the effort of learning and using real AI techniques during development, but is distributing a
fixed routine for the real game. On the "best" side, a program written using a procedural language and algorithms will run
much faster and use fewer resources than a program using real AI techniques.
A#d: Flexible Algorithms
Next we come to adaptive algorithms and data tables for the computer opponent. This class of methods uses algorithms
and weights which are adjusted at runtime to get better results. For instance, in a complex wargame, the computer
opponent can start out using a given method of moving units and attacking under different circumstances. Or, better yet, it
initially uses a variety of techniques. The program monitors the effectiveness of each algorithm and set of weights, and
gradually weeds out the least effective.
This method greatly resembles the one described above, except that the modification of algorithms and weights happens
while the game is running, rather than during development. Re-read the list of the drawbacks of run-time artificial
intelligence toolkits.
The use of genetic algorithms and other methods of self-modifying the program is considered by some to be artificial
intelligence, but it doesn't quite make the grade. For one thing, something as simple as this would never pass the Turing
test.
A#e: Analyzing the Human's Actions
The next step up is analysis of the human player's actions. In this context, "analysis" means just that: identification, study,
and classification of elements of the human's style. We've now reached an area of serious AI research. At the least, a
program at this level will need strong pattern recognition capabilities.
Computer pattern recognition, though much progress has been made recently, doesn't come anywhere near the ability of a
human being or most house pets. Think about a shooting game. If your opponent always dodges left when you shoot at
him, you, the human, will probably catch on pretty quickly and learn to fire a quick second shot to his left. A computer, on
the other hand, might see only that sometimes the human moves 14 units over, and sometimes 15; no pattern there!
Pattern recognition is much more effective as the number of cases grows. In neural-net terms, the training session can
continue forever, even if the net needs to give results before forever is over. And being able to divide the experiences into
different buckets can help even more. For this reason, asking the player to identify himself and storing as much
information permanently will greatly increase the game's apparent intelligence.
A#f: Sub-Goal Selection
Sub-goal selection has a pretty obvious meaning: the choice of goals to accomplish, each of which lead to the
accomplishment of the overall goal. For instance, in the space shoot-em-up we've used several times in this answer, let's
say the computer checks energy levels and basic abilities and determines that the human has a definite advantage because
it has a higher energy level, but is otherwise equal. The computer opponent could decide on the subgoal of depleting the
human's energy without giving up any advantage. To do that, the computer could decide to zip in, attract fire, and dash off
without being hit. The actions the computer would perform in this scenario could have been pre-programmed to crop up in
the proper circumstances, assuming that the programmer was infinitely far-sighted. In a proper AI system, the computer
would somehow recognize a need, sift through a large number of possible goals and actions, and choose the one most
likely to succeed. So far as I know, no game on the market uses anything approaching this technique.
A#g: Be creative
Combine, change, alter, come up with new, mutate, evolve, destroy, reconstruct, burn, spray paint algorithms to come up
with a solution.
So now, to return to the original question, how should you add AI to your game? You first need to decide what level of

http://www.gamedev.net/reference/articles/article200.asp (8 of 16) [25/06/2002 3:11:19 PM]


GameDev.net - Comp.AI.Games FAQ

computer response will suit your needs, abilities, and available time. If your main intent is to write a game which will keep
the human players entertained and challenged, go with the lowest workable level, as they are defined above. Do research,
and lift whatever you can from public-domain sources. The job of adding convincing responses to your game will
probably dwarf any estimates you make, so anything you can do to minimize the work is research effort well spent. The
best AIs implemented have been shown to use a combination of many approaches.

2.5B Start Small


Many posters in c.a.g state that they wish to develop a World War II simulation with an intelligent computer opponent, or
some similarly ambitious goal. That's fine for a long-term goal, but the details are likely to swamp all but the true code
gods. I don't mean to sound defeatist here; I just want to get y'all thinking about what you're biting into.
That being said, what is the best way to get started? Why, in small steps, of course. Rather than writing the data structures
and movement rules for an armored division and all subordinate units, and then wondering how to have the lower units
find their way from point A to point B, start with a small, simple map or grid or generalized graph and simple movement
rules. Write code to get a single unit from point A to point B. Then add complications piece by piece and get a good
algorithm for each step. Your final product may well be general enough to handle any conditions your real game will
need, but you may want to hang on to the earlier algorithms as well; they should be faster for their specific domains than a
more general routine.
Another way to start small is to write an opponent for a simple game. Many simple games exist; just see any book of kids
games from the pre- electronic days. Fox and Geese or an Awari variant should give a good starting point. Learn the
Minimax algorithm for board game variants, and the A* pathfinding algorithm for moving units in a Real Time strategy
game. Learn how to write a finite state machine for use in running bots in a 3D shoot-em-up. It can all get very
complicated very quickly!

3 Solved and Unsolved Problems in AI for


Games
Steve Woodcock keeps his finger on the pulse of current AI evolution. You can usually get some good leads on all things
comp.ai.games by visiting his excellent site at:
http://www.gameai.com
He has a page about "solved" *games*. That means that computer AI has evaluated enough combinations to know
whether a win is possible by moving first. Or second, or whatever. Just like in O's and X's, you know that "it's always a
draw" with optimum play, it has been shown that, for example, Connect 4 is always a Win for the first player, again, with
optimum play. Chess hasn't been solved...

3.1 What game AI problems have been solved well enough


that I can build them into my next game?
3.1A Finding the shortest path from point A to point B.
The A* Algorithm
This is the best known, and most widely implemented Shortest Path Algorithm. The algorithm forms a star, and this is the
shape of the potential paths spread out from point 'A'. Point 'B' will, at some point, be under the star shape as it spreads out
looking for all paths from point A. Weighting is given to each discrete area covered by the search to find the best path.
There are lots of shortest path algorithms (SPAs) around. There are some animations present on the internet.

http://www.gamedev.net/reference/articles/article200.asp (9 of 16) [25/06/2002 3:11:19 PM]


GameDev.net - Comp.AI.Games FAQ

An good page is Smart Unit Navigation:


http://home.sol.no/~johncl/shorpath.htm
Another, academically focussed page is
http://home1.stofanet.dk/breese/aaai99.html
A complete 3D RTS (Real Time Strategy) game which is using A* has been written by Keith Mukai. It can be
downloaded and is described at :
this page has disappeared. Keith, are you there?
Finally, there is much information on A* and variants at Amit's Game Programming site :
http://theory.stanford.edu/~amitp/GameProgramming/
James Matthews has written a demo app with source code which can be found at
http://www.generation5.org/

3.1B Traversing a maze.


Traversing a maze is a variant of the above. For our purposes here, "traversing a maze" involves moving from position to
position toward an unknown but specific goal, with only partial knowledge of the maze. The effort to move from one
position to another is usually constant.
A monte carlo algorithm, randomly choosing a direction at each intersection, will in principle eventually solve any maze.
And an infinite number of monkeys typing randomly will eventually recreate the works of Shakespeare.
A brute force algorithm, such as following the left wall until you find the exit, will solve most mazes, but can be defeated
by special maze design.
A more sophisticated algorithm will remember that part of the maze which has already been travelled or seen, and will
avoid aimless retravel of known segments. The algorithm should attempt to fill in blank spaces by directed searching.

3.1C Exhaustive search of sequences of a limited set of moves. (gametrees)


known as Minimax
[eg, chess]
[Note that the practicality of building this into your game depends on the power of your processor, amount of memory,
time constraints, number of possible moves at each step, and number of moves you wish to search.]
Such games are often represented as minimax trees---we assume the first player is trying to maximize some payoff for
himself, and the second player is trying to minimize the payoff to the first player. At the end of each sequence of tried
moves, we assign a number (the payoff) to the reached position using an "evaluation function".
>>>> [example here to follow]
Alpha-beta pruning is a method that (usually) allows minimax trees twice as deep (i.e., move sequences twice as long) to
be searched in the same time. Alpha-beta is described in almost any introductory AI text book. It is basically an
optimisation of the evaluation function so that redundant nodes can be eliminated as soon as it is known, rather than
evaluating all nodes and leaves before inspecting minimax style.

3.1D Database knowledge of information to apply


Larger game AI developments, in particular chess, and backgammon, have libaries of 'openings' which huge historic and
stored analysis has already been done to help find the best next move. A good online version of how a database mimics
human intelligence can be found in a "Twenty Questions" on the Web called http://come.to/20Q and it's great fun!

http://www.gamedev.net/reference/articles/article200.asp (10 of 16) [25/06/2002 3:11:19 PM]


GameDev.net - Comp.AI.Games FAQ

3.2 What game-applicable AI problems have not been


solved yet?
The current best backgammon player, TD-Gammon, by Gerald Tesauro, was trained using temporal difference learning on
a neural net (ref: March 1995 CACM)

3.3 What games have been "solved"


First off, the following games have been solved by analysis and exhaustive searches, not by "artificial intelligence".
However, they are included here so you won't put a lot of time into writing an AI program to beat someone at these games.
(This isn't to discourage you from writing the games, just to advise you that AI, per se, won't be needed.)
The meaning of "solved" is subject to some debate. In some games (e.g., Hex) we know who will win if the game is
played perfectly (the "game theoretic value"), but not how to play perfectly. The following are games for which computer
opponents are available that do achieve the game theoretic value.
[ If anyone has code snippets or entire programs for these, or pointers to such, please drop me a line. ]
● Connect 4. First player wins.

● Gomoku (5 in a row) on a 15x15 board. First player wins. The game is solved both with and without overlines (six
or more in a row) being counted as a win.
● 3-D Tic-Tac-Toe (3x3x3). First player wins.

● Qubic (4x4x4 three-dimensional tic-tac-toe). First player wins.

● Nine Men's Morris (also called Mill). The game is a draw.

● Awari. The general name of "Awari" covers a huge number of variants. Some of these have been solved analytically
and some have not.
● Sprouts, for some numbers of points, anyway. Sorry, I don't have the figures on hand, but they were given in the
original Scientific American article back in the '70's. I haven't heard of any further investigation.
---------------------------------------------------------------------------

4 What are some promising areas of research in


AI, as applied to games?
Case-Based Reasoning (CBR), sometimes (not necessarily correctly) known as knowledge-based systems or expert
systems.
Neural Nets.
Steve Woodcock's Round Table discussions at the Gamedev meet have sometimes touched on this. Many programmers
quote it as the next great thing to implement, but few have actually succeeded. The problem is one of complexity and
practicality.
Fuzzy Logic.
Monitoring and adapting to the opponent's style of play. A number of rules which can be adapted, graded, modified and
measured are an excellent way of computer learning.
---------------------------------------------------------------------------

http://www.gamedev.net/reference/articles/article200.asp (11 of 16) [25/06/2002 3:11:19 PM]


GameDev.net - Comp.AI.Games FAQ

5 Fact Sheets and descriptions of Gaming AI


projects
Projects come and go, but some of the latest ones can be found at
http://dir.yahoo.com/Recreation/Games/Computer_Games/Programming/Projects/
A subject much in discussion during the latter half of 1999 was the 'Rock, Scissors, Paper' or 'RoShamBo' competition,
along with the whole of the rec.games.poker newsgroup. The competition has now completed, but the discussion goes on.
Find out more about it at http://www.cs.ualberta.ca/~darse/rsbpc.html and play a bot at
http://www.sportsrocket.com/cgi-bin/roshambo/roshambot.cgi
---------------------------------------------------------------------------

6 Summaries of useful c.a.g threads


This note is here because we can refer to Steve's pages again. He has these threads archived, not summarized :
http://www.gameai.com
---------------------------------------------------------------------------

7 Game AI in the larger world


7.1 Where can I find information on XXX game
programming?
This section lists game-specific links which serve as good starting points for those interested in resources and approaches
for programming game AI in each specific game. Please email the maintainer with new links for other games, or better
links if you can find them.

7.1.1 Chess
A good, but not recently updated guide, is at http://www.xs4all.nl/~verhelst/chess/programming.html

7.1.2 GO
A good opening for AI discussion of GO is at: http://www.usgo.org/computer/

7.1.3 Backgammon
A fine starting point for Backgammon, including a discussion of the various AI programs available and their capabilities :
http://www.statslab.cam.ac.uk/~sret1/backgammon/

7.1.4 Othello / Reversi


There doesn't appear to be a specific programming page around, but a good starting point would be http://www.iioa.org

http://www.gamedev.net/reference/articles/article200.asp (12 of 16) [25/06/2002 3:11:19 PM]


GameDev.net - Comp.AI.Games FAQ

7.1.5 Checkers / Draughts


Again, no specific programming AI page, but this is a good place to start : http://www.mcn.net/~jimloy/checkers.html

7.1.6 3D shootemups AI
Bots, programming of both server and client, can be found at http://www.planetquake.com/botshop and many more at
http://www.botepidemic.com/

7.1.7 Other game AI links


The list below will give you many hours AI surfing :
http://www.gameai.com/ai.html
http://www.gamedev.net
http://www.gamasutra.com
http://www.programmersheaven.com
Grant Castillou has put together some downloads which cover Gomoku, Checkers and Othello. You can get them at:
http://skyscraper.fortunecity.com/apple/230/
This section (7.1) needs additional resources for AI in card games, board games, driving games, fighting/arcade and AI for
other computer hosted / simulated games. Let me know if you have any.

7.2 Who is sponsoring research into AI in games?


Lots of people. Professional game developers can either just want to get the game out the door or can be dedicated to the
AI. Whether they get their code included or not! University research projects are usually focused on the techniques used
instead of the domain.

7.3 How can I get a job or an assistanceship writing games?


Learn C or C++. Buy a book on games. Write a simple game. Write a more complicated game that's enjoyable to play.
Send the demo to a company which wants talented, useful game programmers.
You could offer to work for free, of course, if the bank balance allows, and you're confident enough.
Further details of the game industry from the developer viewpoint can be found at http://www.godgames.com especially,
check out 'The Oracle' where pretty much every aspect of the game developer's questions are asked (and answered).

7.4 What commercial games use AI?


Pretty much all of them. You will see people complaining in the related games newsgroups about poor AI if it hasn't been
implemented coherently.

7.5 What commercial games advertise AI but don't really


use it?
This is getting into the definition of AI.

http://www.gamedev.net/reference/articles/article200.asp (13 of 16) [25/06/2002 3:11:19 PM]


GameDev.net - Comp.AI.Games FAQ

7.6 Are there any AI competitions?


The Loebner Prize, based on a fund of over $100,000 established by New York businessman Hugh G. Loebner, is awarded
annually for the computer program that best emulates natural human behavior. During the contest, a panel of independent
judges attempts to determine whether the responses on a computer terminal are being produced by a computer or a person,
along the lines of the Turing Test. The designers of the best program each year win a cash award and a medal. If a
program passes the test in all its particulars, then the entire fund will be paid to the program's designer and the fund
abolished. For further information about the Loebner Prize, write Dr. Robert Epstein, Executive Director, Cambridge
Center for Behavioral Studies, 11 Waterhouse Street, Cambridge, MA 02138, or call 617-491-9020.
The International Computer Chess Association presents an annual prize for the best computer-generated annotation of a
chess game. The output should be reminiscent of that appearing in newspaper chess columns, and will be judged on both
the correctness and depth of the variations and also on the quality of the program's written output. The deadline is
December 31, 1994. For more information, write to Tony Marsland tony@cs.ualberta.ca, ICCA President, Computing
Science Department, University of Alberta, Edmonton, Canada T6G 2H1, call 403-492-3971, or fax 403-492-1071.
---------------------------------------------------------------------------

8 Further information
8.1 What are some related news groups?
The entire comp.ai.* hierarchy. Particular topics people seem interested in include comp.ai.fuzzy, comp.ai.genetic, and
comp.ai.neural-net. The rec.games.* and alt.games.* hierarchies.
The comp.ai FAQ in particular has an enormous fund of information. I plan to incorporate as much of it as I can without
being accused of plagiarism, but for now, just look it over. This FAQ has a long list of reference books and articles
covering topics of interest to AI gamers. http://www.cs.uu.nl/wais/html/na-dir/ai-faq/general/.html

8.2 Where can I go to FTP text and binaries?


Some info can be found in x2ftp.oulu.fi:/pub/msdos/programming/ directory.. Please note that the "msdos" part in path
does NOT mean everything is dos related

x2ftp.oulu.fi:/pub/books/game/
3d-books.320 3D graphics books reviewed by Brian Hook - 3.20
aaa_set.toc Action Arcade Adventure Set - Diana Gruber 1994 (FastGraph)
cgames_1.txt Computer Games I - Levy 1988
cgames_2.txt Computer Games II - Levy 1988
explorer.toc PC Game Programming Explorer - Dave Roberts 1994 playgod.zip Playing
God, Creating Virtual Worlds - Roehl 1994 (TOC/errata)
tricks.rev Tricks of the Game Programming Gurus - LaMothe/Ratcliff 1994
tricks.toc Tricks of the Game Programming Gurus - LaMothe/Ratcliff 1994
x2ftp.oulu.fi:/pub/msdos/programming/ai/
x2ftp.oulu.fi:/pub/msdos/programming/theory/
Especially the latter directory has some excellent documents.
BOLO : the 'official' archive seems to be down. I suggest starting at the home page, but the development seems to have
died. http://www.lgm.com/bolo/

http://www.gamedev.net/reference/articles/article200.asp (14 of 16) [25/06/2002 3:11:19 PM]


GameDev.net - Comp.AI.Games FAQ

8.3 What books and magazines are available on AI and


gaming topics?
_Neural Networks and Fuzzy Logic Applications in C/C++_, Stephen T. Welstead, Wiley Professional Computing, 1994.
Discusses the topics and gives library and application source code for Borland C++.
Another resource is Sunir Shah's "Programmers' Booklist," a list of, you got it, books (plus magazines, e-mags, web sites,
and files) of interest to the programmer. This list is indexed hypertext. The web site is at
http://www.intranet.ca/~sshah/booklist.html. There is a "binary searchable" version at:
http://www.intranet.ca/~sshah/booklist. This list is a very comprehensive and inclusive one.
---------------------------------------------------------------------------

10 Glossary
Case-based Reasoning:
Technique whereby "cases" similar to the current problem are retrieved and their "solutions" modified to work on the
current problem.
Fuzzy Logic:
In Fuzzy Logic, truth values are real values in the closed interval [0..1]. The definitions of the boolean operators are
extended to fit this continuous domain. By avoiding discrete truth-values, Fuzzy Logic avoids some of the problems
inherent in either-or judgments and yields natural interpretations of utterances like "very hot". Fuzzy Logic has
applications in control theory.
SPA:
Shortest Path Algorithm.
---------------------------------------------------------------------------

11 Acknowledgements
[Dave Burbage's]
My thanks to:
Sunir Shah - for handing me the FAQ.
Bjorn Reese - for not bullying me too much in getting this version out at all!
[Sunir Shah's]
My thanks to:
David Burbage for taking over the FAQ.
Steve Furlong
Doug Walker for conferring the copyrighted on me.
Robert Uhl
Lukas Bradley for HTMLizing the FAQ.
Steven Woodcock for putting up http://www.gameai.com
Everyone else for putting up with me.
[Steve Furlong's]
My thanks to:

Dan Thies for the charter, and for creating this group

http://www.gamedev.net/reference/articles/article200.asp (15 of 16) [25/06/2002 3:11:19 PM]


GameDev.net - Comp.AI.Games FAQ

Doug Walker, the original FAQ maintainer


Mark Kantrowitz, maintainer of the comp.ai FAQ
Jouni Miettunen
Paul Colley
Frits Daalmans
Robert Uhl
Mitchel Timin
Jessie Montrose
Will Uther
Timothy Huang
and everyone else that submitted questions or answers or pointers to answers
[document ends]
Discuss this article in the forums

Date this article was posted to GameDev.net: 7/31/1999


(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article200.asp (16 of 16) [25/06/2002 3:11:19 PM]


Introduction to Learning in Games

Introduction to Learning in Games


By Geoff Howland

To create a fun and successful game you need to be able to challenge your players. They need to feel that they are
overcoming something by beating your game. One way to achieve this is to have your game learn from their actions. Have
it analyze what they are doing and try to come up with counter attacks to provide a challenge and to create the illusion of
more intelligent opponents.

Pattern Matching
There are a lot of legitimate artificial intelligence algorithms to finding patterns in data and finding patterns in reactions to
them based on different success requirements. However, for the purposes of game design a lot of these are currently over
kill, and more importantly, not tuned to the scope of the problem.

You are not interested in creating a perfect reactionary machine in a game enemy, you are interested in provide a challenge
for the player. Any game already has a big plus for you as the designer, since it is your creation and you know the limits of
the game. You can therefore build your own pre-made patterns and test for them by checking the player's input or different
aspects of how they are playing.

For instance, in a fighting game such as Street Fighter 2, the player has six buttons they can choose from. By capturing
when the player hits these buttons and the distance of the enemy or if the enemy is in the air, you can find certain patterns
of play. The player may often try to punch and then move in for a throw when they are close enough. The player may
always try to do an uppercut when their enemy has jumped. By recording different input and game information at the time
of input you can create a map of possible actions that you can use for the game's AI. In doing so you can "learn" the
players moves and then try to counter them.

Real-time strategy (RTS) games have a much more complex system of attack, because the input of mouse clicks are
irrelevant. To try to learn what your player is attempting to do in an RTS game you will have to abstract the data of the
player's actions to find a common pattern. This is a totally game dependent process, but as an example let's use Command
& Conquer (C&C).

In C&C the objective of an average mission is to build troops and a base to defend yourself, then destroy the enemy and
their base. There are two necessary points of learning: how the player interacts with enemy units and how the player builds
his base. To keep this example in focus we will only explore the first learning objective although the second would be
crucial for counter-attacks.

Contact between C&C's units is very limited, when they are close enough together then they will begin to fight each other.
The first type data you will need to search on is the player's preferred unit types. The player may prefer doing tank rushes;
in this case, you will need to build defenses that specialize in defeating tanks. If they prefer making mini-gunner units then
you will adjust your defenses against that type of attack.

The player could have a preference of attacking the harvesters versus attacking the base directly. This can be recorded and
used so that you can send out troops to guard the harvesters or build more protection around the main base. To create a
good learning system you need to find the most common methods of attack and then figure out how you can determine if
they are occurring.

Storing and Searching


A learning system, as with any kind of database, is only as good as it's ability to find useful information. The actual
content of any learning system is game dependent but there are some basics to small databases that are universal.

When you search for a pattern, you want to search on either one or more criteria. In order to do this you should save your

http://www.lupinegames.com/articles/introlearn.htm (1 of 2) [25/06/2002 3:14:02 PM]


Introduction to Learning in Games

data in a way that you can easily access data quickly. This requires that at the time you save your data you need to plan for
the way the data will be accessed. If you wish to save each occurrence of data as a separate element you will need to save
them in an order that is quickly searchable. To use the C&C example you can save them starting with unit type. By
creating a table for each unit type you will short cut the need to search through all the records and collect the records that
contain the appropriate types of units.

Another method is to store all the data in one table of ratios. The ratio of attacks to harvesters versus attacks to the base for
instance. The ratio of using one type of unit over another. This would make a good in-game search method, as there are no
records to retrieve and analyze. Single entry records could still be saved and analyzed outside the game. You could also
weight the latest actions to represent them as more important than previous actions as an attempt to cover for any change
in tactics.

Overview
Creating learning methods, like any other type of game AI, is going to take a lot of sitting down and thinking of situations.
It will also take a lot of play testing. Players will come up with methods of play that you can't be obligated to think of
beforehand, so you need to build your learning database flexible enough to add in more situations.

Learning the player's styles and preferences is not a key to creating an unbeatable opponent or the ultimate AI; it is a
method for creating a challenge for your players. By never letting them develop a tactic that will constantly work against
the computer, you will ultimately extended the life of your game and keep it fresh and challenging.

http://www.lupinegames.com/articles/introlearn.htm (2 of 2) [25/06/2002 3:14:02 PM]


generation5.org - The History of AI

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

The History of AI
Mankind has long been curious about how the mind works and fascinated by intelligent machines. From Talos, the copper giant in
Iliad , Pinocchio, the fairy wooden puppet acting like a real boy, and the early debates on the nature of the mind and thought by
European philosophers and mathematicians, we can see people's desire to understand and even to create intelligence.

The Birth of AI (1945-56)


However, it wasn't until the postwar period (1945-1956) that Artificial Intelligence would emerge as a widely-discussed field. What
propelled the birth of Artificial Intelligence were the arrival of modern computer technology and the arise of a critical mass. Pioneers
such as Marvin Minsky, John McCarthy, Allen Newell, and Herbert Simon led their students in defining the new and promising field.
The development of the modern computer technology effected the AI research tremendously. Many pioneers of AI broke away from
the traditional approach of artificial neurons and decided that the human thought could be more efficiently emulated with modern
digital computer. Those who did not accept digital computers as the new approach stayed in the parallel field of neural network.

The Dawning Age of AI (1956-63)


The Dartmouth Conference of 1956 brought AI to a new era. 1956-1963 represents the dawning of a an intensive AI wave. During
this period, major AI research centers such as Carnegie Mellon, MIT and its Lincoln Laboratory, Stanford, and IBM concentrated
their work on two main themes. First, the attempt to limit the breadth of searches in trial-and-error problems led to the initiation of
projects such as Logic Theorist (considered as the first AI program), Geometry Theorem Prover, and SAINT. Next, the study on
computer learning includes projects on chess, checkers, and pattern recognition programs. Specialized list-processing AI languages
such as LISP were also developed in MIT and other places in 1958.

The Maturation of AI (1963-70)


By mid 60's, AI had become the common goal of thousands of different studies. AI researchers utilized their programming
techniques and the improved computers in pursuing various projects. However, the memories of computers during these years were
still very limited. Perception and knowledge representation in computers became the theme of many AI researches. One
representative project was the Blocks Micro World project carried out in MIT. Facing a collection of pure geometric shapes, the
robots looked through cameras and interpreted what they had seen. Then, they would move about, manipulate blocks and express
their perceptions, activities, and motivations. With the booming of AI, the rival science of artificial neural network would face the
downfall especially after the exposure of basic flaws in its research in "Perceptron" by Minsky and Papert.

The Specialization of Various AI Studies (1970's)


Different AI-related studies had developed into recognizable specialties during the 70's. Edward Feigenbaum pioneered the research
on expert systems; Roger Schank promoted language analysis with a new way of interpreting the meaning of words; Marvin Minksy
propelled the field of knowledge representation a step further with his new structures for representing mental constructs; Douglas
Lenat explored automatic learning and the nature of heuristics; David Marr improved computer vision; the authors of PROLOG
language presented a convenient higher language for AI researhes. The specialization of AI in the 70's greatly strengthened the

http://www.generation5.org/aihistory.shtml (1 of 2) [25/06/2002 3:14:12 PM]


generation5.org - The History of AI
backbone of AI theories. However, AI applications were still few and premature.

The Unfulfilled Expectations (1980's)


The 1980's was a period of rollercoasting for AI. The anti-science tradition of the public was improved greatly following the
appearance of Star Wars movies and the new popularity of the personal computers. XCON, the first expert system employed in
industrial world, symbolized the budding of real AI application. Within four years, XCON had grown tenfold with an investment of
fifty person-years in the program and an achievement of saving about forty million dollards in testing and manufacturing costs for the
industrial clients. Following the brilliant success was the AI boom. The number of AI groups increased tremendously and in 1985,
150 companies spent about $1 billion altogether on internal AI groups. However, the fundamental AI algorithm was still
unsatisfying. As Marvin Minsky warned the over-confident public: these seemingly intelligent programs simply make dumb
decisions faster. Indeed, the warning foreshadowed the downfall of AI industry in late 80's. The replacing of LISP machines by
standard microcomputers with AI softwares in the popular C language in 1987 and the unstability of expert systems caused a painful
transition on expert system industry; the computer vision industry also suffered from a big setback when Machine Vision
International crashed in 1988; one other major loss was the failure in Autonomous Land Vehicle project (AI drivers + Robotics). The
AI industry started recovering at the end of the 80's but learning from the past experience, public assumed a much more conservative
view on AI ever since. Another notable event is the revisitting of neural network with the work done by the Parallel Distributed
Processing Study Group. In 1989, about three hundred companies were founded to compete for the predicted $1 billion market for
neural nets by the end of the century.

AI Being Incorporated in War (early 1990's)


The Persian Gulf War in the early 90's proved the importance of AI research for military use. Tasks as simple as packing a transport
plane and as complicated as the timing and coordination of Operation Desert Storm were assisted by AI-oriented expert systems.
Advanced weapons such as "cruise missiles" were equipped with technologies previously studies in different AI-related fields such as
Robotics and Machine Vision. Two projects successing the Automated Land Vehicle project were the Pilot's Associate project
(electronic copilot) and the Battle Management System project (military expert systems).

New AI Applications (late 1990's)


The victory of Deep Blue over chess champion Kasparov in 1996 led to a new summit of AI gaming. A new branch of expert
systems has been expected to prosper as Genetic Engineering matures. Manipulating such gigantic knowledge base of human DNA
map (Bioinformatics) will require very specialized algorithms and AI researches.

An Overview
AI is a result of the merge of philosophy, mathematics, psychology, neurology, linguistics, computer science, and many other fields.
Futhermore, the application of AI relates to almost any fields. This variety gives AI an endless potential. A relatively young science,
AI has made much progress in 50 years. Though fast-growing, AI has never actually caught up with all the expectation imposed on it.
There are two reasons for public's over-confidence in AI. First, AI theories are often ingenious and subtle even fictional, implying
much futuristic applications. Second, AI, being incorporated with computer technology, is often expected to progress as fast as the
computer technology. Conclusionally, AI is a young, energetic, and attractive science.

● Applications in the Military


● Artificial Intelligence: An Introduction

All content copyright © 1998-2002, Generation5

http://www.generation5.org/aihistory.shtml (2 of 2) [25/06/2002 3:14:12 PM]


generation5.org - An Introduction to Natural Language Processing

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

An Introduction to Natural Language Processing


A natural language is a language used in human culture such as Chinese, English or Bulgarian. They can be either spoken or written.
So, what is Natural Language Processing?

What is Natural Language Processing?


NLP (Natural Language Processing) is a field in AI involving anything that processes natural language. Extensive research in NLP
over the past decade has brought us one of the most useful applications of AI: machine translation. If we could one day created a
program that could translate (for example) English text to Japanese and vice versa without need of polishing by a professional
translator then bridges of communication could be significantly widened. Our current translation programs have not yet reached this
level, but they may do so very soon. In particular, NLP research also deals with speech recognition. Currently, programs that convert
spoken speech into text have been widely used and are fairly dependable. Another field of study in NLP are story understanders, as
well as chatterboxes. During the 1960s Joseph Weizenbaum created ELIZA. ELIZA created a storm of public interest in AI, as it
helped thousands overcome their personal problems. ELIZA was a psychiatrist, particularly one that posed analytical questions for
every answer the user gave it. Though sometimes they may have seemed ambiguous, people actually felt ELIZA could take care of
their needs just as well as any other therapist. They became emotionally involved with ELIZA, even Weizenbaum's secretary
demanded to be left alone with the program. The following is a transcript of ELIZA chatting with another program PARRY:

Parry: I don't understand your motives.


Eliza: Why don't you understand my motives?
Parry: Shouldn't I be?
Eliza: What did you say just now?
Parry: That's just the way it is.
Eliza: Suppose that's just the way it were not.
Parry: So?
Eliza: I am not sure I understand you fully.
Parry: You harp on the same questions too much.
Eliza: What does this similarity suggest to you?
Parry: I don't get you.
Eliza: Do you wish that I you did get I?
Parry: It's about time you showed some interest in my feelings.
Eliza: Oh? I showed some interest in your feelings?
When people had started calling ELIZA intelligent, Joseph Weizenbaum went into an uproar. Technically, ELIZA was actually
unable to understand people's personal problems to the depth of any other human being. ELIZA could only manipulate syntax
(grammar), and check for some key words. Certainly, if someone had no knowledge of ELIZA being a program, one could easily
conclude that it behaved like a human conversing, although it never really neccessary understood everything to the detail that humans
do.
Coincidentally, ELIZA creates questions to help people's personal problems, while IQATS (Intelligent Question and Answer Test
Summarizer), a program written by Sam Hsiung (Generation 5 staff member), creates questions for test-making purposes. Unlike
ELIZA, IQATS is able to learn how to ask new questions, if it is given a sample question and answer. Yet, like ELIZA, it knows and

http://www.generation5.org/nlpoverview.shtml (1 of 3) [25/06/2002 3:14:45 PM]


generation5.org - An Introduction to Natural Language Processing

will learn only how to manipulate syntax. It will be able to ask a question about what the capital or Saudi Arabia is, however if it
were given something a bit more complex, such as Martin Luther King's 'I have a dream...' speech, it would not be able to come up
with questions that force people to draw inferences (Ex.: Under what context was this speech given in?); neither does it really
understand what it is asking.
Many researchers realized this limitation, and as a result conceptual dependency (CD) theory was created. CR systems such as SAM
(Script Applier Mechanism) are story understanders. When SAM is given a story, and later asked questions about it, it will answer
many of those questions accurately. (Thus showing that it "understands") It can even infer. It accomplishes this through use of
scripts. The scripts designate a sequence of actions that are to be performed in chronological fashion for a certain situation. A
restaurant script would say that you would need to sit down by a table before you are served dinner.
The following is a small example of SAM (Script Applier Mechanism) paraphrasing a story (notice the inferences):
Input: John went to a restaurant. He sat down. He got mad. He left.
Paraphrase: JOHN WAS HUNGRY. HE DECIDED TO GO TO A RESTAURANT. HE WENT TO ONE. HE SAT DOWN IN A
CHAIR. A WAITER DID NOT GO TO THE TABLE. JOHN BECAME UPSET. HE DECIDED HE WAS GOING TO LEAVE
THE RESTAURANT. HE LEFT IT.
Scripts allow CD systems to draw links and inferences between things. They are also able to classify and distinguish primitive
actions. Kicking someone, for example could be a physical action that institutes 'hurt', while loving could be an emotional
expressiong that implies 'affection'.

The Legendary Turings Test and It's Weaknesses


Let's move on to a controversial subject involving understanding in natural language systems. How can we be sure whether or not a
machine actually understands something? In 1950, Dr. Alan Turing, a British mathematician who is now considered the father of AI
proposed the Turing's test for intelligence. Rather simply, the Turing's test boils down to the question: "Can this machine trick the
human to think that its human". Specifically, the machine is a natural language system that converses with human subjects. In the
Turing's test, a human (the judge) is placed in one room, and the machine/or another human is placed in another. The judge may ask
questions or answer questions posed by the computer/or another human. All communication is done through a terminal, input is done
by typing. The judge is not aware whether or not the subject that he/she is talking to is either a human or a computer before the
conversation begins. Supposing that the judge was conversing with a computer, during and after the conversation, he/she must be
"fooled" into thinking that the machine is a human in order for the machine to pass the Turings test. There are actually very many
pitfalls to the Turings test, and it is in fact, not very widely accepted as a test for true intelligence.
Today, the Loebner Prize is a modern version of the Turings test. The criticisms surrounding the Loebner prize deals with how the
Turings Test is carried out. The goal of the contestant is to fool or trick the judge into thinking that his program is a human. Such a
prospect does not encourage the advancement of AI> For example, messages are transmitted via text, as the subject (human or
computer) types, the judge sees the text that is being typed, live. Thus, many contestants have been forced to emulate typing
conditions of humans, i.e. text that is outputed comes out at varied speeds, sometimes words must be misspelled and corrected,
incorrect punctuation is often used etc. Even then, the programs in the contest usually talk about only one subject (to talk about
everything present in our culture is simply impossible- at least for a natural language system that understands only words, syntax and
semantics and not really what they look like, what some objects really do etc.- which will be discussed later in other essays). If the
judge picks another subject to discuss, the programs usually try to divert the attention of the judge. Programs have even tried to use
vulgarity or an element of surprise, to get the judge excited (Truly, no computer could be vulgar or unpredictable could it?). For
example, you may want to see a transcript of Jason Hutchen's program which competed for the Loebner Prize. You can also read an
interview with Robby Glen Garner, winner of the 98/99 Loebner Prize.

In summary, the outcome of the test is too dependent on human involvement, and so also is the question of whether a certain system
is really intelligent or not. Such a question is actually quite trivial and shallow. As Tantomito puts it, We should be asking about the
kinds, quality and quantity of knowledge in a aystem, the kinds of inference that it can make with this knowledge, how well-directed
its search procedure is, and what means of automatic knowledge acquisition are provided. There are many dimensions of
intelligence, and these interact with one another.

● NLP Essays - Many essays on theory and applications of GAs.


● NLP Programs - Full source code included.

http://www.generation5.org/nlpoverview.shtml (2 of 3) [25/06/2002 3:14:45 PM]


generation5.org - An Introduction to Natural Language Processing
● AISolutions

All content copyright © 1998-2002, Generation5

http://www.generation5.org/nlpoverview.shtml (3 of 3) [25/06/2002 3:14:45 PM]


generation5.org - An Introduction to Natural Language Theory

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

An Introduction to Natural Language Theory


This tutorial provides a brief introduction to the content and method of natural language processing (NLP). More detailed expositions
on NLP will be soon to come. A natural language processing program may consist of the following subprocesses:

Syntactic Understanding
● Acquiring knowledge about the grammar and structure of words and sentences.
● Effective representation and implementation of this allows effective manipulation of language in respect to grammar.
● This is usually implemented through a parser.

The Parser
A parser assigns phrase markers (or grammatical objects) to words, such as verbs, adverbs, nouns, etc. It breaks down sentences into
grammatical objects. For example, the IQATS parser is context-free. That is, recursion is allowed in the parsing of words, thus
multiple levels of embedding will be allowed for grouping words with their respective phrase markers. This makes much more
intricate parsing possible.
A parsed sentence can be easily represented in tree form.
(subject (Jack) (predicate (verb (ate)) (direct-object (a frog))))
(s (np (jack)) (vp (verb (ate)) (np (a frog))))
Below the trees are representations of parsings in lists. This is how the data structures for parsed sentences may look like in a list
processing language like LISP.

Semantic Understanding
● This includes the literal meaning of words in language.
● The inferences they can make.
● The conclusions we can draw from them.
● Generally the most difficult process to develop in NLP.

Theory for Acquiring Semantic Knowledge


Semantic Memory

● Associates, and defines objects with other interconnected objects in a tree structure.
mammal
/ \
/ \
bird sealife

http://www.generation5.org/nlp.shtml (1 of 2) [25/06/2002 3:15:24 PM]


generation5.org - An Introduction to Natural Language Theory
/ \ \
/ \ \
parrot sparrow whale
● For example, from this simple tree structure we can define a parrot as a bird and a mammal, but not a sparrow or a fish or a
goldfish. Most natural language researchers nowadays do not consider the model for semantic memory to be an adequate way to
emulate human understanding. However it is certainly very convenient in terms of development.
● The IQATS implementation of representing semantics is closest to this.

Conceptual Dependency (CD)


● CD relies on a more intricate network of primitives, representation for actions, states etc. CD is dependent on scripts for its
'real-world' knowledge base.
● CD has been implemented in script-appliers (can be thought of as story understanders) such as SAM (Script Applier Mechanism)

● Programs implementing CD can infer or "read between the lines" of sentences (see example below) by drawing real-word
information from scripts. The primitive actions defining the actions of the real word are chained together in scripts. A restaurant
script for example may describe that a person should pay a tip before leaving, or if he is not satisfied with the service, he should not
be expected to pay a tip. A script on reading books may stipulate that one must lay one's eyes on a book before actually drawing
information from words. Scripts play an essential role in enabling programs using CD to draw inferences.
Here is a small example of SAM (Script Applier Mechanism) paraphrasing a story (notice the inferences):
Input: John went to a restaurant. He sat down. He got mad. He left.
Paraphrase: JOHN WAS HUNGRY. HE DECIDED TO GO TO A RESTAURANT. HE WENT TO ONE. HE SAT DOWN IN A
CHAIR. A WAITER DID NOT GO TO THE TABLE. JOHN BECAME UPSET. HE DECIDED HE WAS GOING TO LEAVE
THE RESTAURANT. HE LEFT IT.

● NLP Essays - Many essays on theory and applications of GAs.


● NLP Programs - Full source code included.
● AISolutions

All content copyright © 1998-2002, Generation5

http://www.generation5.org/nlp.shtml (2 of 2) [25/06/2002 3:15:24 PM]


generation5.org - Conceptual Representation and Scripting

Search:
Home • Essays • Interviews • Programs • Books • Media • Software • Hardware • Games • Features
Discussion Board • AISolutions • Competition • Education • Projects • Glossary • Links • Search

Conceptual Representation and Scripting


Recall a famous story, perhaps the birth of Jesus, "Boy Cries Wolf", "Little Red Riding Hood" or any story that you know. Now, tell
that story to a friend. After that, tell that story to another friend. Did you tell the story in the exact same way each time? It is highly
likely you didn't. Why did you do this? The answer lies in the way you remember the story - you do not remember the story word for
word, you store the ideas and the concepts of the story in your head. In Artificial Intelligence, this is called conceptual
representation (CR).

Applications of CR
What does this have to do with Artificial Intelligence? Well, imagine the potential of a program that could parse information and
store it in a string of concepts. Here are just a few applications that CR can be applied to:
● Translation: Translation programs are notorious for their incredible sketchy translations, due to the fact they often took the
absolute, or most common, meaning a word to translate. For example, a program might take the english phrase, "Mum, please
don't hassle me, I've gotta fly to school now, I woke up late." and translate the word fly as to take a plane. If CR was used,
firstly a parser would parse the sentence, then a conceptual-representor would create the necessary data structures (see CR
Structures), then a translator would translate those concepts into the necessary language, without the complications that arose
before.
● Paraphrasing: When you paraphrase something, you take the information you were given and then recreate your own shorter
version. The Microsoft Word paraphraser uses a mathematical approach to paraphrasing, here how it does it:
How does AutoSummarize determine what the key points are? AutoSummarize analyzes the document and
assigns a score to each sentence. (For example, it gives a higher score to sentences that contain words used
frequently in the document.) You then choose a percentage of the highest-scoring sentences to display in the
summary.
(Extract from Microsoft Word Online Help)

Whilst this is very effective when you would like to take a large document and only read the most important parts, such a
mathematical method produces output that does not always make sense as a whole.
If a CR-approach is used, the overall summary will not only highlight the most important parts, it will also make grammatical
sense as a piece of text. Such a program would be great for radio stations and other networks that receive information from
Press networks. Information could be paraphrased by the very computers receiving the information, making the job of making
the news reports presentable, a lot easier.
● Story creation: Apart from the field of computer arts, other applications of story creation could perhaps be in gaming, where
the story is altered and reconstructed dynamically according to how the player changes the game world - image the long-term
playability of that!

http://www.generation5.org/concept.shtml (1 of 6) [25/06/2002 3:16:02 PM]


generation5.org - Conceptual Representation and Scripting

CR Structures (CRS)
After giving you three examples of how CR can be used, it makes you wonder why such programs aren't out yet. Creating a
successful CR program is incredibly difficult. Let's look at one possible CRS. This CRS was devised at Yale for, as they put it,
"possesion-changing-actions". They named it ATRANS, Abstract TRANSfer of Possesion. The example I am going to give is a very
simple one — for more complicated diagrams, see Does the Top-Down or Bottom-Up Approach Best Model the Human Brain.

Basically, a CRS (like ATRANS) represents a very simple action, a base action that many other actions can be made up of. ATRANS
can be used to represent give, trade, buy, exchange, sell and many, many more. Structurally, a CR is a series of 'slots', or
expectations, that the computer fills as its parses and interprets the sentence. Input: "John sold Mary a book" Through the
representation that would be created inferences such as these could be created:
● Mary gave John some money and John gave her a book.

● Mary bought John's book.

● Mary paid John for a book.

Now, look at some of the inferences that the program could make:
● Mary has a book.

● John has money.

● Mary wanted John's book.

● John didn't want the book.

● John had already read the book.

● Mary will read the book.

● John needed the money.


This is based on an example in the The Cognitive Computer, page 100.

Obviously, for these type of inferences and paraphrases to be made, prior knowledge of various aspects must have been programmed
into the application. Also, the inferences made are the every day type. The type that would be made if the program also utilized
scripting. For example, John could have sold the book to Mary because it was a cookery book for Tahitian food, which John doesn't
really like. Therefore, he didn't read the book...but the program would infer he had done. Perhaps with more advanced CR primitives,
and a more powerful inference engine that will check for such possibilities before inferring them, such cases could be reduced or
overcome completely.

Scripting
How does the program about create such inferences? I said that it uses prior knowledge - this prior knowledge will often come in the
form of a script. Schank describes a script like this:
"...Scripts are prepackaged set of expectations, inferences, and knowledge that are applied in common situations, like a
blueprint for action without the detail filled in..."
Cognitive Computer, pg 114.

Scripts are used by humans, in a sense. Imagine you hear this story: "Bob went to the shops. Ten minutes later, he walked out with
his shopping and went home." You make a few assumptions - that Bob bought the shopping, that Bob was short of a few items etc.
The reason you know this is because you follow a script unconciously in your head. You know the basic outline of shopping (due to
experience) and you can fill in the details, and make assumptions from the rest. Let's look at another story: "Bob went to the
gardeners. He asked the waiter for a BMW and left." Now, this story makes no sense whatsoever to the normal person! This is
because is does not follow the "gardeners-script". Gardeners don't have waiters, nor do they sell BMWs!

Example of a Script.
Here is an incredibly simple example of a script, based on how to turn on a computer:

http://www.generation5.org/concept.shtml (2 of 6) [25/06/2002 3:16:02 PM]


generation5.org - Conceptual Representation and Scripting

Script: COMPUTER-ON.
Track: Computer Room.
Props: Computer.
On-button.
Keyboard.
Roles: User.

Entry Conditions: User needs computer.


Computer is off.
Results: User can use computer.
Computer is on.

Scene 1: Locating On Button.

If PC, look on front (or back) of computer for on-switch.


Wait.

Scene 2: Booting up.

Computer starts to boot up.


If computer starts ok, go to (Operating System).
Else:
Look at error.
Fix problem. (Another script).
Turn off computer.
Go to (Locating On Button).

Scene 3: Operating System.

Computer has started, run program needed.


User computer.
To see an example of scripts at work, look at SAM.

CR Programs
Having said that CR programs are incredibly difficult to program, that doesn't mean such programs don't exist. All have been
demonstration, proof-of-concept programs. I will look at two - one called SAM and another called IPP.

SAM
Perhaps one of the most famous AI programs, SAM (Script Applier Mechanism) was developed in 1975 by Richard Cullingford,
Wendy Lehnert, Anatole Gershman and Jaime Carbonell. It was designed to read stories that followed basic scripts, and output
summaries in several languages, and create questions and answers based on the text.
SAM had 4 basic modules: a parser and generator based on a previous program, then the main module - the Script Applier (by

http://www.generation5.org/concept.shtml (3 of 6) [25/06/2002 3:16:02 PM]


generation5.org - Conceptual Representation and Scripting
Cullingford), the question-answer module (Lehnert), and the Russian and Spanish generators (Gershman and Carbonell,
respectively).
Here is some sample output from SAM:

Input:

Friday evening a car swerved off Route 69. The vehicle struck a tree.
The passenger, a New Jersey man, was killed. David Hall, 27, was
Pronouced dead at the scene by Dr. Dana Blanchard, medical examiner.
frank Miller, 32, of 592 Foxon Rd., the driver, was taken to Milford
Hospital by Flanagan Ambulance. He was treated and released. No
charges were made. Patrolman Robert Onofrio investigated the accident.

English Summary:

AN AUTOMOBILE HIT A TREE NEAR HIGHWAY 69 FOUR DAYS AGO. DAVID HALL,
AGE 27, RESIDENCE IN NEW JERSEY, THE PASSENGER, DIED. FRANK MILLER,
AGE 32, RESIDENCE AT 593 FOXON ROAD IN NEW HAVEN, CONNECTICUT, THE
DRIVER, WAS SLIGHTLY INJURED. THE POLICE DEPARTMENT DID NOT FILE
CHARGES.

Spanish Summary:

UN AUTO CHOCO CONTRA UN ARBOL CERCA DE LA CARRETERA 69 EL VIERNES AL


ANOCHECER. DAVID HALL, DE 27 ANOS, QUE VIVIA EN NEW JERSEY, EL
PASAGERO, SE MURIO. FRANK MILLER, DE 32 ANOS, QUE VIVIA EN 593 CALLE
FOXON EN NEW HAVEN, CONNECTICUT, EL CHOFER, QUEDO UN POCO HERIDO. EL
DEPARTAMENTO DE POLICA NO ACUSO A NADIE.

Question-Answering Output:

Q1: Was anyone killed?


A1: YES, DAVID HALL DIED

Q2: Was anyone hurt?


A2: YES, FRANK MILLER WAS SLIGHTLY INJURED.

Q3: Why was Miller hurt?


A3: BECAUSE THE AUTOMOBILE HIT THE TREE.

Q4: Did Miller go to the hospital?


A4: YES, HE WAS IN THE MILFORD HOSPITAL.

Q5: How did Miller get to the hospital.


A5: THE FLANAGAN AMBULANCE COMPANY TOOK HIM TO THE MILFORD HOSPITAL.
SAM had a few shortcomings, though. If a story digressed from a script, SAM would have a hard time. A program that handled
stories with more complicated plots, and characters would need more complicated structures. Five years and several programs later,
IPP was developed.

IPP
IPP was developed in 1980 by Michael Lebowitz. IPP used slightly more advanced techniques than SAM -- in addition it to CR
primitives and scripts it used plans and goals too (beyond the scope of this essay). IPP was built to look at newpaper articles of a
specific domain, and to make generalizations about the information it read and remembered. IPP was important because it could
update and expand its own memory structures.
Here is some sample output from the program, reading articles about Basque terrorism:

http://www.generation5.org/concept.shtml (4 of 6) [25/06/2002 3:16:02 PM]


generation5.org - Conceptual Representation and Scripting

*(PARSE S1-7)

(10 9 79) SPAIN

(STEPPING UP EFFORTS TO DETAIL A BASQUE HOME RULE STATUTE


THAT WILL BE PUT TO A REFERENDUM THIS MONTH BASQUE GUNMEN
IN SAN SEBASTIAN SPRAYED A BAR FREQUENTED BY POLICEMEN WITH
GUNFIRE WOUNDING 11 PERSONS)

(IN PAMPLONA ANOTHER BASQUE CITY TERRORISTS MURDERED A


POLICE INSPECTOR *COMMA* KILLING HIM AS HE DREW HIS OWN
WEAPON IN SELF-DEFENSE)

>>> Beginning final memory inspection...

Feature analysis: EV1 (S-DESTRUCTIVE-ATTACK)


RESULTS HEALTH -10
AU HURT-PERSON
HEALTH -5
VICTIM NUMBER MANY
ROLE AUTHORITY
TARGET PLACE BAR
ACTOR NATIONALITY BASQUE
METHODS AU $SHOOT-ATTACK
LOCATION AREA WESTERN-EUROPE
NATION SPAIN

Indexing EV1 (S1-7) as variant of S-DESTRUCTIVE ATTACK

>>> Memory incorporation complete.

(5 15 80) SPAIN

(A BASQUE SEPARATIST GUERILLA SHOT TO DEATH THREE


YOUNF NATIONAL POLICEMEN AT POINT BLANK RANGE THURSDAY
AS THEY DRANK THEIR MORNING COFFEE IN A BAR)

>>> Beginning final memory incorporation...

Feature analysis: EV5 (S-DESTRUCTIVE-ATTACK)


TARGET PLACE BAR
VICTIM GENDER MALE
ROLE AUTHORITY
ACTOR NATIONALITY BASQUE
DEMAND-TYPE SEPARATISM
METHODS AU $SHOOT-ATTACK
LOCATION AREA WESTERN-EUROPE
NATION SPAIN

Creating more specific S-DESTRUCTIVE-ATTACK


(G1-1 : BASQUE-GEN) from events EV1 (S1-7)
EV5 (S1-6) with features:

VICTIM (1) GENDER MALE


ROLE AUTHORITY

http://www.generation5.org/concept.shtml (5 of 6) [25/06/2002 3:16:02 PM]


generation5.org - Conceptual Representation and Scripting
ACTOR (1) NATIONALITY BASQUE
METHODS (1) AU $SHOOT-ATTACK
LOCATION (1) AREA WESTERN-EUROPE
NATION SPAIN
TARGET (1) PLACE BAR

>>>Memory incorporation complete

"Terrorist attacks in Spain are often shootings


of policement in bars by Basques"
This program obviously had its limitations. Often the conclusions it made were a little deceptive, and the domain was limited.
Nevertheless this program was a definite landmark in CR programs. Other CR programs included MARGIE, PAM, POLITICS,
FRUMP, BORIS and the very impressive CYRUS. To find more information on these programs, refer to pgs 138-163 in Schank's
Cognitive Computer.

Addendum: Problems with Conceptual Representation


This essay in its original form was one of the first I wrote for Generation5. I had the early AI researcher enthusiasm - since then, I
have become rather disenchanted with the entire top-down approach to Artificial Intelligence. (For readers unsure about the concepts
of 'top-down' and 'bottom-up' I'll refer you to Does the Top-down or the Bottom-Up Approach More Closely Model the Brain - the
essay I wrote that started the whole disillusionment process!)
As I explained above, the problems with natural language processing as a whole are huge. Conceptual Representation is no exception
to any of these. Imagine coding what we know as 'commonsense' into computer code? In fact, a project is on-going in Texas called
the Cyc Project. After 14 years, they have a million or so facts down in computable form. While this is impressive - even more
impressive will be the search algorithm to go through them! I think that while the goals of the Cyc Project are genuine, once finished
(if ever) it will be of little use to anyone.
The lack of discourse knowledge is just one problem (albeit the biggest). Other areas of critique are the huge amount of domain
knowledge required on top of 'commonsense'. The lack of biological plausibility is another - structures manipulated in a serial
fashion is not how the brain works. The serial fashion of processing has another problem - speed!
Now, I believe that natural language processing will be one of the last (if not THE last) field of Artificial Intelligence to fully mature.
Robots such as Cog may one day be able to come up with methods of learning language by themselves, without having to be
explictly taught. This is the goal of robotics after all - implicit learning.

● NLP Essays - Many essays on theory and applications of GAs.


● NLP Programs - Full source code included.
● AISolutions

All content copyright © 1998-2002, Generation5

http://www.generation5.org/concept.shtml (6 of 6) [25/06/2002 3:16:02 PM]


GameDev.net - Neural Netware

Neural Netware GameDev.net

See Also:
Artificial Intelligence:Neural Networks

Neural Netware
by Andre' LaMothe - Xtreme Games

And There Was Light...


The funny thing about high technology is that sometimes it's hundreds of years old! For example, Calculus was independently
invented by both Newton and Leibniz over 300 years ago. What used to be magic, is now well known. And of course we all know
that geometry was invented by Euclid a couple thousand years ago. The point is that many times it takes years for something to come
into "vogue". Neural Nets are a prime example. We all have heard about neural nets, and about what their promises are, but we don't
really see too many real world applications such as we do for ActiveX or the Bubblesort. The reason for this is that the true nature of
neural nets is extremely mathematical and understanding and proving the theorems that govern them takes Calculus, Probability
Theory, and Combinatorial Analysis not to mention Physiology and Neurology.
The key to unlocking any technology is for a person or persons to create a Killer App for it. We all know how DOOM works by now,
i.e. by using BSP trees. However, John Carmack didn't invent them, he read about them in a paper written in the 1960's. This paper
described BSP technology. John took the next step an realized what BSP trees could be used for and DOOM was born. I suspect that
Neural Nets may have the same revelation in the next few years. Computers are fast enough to simulate them, VLSI designers are
building them right into the silicon, and there are hundreds of books that have been published about them. And since Neural Nets are
more mathematical entities then anything else, they are not tied to any physical representation, we can create them with software or
create actual physical models of them with silicon. The key is that neural nets are abstractions or models.
In many ways the computational limits of digital computers have been realized. Sure we will keep making them faster, smaller and
cheaper, but digital computers will always process digital information since they are based on deterministic binary models of
computation. Neural nets on the other hand are based on different models of computation. They are based on highly parallel,
distributed, probabilistic models that don't necessarily model a solution to a problem as does a computer program, but model a
network of cells that can find, ascertain, or correlate possible solutions to a problem in a more biological way by solving the problem
a in little pieces and putting the result together. This article is a whirlwind tour of what neural nets are, and how they work in as much
detail as can be covered in a few pages. I know that a few pages doesn't do the topic justice, but maybe we can talk the management
into a small series???
Figure 1.0 - A Basic Biological Neuron.

http://www.gamedev.net/reference/articles/article771.asp (1 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware

Biological Analogs
Neural Nets were inspired by our own brains. Literally, some brain in someone's head said, "I wonder how I work?" and then
proceeded to create a simple model of itself. Weird huh? The model of the standard neurode is based on a simplified model of a
human neuron invented over 50 years ago. Take a look at Figure 1.0. As you can see, there are 3 main parts to a neuron, they are:
● Dendrite(s)........................Responsible for collecting incoming signals.

● Soma................................Responsible for the main processing and summation of signals.

● Axon.................................Responsible for transmitting signals to other dendrites.

The average human brain has about 100,000,000,000 or 1011 neurons and each neuron has up to 10,000 connections via the
dendrites. The signals are passed via electro-chemical processes based on NA (sodium), K (potassium), and CL (chloride) ions.
Signals are transferred by accumulation and potential differences caused by these ions, the chemistry is unimportant, but the signals
can be thought of simple electrical impulses that travel from axon to dendrite. The connections from one dendrite to axon are called
synapses and these are the basic signal transfer points.
So how does a neuron work? Well, that doesn't have a simple answer, but for our purposes the following explanation will suffice.
The dendrites collect the signals received from other neurons, then the soma performs a summation of sorts and based on the result
causes the axon to fire and transmit the signal. The firing is contingent upon a number of factors, but we can model it as an transfer
function that takes the summed inputs, processes them, and then creates an output if the properties of the transfer function are met. In
addition, the output is non-linear in real neurons, that is, signals aren't digital, they are analog. In fact, neurons are constantly
receiving and sending signals and the real model of them is frequency dependent and must be analyzed in the S-domain (the
frequency domain). The real transfer function of a simple biological neuron has, in fact, been derived and it fills a number of
chalkboards up.
Now that we have some idea of what neurons are and what we are trying to model, let's digress for a moment and talk about what we
can use neural nets for in video games.

Applications to Games
Neural nets seem to be the answer that we all are looking for. If we could just give the characters in our games a little brains, imagine
how cool a game would be! Well, this is possible in a sense. Neural nets model the structure of neurons in a crude way, but not the
high level functionality of reason and deduction, at least in the classical sense of the words. It takes a bit of thought to come up with
ways to apply neural net technology to game AI, but once you get the hang of it, then you can use it in conjunction with deterministic
algorithms, fuzzy logic, and genetic algorithms to create very robust thinking models for your games. Without a doubt better than
anything you can do with hundreds of if-then statements or scripted logic. Neural nets can be used for such things as:
Environmental Scanning and Classification - A neural net can be feed with information that could be interpreted as vision or auditory
information. This information can then be used to select an output response or teach the net. These responses can be learned in
real-time and updated to optimize the response.
Memory - A neural net can be used by game creatures as a form of memory. The neural net can learn through experience a set of
responses, then when a new experience occurs, the net can respond with something that is the best guess at what should be done.
Behavioral Control - The output of a neural net can be used to control the actions of a game creature. The inputs can be various
variables in the game engine. The net can then control the behavior of the creature.
Response Mapping - Neural nets are really good at "association" which is the mapping of one space to another. Association comes in
two flavors: autoassociation which is the mapping of an input with itself and heterassociation which is the mapping of an input with
something else. Response mapping uses a neural net at the back end or output to create another layer of indirection in the control or
behavior of an object. Basically, we might have a number of control variables, but we only have crisp responses for a number of
certain combinations that we can teach the net with. However, using a neural net on the output, we can obtain other responses that are
in the same ballpark as our well defined ones.
The above examples may seem a little fuzzy, and they are. The point is that neural nets are tools that we can use in whatever way we
like. The key is to use them in cool ways that make our AI programming simpler and make game creatures respond more
intelligently.

http://www.gamedev.net/reference/articles/article771.asp (2 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware

Neural Nets 101


In this section we're going to cover the basic terminology and concepts used in neural net discussions. This isn't easy since neural
nets are really the work of a number of different disciplines, and therefore, each discipline creates their own vocabulary. Alas, the
vocabulary that we will learn is a good intersection of all the well know vocabularies and should suffice. In addition, neural network
theory is replete with research that is redundant, meaning that many people re-invent the wheel. This has had the effect of creating a
number of neural net architectures that have names. I will try to keep things as generic as possible, so that we don't get caught up in
naming conventions. Later in the article we will cover some nets that are distinct enough that we will refer to them will their proper
names. As you read don't be too alarmed if you don't make the "connections" with all of the concepts, just read them, we will cover
most of them again in full context in the remainder of the article. Let's begin...
Figure 2.0 - A Single Neurode with n Inputs.

Now that we have seen the wetware version of a neuron, let's take a look at the basic artificial neuron to base our discussions on.
Figure 2.0 is a graphic of a standard "neurode" or "artificial neuron". As you can see, it has a number of inputs labeled X1 - Xn and B.
These inputs each have an associated weight w1 - wn, and b attached to them. In addition, there is a summing junction Y and a single
output y. The output y of the neurode is based on a transfer or "activation" function which is a function of the net input to the
neurode. The inputs come from the Xi's and from B which is a bias node. Think of B as a "past history", "memory", or "inclination".
The basic operation of the neurode is as follows: the inputs Xi are each multiplied by their associated weights and summed. The
output of the summing is referred to as the input activation Ya. The activation is then fed to the activation function fa(x) and the final
output is y. The equations for this is:
Eq. 1.0
n
Ya = B*b + ∑ Xi * wi
i =1
AND
y = fa(Ya)

The various forms of fa(x) will be covered in a moment.

Before we move on, we need to talk about the inputs Xi, the weights wi, and their respective domains. In most cases, inputs consist of
the positive and negative integers in the set ( -∞ , +inputs are ∞ ). However, many neural nets use simpler bivalent values (meaning
that they have only two values). The reason for using such a simple input scheme is that ultimately all binary as image or bipolar and

http://www.gamedev.net/reference/articles/article771.asp (3 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware
complex inputs are converted to pure binary or bipolar representations anyway. In addition, many times we are trying to solve
computer problems such or voice recognition which lend themselves to bivalent representations. Nevertheless, this is not etched in
stone. In any case, the values used in bivalent systems are primarily 0 and 1 in a binary system or -1 and 1 in a bipolar system. Both
systems are similar except that bipolar representations turn out to be mathematically better than binary ones. The weights wi on each
input are typically in the range bias ( -∞ , +∞ ). and are referred to as excitatory, and inhibitory for positive and negative values
respectively. The extra input B which is called the is always 1.0 and is scaled or multiplied by b, that is, b is it's weight in a sense.
This is illustrated in Eq.1.0 by the leading term.
Continuing with our analysis, once the activation Ya is found for a neurode then it is applied to the activation function and the output
y can be computed. There are a number of activation functions and they have different uses. The basic activation functions fa(x) are:

Step Linear Exponential

Fs(x) =1, if x ≥θ Fl(x) = x, for all x Fe(x) = 1/(1+e-σ x)

The equations for each are fairly simple, but each are derived to model or fit various properties.
The step function is used in a number of neural nets and models a neuron firing when a critical input signal is reached. This is the
purpose of the factor θ, it models the critical input level or threshold that the neurode should fire at. The linear activation function is
used when we want the output of the neurode to more closely follow the input activation. This kind of activation function would be
used in modeling linear systems such as basic motion with constant velocity. Finally, the exponential activation function is used to
create a non-linear response which is the only possible way to create neural nets that have non-linear responses and model non-linear
processes. The exponential activation function is key in advanced neural nets since the composition of linear and step activation
functions will always be linear or step, we will never be able to create a net that has non-linear response, therefore, we need the
exponential activation function to address the non-linear problems that we want to solve with neural nets. However, we are not
locked into using the exponential function. Hyperbolic, logarithmic, and transcendental functions can be used as well depending on
the desired properties of the net. Finally, we can scale and shift all the functions if we need to.
Figure 3.0 - A 4 Input, 3 Neurode, SingleLayer Neural Net.

http://www.gamedev.net/reference/articles/article771.asp (4 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware

Figure 4.0 - A 2 Layer Neural Network.

As you can imagine, a single neurode isn't going to do alot for us, so we need to take a group of them and create a layer of neurodes,
this is shown in Figure 3.0. The figure illustrates a single layer neural network. The neural net in Figure 3.0 has a number of inputs
and a number of output nodes. By convention this is a single layer net since the input layer is not counted unless it is the only layer in
the network. In this case, the input layer is also the output layer and hence there is one layer. Figure 4.0 shows a two layer neural net.
Notice that the input layer is still not counted and the internal layer is referred to as "hidden". The output layer is referred to as the
output or response layer. Theoretically, there is no limit to the number of layers a neural net can have, however, it may be difficult to
derive the relationship of the various layers and come up with tractable training methods. The best way to create multilayer neural
nets is to make each network one or two layers and then connect them as components or functional blocks.
All right, now let's talk about temporal or time related topics. We all know that our brains are fairly slow compared to a digital

http://www.gamedev.net/reference/articles/article771.asp (5 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware
computer. In fact, our brains have cycle times in the millisecond range whereas digital computers have cycle times in the nanosecond
and soon sub-nanosecond times. This means that signals take time to travel from neuron to neuron. This is also modeled by artificial
neurons in the sense that we perform the computations layer by layer and transmit the results sequentially. This helps to better model
the time lag involved in the signal transmission in biological systems such as us.
We are almost done with the preliminaries, let's talk about some high level concepts and then finish up with a couple more terms. The
question that you should be asking is, "what the heck to neural nets do?" This is a good question, and it's a hard one to answer
definitively. The question is more, "what do you want to try and make them do?" They are basically mapping devices that help map
one space to another space. In essence, they are a type of memory. And like any memory we can use some familiar terms to describe
them. Neural nets have both STM (Short Term Memory) and LTM (Long Term Memory). STM is the ability for a neural net to
remember something it just learned, whereas, LTM is the ability of a neural net to remember something it learned some time ago
amongst its new learning. This leads us to the concepts of plasticity or in other words how a neural net deals with new information or
training. Can a neural net learn more information and still recall previously stored information correctly? If so, does the neural net
become unstable since it is holding so much information that the data starts to overlapping or has common intersections. This is
referred to as stability. The bottom line is we want a neural net to have a good LTM, a good STM, be plastic (in most cases) and
exhibit stability. Of course, some neural nets have no analog to memory they are more for functional mapping, so these concepts
don't apply as is, but you get the idea. Now that we know about the aforementioned concepts relating to memory, let's finish up by
talking some of the mathematical factors that help measure and understand these properties.
One of the main uses for neural nets are memories that can process input that is either incomplete or noisy and return a response. The
response may be the input itself (autoassociation) or another output that is totally different from the input (heteroassociation). Also,
the mapping may be from a n-dimensional space to a m-dimensional space and non-linear to boot. The bottom line is that we want to
some how store information in the neural net so that inputs (perfect as well as noisy) can be processed in parallel. This means that a
neural net is a kind of hyperdimensional memory unit since it can associate an input n-tuple with an output m-tupple where m can
equal n, but doesn't have to.
What neural nets do in essence is partition an n-dimensional space into regions that uniquely map the input to the output or classify
the input into distinct classes like a funnel of sorts. Now, as the number of input values (vectors) in the input data set increase which
we will refer to as S, it logically follows that the neural net is going to have harder time separating the information. And as a neural
net is filled with information, the input values that are to be recalled will overlap since the input space can no longer keep everything
partitioned in a finite number of dimensions. This overlap results in crosstalk, meaning that some inputs are not as distinct as they
could be. This may or may not be desired. Although this problem isn't a concern in all cases, it is a concern in associative memory
neural nets, so to illustrate the concept let's assume that we are trying to associate n-tuple input vectors with some output set. The
output set isn't as much of a concern to proper functioning as is the input set S is.
If a set of inputs S is straight binary then we are looking at sequences in the form 1101010...10110 let's say that our input bit vectors
are only 3 bits each, therefore the entire input space consist of the vectors:
v0 = (0,0,0), v1 = (0,0,1), v2 = (0,1,0), v3 = (0,1,1), v4 = (1,0,0), v5 = (1,0,1), v6 = (1,1,0),

v7 = (1,1,1)

To be more precise the Basis for this set of vectors is:


v = (1,0,0) * b2 + (0,1,0) * b1 + (0,0,1) * b0, where bi can take on the values 0 or 1.

For example if we let b2=1, b1=0, and b0=1 then we get the vector:

v = (1,0,0) * 1 + (0,1,0) * 0 + (0,0,1) * 1 = (1,0,0) + (0,0,0) + (0,0,1) = (1,0,1) which is v5 in our possible input set.

A basis is a special vector summation that describes a set of vectors in a space. So v describes all the vector in our space. Now to
make a long story short, the more orthogonal the vectors in the input set are the better they will distribute in a neural net and the
better they can be recalled. Orthogonality refers to the independence of the vectors or in other words if two vector are orthogonal then
their dot product is 0, their projection onto one another is 0, and they can't be written in terms of one another. In the set v there are a
lot of orthogonal vectors, but they come in small groups, for example v0 is orthogonal to all the vectors, so we can always include it.
But if we include v1 in our set S then the only other vectors that will fit and maintain orthogonality are v2 and v4 or the set:

v0 = (0,0,0), v1 = (0,0,1), v2= (0,1,0), v4 = (1,0,0)

Why? Because vi • vj for all i,j from 0..3 is equal to 0. In other words, the dot product of all the pairs of vectors in 0, so they must all
be orthogonal. Therefore, this set will do very well in a neural net as input vectors. However, the set:

http://www.gamedev.net/reference/articles/article771.asp (6 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware

v6 = (1,1,0), v7 = (1,1,1)

will potentially do poorly as inputs since v6•v7 is non-zero or in a binary system it is 1. The next question is, "can we measure this
orthogonality?" The answer is yes. In the binary vector system there is a measure called hamming distance. It is used to measure the
n-dimensional distance between binary bit vectors. It is simply, the number of bits that are different between two vectors. For
example the vectors:
v0 = (0,0,0), v1 = (0,0,1)

have a hamming distance of 1 while the vectors,


v2 = (0,1,0), v4 = (1,0,0)

have a hamming distance of 2.


We can use hamming distance as the measure of orthogonality in binary bit vector systems. And this can help us determine if our
input vectors are going to have a lot of overlap. Determining orthogonality with general vector inputs is harder, but the concept is the
same. That's all the time we have for concepts and terminology, so let's jump right in and see some actual neural nets that do
something and hopefully by the end of the article you will be able to use them in your game's AI. We are going to cover neural nets
used to perform logic functions, classify inputs, and associate inputs with outputs.
Figure 5.0 - The McCulloch-Pitts Neurode.

Pure Logic Mr. Spock


The first artificial neural networks were created in 1943 by McCulloch and Pitts. The neural networks were composed of a number of
neurodes and were typically used to compute simple logic functions such as AND, OR, XOR, and combinations of them. Figure 5.0 is
a representation of a basic McCulloch-Pitts neurode with 2 inputs. If you are an electrical engineer then you will immediately see a
close resemblance between McCulloch-Pitts neurodes and transistors or MOSFETs. In any case, McCulloch-Pitts neurodes do not
have biases and have the simple activation function fmp(x) equal to:
Eq. 5.0
fmp(x) = 1, if x≥θ
0, if x < θ
The MP (McCulloch-Pitts) neurode functions by summing the product of the inputs Xi and weights wi and applying the result Ya to
the activation function fmp(x). The early research of McCulloch-Pitts focused on creating complex logical circuitry with the neurode
models. In addition, one of the rules of the neurode model is that is takes one time step for a signal to travel from neurode to neurode.
This helps model the biological nature of neurons more closely. Let's take a look at some examples of MP neural nets that implement
basic logic functions. The logical AND function has the following truth table:
Table 1.0 - Truth Table for Logical AND.
X1 X2 Output
0 0 0

http://www.gamedev.net/reference/articles/article771.asp (7 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware

0 1 0
1 0 0
1 1 1

Figure 6.0 - Basic Logic Functions Implemented with McCulloch-Pitts Nets.

We can model this with a two input MP neural net with weights w1=1, w2=1, and θ = 2. This neural net is shown in Figure 6.0a. As
you can see, all input combinations work correctly. For example, if we try inputs X1=0, Y1=1, then the activation will be:
X1*w1 + X2*w2 = (1)*(1) + (0)*(1) = 1.0

If we apply 1.0 to the activation function fmp(x) then the result is 0 which is correct. As another example, if we try inputs X1=1, X2=1,
then the activation will be:
X1*w1 + X2*w2 = (1)*(1) + (1)*(1) = 2.0

If we input 2.0 to the activation function fmp(x), then the result is 1.0 which is correct. The other cases will work also. The function of
the OR is similar, but the threshold θ of is changed to 1.0 instead 2.0 as it is in the AND. You can try running through the truth table
yourself to see the results.
The XOR network is a little different because it really has 2 layers in a sense because the results of the pre-processing are further
processed in the output neuron. This is a good example of why a neural net needs more than one layer to solve certain problems. The
XOR is a common problem in neural nets that is used to test a neural net's performance. In any case, XOR is not linearly separable in
a single layer, it must be broken down into smaller problems and then the results added together. Let's take a look at XOR as the final
example of MP neural networks. The truth table for XOR is as follows:
Table 2.0 - Truth Table for Logical XOR.

http://www.gamedev.net/reference/articles/article771.asp (8 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware

X1 X2 Output
0 0 0
0 1 1
1 0 1
1 1 0

Figure 7.0 - Using the XOR Function to Illustrate Linear Separability.

XOR is only true when the inputs are different, this is a problem since both inputs map to the same output. XOR is not linearly
separable, this is shown in Figure 7.0. As you can see, there is no way to separate the proper responses with a straight line. The point
is that we can separate the proper responses with 2 lines and this is just what 2 layers do. The first layer pre-processes or solves part
of the problem and the remaining layer finishes up. Referring to Figure 6.0c, we see that the weights are w1=1, w2=-1, w3=1, w4=-1,
w5=1, w6=1. The network works as follows: layer one computes if X1 and X2 are opposites in parallel, the results of either case (0,1)
or (1,0) are feed to layer two which sums these up and fires if either is true. In essence we have created the logic function:
z = ((X1 AND NOT X2) OR (NOT X1 AND X2))
If you would like to experiment with the basic McCulloch Pitts neurode Listing 1.0 is a complete 2 input, single neurode simulator
that you can experiment with.
Listing 1.0 - A McCulloch-Pitts Logic Neurode Simulator

// MCULLOCCH PITTS SIMULATOR


/////////////////////////////////////////////////////

// INCLUDES
/////////////////////////////////////////////////////

http://www.gamedev.net/reference/articles/article771.asp (9 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware
#include <conio.h>
#include <stdlib.h>
#include <malloc.h>
#include <memory.h>
#include <string.h>
#include <stdarg.h>
#include <stdio.h>
#include <math.h>
#include <io.h>
#include <fcntl.h>

// MAIN
/////////////////////////////////////////////////////

void main(void)
{
float threshold, // this is the theta term used to threshold the summation
w1,w2, // these hold the weights
x1,x2, // inputs to the neurode
y_in, // summed input activation
y_out; // final output of neurode

printf("\nMcCulloch-Pitts Single Neurode Simulator.\n");


printf("\nPlease Enter Threshold?");
scanf("%f",&threshold);

printf("\nEnter value for weight w1?");


scanf("%f",&w1);

printf("\nEnter value for weight w2?");


scanf("%f",&w2);

printf("\n\nBegining Simulation:");

// enter main event loop

while(1)
{
printf("\n\nSimulation Parms: threshold=%f,
W=(%f,%f)\n",threshold,w1,w2);

// request inputs from user


printf("\nEnter input for X1?");
scanf("%f",&x1);

printf("\nEnter input for X2?");


scanf("%f",&x2);

// compute activation
y_in = x1*w1 + x2*w2;

// input result to activation function (simple binary step)


if(y_in >= threshold)
y_out = (float)1.0;
else
y_out = (float)0.0;

// print out result

http://www.gamedev.net/reference/articles/article771.asp (10 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware
printf("\nNeurode Output is %f\n",y_out);

// try again
printf("\nDo you wish to continue Y or N?");
char ans[8];
scanf("%s",ans);

if(toupper(ans[0])!='Y')
break;
} // end while

printf("\n\nSimulation Complete.\n");
} // end main
That finishes up our discussion of the basic building block invented by McCulloch and Pitts now let's move on to more contemporary
neural nets such as those used to classify input vectors.
Figure 8.0 - The Basic Neural Net Model Used for Discussion.

Classification and "Image" Recognition


At this point we are ready to start looking at real neural nets that have some girth to them! To segue into the following discussions on
Hebbian, and Hopfield neural nets, we are going to analyze a generic neural net structure that will illustrate a number of concepts
such as linear separability, bipolar representations, and the analog that neural nets have with memories. Let's begin with taking a look
at Figure 8.0 which is the basic neural net model we are going to use. As you can see, it is a single node net with 3 inputs including
the bias, and a single output. We are going to see if we can use this network to solve the logical AND function that we solved so
easily with McCulloch-Pitts neurodes.
Let's start by first using bipolar representations, so all 0's are replaced with -1's and 1's are left alone. The truth table for logical AND
using bipolar inputs and outputs is shown below:
Table 3.0 - Truth Table for Logical AND in Bipolar Format.
X1 X2 Output
-1 -1 -1
-1 1 -1
1 -1 -1
1 1 1

And here is the activation function fc(x) that we will use:


Eq. 6.0
fc(x) = 1, if x ≥ θ
-1, if x < θ

http://www.gamedev.net/reference/articles/article771.asp (11 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware

Notice that the function is step with bipolar outputs. Before we continue, let me place a seed in your mind; the bias and threshold end
up doing the same thing, they give us another degree of freedom in our neurons that make the neurons respond in ways that can't be
achieved without them. You will see this shortly.
The single neurode net in Figure 8.0 is going to perform a classification for us. It is going to tell us if our input is in one class or
another. For example, is this image a tree or not a tree. Or in our case is this input (which just happens to be the logic for an AND) in
the +1 or -1 class? This is the basis of most neural nets and the reason I was belaboring linear separability. We need to come up with
a linear partitioning of space that maps our inputs and outputs so that there is a solid delineation of space that separates them. Thus,
we need to come up with the correct weights and a bias that will do this for us. But how do we do this? Do we just use trial and error
or is there a methodology? The answer is that there are a number of training methods to teach a neural net. These training methods
work on various mathematical premises and can be proven, but for now, we're just going to pull some values out of the hat that work.
These exercises will lead us into the learning algorithms and more complex nets that follow.
All right, we are trying to finds weights wi and bias b that give use the correct result when the various inputs are feed to our network
with the given activation function fc(x). Let's write down the activation summation of our neurode and see if we can infer any
relationship between the weights and the inputs that might help us. Given the inputs X1 and X2 with weights w1 and w2 along with
B=1 and bias b, we have the following formula:
Eq. 7.0
X1*w1 + X2*w2 + B*b=θ

Since B is always equal to 1.0 the equation simplifies to:


X1*w1 + X2*w2 + b=θ

.
.
X2 = -X1*w1/w2 + (θ -b)/w2 (solving in terms of X2)

Figure 9.0 - Mathematical Decision Boundaries Generated by Weights, Bias, and θ

What is this entity? It's a line! And if the left hand side is greater than or equal to θ, that is, (X1*w1 + X2*w2 + b) then the neurode
will fire and output 1, otherwise the neurode will output -1. So the line is a decision boundary. Figure 9.0a illustrates this. Referring

http://www.gamedev.net/reference/articles/article771.asp (12 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware

to the figure, you can see that the slope of the line is -w1/w2 and the X2 intercept is (θ -b)/w2. Now can you see why we can get rid of
θ? It is part of a constant and we can always scale b to take up any loss, so we will assume that θ = 0, and the resulting equation is:
X2 = -X1*w1/w2 - b/w

What we want to find are weights w1 and w2 and bias b so that it separates our outputs or classifies them into singular partitions
without overlap. This is the key to linear separability. Figure 9.0b shows a number of decision boundaries that will suffice, so we can
pick any of them. Let's pick the simplest values which would be:
w1 = w2 = 1

b = -1
With these values our decision boundary becomes:
X2 = -X1*w1/w2 - b/w2 -> X2 = -1*X1 + 1

The slope is -1 and the X2 intercept is 1. If we plug the input vectors for the logical AND into this equation and use the fc(x) activation
function then we will get the correct outputs. For example if, X2 + X1-1 >0 then fire the neurode, else output -1. Let's try it with our
AND inputs and see what we come up with:
Table 4.0 - Truth Table for Bipolar AND with decision boundary.
Input X1 X2 Output (X2+X1-1)
-1 -1 (-1) +( -1) -1 = 3 < 0 don't fire, output -1
-1 1 (-1) + (1) -1 = -1< 0 don't fire, output -1
1 -1 (1) + (-1) -1 = -2 < 0 don't fire, output -1
1 1 (1) + (1)-1 = 1 > 0 fire, output 1

As you can see, the neural network with the proper weights and bias solves the problem perfectly. Moreover, there are a whole family
of weights that will do just as well (sliding the decision boundary in a direction perpendicular to itself). However, there is an
important point here. Without the bias or threshold, only lines through the origin would be possible since the X2 intercept would have
to be 0. This is very important and the basis for using a bias or threshold, so this example has proven to be an important one since it
has flushed this fact out. So, are we closer to seeing how to algorithmically find weights? Yes, we now have a geometrical analogy
and this is the beginning of finding an algorithm.

The Ebb of Hebbian


Now we are ready to see the first learning algorithm and its application to a neural net. One of the simplest learning algorithms was
invented by Donald Hebb and it is based on using the input vectors to modify the weights in a way so that the weight create the best
possible linear separation of the inputs and outputs. Alas, the algorithm works just OK. Actually, for inputs that are orthogonal it is
perfect, but for non-orthogonal inputs, the algorithm falls apart. Even though, the algorithm doesn't result in correct weight for all
inputs, it is the basis of most learning algorithms, so we will start here.
Before we see the algorithm, remember that it is for a single neurode, single layer neural net. You can of course, place a number of
neurodes in the layer, but they will all work in parallel and can be taught in parallel. Are you starting to see the massive parallization
that neural nets exhibit? Instead of using a single weight vector, a multi-neurode net uses a weight matrix. Anyway, the algorithm is
simple, it goes something like this:
Given:
● Inputs vectors are in bipolar form I = (-1,1,0,...-1,1) and contain k elements.

● There are n input vectors and we will refer to the set as I and the jth element as Ij.

● Outputs will be referred to as yj and there are k of them, one for each input Ij
● The weights w1-wk are contained in a single vector w = (w1, w2, ... wk).

Step 1. Initialize all your weights to 0, and let them be contained in a vector w that has n entries. Also initialize the bias b to 0.

http://www.gamedev.net/reference/articles/article771.asp (13 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware
Step 2. For j = 1 to n do
b = b + yj (where y is the desired output)

w = w + Ij * yj (remember this is a vector operation)

end do
The algorithm is nothing more than an "accumulator" of sorts. Shifting, the decision boundary based on the changes in the input and
output. The only problem is that it sometimes can't move the boundary fast enough (or at all) and "learning" doesn't take place.
So how do we use Hebbian learning? The answer is, the same as the previous network except that now we have an algorithmic
method teach the net with, thus we refer to the net as a Hebb or Hebbian Net. As an example, let's take our trusty logical AND
function and see if the algorithm can find the proper weights and bias to solve the problem. The following summation is equivalent to
running the algorithm:
w = [I1*y1] + [I2*y2] + [I3*y3] + [I4*y4] = [(-1, -1)*(-1)] + [(-1, 1)*(-1)] + [( 1, -1)*(-1)] + [(1, 1)*(1)] = (2,2)

b = y1 + y2 + y3 + y4 = (-1) + (-1) + (-1) + (1) = -2

Therefore, w1=2, w2=2, and b=-2. These are simply scaled versions of the values w1=1, w2=1, b=-1 that we derived geometrically in
the previous section. Killer huh! With this simple learning algorithm we can train a neural net (consisting of a single neurode) to
respond to a set of inputs and either classify the input as true or false, 1 or -1. Now if we were to array these neurodes together to
create a network of neurodes then instead of simple classifying the inputs as on or off, we can associate patterns with the inputs. This
is one of the foundations for the next network neural net structure; the Hopfield net. One more thing, the activation function used for
a Hebb Net is a step with a threshold of 0.0 and bipolar outputs 1 and -1.
To get a feel for Hebbian learning and how to implement an actual Hebb Net, Listing 2.0 contains a complete Hebbian Neural Net
Simulator. You can create networks with up to 16 inputs and 16 neurodes (outputs). The program is self explanatory, but there are a
couple of interesting properties: you can select 1 of 3 activation functions, and you can input any kind of data you wish. Normally,
we would stick to the Step activation function and inputs/outputs would be binary or bipolar. However, in the light of discovery,
maybe you will find something interesting with these added degrees of freedom. However, I suggest that you begin with the step
function and all bipolar inputs and outputs.
Listing 2.0 - A Hebb Net Simulator (in neuralnet.zip).

Playing the Hopfield


Figure 10.0 - A 4 Node Hopfield Autoassociative Neural Net.

http://www.gamedev.net/reference/articles/article771.asp (14 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware

John Hopfield is a physicist that likes to play with neural nets (which is good for us). He came up with a simple (in structure at least),
but effective neural network called the Hopfield Net. It is used for autoassociation, you input a vector x and you get x back
(hopefully). A Hopfield net is shown in Figure 10.0. It is a single layer network with a number of neurodes equal to the number of
inputs Xi. The network is fully connected meaning that every neurode is connected to every other neurode and the inputs are also the
outputs. This should strike you as weird since there is feedback. Feedback is one of the key features of the Hopfield net and this
feedback is the basis for the convergence to the correct result.
The Hopfield network is an iterative autoassociative memory. This means that is may take one or more cycles to return the correct
result (if at all). Let me clarify; the Hopfield network takes an input and then feeds it back, the resulting output may or may not be the
desired input. This feedback cycle may occur a number of times before the input vector is returned. Hence, a Hopfield network
functional sequence is: first we determine the weights based on our input vectors that we want to autoassociate, then we input a
vector and see what comes out of the activations. If the result is the same as our original input then we are done, if not, then we take
the result vector and feed it back through the network. Now let's take a look at the weight matrix and learning algorithm used for
Hopfield nets.
The learning algorithm for Hopfield nets is based on the Hebbian rule and is simply a summation of products. However, since the
Hopfield network has a number of input neurons the weights are no longer a single array or vector, but a collection of vectors which
are most compactly contained in a single matrix. Thus the weight matrix W for a Hopfield net is created based on this equation:
Given:
● Inputs vectors are in bipolar form I = (-1,1,,...-1,1) and contain k elements.

● There are n input vectors and we will refer to the set as I and the jth element as Ij.

● Outputs will be referred to as yj and there are k of them, one for each input Ij.
● The weight matrix W is square and has dimension kxk since there are k inputs.
Eq. 8.0
k
W (kxk) = ∑ Iit x Ii
i=1
note: each outer product will have dimension k x k, since we are multiplying a column vector and a row vector.
and, Wii = 0, for all i.

http://www.gamedev.net/reference/articles/article771.asp (15 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware

Notice that there are no bias terms and the main diagonal of W must be all zero's. The weight matrix is simply the sum of matrices
generated by multiplying the transpose Iit x Ii for all i from 1 to n. This is almost identical to the Hebbian algorithm for a single
neurode except that instead of multiplying the input by the output, the input is multiplied by itself, which is equivalent to the output
in the case of autoassociation. Finally, the activation function fh(x) is shown below:
Eq. 9.0
fh(x) = 1, if x ≥ 0
0, if x < 0
fh(x) it is a step function with a binary output. This means that the inputs must be binary, but we already said that inputs are bipolar?
Well, they are, and they aren't. When the weight matrix is generated we convert all input vectors to bipolar, but for normal operation
we use the binary version of the inputs and the output of the Hopfield net will also be binary. This convention is not necessary, but
makes the network discussion a little simpler. Anyway, let's move on to an example. Say we want to create a four node Hopfield net
and we want it to recall these vectors:
I1=(0,0,1,0), I2=(1,0,0,0), I3=(0,1,0,1) Note: they are all orthogonal.

Converting to bipolar *, we have:


I1* = (-1,-1,1,-1) , I2* = (1,-1,-1,-1) , I3* = (-1,1,-1,1)

Now we need to compute W1, W2, W3, where Wi is the product of the transpose of each input with itself.

W1= [ I1*t x I1* ] = (-1,-1,1,-1)t x (-1,-1,1,-1) =

1 1 -1 1
1 1 -1 1
-1 -1 1 -1
1 1 -1 1

W2 = [ I2*t x I2* ] = (1,-1,-1,-1)t x (1,-1,-1,-1) =

1 -1 -1 -1
-1 1 1 1
-1 1 1 1
-1 1 1 1

W3 = [ I3*t x I3* ] = (-1,1,-1,1)t x (-1,1,-1,1) =

1 -1 1 -1
-1 1 -1 1

1 -1 1 -1

-1 1 -1 1

Then we add W1 + W2 + W3 resulting in:

W(1+2+3) =

3 -1 -1 -1

-1 3 -1 3

http://www.gamedev.net/reference/articles/article771.asp (16 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware

-1 -1 3 -1

-1 3 -1 3

Zeroing out the main diagonal gives us the final weight matrix:
W=
0 -1 -1 -1
-1 0 -1 3

-1 -1 0 -1
-1 3 -1 0

That's it, now we are ready to rock. Let's input our original vectors and see the results. To do this we simply have to matrix multiple
the input by the matrix and then process each output value with our activation function fh(x). Here are the results:

I1 x W = (-1,-1,0,-1) and fh((-1,-1,0,-1)) = (0,0,1,0)

I2 x W = (0,-1,-1,-1) and fh((0,-1,-1,-1)) = (1,0,0,0)

I3 x W = (-2,3,-2,3) and fh((-2,3,-2,3)) = (0,1,0,1)

The inputs were perfectly recalled, and they should be since they were all orthogonal. As a final example, let's assume that our input
(vision, auditory etc.) is a little noisy and the input has a single error in it. Let's take I3 = (0,1,0,1) and add some noise to I3 resulting
in I3noise = (0,1,1,1). Now let's see what happens if we input this noisy vector to the Hopfield net:

I3noise x W = (-3, 2, -2, 2) and fh((-3,2,-2, 2)) = (0,1,0,1)

Amazingly enough, the original vector is recalled. This is very cool. So we might have a memory that is filled with bit patterns that
look like trees, (oaks, weeping willow, spruce, redwood etc.) then if we input another tree that is similar to say a weeping willow, but
hasn't been entered into the net, our net will (hopefully) output a weeping willow indicating that this is what it "thinks" it looks like.
This is one of the strengths of associative memories, we don't have to teach it every possible input, but just enough to give it a good
idea. Then inputs that are "close" will usually converge to an actual trained input. This is the basis for image, and voice recognition
systems. Don't ask me where the heck the "tree" analogy came from. Anyway, to complete our study of neural nets, I have included a
final Hopfield autoassociative simulator that allows you to create nets with up to 16 neurodes. It is similar to the Hebb Net, but you
must use a step activation function and your inputs exemplars must be in bipolar while training and binary while associating
(running). Listing 3.0 contains the code for the simulator.
Listing 3.0 - A Hopfiled Autoassociative Memory Simulator (in neuralnet.zip).

Brain Dead...
Well that's all we have time for. I was hoping to get to the Perceptron network, but oh well. I hope that you have an idea of what
neural nets are and how to create some working computer programs to model them. We covered basic terminology and concepts,
some mathematical foundations, and finished up with some of the more prevalent neural net structures. However, there is still so
much more to learn about neural nets. We need to cover Perceptrons, Fuzzy Associative Memories or FAMs, Bidirectional
Associative Memories or BAMs, Kohonen Maps, Adalines, Madalines, Backpropagation networks, Adaptive Resonance Theory
networks, "Brain State in a Box", and a lot more. Well that's it, my neural net wants to play N64!
Download: neuralnet.zip (Source & EXE)

Discuss this article in the forums

Date this article was posted to GameDev.net: 10/7/1999


(Note that this date does not necessarily correspond to the date the article was written)

http://www.gamedev.net/reference/articles/article771.asp (17 of 18) [25/06/2002 3:20:55 PM]


GameDev.net - Neural Netware
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article771.asp (18 of 18) [25/06/2002 3:20:55 PM]


The Artificial Neuron

B1.1 Neurons and neural networks: the most abstract


view

Michael A Arbib

Abstract
See the abstract for Chapter B1.

There are many types of artificial neuron, but most of them can be captured as formal objects of the kind
shown in figure B1.1.1. There is a set X of signals which can be carried on the multiple input lines x1 ,
. . . , xn and single output line y. In addition, the neuron has an internal state s belonging to some state
set S.

Figure B1.1.1. A ‘generic’ neuron, with inputs x1 , . . . , xn , output y, and internal state s.

A neuron may be either discrete-time or continuous-time. In other words, the input values, state and
output may be given at discrete times t ∈ Z = {0, 1, 2, 3, . . .}, say, or may be given at all times t in some
interval contained in the real line R. A discrete-time neuron is then specified by two functions which
specify (i) how the new state is determined by the immediately preceding inputs and (in some neuron
models, but by no means all) the previous state, and (ii) how the current output is to be ‘read out’ from
the current state:
The next-state-function f : Xn × S → S, s(t) = f (x1 (t − 1), . . . , xn (t − 1), s(t − 1)); and
The output function g : S → Y, y(t) = g(s(t)).
As we shall see in later sections, popular choices take the signal-set X to be either a binary set—{0, 1}
is the ‘classical choice’, though physicists, inspired by the ‘spin-glass’ analogy, often use the spin-down,
spin-up set denoted by {−1, +1}—or an interval of the real line, such as [0, 1]; while the state-set is often
taken to be R itself. A continuous-time neuron is also specified by two functions f : Xn × S → S, and
g : S → Y, y(t) = g(s(t)), but now f serves to define the rate of change of the state, that is, it provides
the right-hand side of the differential equation which defines the state dynamics:

ds(t)
= f (x1 (t), . . . , xn (t), s(t)) .
dt
Clearly, S at least can no longer be a discrete set. A popular choice is to take the signal-set X to be
an interval of the real line, such as [0, 1], and the state-set to be R itself.
The focus of this chapter will be on motivating and defining some of the best known forms for f
and g. But first it is worth noting that the subject of neural computation is not interested in neurons as


c 1997 IOP Publishing Ltd and Oxford University Press Handbook of Neural Computation release 97/1 B1.1:1
Neurons and neural networks: the most abstract view

Figure B1.1.2. A neural network viewed as a system (continuous-time case) or automaton (discrete-time
case). The input at time t is the pattern on the input lines, the output is the pattern on the output lines; and
the internal state is the vector of states of all neurons of the network.

ends in themselves but rather in neurons as units which can be composed into networks. Thus, both as
background for later chapters and as a framework for the focused discussion of individual neurons in this
chapter, we briefly introduce the idea of a neural network.
We first show how a neural network comprised of continuous-time neurons can also be seen as a
continuous-time system in this sense. As typified in figure B1.1.2, we characterize a neural network by
selecting N neurons and by taking the output line of each neuron, which may be split into several branches
carrying identical output signals, and either connecting each branch to a unique input line of another neuron
or feeding it outside the network to provide one of the NL network output lines. Then every input to a
given neuron must be connected either to an output of another neuron or to one of the (possibly split)
N1 input lines of the network. Then the input set X of the entire network is RN1 , the state set Q = RN ,
and the output set Y = RNL . If the ith output line comes from the j th neuron, then the output function
is determined by the fact that the ith component of the output at time t is the output gj (sj (t)) of the j th
neuron at time t. The state transition function for the neural network follows from the state transition
functions of each of the N neurons
dsj (t)
= fj (x1j (t), . . . , xnj j (t), sj (t))
dt
as soon as we specify whether xij (t) is the output of the kth neuron or the value currently being applied
on the lth input line of the overall network.
Turning to the discrete-time case, we first note that, in computer science, an automaton is a discrete-
time system with discrete input, output and state spaces. Formally, we describe an automaton by the sets X,
Y and Q of inputs, outputs and states, respectively, together with the next-state function δ : Q × X → Q
and the output function β : Q → Y . If the automaton is in state q and receives input x at time t,
then its next state will be δ(q, x) and its next output will be β(q). It should be clear that a network
like that shown in figure B1.1.2, but now a discrete-time network made up solely from discrete-time
neurons, functions like a finite automaton, as each neuron changes state synchronously on each tick of
the time-scale t = 0, 1, 2, 3, . . . . Conversely, it can be shown (see e.g. Arbib 1987, Chapter 2—that the
result was essentially, though inscrutably, due to McCulloch and Pitts 1943) that any finite automaton
can be simulated by a suitable network of discrete-time neurons (even those of the ‘McCulloch–Pitts
type’ defined below). Although we can define a neural network for the very general notion of ‘neuron’
shown in figure B1.1.1, most artificial neurons are of the kind shown in figure B1.1.3 in which the input
lines are parametrized by real numbers. The parameter attached to an input line to neuron i that comes
from the output of neuron j is often denoted by wij , and is referred to by such terms as the strength or
synaptic weight for the connection from neuron j to neuron i. Much of the study of neural computation B3.3
is then devoted to finding settings for these weights which will get a given neural network to approximate
some desired behavior. The weights may either be set on the basis of some explicit design principles,
or ‘discovered’ through the use of learning rules whereby the weight settings are automatically adjusted B3.3
‘on the basis of experience’. But all this is meat for later chapters, and we now return to our focal aim:


c 1997 IOP Publishing Ltd and Oxford University Press Handbook of Neural Computation release 97/1 B1.1:2
Neurons and neural networks: the most abstract view

introducing a number of the basic models of single neurons which ‘fill in the details’ in figure B1.1.3. As
described in Section A1.2, there are radically different types of neurons in the human brain, and further A1.2
variations in neuron types of other species.

Figure B1.1.3. A neuron in which each input xi passes through a ‘synaptic weight’ or ‘connection strength’
wi .

Figure B1.1.4. The ‘basic’ neuron. The soma and dendrites act as the input surface; the axon carries the
output signals. The tips of the branches of the axon form synapses upon other neurons or upon effectors.
The arrows indicate the direction of information flow from inputs to outputs.

In neural computation, the artificial neurons are designed as variations on the abstractions of brain
theory and implemented in software, VLSI, or other media. Figure B1.1.4 indicates the main features E1.3, E1.4.3
needed to visualize biological neurons. We divide the neuron into three parts: the dendrites, the soma
(cell body) and a long fiber called the axon whose branches form the axonal arborization. The soma
and dendrites act as input surface for signals from other neurons and/or input devices (sensors). The
axon carries ‘spikes’ from the neuron to other neurons and/or effectors (motors, etc). Towards a first
approximation, we may think of a ‘spike’ as an all-or-none (binary) event; each neuron has a ‘refractory
period’ such that at most one spike can be triggered per refractory period. The locus of interaction between
an axon terminal and the cell upon which it impinges is called a synapse, and we say that the cell with
the terminal synapses upon the cell with which the connection is made.

References
Arbib M A 1987 Brains, Machines and Mathematics 2nd edn (Berlin: Springer)
McCulloch W S and Pitts W H 1943 A logical calculus of the ideas immanent in nervous activity Bull. Math. Biophys.
5 115–33


c 1997 IOP Publishing Ltd and Oxford University Press Handbook of Neural Computation release 97/1 B1.1:3
Justin Heyes-Jones personal web pages - A* Tutorial
[ Contents | Who Am I? | Resume | Jokes | Products | A* Tutorial |
Programmers book shelf | Java | Win95 | Guestbook | Send Email ]

Visitors Jan'00 Update history

Updated September 2, September 6 2001 Added Macintosh OS9 pathfinder link


2001 September 2 2001 Complete revision. This is a complete rewrite of the tutorial. The old
Site hosted on one can be found at oldastar.html but hopefully you will find that the new one is better.
August 31 2001 Now includes example source code (C++).
July 23 2001 I have updated the programmers book shelf page.

A* algorithm tutorial
Introduction

This document contains a description of the AI algorithm known as A*. The downloads
section also has full source code for an easy to use extendable implementation of the
algorithm, and two example problems.

Previously I felt that it would be wrong of me to provide source code, because I wanted to
AI Game focus on teaching the reader how to implement the algorithm rather than just supplying a
Programming ready made package. I have now changed my mind, as I get many emails from people
struggling to get something working. The example code is written in Standard C++ and
Wisdom
uses STL, and does not do anything machine or operating system specific, so hopefully it
will be quite useful to a wide audience.
Amazon.com
associate
State space search

A* is a type of search algorithm. Some problems can be solved by representing the world
in the initial state, and then for each action we can perform on the world we generate states
Amazon.co.uk for what the world would be like if we did so. If you do this until the world is in the state
associate that we specified as a solution, then the route from the start to this goal state is the solution
to your problem.

In this tutorial I will look at the use of state space search to find the shortest path between
two points (pathfinding), and also to solve a simple sliding tile puzzle (the 8-puzzle). Let's
look at some of the terms used in Artificial Intelligence when describing this state space
search.

Some terminology

A node is a state that the problem's world can be in. In pathfinding a node would be just a
2d coordinate of where we are at the present time. In the 8-puzzle it is the positions of all
the tiles.
Next all the nodes are arranged in a graph where links between nodes represent valid steps
in solving the problem. These links are known as edges. In the 8-puzzle diagram the edges
are shown as blue lines. See figure 1 below.
State space search, then, is solving a problem by beginning with the start state, and then
for each node we expand all the nodes beneath it in the graph by applying all the possible
moves that can be made at each point.

Heuristics and Algorithms

At this point we introduce an important concept, the heuristic. This is like an algorithm,
but with a key difference. An algorithm is a set of steps which you can follow to solve a
problem, which always works for valid input. For example you could probably write an
algorithm yourself for multiplying two numbers together on paper. A heuristic is not
guaranteed to work but is useful in that it may solve a problem for which there is no

http://www.geocities.com/jheyesjones/astar.html (1 of 8) [25/06/2002 3:27:40 PM]


Justin Heyes-Jones personal web pages - A* Tutorial
algorithm.
We need a heuristic to help us cut down on this huge search problem. What we need is to
use our heuristic at each node to make an estimate of how far we are from the goal. In
pathfinding we know exactly how far we are, because we know how far we can move each
step, and we can calculate the exact distance to the goal.
But the 8-puzzle is more difficult. There is no known algorithm for calculating from a
given position how many moves it will take to get to the goal state. So various heuristics
have been devised. The best one that I know of is known as the Nilsson score which leads
fairly directly to the goal most of the time, as we shall see.

Cost

When looking at each node in the graph, we now have an idea of a heuristic, which can
estimate how close the state is to the goal. Another important consideration is the cost of
getting to where we are. In the case of pathfinding we often assign a movement cost to
each square. The cost is the same then the cost of each square is one. If we wanted to
differentiate between terrain types we may give higher costs to grass and mud than to
newly made road. When looking at a node we want to add up the cost of what it took to
get here, and this is simply the sum of the cost of this node and all those that are above it
in the graph.

8 Puzzle

Let's look at the 8 puzzle in more detail. This is a simple sliding tile puzzle on a 3*3 grid
where one tile is missing and you can move the other tiles into the gap until you get the
puzzle into the goal position. See figure 1.

Figure 1 : The 8-Puzzle state space for a very simple example

There are 362,880 different states that the puzzle can be in, and to find a solution the
search has to find a route through them. From most positions of the search the number of
edges (that's the blue lines) is two. That means that the number of nodes you have in each
level of the search is 2^d where d is the depth. If the number of steps to solve a particular
state is 18, then that’s 262,144 nodes just at that level.

The 8 puzzle game state is as simple as representing a list of the 9 squares and what's in
them. Here are two states for example; the last one is the GOAL state, at which point
we've found the solution. The first is a jumbled up example that you may start from.

Start state SPACE, A, C, H, B, G, F, E


Goal state A, B, C, H, SPACE, D, G, F, E

http://www.geocities.com/jheyesjones/astar.html (2 of 8) [25/06/2002 3:27:40 PM]


Justin Heyes-Jones personal web pages - A* Tutorial
The rules that you can apply to the puzzle are also simple. If there is a blank tile above,
below, to the left or to the right of a given tile, then you can move that tile into the space.
To solve the puzzle you need to find the path from the start state, through the graph down
to the goal state.

There is example code to to solve the 8-puzzle on the downloads page.

Pathfinding

In a video game, or some other pathfinding scenario, you want to search a state space and
find out how to get from somewhere you are to somewhere you want to be, without
bumping into walls or going too far. For reasons we will see later, the A* algorithm will
not only find a path, if there is one, but it will find the shortest path. A state in pathfinding
is simply a position in the world. In the example of a maze game like Pacman you can
represent where everything is using a simple 2d grid. The start state for a ghost say, would
be the 2d coordinate of where the ghost is at the start of the search. The goal state would
be where pacman is so we can go and eat him. There is also example code to do
pathfinding on the downloads page.

Figure 2 : The first three steps of a pathfinding state space

Implementing A*

We are now ready to look at the operation of the A* algorithm. What we need to do is start
with the goal state and then generate the graph downwards from there. Let's take the
8-puzzle in figure 1. We ask how many moves can we make from the start state? The
answer is 2, there are two directions we can move the blank tile, and so our graph expands.
If we were just to continue blindly generating successors to each node, we could
potentially fill the computer's memory before we found the goal node. Obviously we need
to remember the best nodes and search those first. We also need to remember the nodes
that we have expanded already, so that we don't expand the same state repeatedly.
Let's start with the OPEN list. This is where we will remember which nodes we haven't yet
expanded. When the algorithm begins the start state is placed on the open list, it is the only
state we know about and we have not expanded it. So we will expand the nodes from the
start and put those on the OPEN list too. Now we are done with the start node and we will
put that on the CLOSED list. The CLOSED list is a list of nodes that we have expanded.

f=g+h

Using the OPEN and CLOSED list lets us be more selective about what we look at next in
the search. We want to look at the best nodes first. We will give each node a score on how
good we think it is. This score should be thought of as the cost of getting from the node to
the goal plus the cost of getting to where we are. Traditionally this has been represented by

http://www.geocities.com/jheyesjones/astar.html (3 of 8) [25/06/2002 3:27:40 PM]


Justin Heyes-Jones personal web pages - A* Tutorial
the letters f, g and h. 'g' is the sum of all the costs it took to get here, 'h' is our heuristic
function, the estimate of what it will take to get to the goal. 'f' is the sum of these two. We
will store each of these in our nodes.
Using the f, g and h values the A* algorithm will be directed, subject to conditions we will
look at further on, towards the goal and will find it in the shortest route possible.

So far we have looked at the components of the A*, let's see how they all fit together to
make the algorithm :

Pseudocode

Hopefully the ideas we looked at in the preceding paragraphs will now click into place as
we look at the A* algorithm pseudocode. You may find it helpful to print this out or leave
the window open while we discuss it.

To help make the operation of the algorithm clear we will look again at the 8-puzzle
problem in figure 1 above. Figure 3 below shows the f,g and h scores for each of the tiles.

Figure 3 : 8-Puzzle state space showing f,g,h scores

First of all look at the g score for each node. This is the cost of what it took to get from the
start to that node. So in the picture the center number is g. As you can see it increases by
one at each level. In some problems the cost may vary for different state changes. For
example in pathfinding there is sometimes a type of terrain that costs more than other
types.
Next look at the last number in each triple. This is h, the heuristic score. As I mentioned
above I am using a heuristic known as Nilsson's Sequence, which converges quickly to a
correct solution in many cases. Here is how you calculate this score for a given 8-puzzle
state :

Nilsson's sequence score


A tile in the center scores 1 (since it should be empty)
For each tile not in the center, if the tile clockwise to it is not the one that should be
clockwise to it then score 2. Multiply this sequence by three and finally add the total
distance you need to move each tile back to its correct position. Reading the source code
should make this clearer.

Looking at the picture you should satisfy yourself that the h scores are correct according to
this algorithm.

Finally look at the digit on the left, the f score. This is the sum of g and h, and by tracking

http://www.geocities.com/jheyesjones/astar.html (4 of 8) [25/06/2002 3:27:40 PM]


Justin Heyes-Jones personal web pages - A* Tutorial
the lowest f down through the state space you are doing what the A* algorithm would be
doing during its search.

Let me now look at the example source code provided with the tutorial, for although the
algorithm at this stage may be clear in your mind the implementation is a little
complicated. The language of choice for this kind of algorithm is really lisp or prolog, and
most Universities use these when teaching. This effectively lets students focus on the
algorithm rather than the implementation details such as memory and data stuctures. For
our purposes however, I will refer to my example source code. This is in C++ and uses
standard library and STL data structures.

C++ implementation details

If you intend on compiling and running the example code then you can get it on the
downloads page. I have not put any project, workspace or makefiles in the archive, but
compilation and linking should be straight forward; the programs run from a command
line. As we will see the A* algorithm is in a header file, since it is implemented as a
template class, so to compile you need only compile on of the example files 8puzzle.cpp
or findpath.cpp.

There are comments throughout the source, and I hope it is clear and readable. What
follows then is a very brief summary for how it works, and the basic design ideas.

The main class is called AStarSearch, and is a template class. I chose to use templates
because this enables the user to specialise the AStarSearch class to their user state in an
efficient way. Originally I used inheritence from a virtual base class, but that lead to the
use of type casts in many places to convert from the base Node to the user's node. Also
templates are resolved at compile time rather than runtime and this makes them more
efficient and require less memory.

You pass in a type which represents the state part of the problem. That type must contain
the data you need to represent each state, and also several member functions which get
called during the search. These are described below :

float GoalDistanceEstimate( PuzzleState &nodeGoal );


Return the estimated cost to goal from this node

bool IsGoal( PuzzleState &nodeGoal );


Return true if this node is the goal

void GetSuccessors( AStarSearch *astarsearch );


For each successor to this state call the AStarSearch's AddSuccessor call to add each one
to the current search

float GetCost( PuzzleState *successor );


Return the cost moving from this state to the state of successor

bool IsSameState( PuzzleState &rhs );


Return true if the provided state is the same as this state

The idea is that you should easily be able to implement different problems. All you need
do is create a class to represent a state in your problem, and then fill out the functions
above.
Once you have done that you create a search class instance like this :

AStarSearch astarsearch;

Then the create the start and goal states and pass them to the algorithm to initialize the
search :

astarsearch.SetStartAndGoalStates( nodeStart, nodeEnd );

http://www.geocities.com/jheyesjones/astar.html (5 of 8) [25/06/2002 3:27:40 PM]


Justin Heyes-Jones personal web pages - A* Tutorial
Each step (a step is getting the best node and expanding it's successors) you call :

SearchState = astarsearch.SearchStep();

Which returns a status which let's you know whether the search succeeded, failed, or is
still going.

Once your search succedes you need to be able to display it to the user, or use it in your
program. To facilitate this I have added functions to allow movement through the solution.

UserState *GetSolutionStart();
UserState *GetSolutionNext()

UserState *GetSolutionEnd();
UserState *GetSolutionPrev()

You use these to move an internal iterator through the solution. The most typical use
would be to GetSolutionStart (the start state) and the iterate through each node using
GetSolutionNext. For debugging purposes or some problems you may need to iterate
through the solution backwards, and the second two functions allow that.

Debugging and Educational functions

Let's say you decide to display the OPEN and CLOSED lists at each step of the solution.
This is a common debug feature whilst getting the algorithm working. Further, for the
student it is often easier to see what is going on this way. Using the following calls you
can display the lists during the search process...

UserState *GetOpenListStart( float &f, float &g, float &h );


UserState *GetOpenListNext( float &f, float &g, float &h );
UserState *GetClosedListStart( float &f, float &g, float &h );
UserState *GetClosedListNext( float &f, float &g, float &h );

As you see these calls take references to float values for f,g and h so if your debugging or
learning needs involve looking at these then you can pass floats in to store the results. If
you don't care these are optional arguments.

Examples of how you use these features are present in both the findpath.cpp and
8puzzle.cpp example files.

I hope that at this point you will understand the key concepts you need, and by reading and
experimenting with the example code (stepping through it with a debugger is very
instructive) you hopefully will fully grasp the A* Algorithm. To complete this
introduction I will briefly cover Admissibility and Optimization issues.

Admissibility

Any graph search algorithm is said to be admissible if it always returns an optimal soution,
that is the one with the lowest cost, if a solution exists at all.
However, A* is only admissible if the heuristic you use h' never over-estimates the
distance to the goal. In other words if you knew a heuristic h which always gave the exact
distance to goal then to be admissible h' must be less than or equal to h.
For this reason when choosing a heuristic you should always try to ensure that it does not
over-estimate the distance the goal. In practice this may be impossible. Look at the
8-puzzle for example; in our heuristic above it is possible that we may get an estimated
cost to goal that is higher than is really neccessary. But it does help you to be aware of this
theory. If you set the heuristic to return zero, you will never over-estimate the distance to
goal, but what you will get is a simple search of every node generated at each step
(breadth-first search).
One final note about admissibility; there is a corollary to this theory called the Graceful
Decay of Admissibility which states that if your heuristic rarely over-estimates the real
distance to goal by more than a certain value (lets call it E) then the algorithm will rarely

http://www.geocities.com/jheyesjones/astar.html (6 of 8) [25/06/2002 3:27:40 PM]


Justin Heyes-Jones personal web pages - A* Tutorial
find a solution which costs more than E over the cost of the optimal solution.

Optimization

A good source of optimizations for A* can be found in Steve Rabin's chapters in Game
Gems, which is on the books page. The forthcoming book AI Wisdom by the same
publisher is going to have several chapters on optimization of A*. These of course focus
on pathfinding, which is the ubiquitous use of A* in games.
Optimizing pathfinding is a whole subject in itself and I only want to target the A*
algorithm for general use, but there are some obvious optimizations you will want to make
for most problems. After testing my example code with VTune I found the two main
bottlenecks were searching the OPEN and CLOSED lists for a new node, and managing
new nodes. A simple but very effective optimization was to write a simpler memory
allocator than the C++ std new uses. I have provided the code for this class and you can
enable it in stlastar.h. I may write a tutorial on it in the future if there is sufficient interest.
Since you always want to get the node with the lowest 'f' score off the OPEN list each
search loop you can use a data structure called a 'priority queue'. This enables to you to
organise your data in a way in which the best (or worst depending on how you set it up)
item can always be removed efficiently. Steve Rabin's chapter in the book above shows
how to use an STL Vector along with heap operations to get this behaviour. My source
code uses this technique
If you are interested in priority queues follow the link above to my old A* tutorial as I
implemented one from scratch in C, and the source code has been used in public projects
such as FreeCell Solver
Another optimization is that instead of searching the lists you should use a hash table. This
will prevent you having to do a linear search. A third optimization is that you never need
to backtrack in a graph search. If you look at pathfinding for example you will never be
nearer to the goal if you step back to where you came from. So when you write your code
to generate the successor's of a node, you can check the generated ones and eliminate any
states that are the same as the parent. Although this makes no difference to the operation
of the algorithm it does make backtracking quicker.
The key to optimization is not to do it until you have your code working and you know
that your problem is correctly represented. Only then can you start to optimize the data
structures to work better for your own problem. Using VTune or True Time, or whatever
profiler you have available is the next step. In some problems checking to see if something
is the goal or not may be costly, whilst in others generating the successor nodes at each
step may be a significant bottleneck. Profiling takes the guesswork out of finding where
the bottleneck is, so that you can target the key problems in your application.

For more information

● The original tutorial was inspired by Brian Stout's pathfinding tutorial in Game
Developer:
http://www.gamasutra.com/features/19990212/sm_01.htm
● I also recommend Stephen Woodcock's website:
http://www.gameai.com
● Books See the books page; the AI section contains two good text books which include
good sections on the A* algorithm.

Other implementations on the web

If this list is out of date or you would like to add an implementation link I be grateful for
your email.

● Thomas Grubb's Delphi Pathfinding demo

http://www.geocities.com/jheyesjones/astar.html (7 of 8) [25/06/2002 3:27:40 PM]


Justin Heyes-Jones personal web pages - A* Tutorial
http://www.riversoftavg.com/downloads.htm
● Visual Basic (Dijkstra not A*) by Julien Lecomte
http://amanitamuscaria.phidji.com/files/downloadcode.asp?file=tile
● Director (using Lingo) pathfinding demo
http://www.spritelab.dk/beta/Astar2.html
● James MacGill, Leeds University, has a useful Java based demo
http://www.ccg.leeds.ac.uk/james/aStar
● For those of you at University or in love with Lisp for your own reasons here is a great
lisp link
http://yoda.cis.temple.edu:8080/UGAIWWW/resources/search-resources.html
● Geert-Jan van Opdorp's Java A* Implementation
http://www.gameai.com/javastar.html
● Blair Robertson's A* pathfinder in C (Mac OS9)
http://www.idevgames.com/fileshow.php3?showid=179

Copyright 1999,2001 Justin Heyes-Jones


All rights Reserved.
Linking to this site is welcome and encouraged, but reproduction in whole or in part either
on the WWW or other media is prohibited under appropriate copyright laws. Please ask
for permission first.

No warranty provided for use of software, either expressed or implied.

http://www.geocities.com/jheyesjones/astar.html (8 of 8) [25/06/2002 3:27:40 PM]


Chess Tree Search

Last modified: 16 November 1997

Accessed:

Chess Tree Search


Tree search is one of the central algorithms of any game playing program. The term is based on looking
at all possible game positions as a tree, with the legal game moves forming the branches of this tree. The
leaves of the tree are all final positions, where the outcome of the game is known. The problem for most
interesting games is that the size of this tree is tremendously huge, something like W^D, where W is the
average number of moves per position and D is the depth of the tree, Searching the whole tree is
impossible, mainly due to lack of time, even on the fastest computers. All practical search algorithms are
approximations of doing such a full tree search.
These pages give an overview of traditional, fixed depth minimax search, with various refinements such
as selective extensions and pruning, as usd in most modern chess programs. There are other, more
experimental, game tree search techniques that take a different approach, like e.g. B* and conspiracy
numbers, which I hope to describe at a later time.
This overview covers the follwoing subjects:
● MiniMax and NegaMax

● Alpha-Beta search
● Aspiration search
● Transposition table
● Iterative Deepening
● Principal Variation Search
● Memory Enhanced Test
● Enhanced Transposition Cutoff
● Killer heuristic
● History heuristic
● Null move heuristic
● Quiescence search
● Selective extensions
The various search algorithms are illustrated in a compact pseudo-C. The variables and functions used
have the following meaning:
pos A position in a chess game.
depth The number of levels in the tree to be searched.

http://www.xs4all.nl/~verhelst/chess/search.html (1 of 6) [25/06/2002 3:28:37 PM]


Chess Tree Search

Evaluate A function that determines a value for a position as seen for the side to move. In practice
such a function will be composed of the difference in material values and a large number of
positional terms. Results lie between -INFINITY and +INFINITY.
best The best value seen while searching the next level in the tree.
Successors A function that determines the set of all positions that can be reached from a position in one
move (move generation).
succ The set of positions reachable form the input position by doing one move.

MiniMax and NegaMax


Finding the best move for some position on the chess board means searching through a tree of positions.
At the root of the tree we search for the best successor position for the player to move, at the next level
we search for the best succesor position from the standpoint of the opponent, and so on. Chess tree search
is an alternation between maximizing and minimizing the value of the positions in the tree; this is often
abbreviated to minimaxing. To remove the distinction between own and opponent position, the value of a
position is always evaluated from the standpoint of the player to move, i.e by negating the value as seen
by the opponent; this is called negamaxing. This is illustrated by the following C-like pseudo code:

int NegaMax (pos, depth)


{
if (depth == 0) return Evaluate(pos);
best = -INFINITY;
succ = Successors(pos);
while (not Empty(succ))
{
pos = RemoveOne(succ);
value = -NegaMax(pos, depth-1);
if (value > best) best = value;
}
return best;
}
The number of positions that has to be searched by this algorithm is W^D, where W is the width of the
tree (average number of moves possible in each position) and D is the depth of the tree (^ indicates
exponentiation). This is extremely ineffcient and would even hold back a supercomputer from reaching
greater depths.

Alpha-Beta search
Alpha-Beta search is the first major refinement for reducing the number of positions that has to be
searched and thus making greater depths possible in the same amount of time. The idea is that in large
parts of the tree we are not interested in the exact value of a position, but are just interested if it is better
or worse than what we have found before. Only the value of the psoition along the principal variation has

http://www.xs4all.nl/~verhelst/chess/search.html (2 of 6) [25/06/2002 3:28:37 PM]


Chess Tree Search

to be determined exactly (the principle variation is the alternation of best own moves and best opponent
moves from the root to the depth of the tree).
The AlphaBeta search procedure gets two additional arguments which indicate the bounds between
which we are interested in exact values for a position:

int AlphaBeta (pos, depth, alpha, beta)


{
if (depth == 0) return Evaluate(pos);
best = -INFINITY;
succ = Successors(pos);
while (not Empty(succ) && best < beta)
{
pos = RemoveOne(succ);
if (best > alpha) alpha = best;
value = -AlphaBeta(pos, depth-1, -beta, -alpha);
if (value > best) best = value;
}
return best;
}
The gain from AlphaBeta will come form the earlier exit from the while loop; a value of best that
equals or exceeds beta is called a cutoff. These cutoffs are completely safe because they mean that this
branch of the tree is worse than the prinicpal variation. The largest gain is reached when at each level of
the tree the best successor position is searched first, because this position will either be part of the
principal variation (which we want to establish as early as possible) or it will cause a cutoff to be as early
as possible.
Under optimal circumstances AlphaBeta still has to search W^((D+1)/2) + W^(D/2) - 1 positions. This
is much less than MiniMax, but still exponential. It allows to reach about twice the depth in the same
amount of time. More positions will have to be searched if move ordering is not perfect.
Note: The version of AlphaBeta shown above is also known as fail-soft alpha-beta. It can return
values outside the range alpha...beta, which can be used as upper or lower bounds if a re-search
has to be done.

Aspiration search
Aspiration search is a small improvement on Alpha-Beta search. Normally the top level call would be
AlphaBeta(pos, depth, -INFINITY, +INFINITY). Aspiration search changes this to
AlphaBeta(pos, depth, value-window, value+window), where value is an estimate
for the expected result and window is a measure for the deviations we expect from this value.
Aspiration search will search less positions because it uses alpha/beta limits already at the root of the
tree. The danger is that the search result will fall outside the aspiration window, in which case a re-search
has to be done. A good choice of the window variable will still give an average net gain.

http://www.xs4all.nl/~verhelst/chess/search.html (3 of 6) [25/06/2002 3:28:37 PM]


Chess Tree Search

Transposition table
The transposition table is a hashing scheme to detect positions in different branches of the search tree
that are identical. If a search arrives at a position that has been reached before and if the value obtained
can be used, the position does not have to be searched again. If the value cannot be used, it is still
possible to use the best move that was used previously at that position to improve the move ordering.
A transposition table is a safe optimization that can save much time. The only danger is that mistakes can
be made with respect to draw by repetition of moves because two positions will not share the same move
history.
A transposition table can save up to a factor 4 on tree size and thus on search time. Because of the
exponential nature of tree growth, this means that maybe one level deeper can be searched in the same
amount of time.

Iterative Deepening
Iterative deepening means repeatedly calling a fixed depth search routine with increasing depth until a
time limit is exceeded or maximum search depth has been reached. The advantage of doing this is that
you do not have to choose a search depth in advance; you can always use the result of the last completed
search. Also because many position evaluations and best moves are stored in the transposition table, the
deeper search trees can have a much better move ordering than when starting immediately searching at a
deep level. Also the values returned from each search can be used to adjust the aspiration search window
of the next search, if this technique is used.

Principal Variation Search


Principal variation search is a variation of alpha-beta search where all nodes outside the principal
variation are searched with a minimal window beta = alpha + 1. The idea is that with perfect move
ordening all moves outside the principal variation will be worse than the principal variation; this can be
proven by the minimal window search failing low. If the move ordening is imperfect, fail high may be
encountered and in such a case a re-search has to be done with the full alpha-beta window. The
expectation is that the gain of the minimal window search is higher than the loss of these occasional
re-searches.
Has this been proven somewhere?

int PrincipalVariation (pos, depth, alpha, beta)


{
if (depth == 0) return Evaluate(pos);
succ = Successors(pos);
pos = RemoveOne(succ);
best = -PrincipalVariation(pos, depth-1, -beta, -alpha);
while (not Empty(succ) && best < beta)

http://www.xs4all.nl/~verhelst/chess/search.html (4 of 6) [25/06/2002 3:28:37 PM]


Chess Tree Search

{
pos = RemoveOne(succ);
if (best > alpha) alpha = best;
value = -PrincipalVariation(pos, depth-1, -alpha-1, -alpha);
if (value > alpha && value < beta)
best = -PrincipalVariation(pos, depth-1, -beta, -value);
else if (value > best)
best = value;
}
return best;
}
A further refinement of this is known as NegaScout. See Alexander Reinefeld's on-line description .

Memory Enhanced Test


Memory enhanced test is a family of search algorithms that have in common that at the top level an
alpha-beta search is done with a minimal window beta = alpha+1. Differences can be found in the
sequence of alpha-beta values that is tried. Because the top level search is called repeatedly with different
alpha-beta parameters and the same depth, it is important to have a large transposition table in order to
re-use partial search results from previous searches. See [TS95c] or Aske Plaat's on-line description .

Enhanced Transposition Cutoff


Move ordering is important in tree search because it increases the chance of getting a cutoff on the first
successor position searched. This is not always optimal; there may be several successors causing a cutoff
and we want to use the one with the smalles search tree. One idea that has been tried is to look at all
successor positions and see if they are in the transposition table and cause a cutoff. If one such position is
found, no further serach has to be done. This can save about 20-25% in total tree size.

Killer heuristic
The killer heuristic is used to improve the move ordering. The idea is that a good move in one branch of
the tree is also good at another branch at the same depth. For this purpose at each ply we maintain one or
two killer moves that are searched before other moves are searched. A successful cutoff by a non-killer
move overwrites one of the killer moves for that ply.

History heuristic
The history heuristic is another improvement method for the move ordering. In a table indexed by from
and to square statistics are maintained of good cutoff moves. This table is used in the move ordering sort
(together with other information such as capture gains/losses).

http://www.xs4all.nl/~verhelst/chess/search.html (5 of 6) [25/06/2002 3:28:37 PM]


Chess Tree Search

Null move heuristic


The null move heuristic is a method of skipping searches in parts of the tree where the position is good
enoigh. This is tested by doing a null move (i.e. passing, doing no move at all) and then seraching with
reduced depth. If the result of this is higher than beta, no further search is done; if the result is lower than
beta we do a normal search.
The null move heuristic has big dangers because it can fail to detect deep combinations. On the other
hand it can save a lot of time by skipping large parts of the search tree.

Quiescense search
Instead of calling Evaluate when depth=0 it is customary to call a quiescence search routime. Its purpose
is to prevent horizon effects, where a bad move hides an even worse threat because the threat is pushed
beyond the search horizon. This is done by making sure that evaluations are done at stable positions, i.e.
positions where there are no direct threats (e.g. hanging pieces, checkmate, promotion). A quiescence
search does not take all possible moves into account, but restricts itself e.g. to captures, checks, check
evasions, and promotion threats. The art is to restrict the quiescence search in such a way that it does not
add too much to the search time. Major debates are possible about whether it is better to have one more
level in the full width search tree at the risk of overlooking deeper threats in the quiescence search.

Selective extensions
In the full width part of the tree, search depth can be increased in forced variations. Different criteria can
be used to decide if a variation is forced; examples are check evasions, capturing a piece that has just
captured another piece, promotion threats. The danger if used carelessly is an explosion in tree size.
A special case of selective extensions is the singular extension heuristic introduced in the Deep Thought
chess program. The idea here is to detect forced variations by one successor position being sgnificantly
better than the others. Implementation is tricky because in alpha-beta search exact evaluations are often
not available.
It is said that the commercial chess programs use fractional depth increments to distiguish the quality of
different moves; moves with high probabbility of being good get a lower depth increment than moves
that seem bad. I have no direct references to this; the commercial chess programmers do not publish their
techniques.

[XS4ALL] [Home] [Statistics] [Up] [Previous] [Next]


Comments to: Paul Verhelst (verhelst@xs4all.nl)

http://www.xs4all.nl/~verhelst/chess/search.html (6 of 6) [25/06/2002 3:28:37 PM]


Gamasutra - Features - Coordinated Unit Movement - Introduction

| | | |

Features
By Dave C. Pottinger
Gamasutra
January 22, 1999
Coordinated Unit Movement
Originally How many times have you been sitting in rush-hour traffic thinking, "Hey, I know Contents
Published in where I want to go. And I'm sure everyone around me knows where they want to
Game Developer Introduction
go, too. If we could just work together, I'll bet we would all get where we wanted
Magazine, to go a lot easier, faster, and without rear-ending each other"? As your frustration Movement Issues
January, 1999. rises, you realize that impatient commuters aren't the most cooperative people. Facing Game
However, if you're a game player, uncooperative resource gatherers and infantry Developers
are probably even more frustrating than a real-life traffic jam. Figuring out how to
Simple Movement
get hundreds of units moving around a complex game map in real time -
Algorithm
commonly referred to as pathfinding - is a tough task. While pathfinding is a hot
industry buzzword, it's only half of the solution. Movement, the execution of a Collision
given path, is the other half of the solution. For real-time strategy games, this Determination
movement goes hand in hand with pathfinding. An axeman certainly needs a plan
(as in, a path) for how he's going to get from one side of his town to the other to Discrete vs.
help stave off the enemy invasion. If he doesn't execute that plan using a good Continuous Simulation
movement system, however, all may be lost.
Predicted Positions
Game Developer has already visited the topic of pathfinding in such past articles as
Unit to Unit
"Smart Move: Path-Finding" by Brian Stout (October/November 1996) and
"Real-Time Pathfinding for Multiple Objects" by Swen Vincke (June 1997). Rather Cooperation
than go over the same material, I'll approach the problem from the other side by Basic Planning
examining the ways to execute a path that's already been found. In this article, I'll
cover the basic components of an effective movement system. In a companion Basic Definitions
article in next month's Game Developer, I'll extend these basic concepts to cover
higher-order movement and implementation. Though the examples in these articles focus mainly on a
real-time strategy game, the methods I'll describe can easily be applied to other genres.

Movement Issues Facing Game Developers

Letters to the Editor:


Write a letter
View all letters

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990122/movement_01.htm [25/06/2002 3:29:24 PM]


Gamasutra - Features - Coordinated Unit Movement - "Movement Issues Facing Developers" [01.22.99]

| | | |

Features
By Dave C. Pottinger
Gamasutra
January 22, 1999 Movement Issues Facing Game Developers
Before we dive into coordinated unit movement, let's take a look at some of the Contents
Originally movement issues facing game developers today. Most of these have to do with Introduction
Published in minimizing CPU load versus maximizing the accuracy and intelligence of the
Game Developer movement. Movement Issues
Magazine, Facing Game
January, 1999. Moving one unit versus moving multiple units. Moving one unit is generally pretty Developers
simple, but methods that work well for one unit rarely scale up effortlessly for Simple Movement
application to hundreds of units. If you're designing a system for hundreds of Algorithm
units, it will need to be very conservative in its CPU use.
Collision
Some movement features are CPU intensive. Very few games that move hundreds Determination
of units support advanced behavior such as modeling the acceleration and
deceleration of these units. The movement of large ships and heavily armored Discrete vs.
units has a lot more realism with acceleration and deceleration, but that realism Continuous Simulation
comes at a high cost in terms of extra CPU usage. The actual movement
Predicted Positions
calculation becomes more complicated because you have to apply the time
differential to the acceleration to create the new velocity. As we extend our Unit to Unit
movement system to handle prediction, we'll see that acceleration and Cooperation
deceleration complicate these calculations as well. Modeling a turn radius is also
difficult because many pathfinding algorithms are not able to take turn radii into Basic Planning
account at all. Thus, even though a unit can find a path, it may not be able to
follow that path because of turn radius restrictions. Most systems overcome this Basic Definitions
deficiency by slowing the unit down to make a sharp turn, but this involves an
extra set of calculations.

Different lengths for the main game update loop. Most games use the length of the last pass through
update loop as some indication of how much time to simulate during the next update pass. But such a
solution creates a problem for unit movement systems because these lengths vary from one update to
the next (see Figure 1 below). Unit movement algorithms work much better with nice, consistent
simulation intervals. A good update smoothing system can alleviate this problem quite a bit.

Letters to the Editor:


Write a letter
View all letters

http://www.gamasutra.com/features/19990122/movement_02.htm (1 of 3) [25/06/2002 3:31:47 PM]


Gamasutra - Features - Coordinated Unit Movement - "Movement Issues Facing Developers" [01.22.99]

Figure 1. Varied update lengths cause units


to move differing distances each update.

Sorting out unit collisions. Once units come into contact with one another, how do you get them apart
again? The naïve solution is just never to allow units to collide in the first place. In practice, though,
this requirement enforces exacting code that is difficult to write. No matter how much code you write,
your units will always find a way to overlap. More importantly, this solution simply isn't practical for
good game play; in many cases, units should be allowed to overlap a little. Hand-to-hand combat in
Ensemble Studios' recent title Age of Empires should have been just such a case. The restriction for
zero collision overlap often makes units walk well out of their way to fight other units, exposing them
to needless (not to mention frustrating) additional damage. You'll have to decide how much collision
overlap is acceptable for your game and resolve accordingly.

Map complexity. The more complex the map is, the more complicated and difficult good movement will
be to create. As game worlds and maps are only getting more intricate and realistic, the requirement
for movement that can handle those worlds goes up, too.

Random maps or controlled scenarios? Because you can't hard-code feasible paths, random maps are
obviously more difficult to deal with in many cases, including pathfinding. When pathfinding becomes
too CPU intensive, the only choice (aside from reducing map complexity or removing random maps) is
to decrease the quality of the pathfinding. As the quality of the pathfinding decreases, the quality of
the movement system needs to increase to pick up the slack.

Maximum object density. This issue, more than anything, dictates how accurate the movement system
must be. If your game has only a handful of moving objects that never really come into contact with
one another (as is the case with most any first-person shooter), then you can get away with a
relatively simple movement system. However, if you have hundreds of moving objects that need to
have collision and movement resolution on the scale of the smallest object (for example, a unit can
walk through a small gap between two other units), then the quality and accuracy requirements of
your movement system are dramatically raised.

Simple Movement Algorithm

http://www.gamasutra.com/features/19990122/movement_02.htm (2 of 3) [25/06/2002 3:31:47 PM]


Gamasutra - Features - Coordinated Unit Movement - "Movement Issues Facing Developers" [01.22.99]

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990122/movement_02.htm (3 of 3) [25/06/2002 3:31:47 PM]


Gamasutra - Features - Coordinated Unit Movement - "Simple Movement Algorithm" [01.22.99]

| | | |

Features
By Dave C. Pottinger
Gamasutra
January 22, 1999 Simple Movement Algorithm
Let's start with some pseudo code for a simple, state-based movement algorithm Contents
Originally (Listing 1). While this algorithm doesn't do much more than follow a path and Introduction
Published in decide to find a new path when a collision is found, it does work equally well for
Game Developer both 2D and 3D games. We'll start in a given state and iterate until we can find a Movement Issues
Magazine, waypoint to move towards. Once we find that point, we break out of the loop and Facing Game
January, 1999. do the movement. There are three states: WaitingForPath, ReachedGoal, and Developers
IncrementWaypoint. The movement state for a unit is preserved across game
updates in order to allow us to set future events, such as the "automatic" waypoint Simple Movement
Algorithm
increment on a future game update. By preserving a unit's movement state, we
lessen the chance that a unit will make a decision on the next game update that Collision
counters a decision made during the current update. This is the first of several Determination
planning steps that we'll introduce.
Discrete vs.
We assume that we'll be given a path to follow and that the path is accurate and Continuous Simulation
viable (meaning, no collisions) at the time it was given to us. Because most
strategy games have relatively large maps, a unit may take several minutes to get Predicted Positions
all the way across the map. During this time, the map can change in ways that can
Unit to Unit
invalidate the path. So, we do a simple collision check during the state loop. At this Cooperation
point, if we find a collision, we'll just repath. Later on, we'll cover several ways to
avoid repathing. Basic Planning

Basic Definitions
Collision Determination

Letters to the Editor:


Write a letter
View all letters

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990122/movement_03.htm [25/06/2002 3:32:09 PM]


Gamasutra - Features - Coordinated Unit Movement - "Collision Determination" [01.22.99]

| | | |

Features
By Dave C. Pottinger
Gamasutra
January 22, 1999 Collision Determination
The basic goal of any collision determination system is to find out if two units have Contents
Originally collided. For the time being, we'll represent all collisions as two-entity collisions. Introduction
Published in We'll cover compound collisions (collisions involving three or more entities) next
Game Developer month. Once a collision is found, each entity needs to know about the collision in Movement Issues
Magazine, order to make appropriate movement decisions. Facing Game
January, 1999. Developers
Basic collision determination for most strategy games consists of treating all units
as spheres (circles in 2D) and doing a simple spherical collision check. Whether or Simple Movement
not such a system is sufficient depends on the specific requirements of a game. Algorithm
Even if a game implements more complex collision - such as oriented bounding
boxes or even low-level polygon to polygon intersection tests - maintaining a total Collision
Determination
bounding sphere for quick potential collision elimination will usually improve
performance. Discrete vs.
Continuous Simulation
There are three distinct entity types to take into account when designing a collision
system: the single unit, a group of units, and a formation (see Figure 2 below). Predicted Positions
Each of these types can work well using a single sphere for quick collision culling
(elimination of further collision checks). In fact, the single unit simply uses a Unit to Unit
Cooperation
sphere for all of its collision checking. The group and the formation require a bit
more work, though. Basic Planning

Basic Definitions

Letters to the Editor:


Write a letter
View all letters

Figure 2. Collision entities.

For a group of units, the acceptable minimum is to check each unit in the group for a collision. By
itself, this method will allow a non-grouped unit to sit happily in the middle of your group. For our
purposes, we can overlook this discrepancy, because formations will provide the additional, more rigid
collision checking. Groups also have the ability to be reshaped at any time to accommodate tight
quarters, so it's actually a good idea to keep group collision checking as simple as possible.

http://www.gamasutra.com/features/19990122/movement_04.htm (1 of 2) [25/06/2002 3:33:56 PM]


Gamasutra - Features - Coordinated Unit Movement - "Collision Determination" [01.22.99]

A formation requires the same checks as a group, but these check must further ensure that there are
no internal collisions within the formation. If a formation has space between some of its units, it is
unacceptable for a non-formed unit to occupy that space. Additionally, formations generally don't have
the option to reshape or break. However, it's probably a good idea to implement some game rules that
allow formations to break and reform on the other side of an obstacle if no path around the obstacle
can be found.

For our system, we'll also keep track of the timing of the collision. Immediate collisions represent
collisions currently existing between two objects. Future collisions will happen at a specified point in
the future (assuming neither of the objects changes its predicted movement behavior). In all cases,
immediate collisions have a higher resolution priority than future collisions. We'll also track the state of
each collision as unresolved, resolving, or resolved.

Discrete vs. Continuous Simulation

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990122/movement_04.htm (2 of 2) [25/06/2002 3:33:56 PM]


Gamasutra - Features - Coordinated Unit Movement - "Discrete vs. Continuous Simulation" [1.22.99]

| | | |

Features
By Dave C. Pottinger
Gamasutra
January 22, 1999 Discrete vs. Continuous Simulation
Most movement algorithms are discrete in nature. That is, they move the unit from Contents
Originally point A to point B without considering what might be between those two points,
Published in whereas a continuous simulation would consider the volume between the two Introduction
Game Developer points as well. In a lag-ridden Internet game, fast moving units can move quite a
Magazine, distance in a single game update. When discrete simulations are coupled with Movement Issues
January, 1999. these long updates, units can actually hop over other objects with which they Facing Game
should have collided. In the case of a resource gathering unit, no one really minds Developers
too much. But players rarely want enemy units to be able to walk through a wall.
While most games work around this problem by limiting the length of a unit's Simple Movement
Algorithm
move, this discrete simulation problem is relatively easy to solve (see Figure 3
below). Collision
Determination

Discrete vs.
Continuous Simulation

Predicted Positions

Unit to Unit
Cooperation

Basic Planning

Basic Definitions

Letters to the Editor:


Write a letter
View all letters

Figure 3. Solving the problem with discrete


movement simulation.

One way to solve the problem is to sub-sample each move into a series of several smaller moves.
Taking the size of the moving unit into account, we make the sampling interval small enough to
guarantee that no other unit can fit between two of the sample points. We then run each of those
points through the collision determination system. Calculating all of those points and collisions may
seem overly expensive, but later on we'll see a potential way to offset most of that cost.

Another method is to create what we'll call a move line. A move line represents the unit's move as a
line segment starting at point A and ending at point B. This system creates no extra data, but the
collision check does have an increase in complexity; we must convert from a simple spherical collision
check to a more expensive calculation that involves finding the distance from a point to a line
segment. Most 3D games have already implemented a fast hierarchical system for visible object
culling, so we can reuse that for collision culling. By quickly narrowing down the number of potential
collisions, we can afford to spend more time checking collisions against a small set of objects.

http://www.gamasutra.com/features/19990122/movement_05.htm (1 of 2) [25/06/2002 3:34:45 PM]


Gamasutra - Features - Coordinated Unit Movement - "Discrete vs. Continuous Simulation" [1.22.99]

Predicted Positions

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990122/movement_05.htm (2 of 2) [25/06/2002 3:34:45 PM]


Gamasutra - Features - Coordinated Unit Movement - "Predicted Positions" [01.22.99]

| | | |

Features
By Dave C. Pottinger
Gamasutra
January 22, 1999
Predicted Positions
Originally Now that we have a simple movement algorithm and a list of unit collisions, what Contents
Published in else do we need to get decent unit cooperation? Position prediction.
Game Developer Introduction
Magazine, Predicted positions are simply a set of positions (with associated orientations and
January, 1999. Movement Issues
time stamps) that indicate where an object will be in the future (see Figure 4 Facing Game
below). A movement system can calculate these positions using the same Developers
movement algorithm that's used to move the object. The more accurate these
positions are, the more useful they are. Position prediction isn't immediately free, Simple Movement
though, so let's look at how to offset the additional CPU usage. Algorithm

Collision
Determination

Discrete vs.
Continuous Simulation

Predicted Positions

Unit to Unit
Cooperation

Basic Planning

Basic Definitions

Letters to the Editor:


Write a letter
View all letters

Figure 4. A closer look at the predicted


positions.

The most obvious optimization is to avoid recalculating all of your predicted positions at every frame. A
simple rolling list works well (see Figure 5 below); you can roll off the positions that are now in the
past and add a few new positions each frame to keep the prediction envelope at the same scale. While
this optimization doesn't get rid of the start-up cost of creating a complete set of prediction positions
the first time you move, it does have constant time for the remainder of the movement.

http://www.gamasutra.com/features/19990122/movement_06.htm (1 of 3) [25/06/2002 3:37:08 PM]


Gamasutra - Features - Coordinated Unit Movement - "Predicted Positions" [01.22.99]

Figure 5. Rolling list of predicted positions.

The next optimization is to create a prediction system that handles both points and lines. Because our
collision determination system already supports points and lines, it should be easy to add this support
to our prediction system. If a unit is traveling in a straight line, we can designate an enclosed volume
by using the current position, a future position, and the unit's soft movement radius. However, if the
object has a turn radius, things get a little more complicated. You can try to store the curve as a
function, but that's too costly. Instead, you're better off doing point sampling to create the right
predicted points (see Figure 6 below). In the end, you really want a system that seamlessly supports
both point and line predictions, using the lines wherever possible to cut down on the CPU cost.

Figure 6. Using predicted


positions with a turn radius.

The last optimization we'll cover is important and perhaps a little nonintuitive. If we're going to get this
predicted system with as little overhead as possible, we don't want to duplicate our calculations for
every unit by predicting its position and then doing another calculation to move it. Thus, the solution is
to predict positions accurately, and then use those positions to move the object. This way, we're only
calculating each move once, so there's no extra cost aside from the aforementioned extra start-up
time.

In the actual implementation, you'll probably just pick a single update length to do the prediction. Of
course, it's fairly unlikely that all of the future updates will be consistent. If you blindly move the unit
from one predicted position to the next without any regard to what the actual update length currently
is, you're bound to run into some problems. Some games (or some subset of objects in a game) can
accept this inaccuracy. Those of us developing all the other games will end up adding some
interpolation so that can quickly adjust a series of predicted points that isn't completely accurate. You
also need to recognize when you're continually adjusting a series of predicted positions so that you cut
your losses and just recalculate the entire series.

Most of the rest of the implementation difficulties arise from the fact that we use these predicted
positions in collision detection just as we do for the object's actual current position. You should easily
see the combinatorial explosion that's created by comparing predicted positions for all units in a given
area. However, in order to have good coordinated unit movement, we have to know where units are
going to be in the near future and what other units they're likely to hit. This takes a good, fast collision
determination system. As with most aspects of a 3D engine, the big optimizations come from quickly
eliminating potential interactions, thus allowing you to spend more CPU cycles on the most probable
interactions.

http://www.gamasutra.com/features/19990122/movement_06.htm (2 of 3) [25/06/2002 3:37:08 PM]


Gamasutra - Features - Coordinated Unit Movement - "Predicted Positions" [01.22.99]

Unit to Unit Cooperation

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990122/movement_06.htm (3 of 3) [25/06/2002 3:37:08 PM]


Gamasutra - Features - Coordinated Unit Movement - "Unit to Unit Cooperation" [01.22.99]

| | | |

Features
By Dave C. Pottinger
Gamasutra
January 22, 1999 Unit to Unit Cooperation
We've created a complex system for determining where an object is going to be in Contents
Originally the future. It supports 3D movement, it doesn't take up much more CPU time than Introduction
Published in a simple system, and it provides an accurate list of everything we expected a unit
Game Developer to run into in the near future. Now we get to the fun part. Movement Issues
Magazine, Facing Game
January, 1999. If we do our job well, most of the collisions that we must deal with are future Developers
collisions (because we avoid most of the immediate collisions before they even
happen). While the baseline approach for any future collision is to stop and repath, Simple Movement
it's important to avoid firing up the pathfinder as much as possible. Algorithm

This set of collision resolution rules is a complete breakdown of how to approach Collision
Determination
the problem of unit-to-unit collision resolution (from a unit's frame of reference).
Discrete vs.
Unresolved collisions Continuous Simulation

Case 1. If both units are not moving: Predicted Positions


1. If we're the lower-priority unit, don't do anything of our own volition.
Unit to Unit
Cooperation
2. If we're the higher-priority unit, figure out which unit (if any) is going to
move and tell that unit to make the shortest move possible to resolve the Basic Planning
hard collision. Change the collision state to resolving.
Basic Definitions
Case 2. If we're not moving, and the other unit is moving, we don't do anything.

Case 3. If we're moving and the other unit is stopped:


1. If we're the higher-priority unit, and the lower priority unit can get out of the way, calculate our
"get-to point" (the point we need to get to in order to be past the collision) and tell the
lower-priority unit to move out of our way (see Figure 7 below). Change the collision state to
resolving.

Letters to the Editor:


Write a letter
View all letters

Figure 7. Resolving a collision between a moving unit and a


stopped unit.

http://www.gamasutra.com/features/19990122/movement_07.htm (1 of 2) [25/06/2002 3:38:56 PM]


Gamasutra - Features - Coordinated Unit Movement - "Unit to Unit Cooperation" [01.22.99]
2. Else, if we can avoid the other unit, avoid the other unit and resolve the collision.

3. Else, if we're the higher-priority unit and we can push the lower-priority unit along our path,
push the lower priority-unit. Change the collision state to resolving.

4. Else, stop, repath, and resolve the collision.


Case 4. If we're moving and the other unit is moving:
1. If we're the lower-priority unit, don't do anything.

2. If collision with hard radius overlap is inevitable and we're the higher-priority unit, tell the
lower-priority unit to pause, and go to Case 3.

3. Else, if we're the higher-priority unit, calculate our get-to point and tell the lower-priority unit to
slow down enough to avoid the collision.
Resolving Collisions
● If we're the unit that's moving in order to resolve a Case 1 collision and we've reached our
desired point, resolve the collision.

● If we're the Case 3.1 lower-priority unit and the higher- priority unit has passed its get-to point,
start returning to the previous position and resolve the collision.

● If we're the Case 3.1 higher-priority unit, wait (slow down or stop) until the lower-priority unit
has gotten out of the way, then continue.

● If we're the Case 3.3 higher-priority unit and the lower-priority unit can now get out of the way,
go to Case 3.1.

● If we're the Case 4.3 lower-priority unit and the higher-priority unit has passed its get-to point,
resume normal speed and resolve the collision.

One of the key components of coordinated unit movement is to prioritize and resolve disputes. Without
a solid, well-defined priority system, you're likely to see units doing a merry-go-round dance as each
demands that the other move out of its way; no one unit has the ability to say no to a demand. The
priority system also has to take the collision severity into account. A simple heuristic is to take the
highest-priority hard collision and resolve down through all of the other hard collisions before
considering any soft collisions. If the hard collisions are far enough in the future, though, you might
want to spend some time resolving more immediate soft collisions. Depending on the game, the
resolution mechanism might also need to scale based on unit density. If a huge melee battle is
creating several compound hard collisions between some swordsmen, you're better served spending
your CPU time resolving all of those combat collisions than resolving a soft collision between two of
your resource gatherers on a distant area of the map. An added bonus to tracking these areas of high
collision density is that you can influence the pathfinding of other units away from those areas.

Basic Planning

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990122/movement_07.htm (2 of 2) [25/06/2002 3:38:56 PM]


Gamasutra - Features - Coordinated Unit Movement - "Basic Planning" [01.22.99]

| | | |

Features
By Dave C. Pottinger
Gamasutra
January 22, 1999 Basic Planning
Planning is a key element of unit cooperation. All of these predictions and Contents
Originally calculations should be as accurate as possible. Inevitably, though, things will go Introduction
Published in wrong. One of the biggest mistakes we made with the Age of Empires' movement
Game Developer was to make every decision within a single frame of reference. Every decision was Movement Issues
Magazine, always made correctly, but we didn't track that information into future updates. As Facing Game
January, 1999. a result, we ended up with units that would make a decision, encounter a problem Developers
during the execution of that decision, and then make a decision that sent them
right back on their original path, only to start the whole cycle over again the next Simple Movement
update. Planning fixes this tautology. We keep around the old, resolved collisions Algorithm
long enough (defined by some game-specific heuristic) so that we can reference
Collision
them should we get into a predicament in the future. When we execute an Determination
avoidance, for example, we remember what object it is that we're avoiding.
Because we'll have created a viable resolution plan, there's no reason to do Discrete vs.
collision checking with the other unit in the collision unless one of the units gets a Continuous Simulation
new order or some other drastic change takes place. Once we're done with the
avoidance maneuver, we can resume normal collision checking with the other unit. Predicted Positions
As you'll see next month, we'll reuse this planning concept over and over again to
accomplish our goals. Unit to Unit
Cooperation
Simple games are a thing of the past; so is simple movement. We've covered the
Basic Planning
basic components necessary for creating a solid, extensible movement system: a
state-based movement algorithm, a scalable collision determination system, and a Basic Definitio ns
fast position prediction system. All of these components work together to create a
deterministic plan for collision resolution.

Next month, we'll extend these concepts to cover higher-order movement topics, such as group
movement, full-blown formation movement, and compound collision resolution. I'll also go into more
detail about some implementation specifics that help solve some of the classic movement problems.

For Further Info


● Take a look at Craig W. Reynolds' Boids work at http://hmt.com/cwr/boids.html.
● Steven Woodcock's Game AI web site is located at http://www.cris.com/~swoodcoc/ai.html.
● Also see Patrick Winston. Artificial Intelligence, 3rd ed. (Addison-Wesley, 1993.)
Letters to the Editor:
Write a letter After several close calls, Dave managed to avoid getting a "real job" and joined Ensemble
View all letters Studios straight out of college a few years ago (just in time to the do the computer-player
AI for a little game called AGE OF EMPIRES). These days, Dave spends his time either
leading the development of Ensemble Studios' engines or with his lovely wife Kristen. Dave
can be reached at dpottinger@ensemblestudios.com.

Basic Definitions

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990122/movement_08.htm [25/06/2002 3:39:11 PM]


Gamasutra - Features - Coordinated Unit Movement - "Basic Definitions" [01.22.99]

| | | |

Features
By Dave C. Pottinger
Gamasutra
January 22, 1999 Basic Definitions
Movement. The execution of a path. Simple movement algorithms move a unit Contents
Originally along a path, while more complex systems check collisions and coordinate unit Introduction
Published in movement to avoid collisions and allow otherwise stuck units to move.
Game Developer
Movement Issues
Magazine, Pathfinding. The act of finding a path (a planned route for a unit to get from Facing Game
January, 1999. point A to point B). The algorithm used can be anything from a simple exhaustive Developers
search to an optimized A* implementation.
Simple Movement
Waypoint. A point on a path that a unit must go through to execute the path. Algorithm
Each path, by definition, has one waypoint at the start and one waypoint at the
end. Collision
Determination
Unit. A game entity that has the ability to move around the game map. Discrete vs.
Continuous Simulation
Group. A general collection of units that have been grouped together by the user
for convenience (usually to issue the same order to all of the units in the group). Predicted Positions
Most games try to keep all of the units in a group together during movement.
Unit to Unit
Formation. A more complex group. A formation has facing (a front, a back, and Cooperation
two flanks). Each unit in the formation tries to maintain a unique relative position
inside the formation. More complex models provide an individualized unit facing Basic Planning
inside of the overall formation and support for wheeling during movement.
Basic Definitions

Hard Movement Radius. A measure of the volume of a unit with which we absolutely do not allow
other units to collide.

Soft Movement Radius. A measure of the volume of a unit with which we would prefer not to collide.

Movement Prediction. Using the movement algorithms to predict where a unit will be at some point
in the future. A good prediction system will take acceleration and deceleration into account.

Turn Radius. The radius of the tightest circle a unit can turn on at a given speed.

Letters to the Editor: [Back to] Introduction


Write a letter
View all letters

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990122/movement_09.htm [25/06/2002 3:40:39 PM]


Gamasutra - Features - "Implementing Coordinated Movement" [01.29.99]

| | | |

Features
By Dave C. Pottinger
Gamasutra
January 29, 1999
Implementing Coordinated Movement
Originally
Published in the Part of the fun of working in the game industry is the constant demand for technical innovations that
February 1999 issue will allow designers to create better games. In the real-time strategy (RTS) genre, most developers
of: are focusing on improving group movement for their next round of games. I'm not talking about the
relatively low-tech methods cur- rently in use. Instead, I'm referring to coordinated group movement,
where units cooperate with each other to move around the map with intelligence and cohesion. Any
RTS game developer that wants to be competitive needs to look beyond simple unit movement; only
the games that weigh in with solid coordinated movement systems will go the distance.

In this article, the second and final part of my coordinated unit movement series, we'll take a look at
Letters to the Editor: how to use the systems that we considered in the first article to satisfy our coordinated group
Write a letter movement goal. We'll also examine how we can use our coordinated movement fundamentals to solve
View all letters some classic, complex movement problems. While we will spend most of our time talking about these
features through the RTS microscope, they can easily be applied to other types of games.

Catching Up
Last week (Coordinated Unit Movement), we discussed a lot of the low-level issues of coordinated unit
movement. While pathfinding (the act of finding a route from point A to point B) gets all of the press,
the movement code (the execution of a unit along a path) is just as important in creating a positive
game experience. A game can have terrific pathfinding that never fails to find the optimum path. But,
if the movement system -isn't up to par, the overall appearance to the players is going to be that the
units are stupid and can't figure out where to go.

One of the key components to any good movement system is the collision determination system. The
collision system really just needs to provide accurate information about when and where units will
collide. A more advanced collision system will be continuous rather than discrete. Most games scale
the length of the unit movement based on the length of the game update loop. As the length of that
update loop increases, the gap between point A and point B can get pretty large. Discrete collision
systems ignore that gap, whereas continuous systems check the gap to make sure there isn't anything
in between the two points that would have created a collision with the unit being moved. Continuous
collision determination systems are more accurate and more realistic. They're more difficult to write,
though.

Another important element for coordinated unit movement is position prediction. We need to know
where our units are trying to go so that we can make intelligent decisions about how to avoid
collisions. Although building a fast position-prediction system presents us with a number of issues, for
this article, we can assume that our collision determination system has been augmented to tell us
about future collisions in addition to current collisions. Thus, each unit in the game will know with
which units it's currently in collision with and which units it will collide with in the near future. We
presented several rules for getting two units out of collision in last month's article.

All of these elements work together to create the basis for a solid, first-order (single unit to single
unit) coordinated movement system. The core thing to keep in mind for this article is that we have an
accurate, continuous collision determination system that tells us when and where units will collide.
We'll use that collision system in conjunction with the collision resolution rules to create second order
(three or more units/groups in collision) coordination.

Group Movement
Looking at the definition of a group (see sidebar at right), we can immediately see that we need to
store several pieces of data. We need a list of the units that make up our group, and we need the
maximum speed at which the group can move while still keeping together. Additionally, we probably
want to store the centroid of the group, which will give us a handy reference point for the group. We
also want to store a commander for the group. For most games, it doesn't matter how the commander
is selected; it's just important to have one.

http://www.gamasutra.com/features/19990129/implementing_01.htm (1 of 12) [25/06/2002 3:42:30 PM]


Gamasutra - Features - "Implementing Coordinated Movement" [01.29.99]
One basic question needs to be answered before we Unit. A game entity that has the ability to
proceed, though. Do we need to keep the units move around the map. Players expect their
together as they move across the board? If not, then units to act intelligently. Group. A general
the group is just a user interface convenience. Each collection of units that have been grouped
unit will path and move as if the player had issued together by the user for convenience (usually
individual commands to each group member. As we to issue the same order to all of the units in
look at how to improve on the organization of our the group). Other than a desire to keep its
groups, we can see that there are varying degrees of units together, groups don't place any other
group movement cohesion. restrictions on unit movement.

Units in a group just move at the same speed. Formation. A more complex group. A
Usually, this sort of organization moves the group at formation has an orientation (a front, a back, a
the maximum speed of its slowest unit, but sometimes right flank, and a left flank). Each unit in the
it's better to let a slow unit move a little faster when formation tries to maintain a unique relative
it's in a group (see Figure 1 below). Designers position within the formation. More complex
generally give units a slow movement speed for a models provide an individualized unit facing
reason, though; altering that speed can often create within the overall formation and support for
unbalanced game play by allowing a powerful unit to wheeling during movement.
move around the map too quickly.
Units, Groups, and Formations

Figure 1

Units in a group move at the same speed and take the same path. This sort of organization
prevents half of the group's units from walking one way around the forest while the other half takes a
completely different route (see Figure 2 below). Later, we'll look at an easy way to implement this sort
of organization.

Figure 2

Units in a group move at the same speed, take the same path, and arrive at the same time.
This organization exhibits the most complex behavior that we'll apply to our group definition. In
addition to combining the previous two options, it also requires that units farther ahead wait for other
units to catch up and possibly allows slower units to get a temporary speed boost in order to catch up.

So, how can we achieve the last option? By implementing a hierarchical movement system, we can
manage individual unit movement in a way that allows us to consider a group of units together. If we
group units together to create a group object, we can store all of the necessary data, calculate the

http://www.gamasutra.com/features/19990129/implementing_01.htm (2 of 12) [25/06/2002 3:42:30 PM]


Gamasutra - Features - "Implementing Coordinated Movement" [01.29.99]
maximum speed for the group as a whole, and provide the basic decision making regarding when units
will wait for other units (Listing 1).

Listing 1. BUnitGroup.
//*****************************************************************************
// BUnitGroup
//*****************************************************************************
class BUnitGroup
{
public:
BUnitGroup( void );
~BUnitGroup( void );
//Returns the ID for this group instance.
int getID( void ) const { return(mID); }
//Various get and set functions. Type designates the type of the group
//(and is thus game specific). Centroid, maxSpeed, and commander are
//obvious. FormationID is the id lookup for any formation attached to
//the group (will be some sentinel value if not set).
int getType( void ) const { return(mType); }
void setType( int v ) { mType=v; }
BVector& getCentroid( void ) const { return(mCentroid); }
float getMaxSpeed( void ) const { return(mMaxSpeed); }
int getCommanderID( void ) const { return(mCommanderID); }
BOOL getFormationID( void ) const { return(mFormationID); }
BOOL setFormationID( int fID );
//Standard update and render functions. Update generates all of the
//decision making within the group. Render is here for graphical
//debugging.
BOOL update( void );
BOOL render( BMatrix& viewMatrix );
//Basic unit addition and removal functions.
BOOL addUnit( int unitID );
BOOL removeUnit( int unitID );
int getNumberUnits( void ) const { return(mNumberUnits); }
int getUnit( int index );

protected:
int mID;
int mType;
BVector mCentroid;
float mMaxSpeed;
int mCommanderID;
int mFormationID;
int mNumberUnits;
BVector* mUnitPositions;
BVector* mDesiredPositions;
};

The BGroup class manages the unit interactions within itself. At any point in time, it should be able to
develop a schedule for resolving collisions between its units. It needs to be able to control or modify
the unit movement through the use of parameter settings and priority manipulation. If your unit
system only has support for one movement priority, you'll want to track a secondary movement
priority within the group for each unit in the group. Thus, to the outside world, the group can behave
as a single entity with a single movement priority, but still have an internal prioritization. Essentially,
the BGroup class is another complete, closed movement system.

The commander of the group is the unit that will be doing the pathfinding for the group. The
commander will decide which route the group as a whole will take. For basic group movement, this
may not mean much more than the commander being the object that generates the pathfinding call.
As we'll see in the next section, though, there's a lot more that the commander can do.

Basic Unit Formation


Formations build on the group system. Formations are a more restrictive version of groups because we
have to define a very specific position for each unit within the group. All of the units must stay
together during group movement in terms of speed, path, and relative distance; it doesn't do any good
to have a column of units if there are huge gaps in that column while it's moving around the map.

The BFormation class (below)manages the definition of the desired positions (the positions and

http://www.gamasutra.com/features/19990129/implementing_01.htm (3 of 12) [25/06/2002 3:42:30 PM]


Gamasutra - Features - "Implementing Coordinated Movement" [01.29.99]
orientations that we want for each unit in the formation), the orientation, and the state of the
formation. Most formations that a game uses are predefined; it's useful to make these easy to edit
during development (via a text file or something else that a nonprogrammer can manipulate). We do
want the ability to create a formation definition on the fly, though, so we'll take the memory hit and
have each formation instance in the game maintain a copy of its own definition.

Listing 2. The BFormation Class


//*****************************************************************************
// BFormation Class
//*****************************************************************************
class BFormation
{
public:
//The three formation states.
enum
{
cStateBroken=0,
cStateForming,
cStateFormed
};
BFormation( void );
~BFormation( void );
//Accessors for the formation's orientation and state. The expectation
//is that BFormation is really a data storage class; BGroup drives the
//state by calling the set method as needed.
BVector& getOrientation( void ) { return(mOrientation); }
void setOrientation( BVector& v ) { mOrientation=v; }
int getState( void ) const { return(mState); }
void setState( int v ) { mState=v; }

//The unit management functions. These all return information for the
//canonical definition of the formation. It would probably be a good
//idea to package the unit information into a class itself.

BOOL setUnits( int num, BVector* pos, BVector* ori, int* types );
int getNumberUnits( void ) const { return(mNumberUnits); }
BVector& getUnitPosition( int index );
BVector& getUnitOrientation( int index );
int getUnitType( int index );
protected:
BVector mOrientation;
int mState;
int mNumberUnits;
BVector* mPositions;
BVector* mOrientations;
int* mTypes;
};

Under this model, we must also track the state of a formation. The state cStateBroken means that
the formation isn't formed and isn't trying to form. cStateForming signifies that our formation is
trying to form up, but hasn't yet reached cStateFormed. Once all of our units are in their desired
positions, we change the formation state to cStateFormed. To make the movement considerably
easier, we can say that a formation can't move until it's formed.

When we're ready to use a formation, our first task is to form the formation. When given a formation,
BGroup enforces the formation's desired positions. These positions are calculated relative to the
current orientation of the formation. When the formation's orientation is rotated, then the formation's
desired positions will automatically wheel in the proper direction.

To form the units into a formation, we use scheduled positioning. Each position in the formation has a
scheduling value (either by simple definition or algorithmic calculation) that will prioritize the order in
which units need to form. For starters, it works well to form from the inside and work outward in order
to minimize collisions and formation time (see Figure 3 below). The group code manages the forming
with the algorithm shown in Listing 3.

Listing 3.
Set all units' internal group movement priorities to same low priority value.

http://www.gamasutra.com/features/19990129/implementing_01.htm (4 of 12) [25/06/2002 3:42:30 PM]


Gamasutra - Features - "Implementing Coordinated Movement" [01.29.99]
Set state to cStateForming.
While state is cStateForming:
{
Find the unfilled position that's closest to the center of the formation.
If no unit was available
Set the state to cStateFormed and break out of forming loop.
Select a unit to fill that slot using a game specific heuristic that:
Minimizes the distance the unit has to travel.
Will collide with the fewest number of other formation members.
Has the lowest overall travel time.
Set unit's movement priority to a medium priority value.
Wait (across multiple game updates) until unit is in position.
Set unit's movement priority to highest possible value. This ensures that
subsequently formed units will not dislodge this unit.
}

Figure 3

So, now that we have all of our swordsmen in place, what do we do with them? We can start moving
them around the board. We can assume that our pathfinding has found a viable path (a path that can
be followed) for our formation's current size and shape (see Figure 4 below). If we don't have a viable
path, we'll have to manipulate our formation (we'll talk about how to do this shortly). As we move
around the map, we designate one unit as the commander of our formation. As the commander
changes direction to follow the path, the rest of our units will also change direction to match the
commander's; this is commonly called flocking.

Figure 4

We have a couple of ways to deal with direction changes for a formation: we can ignore the direction
change or we can wheel the formation to face in the new direction. Ignoring the direction change is
simple and is actually appropriate for something such as a box formation (Figure 5).

http://www.gamasutra.com/features/19990129/implementing_01.htm (5 of 12) [25/06/2002 3:42:30 PM]


Gamasutra - Features - "Implementing Coordinated Movement" [01.29.99]

Figure 5

Wheeling isn't much more complicated and is very appropriate for something such as a line. When we
want to wheel, our first step is to stop the formation from moving. After rotating the orientation of the
formation, we recalculate the desired positions (see Figure 6 below). When that's done, we just go
back to the cStateForming state, which causes the group code to move our units to their new
positions and sets us back to cStateFormed when its done (at which point, we can continue to move).

Figure 6

Advanced Formation Management


So, now we've got formations moving around the map. Because our game map is dynamic and
complex, it's possible that our planned path will be invalidated. If that happens, we'll need to
manipulate the formation in one of three ways.

Scaling unit positions. Because the desired positions are really just vector offsets within a
formation, we can apply a scaling factor to the entire formation to make it smaller. And a smaller
formation can fit through small gaps in walls or treelines (see Figure 7 below). This method works well
for formations in which the units are spread out, but it's pretty useless for formations where the units
are already shoulder to shoulder (as in a line). Scaling the offsets down in that case would just make
our swordsmen stand on top of each other, which isn't at all what we want.

Figure 7

http://www.gamasutra.com/features/19990129/implementing_01.htm (6 of 12) [25/06/2002 3:42:30 PM]


Gamasutra - Features - "Implementing Coordinated Movement" [01.29.99]
Simple ordered obstacle avoidance. If we're moving a formation and we encounter a collision with
another game entity (either a current collision or a future collision), we can assume that our path is
still viable, with the exception of this entity being in the way. The simple solution is to find the first
place along our formation's path where it will not be in collision with the entity and reform our
formation at that spot (see Figure 8 below). Thus, the line of infantry will break, walk around the
obstacle, and reform on the other side. This solution can fall apart fairly easily, though, so it's
important to realize when the reformation position is too far along the path to be of use. If the
distance around the obstacle is far enough that it interferes with the reformation of your group, then
you should just repath your formation.

Figure 8

Halving and rejoining. While simple avoidance is a good solution, it does dilute the visual impact of
seeing a formation move efficiently around the map. Halving can preserve the visual impact of
well-formed troops. When we encounter an obstacle that's within the front projection of the formation
(see Figure 9 below), we can pick a split point and create two formations out of our single formation.
These formations then move to the rejoin position and then merge back into one formation. Halving is
a simple calculation that dramatically increases the visual appeal of formations.

Figure 9

Path Stacks
A path stack is simple a stack-based (last in, first out) method for storing the movement route
information for a unit (see Figure 10 below). The path stack tracks information such as the path the
unit is following, the current waypoint the unit is moving toward, and whether or not the unit is on a
patrol. A path stack suits our needs in two significant ways.

http://www.gamasutra.com/features/19990129/implementing_01.htm (7 of 12) [25/06/2002 3:42:30 PM]


Gamasutra - Features - "Implementing Coordinated Movement" [01.29.99]

Figure 10

First, it facilitates a hierarchical pathfinding setup (see Figure 11 below). Game developers are
beginning to realize that there are two distinctly different types of pathfinding: high-level and
low-level. A high-level path routes a unit around major terrain obstacles and chokepoints on a map,
similarly to how a human player might manually set waypoints for a unit. A low-level path deals with
avoidance of smaller obstacles and is much more accurate on details. A path stack is the ideal method
for storing this high- and low-level information. We can find a high-level path and stuff that into the
stack. When we need to find a low-level path (to avoid a future collision with that single tree in the big
field), we can stuff more paths onto the stack and execute those. When we're done executing a path
on the stack, we pop it off the stack and continue moving along the path that's now at the top of the
stack.

Figure 11

Second, a path stack enables high-level path reuse. If you recall, one of the key components to good
group and formation movement is that all of the units take the same route around the map. If we
write our path stack system so that multiple units can reference the same path, then we can easily
allow units to reuse the same high-level path. A formation commander would find a high-level path
and pass that path on to the rest of the units in his formation without any of them having to do any
extra work.

Structuring the path storage in this manner offers us several other benefits. By breaking up a
high-level path into several low-level paths, we can refine future low-level segments before we
execute them. We can also delay finding a future low-level path segment if we can reasonably trust
that the high-level path is viable. If we're doing highly coordinated unit movement, a path stack allows
us to push temporary avoidance paths onto the stack and have them easily and immediately
integrated into the unit's movement (see Figure 12).

http://www.gamasutra.com/features/19990129/implementing_01.htm (8 of 12) [25/06/2002 3:42:30 PM]


Gamasutra - Features - "Implementing Coordinated Movement" [01.29.99]

Figure 12

Solving A Compound Collision


For our purposes, compound collisions are defined as simultaneous collisions between more than two
units. Most games will have a practical upper limit to how many units can be involved in a compound
collision. Still, as soon as a collision involves more than two units, programmers generally end up
writing a lot of spaghetti logic that breaks way too easily. But we'll get avoid that situation by reusing
the movement priorities and doing some simple scheduling.

If we have a compound collision between three units (see Figure 13 below), our first task is to find the
highest-priority unit involved in the collision. Once we've identified that unit, we need to look at the
other units in the collision and find the most important collision for the highest priority unit to resolve
(this may or may not be a collision with the next highest-priority unit in the collision). Once we have
two units, we pass those two units into the collision resolution system.

Figure 13

As soon as the collision between the first two units is resolved (see Figure 14 below), we need to
reevaluate the collision and update the unit involvement. A more complex system could handle the
addition of new units to the collision at this point, but you can get good results by simply removing
units as they resolve their collisions with the original units.

http://www.gamasutra.com/features/19990129/implementing_01.htm (9 of 12) [25/06/2002 3:42:30 PM]


Gamasutra - Features - "Implementing Coordinated Movement" [01.29.99]

Figure 14

Once we've updated the units in the collision, we go back to find two more units to resolve; we repeat
this process until no more units are involved in the collision.

You can implement this system in two different areas: the collision resolution rules or the collision
determination system. The collision resolution rules would need to be changed in the way in which
they units higher and lower priority; these rules aren't particularly difficult to change, but this
modification does increase the complexity of that code. Alternatively, you can change the collision
determination system so that it only generates collisions that involve two units at a time; you still have
to find all of the units in a collision, though, before you can make this decision.

Solving The Stacked Canyon Problem


One of the ultimate goals of any movement system is to create intelligent movement. Nothing looks
more intelligent than a system that solves the stacked canyon problem. The stacked canyon isn't a
simple problem to solve upon first inspection, but we can reuse some simple scheduling to solve it
once we have our coordinated unit movement in place.

http://www.gamasutra.com/features/19990129/implementing_01.htm (10 of 12) [25/06/2002 3:42:31 PM]


Gamasutra - Features - "Implementing Coordinated Movement" [01.29.99]

The first step is to identify that you have a stacked canyon problem. This step is important because it's
needed to propagate the movement priority of the driving unit (the unit trying to move through the
stacked canyon) through to the rest of the units. We could just let each unit ask other units to move
out of the way based on its own priority, but a better solution to use the priority of the driving unit -
after all, that's the unit that we really want to get through the canyon. Identifying a stacked canyon
problem can be done in a couple of ways: noticing that the driving unit will push the first unit into a
second unit or looking at the driving unit's future collision list to find multiple collisions. Whichever
method is used, the pushed unit should move with the same priority as the driving unit.

Once we've identified the problem, we have a fairly simple recursive execution using the coordinated
movement. We treat the first pushed unit as the driving unit for the second pushed unit, and so on.
Each unit is pushed away from its driving unit until it can move to the side. When the last unit moves
to the side, the original driving unit has a clear path by which to move through the canyon.

A nice touch is to restore the canyon units to their original states. To do this, we simply need to track
the pushing and execute the moves in the reverse order from which the units moved to the side. It's
also useful to have the movement code recognize when the driving unit is part of a group so that the
rest of the group's units can move through the canyon before the canyon units resume their original
positions.

Tips
Optimize this general system to your game. A lot of extra computation can be eliminated or
simplified if you're only doing a 2D game. Regardless of whether you're doing a 2D or 3D game, your
collision detection system will need a good, highly optimized object culling system; they're not just for
graphics anymore.

Use different methods for high and low level pathing. To date, most games have used the same
system for both solutions. Using a low-level solution for high-level pathfinding generally results in
high-level pathfinding that's slow and not able to find long paths. Conversely, a high-level pathfinder
used for low-level pathfinding creates paths that don't take all of the obstacles into account or are
forced to allow units to move completely through each other. Bite the bullet and do two separate
systems.

No matter what you do, units will overlap. Unit overlap is unavoidable or, at best, incredibly
difficult to prevent in all cases. You're better off simply writing code that can deal with the problem
early. Your game will be a lot more playable throughout its development.

Game maps are getting more and more complex. Random maps are going to be one of the key
discriminating features in RTS games for some time to come. The better movement systems will
handle random maps and also take changing map circumstances into account.

Understand how the update affects unit movement. Variable update lengths are a necessary evil
that your movement code will have to be able to handle. Use a simple update smoothing algorithm to
make the most of the problems go away.

Single update frames of reference are a thing of the past. It's impossible to do coordinated unit
movement without planning. It's impossible to do planning if you don't track past decisions and look at
what's likely to happen in the future. Any generalized coordinated unit movement system needs to be
able to recall past collision information and have future collision information available at all times.
Remember that minor variations during the execution of a collision resolution plan can be ignored.

No Stupid Swordsmen
Simple unit movement is, well, simple. Good, coordinated unit movement is something that we should
be working on in order to raise our games to the next level and make our players a lot happier in the
process. With these articles, we've laid the foundation for a coordinated unit movement system by
talking about topics such as planning across multiple updates and using a set of collision resolution
rules that can handle any two-unit collision. Don't settle for stupid swordsman movement again!

For Further Info


● Archer Jones. The Art of War in the Western World. Oxford University Press, 1987. ISBN
0-252-01380-8. This is a great book if you're looking for information on historical formation
usage.
● Bjorn Reese's page of pathfinding/ navigation links is at http://www.imada.ou.dk/~breese/
navigation.html

After several close calls, Dave managed to avoid getting a "real job" and joined Ensemble
Studios straight out of college a few years ago (just in time to the do the computer player
AI for a little game called AGE OF EMPIRES). These days, Dave spends his time either
leading the development of Ensemble Studios' engines or with his lovely wife Kristen. Dave
can be reached at dpottinger@ensemblestudios.com.

http://www.gamasutra.com/features/19990129/implementing_01.htm (11 of 12) [25/06/2002 3:42:31 PM]


Gamasutra - Features - "Implementing Coordinated Movement" [01.29.99]

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990129/implementing_01.htm (12 of 12) [25/06/2002 3:42:31 PM]


GameDev.net - Knowing the Path

Knowing the Path GameDev.net

See Also:
Artificial Intelligence:Pathfinding and Searching
Featured Articles:Featured Articles

Knowing the Path


by Richard Fine

The player is presented with a new environment, having just battled his way past a hoard of enemies. The new environment is a
network of corridors and rooms - and throughout the environment are enemies, bonuses, and other interactive items. The player
explores the environment, taking out the enemies, collecting the bonuses, and discovering the interactive items.
Now, take two.
The NPC is presented with a new environment, having just been generated by the engine. The new environment is a network of
corridors and rooms - and throughout the environment are enemies, bonuses, and other interactive items. The NPC heads straight for
the nearest room, because it can see from the nodegraph that there is a bonus there, and then takes a route through the environment that
avoids and evades all enemies. The framerate drops a little as the route is recalculated each step, to account for enemy movements. The
player watches in disbelief.
Sound familiar? This article is going to discuss a method for avoiding this: something I call an 'expandable path table.' It allows NPCs
to both navigate through an environment, and to explore and learn it, rather than being presented with the entire thing. It's also faster
than constantly recalculating a route.

Before we start
You'll need to know a pathfinding algorithm. I used Dijikstra's algorithm when researching - but the A* algorithm, or any other
algorithm for finding a path through a network of nodes, will do fine.

The Problem

The above network represents my level. The nodes are significant points - such as junctions, rooms, or corners - and the lines represent
the routes between them. There are no weights on this network, but they could easily be added in a real situation (i.e. the length of the
routes between the nodes).
The NPC is my enemy and needs to track me through the level. I'm moving around, and so is it, so it needs to recalculate the route to
my position each step. (For this article, we will assume that the NPC and I can only be at a node at any given time - in a real situation
though, it could be assumed that a node will have a line-of-sight to all nodes that it connects to, so the NPC could move to the node
nearest to me and be able to shoot at me).
Now, while it will depend on your pathfinding algorithm, recalculating a path each step isn't fast. Especially with an algorithm such as
Dijikstra's, which sometimes involves labeling the entire network. Ideally, we need a method to store all the routes between nodes
beforehand.

http://www.gamedev.net/reference/articles/article1841.asp (1 of 5) [25/06/2002 3:42:47 PM]


GameDev.net - Knowing the Path

A concept
If the shortest route from node A to node D is ABCD, then the shortest route from A to C must be ABC.
I haven't yet found a case where this isn't true. For Dijikstra's, the shortest distance (and therefore, route) to each node in the network is
found, and the algorithm depends on it being true. I'm not going to try and disprove it. :-)

How does this help?


As I said, we want some kind of way of pre-calculating all possible routes. Some nice data structure that we can look stuff up in - such
as a tree - or, rather, a matrix. Here's the matrix we're going to use for the problem I described earlier:
To
A B C D E F G H
A
B
C
From D
E
F
G
H

That'll be nice, won't it? Look up where we are, where we're going, and get... what? The answer: the next node we need to move to. I'll
fill out the matrix for the network above, quickly.
To
A B C D E F G H
A - B B E E F F F
B A - C C A A C A
C B B - D D G G G
From D E C C - E G G G
E A A D D - F F F
F A A G G E - G H
G F C C D F F - H
H F F G G F F G -

Does that make sense? Trace the route from any one node to another, quickly. For example, B to H.
B->H: Move to A
A->H: Move to F
F->H: Move to H
GOAL
That demonstrates that we've got a very simple algorithm on our hands. The matrix is also easy to build, too - you just calculate the
route from the 'from' node to the 'start' node, and store the first node in the route (excluding the start node itself).
This network is also a little special because there are no dead ends - that is, nodes with only one connection. If there were - and in real
situations, there would be - that row in the matrix is easy to do, as there's only one option for each cell.
Now, what about the original problem of trying to track me? I'll show a sample 'game,' with the NPC's moves on the left, and mine on
the right. Let's say that the NPC starts at node A, and I at node C. I don't know the level very well, so I may make some bad moves
from time to time, but it demonstrates the AI.
A->C: Move to B C->D
B->D: Move to C D->E

http://www.gamedev.net/reference/articles/article1841.asp (2 of 5) [25/06/2002 3:42:47 PM]


GameDev.net - Knowing the Path

C->E: Move to D E->F


D->F: Move to G F->H
G->H: Move to H Aargh!

In each case, all the NPC needs to do is to know my position, which it can use to look up where to move in the table. It's fast, no?
Sure, if I'd played a little better, I could have had it running around in circles for as long as I felt like. That's partly because of the level,
and partly because the bot doesn't have the option to 'rest.' If the bot stopped to rest (or, for that matter, if I did) then the outcome would
have been different. Also, if there were other goals, the bot may have chosen to take a different route to pick up a bonus on the way.
Ultimately, I guess, it's just a version of the Terminator algorithm (or 'tracker,' but I like Terminator better :-) ) applied to a network.
Still, it's useful.

Cheating
Let's return to the scenario I originally presented you with, of the NPC entering a new environment. Rather than exploring, it uses a
pre-calculated network to find the optimal route and reach its goal immediately, rather than having to explore.
The matrix method that I've just demonstrated can be expanded for education. The mind is a little more complicated, but it's still
understandable.

Learning the path


Another level:

This one's a little more complex. The blue lines represent walls.
We can build two extra tables from this that we'll need later. The first one is one that I will call our VIS table - it contains information
about which nodes can 'see' which other nodes. Our bot will only 'learn' about the existence of a node when it 'sees' it.
Sight Table - T: TRUE, BLANK: FALSE
To
ABCDEFGH I JKLMNOP
A - T T TT
B T - T TT
C - TT
D T - T T
E TTT -
F - T TT
G T T - TT
From H TT - T T
I TT T -
J - TT T T

http://www.gamedev.net/reference/articles/article1841.asp (3 of 5) [25/06/2002 3:42:47 PM]


GameDev.net - Knowing the Path

K T T - T T
L T TT - T
M TTT - T
N T - T
O TT T - T
P TT T T -

(Yes, it is symmetrical along the To=From line - because we're working on the premise that if A can see B, B can see A. I can't think of
a practical application where this isn't true, but if you can, kudos to you. :-) )
The second table we can build - a much smaller one - is a magnitude table. It lists the number of connections each node has. I won't do
it yet, as we don't pre-build it, we build it as we go along.
OK. Let's say that our bot starts at node A. He knows only node A, and nothing more. His route table looks like this:
To
A
From
A

Also, we can build the magnitude table I mentioned above:


Node Magnitude
A 0

Nice! B-) But, somehow, kinda useless. So, as our bot cannot move and has no higher priority goal (that is achievable - if he is hungry,
for example, his limited network will not have any food in it), he decides to look around.
Now, we are going to cheat a little here, but if we didn't, it'd look strange. Node A can see nodes B, L, O, and P. So, we pick the first of
those, and turn the bot to face it.
In actual fact, B, O, and P are all in a line. So, when the bot's field of view picks up node B, it will pick up nodes O and P as well.
The first move is to add the newly discovered nodes to the route and magnitude tables, and to update the magnitude table:
To
A B O P
A -
From B -
O -
P -

Node Magnitude
A 1
B 2
O 1
P 2

Now, recalculate all new rows and columns in the route table:
To
A B O P
A - B B B
From B A - P O
O P P - P
P B B O -

And that's it!


There are a couple of extra things to mention. FirstLY, it's a good idea to keep a seperate, pre-calculated magnitude table for the entire
network, so you can see when a node has been 'fully explored' (by comparing its value in the bot's magnitude table to that in the map's

http://www.gamedev.net/reference/articles/article1841.asp (4 of 5) [25/06/2002 3:42:47 PM]


GameDev.net - Knowing the Path
magnitude table). Exploration of the network is achieved by finding the nearest node that hasn't been 'fully explored' and looking at
each of its links in turn to add them to the tables.
Incidentally, see why I said we had to cheat? If we didn't use the VIS table to look at visible nodes in turn, rather than just rotating the
bot's head and seeing what it could see, what would have happened at, say, corridor corners? The bot would have stopped, done a
complete 360-degree scan, and then continued. A human player would just see that the corridor bends too the right, and continue. Of
course, there could be access from above - which the bot would see, but the player would probably miss. Still, it's better than stopping
at each node to perform a full scan.
Secondly, what do you do when the addition of a node affects other routes? If, for example, node A was fully explored, then B, then P,
the route from node L to node J would be LABPJ. However, if J was then fully explored, the route should become LKJ. How do you
detect this?
I haven't been able to find a nice, neat pattern that can be expressed with a single rule yet. Perhaps there isn't one. The best I can come
up with is:
After calculating values for the new cells, recalculate all rows for nodes directly connected to the new node, and then recalculate all
rows for nodes with a magnitude of 3 or higher.
So, with my previous example, after adding node K to the tables and calculating the values for column K and row K, you need to
recalculate rows L and J, and then nodes B and P (which would each have magnitudes of 3, but would not have already been
recalculated).

Conclusion
Right, that's all. This method has plenty of room for expansion - off the top of my head, assigning 'goodness' values to each node - for
strategic importance, safety, proximity to health/weapons, etc - might be useful to break out of the loop situations I mentioned earlier.
Oof. I'm going to go and eat some toast, I think. Questions, comments, etc, you can email me at rfine@lycos.com, post about this in the
forums (I usually check) or catch me in #gamedev. Happy brainstorming! =]
Discuss this article in the forums

Date this article was posted to GameDev.net: 6/18/2002


(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article1841.asp (5 of 5) [25/06/2002 3:42:47 PM]


GameDev.net - Motion Planning Using Potential Fields

Motion Planning Using Potential Fields GameDev.net

See Also:
Artificial Intelligence:Pathfinding and Searching

Motion Planning Using Potential Fields


by Stefan Baert

Introduction
First of all, I'm no authority on pathfinding or motion planning. There just seems to be a lot of interest in pathfinding algorithms on the
artificial intelligence message board. Last year I did a big project on motion planning for a university course, but since almost nobody
on the net understands Dutch, I thought it might be best to translate it and let a much broader public get acquainted with the general
concepts of motion planning. In this article I'll shed some light on several techniques using potential fields. We'll focus on the
theoretical part of the algorithms and now and then I'll throw in some screenshots from a demo-application I made for the project.
Should you have more questions or ideas on the subject, my name is Stefan Baert, I live in Belgium and on the net you'll mostly find
me as 'StrategicAlliance'.

What is a potential field anyway?


Before we start we have to agree on some definitions. We'll be talking about a pixel-sized 'unit' that can freely maneuver in all 8
directions at a speed of one pixel per turn in an environment. This environment is a 2D-map that has been divided in squares, sized one
pixel each. In this example we'll take a 100*100 environment and call it a 'grid'. The borders are walls that can not be penetrated and
there are several 'objects' on the map which have to be avoided. In theory, these objects can have any form, but we'll focus on
differently sized rectangles.
Now, what we still have to do is define a potential field. In essence, you could think of it as a big matrix. Every pixel in the 'grid' we
mentioned above is represented in the matrix by a number, telling us something about the state of that pixel under the current
circumstances. Since our ultimate goal is to find a path from our starting location to our destination, it's logical to assume that we're
going to manipulate and use these numbers as the decisive factor to move in the 'grid'.
In the examples below we'll always try to make the 'potential' of our destination as low as possible, while making the starting location
as high as possible. We can then, symbolically, glide down the numbers, moving from pixel to pixel in the grid, only accessing pixels
in the grid that have a lower potential than the one our unit stands in at that moment. To avoid bumping into obstacles we'll give them a
very high potential value (higher than the highest accessible pixel in the grid) so our unit will never be tempted to enter such an object.
In the included screenshot you see a potential field indicated with colors. In this case a lower potential means a darker green. The unit
is the blue pixel in the top left corner. Our goal is the red pixel in the lower right corner. Obstacles are shown in white.

http://www.gamedev.net/reference/articles/article1125.asp (1 of 4) [25/06/2002 3:43:39 PM]


GameDev.net - Motion Planning Using Potential Fields

Technique 1 : Forward... march!


We'll now look at a few techniques. Each one will be slightly more complicated than the previous one, solving the problems from the
previous technique, but unfortunately creating different ones at the same time.
We start easy. We calculate the distance in each pixel point to the destination and give this distance as 'potential' value to that pixel.
Please note that in this case you don't have to care about the obstacles (just give them a very high value). The distance from the
destination to a pixel can be measured in a straight line. Once this is done we apply this algorithm : "Check all the neighboring pixels
of the current pixel and select the one with the lowest potential value. Move to that pixel and continue to do this until you reach your
goal or get stuck."
It can hardly be easier and this algorithm works quite well if you have few objects in the grid, but most of the time the unit will get
stuck behind an object and have no way to get back on the right track.

Technique 2 : Filling local minima


As we saw in the first technique it is possible to get stuck. This happens when the unit enters a local minimum. In the potential field
this is represented by a pixel that has neighbors which are either objects or are accessible but have a higher value, yet they are not our
destination. If you have some experience with theoretical algorithms you may have noticed that we performed a depth-first search in
the grid.
We'll improve our algorithm by using a best-first search. We'll do this by building a tree with the pixels in the grid. Our root in the tree
is the starting location. This root has the 8 directions in which our unit can travel as children. We'll take the child with the lowest value
and evaluate all the directions we can travel from here, but not the ones that are already in the tree. We then evaluate all the leaves in
the tree and repeat the process for the leave with the lowest value. We continue to do this until we either find our destination or until we
have evaluated all leaves and cannot find another pixel to move to (which will happen if the destination is simply unreachable from our
starting location). If we are successful the path is defined by traversing the tree from the root to the leave containing our destination.

http://www.gamedev.net/reference/articles/article1125.asp (2 of 4) [25/06/2002 3:43:39 PM]


GameDev.net - Motion Planning Using Potential Fields

If you do a graphical representation of building the tree, you can see that local minima are 'filled' until the 'bucket runs over' and we can
continue a straight line to our goal. In the picture, the members of the tree are blue, except for the leaves which are yellow. Obstacles
are represented in white.

Technique 3 : It is better to prevent...


We may have found a way to get out of local minima in our previous solution, but it would be better if we could just find a way to have
no local minima at all. To accomplish this we need to change the way in which we build our potential field.
Instead of directly calculating the distance between a pixel and the destination, we'll give our destination zero as value and then apply
this algorithm : Every direct (horizontal and vertical but not diagonal) neighbor of the destination gets value 1, then their direct
neighbors get 2, and so on...
We can then start of where our unit is and glide down the numbers to our goal. Since only our destination has a zero value and every
other pixel always has a neighbor with a lower value we'll be successful when a path exists. This method is also known as a
wavefront-expansion.

If you look at the picture you'll notice that local minima now get a higher value (indicated by a brighter yellow) because it's a longer
way around an object to get there than in a straight line. The path we follow is indicated by the blue line moving east and then south.

http://www.gamedev.net/reference/articles/article1125.asp (3 of 4) [25/06/2002 3:43:39 PM]


GameDev.net - Motion Planning Using Potential Fields

Technique 4 : Get away from the edges


Since we began our exploration of the potential field, we've learnt to get out of minima and even to ban them completely. But you'll
have noticed in the last screenshot that we were moving dangerously close to the edge of objects and walls. In an environment where
we might only know an approximate location of objects it would be best to stay as far away from them as possible to avoid collisions.
For this we need to enhance our algorithm from the previous technique. Instead of initiating one wavefront in the destination, we will
now start a wavefront in every pixel that is on the edge of an object. Each of these 'edge-pixels' gets a unique ID and gives this ID to all
his direct neighbors. When two ID's that are sufficiently different collide we've found the middle between two close objects. The result
will be a kind of Voronoi diagram between the objects which we can use as roadmap to navigate between the objects. All we then have
to do is give the pixels in the grid that do not belong to this roadmap, a value so that our unit will get on the roadmap as soon as
possible. For this we can use one of the techniques above.

The picture shows wavefronts coming from the edges of the objects and a purple roadmap exactly in the middle between several
objects.

Conclusion
This concludes our voyage using potential fields. The techniques described above are an interesting alternative for the A* algorithm
that seems to be very popular these days. Though potential fields require some serious calculations to be applied their characteristic of
defining a whole grid in numbers can be an asset if you need quickly changing destination goals, because all you need to do is increase
or decrease the numbers in a certain region to make it more attractive (or not) for the unit to follow that path.
Discuss this article in the forums

Date this article was posted to GameDev.net: 7/15/2000


(Note that this date does not necessarily correspond to the date the article was written)
© 1999-2002 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Send us an e-mail!

http://www.gamedev.net/reference/articles/article1125.asp (4 of 4) [25/06/2002 3:43:39 PM]


Gamasutra - Features - "Pawn Captures Wyvern: How Computers Chess Can Improve Your Pathfinding" [06.26.00]

| | | |

Features
by Mark Brockington
Gamasutra
June 26, 2000
Pawn Captures Wyvern: How Computer Chess
Can Improve Your Pathfinding
Printer Friendly Editor's note: This paper was originally published in the 2000 Game Developer's Conference
Version proceedings
1. Introduction
Discuss this Article Contents
Most of you with Computer Science training have probably been through the
typical Artificial Intelligence lecture on search and planning. You are shown 1. Introduction
A*, with some trivial example (so your professor doesn't get lost while doing
it) which shows all of the various parts of A*. You've also 3.1 Game Trees and
sat through the proof of why A* generates an optimal solution when it has Minimax Search
an admissible heuristic. If you're really lucky, you get to implement A* in
Lisp or Prolog in an assignment, and solve a puzzle involving sliding tiles. 3.2 Iterative Deepening

Jump ahead a few years, and you've been given the task of implementing a 4.0 Reimplementing A*
pathfinding algorithm for your game. You shift through your notes, trying to
remember why you need a CLOSED list, and how to translate all the car() 4.3 The History of
and cdr() instructions from Lisp into something that your lead programmer Heuristic
won't bring up during your next performance review. You study web sites on
AI and pathfinding, try a few enhancements, and eventually come up with a solution that behaves in a
very similar manner to the A* algorithm from your notes.

In an alternate universe, there are academics and hobbyists that concentrate on computer games of
thought, such as chess, checkers and Othello. There are regular tournaments between programs, and
the two main ways to outplay your opponent and win the game involve outsearching your opponent
and having a smarter (but still computationally fast) evaluation of positions. I have heard each of
these statements while chatting with other Othello programmers during tournaments. Do these
statements sound like anything you've heard a programmer in your company mention?
● "I don't trust C++ to generate fast code, so I'm still using ANSI C."
● "I coded the inner loop in assembly. It took me two months of work, but it speeds up the
program by 10%, so it was worth it."
Letters to the Editor:
● "I've had about eight hours of sleep in 72 hours, but I've improved the performance."
Write a letter
View all letters
Computer chess programmers been dealing with a search algorithm (cryptically called ab) for the last
25 years. They have a library of standard enhancements that they can use to enhance ab and improve
the performance of their program without having to resort to learning MIPS processor machine
language, or trying to acquire knowledge about what sort of positions their program handles poorly.
Academics involved in the field often quoted the desire to beat the World Chess Champion in a game
of chess to get their research funding. However, IBM and Deep Blue brought the funding train to a
screeching halt. Most have moved onto games that are significantly harder for the computer to do well
at, such as Poker, Bridge and Go. However, others realized that A* search really is not all that
different from .
When we cast aside the superficial differences between the two algorithms, we quickly discover A* and
ab are actually remarkably similar, and we can use the standard search enhancements from the typical
computer chess program in your pathfinding algorithm. We will be describing the subset of the
computer chess based search enhancements that we use in our pathfinding code at BioWare.
Section 2 will quickly review the standard A* algorithm (so you do not have to dig out your AI lecture
notes again). Section 3 will discuss the anatomy of a computer chess search algorithm, and Section 4
shows you how to put the search enhancements into A*.
2. A Really Brief Review of A*

http://gamasutra.com/features/20000626/brockington_01.htm (1 of 2) [25/06/2002 3:44:04 PM]


Gamasutra - Features - "Pawn Captures Wyvern: How Computers Chess Can Improve Your Pathfinding" [06.26.00]
A* [Hart 1968, Nilsson 1971] is one of the preferred methods of dealing with the pathfinding problem.
A* search starts with the initial state in a main data structure known as the OPEN list. The CLOSED list
represents the positions that the algorithm has already examined, and is initially empty. For each node
within the OPEN and CLOSED lists, A* maintains two heuristic values: g(n), the best-known minimum
cost, and h(n), the estimate of the cost to a goal state. Thus, the best node to examine at any point in
the algorithm has the lowest estimated cost: f(n) = g(n) + h(n).
The A* algorithm is an iterative process. In each step, A* takes the best state s from the OPEN list and
moves it to the CLOSED list. The successors of the best state, si, are generated and examined in turn.
If a successor si does not appear in either the OPEN or CLOSED list, then si is added to the OPEN list.
However, if si already appears in either list, we must check to see if the minimum cost g(n) has
decreased. If g(n) decreases, the node si must be deleted from its current location and reinserted into
the OPEN list.
The heuristic h(n) is critical for the performance of the A* algorithm. h(n) is said to be admissible if
the heuristic never overestimates the cost of travelling to the goal state. This is important because if
h(n) is admissible, A* is guaranteed to generate the least cost or optimal solution the first time the
goal node is generated. In the case of the typical pathfinding algorithm, h(n) is the straight line
distance between the current point n and the target point.

Figure 1. A Start and Goal Position For The Sliding-Tile Puzzle.

Some of the performance information referenced in this paper refers to the sliding-tile puzzle instead
of pathfinding, since this has been the most popular test in academic circles for studying A*. An
example of the sliding-tile puzzle can be found in Figure 1. In the sliding-tile puzzle, the Manhattan
distance (the sum of the vertical and horizontal displacements of each tile from its current square to
its goal square) is an admissible and effective heuristic for use in A* search.
3. The Anatomy Of A Chess Program

Now that we have quickly reviewed A*, let us deal with a computer chess search algorithm.
Games such as chess, checkers and Othello belong to a broad group of games called two-player
zero-sum games with perfect information. Zero-sum implies that if one player wins, the other player
loses. Perfect information implies the entire state of the game is known at any time. Scrabble has
hidden tiles, and is defined as a game of imperfect information.
Two-player zero-sum games with perfect information are well known to game theoreticians [von
Neumann 1944]. In any position for a game in this category, an optimal move can be determined. An
optimal move can be determined via the minimax algorithm which, for a game like chess, contains a
matrix that has been estimated to contain more molecules than our entire planet! However, all hope is
not lost, since there are alternative formulations of the minimax algorithm that involve searching a
game tree.
________________________________________________________
3.1 Game Trees and Minimax Search

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://gamasutra.com/features/20000626/brockington_01.htm (2 of 2) [25/06/2002 3:44:04 PM]


Gamasutra - Features - "Pawn Captures Wyvern: How Computers Chess Can Improve Your Pathfinding" [06.26.00]

| | | |

Features
by Mark Brockington
Gamasutra
June 26, 2000

Editor's note: This paper was originally published in the 2000 Game Developer's Conference
Printer Friendly proceedings
Version
3.1 Game Trees And Minimax Search
Contents
Discuss this Article The root of a game tree represents the current state of the game. Each node
in the tree can have any number of child nodes. Each child of a node 1. Introduction
represents a new state after a legal move from the node's state. This
continues until we reach a leaf, a node with no child nodes, in the game 3.1 Game Trees and
tree. We assign a payoff vector to each leaf in the game tree. In a Minimax Search
generalized game tree, this payoff vector represents the utility of the final
position to both players. In general, winning a game represents a positive 3.2 Iterative Deepening
utility or a player, while losing a game represents a negative utility. Since
4.0 Reimplementing A*
the game is a two-player zero-sum game, the utility for the first player must
equal the negative of the utility for the second player. The utility for the side
4.3 The History of
to move at the root of the tree is usually the only one given to save space. Heuristic

In Figure 2, an example of a game tree for a game of Naughts and Crosses


(or Tic-Tac-Toe) is given. Note that the two players take alternating turns at different levels of the
tree. X moves

Letters to the Editor:


Write a letter
View all letters

Figure 2. Example Of Naughts and Crosses Game Tree.

at the root, while the opponent, O, moves at the first level below the root. A position is normally
categorized by how many levels down in the game tree it is located. The common term for this is ply.
The root is said to be at ply 0, while the immediate successors of the root are said to be at ply 1, et
cetera.
Naughts and Crosses, like chess and checkers, has only three possible outcomes for a player: win, loss
or draw. Normally, we assign the payoff of +1, 0 and -1 to a win, draw or loss for the player to move,
respectively. These payoffs are given in Figure 2 at the bottom of each leaf position, with respect to
the player with the crosses.

http://gamasutra.com/features/20000626/brockington_02.htm (1 of 3) [25/06/2002 3:46:26 PM]


Gamasutra - Features - "Pawn Captures Wyvern: How Computers Chess Can Improve Your Pathfinding" [06.26.00]
We will give names to each player to simplify our discussion. Let us call the player to move in the
initial position Max and the opponent Min. At each node in the tree where Max has to move, Max would
like to play the move that maximizes the payoff. Thus, Max will assign the maximum score amongst
the children to the node where Max makes a move. Similarly, Min will minimize the payoff to Max,
since that will maximize Min's payoff. The maximum and minimum scores are taken at alternating
levels of the tree, since Max and Min alternate turns.
In this way, all nodes in the tree can be assigned a payoff or minimax value, starting from the leaves
of the tree and moving up the tree towards the root. In Figure 3, we give minimax values for all nodes
in our Naughts and Crosses game tree (Figure 2). These minimax values tell us what the best possible
outcome for Max is in any position within the game tree, given that Min will do its best to foil Max's
plans.

Once the root of the game tree has been assigned a minimax value, a best move for Max is defined as
a move which leads to the same minimax value as the root of the tree. We can trace down the tree,
always choosing moves that lead to the same minimax value. This path of moves gives us an optimal
line of play for either player, and is known as a principal

Figure 3. Naughts and Crosses Game Tree With Minimax Values.

variation. Note that in our game of Naughts and Crosses, the side playing the Crosses will draw the
game, but only if an X is played in the lower central square. Playing to either square in the top row can
lead to a loss for the Crosses, if the opponent plays the best move.

To compute the minimax value of a position, we can use any algorithm that searches the whole game
tree. A depth-first search uses less memory than a best-first or breadth-first tree search algorithm, so
it is preferred in current game-tree search programs. In Figure 3, we show two C functions which are
the basis of a recursive depth-first search of a game tree. By calling Maximize with a position p, we
will get the minimax value of position p as the output of the function after the entire game tree has
been searched.

In Listing 1, we have left out some of the details. For example, we have not defined what a position is,
since this is game-dependent. There are three additional functions that would be required to
implement the minimax search: (1) EndOfGame, which determines whether the game is over at the
input position, returning TRUE if the game is over; (2) GameValue, which accepts a position as a
parameter, determines who has won the game, and returns the payoff with respect to the player Max;
and (3) GenerateSuccessors which generates an array of successor positions (p.succ[]) from the input
position, and returns the number of successors to the calling procedure.
Note that Maximize() and Minimize() recursively call one another until a position is reached where the
EndOfGame() function returns TRUE. As each successor of a node is explored, gamma maintains the
current assessment of the position, based on all of the moves that have been searched so far. Once all
successors have been examined, the minimax value for that position has been computed and stored in
gamma, which can be returned to a higher level within the tree (please refer to Listing 1).

The minimax algorithm can also determine which move yields the score gamma, and return that up
the tree as well. However, there is only one place we are interested in the move choice: the root of the
game tree. We could write a special version of Maximize that returns a best move and the minimax
value.

This formulation requires exactly the same amount of work as the matrix formulation did, but further
pruning can be done on this tree. The algorithm [Knuth 1975] improves on the typical minimax

http://gamasutra.com/features/20000626/brockington_02.htm (2 of 3) [25/06/2002 3:46:26 PM]


Gamasutra - Features - "Pawn Captures Wyvern: How Computers Chess Can Improve Your Pathfinding" [06.26.00]
algorithm by passing down bounds throughout the tree and can prune off branches that can be shown
to have no relevance on the minimax value of the game tree. With the algorithm, one can search
the optimal number of terminal positions required to determine the minimax value (if one always
searches the best move first at every node)! In practice, that doesn't happen (why would you need to
search if you already knew the best move?), so there are variants on such as NegaScout [Reinefeld
1983] and MTD(f) [Plaat 1996] that have been shown to be significantly better than in practice.

However, it would still take a modern computer millions of years to evaluate the full game tree for the
game of chess if one had to go all the way to the terminal nodes. How can we control the size of the
search?

________________________________________________________
3.2 Iterative Deepening

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://gamasutra.com/features/20000626/brockington_02.htm (3 of 3) [25/06/2002 3:46:26 PM]


Gamasutra - Features - "Pawn Captures Wyvern: How Computers Chess Can Improve Your Pathfinding" [06.26.00]

| | | |

Features
by Mark Brockington
Gamasutra
June 26, 2000
Editor's note: This paper was originally published in the 2000 Game Developer's Conference
proceedings

Printer Friendly 3.2 Iterative Deepening


Contents
Version
One method that most practitioners employ is to search the tree to a fixed
depth, k ply from the root node, and use approximate minimax value at that 1. Introduction
Discuss this Article level. However, the nature of the pruning algorithms (such as NegaScout
and MTD(f)) yield game trees that can vary widely in size at the same 3.1 Game Trees and
nominal depth. Computer chess has real time limits, and if one exceeds Minimax Search
those time limits, the game is lost, so having an algorithm that can generate
a rational decision at any time is very important. Thus, a technique called 3.2 Iterative Deepening
iterative deepening Scott 1969] is used.
4.0 Reimplementing A*

The idea is that the algorithm should be limited to exploring a small 4.3 The History of
search depth k by forcing evaluations of nodes once they reach that depth. Heuristic
Once that search is done, the limit k can be moved forward by a step s, and
the search can be repeated to a depth of k+s. In chess programs, k and s usually equal 1. Thus, the
program does a 1-ply search before doing a 2-ply search, which occurs before the 3-ply search et
cetera.
Scott noted that there is no way of predicting how long an ab search will take, since it depends heavily
on the move ordering. However, by using iterative deepening, one can estimate how long a (k+1)-ply
search will take, based on the length of the preceding k-ply search. Unfortunately, the prediction may
be far off the accurate value. In some cases, a real time constraint (such as a time control in a chess
game) may necessitate aborting the current search. Without iterative deepening, if a program has not
finished a search when the time constraint interrupts the search, the program may play a catastrophic
move. With iterative deepening, we can use the best move from the deepest search that was
completed.
Other benefits were explored by Slate and Atkin in their Chess 4.5 program [Slate 1977]. They
discovered that there were many statistics that could be gathered from a search iteration, including
the principal variation. The principal variation of a k-ply search is a good starting place to look for a
principal variation of a (k+1)-ply search, so the principal variation from the k-ply search is searched
Letters to the Editor:
first at depth (k+1). This improves the ordering of the moves in the (k+1)-ply search. Usually, the
Write a letter number of bottom positions explored for all of the searches up to depth d with iterative deepening is
View all letters
significantly smaller than attempting a d-ply search without iterative deepening.
3.3 Transposition Tables

Specific information about a search can be saved in a transposition table [Greenblatt 1967].
In the minimax algorithm given in Listing 1, all of the information about a node can be accumulated
including the best score, the best move from that position, the depth it was searched to. All of this
information is commonly stored into one transposition table entry. Transposition tables are normally
constructed as closed hash tables, with hashing functions that are easy to update (such as a number
of XOR operations) as one traverses the tree. The transposition table information can be used in two
main ways: duplicate detection and move ordering.

Why would we need to detect duplicates in a game tree? In reality, the game tree is a graph; some of
the positions appear in multiple places within the tree. Thus, it makes sense that each position should
only be explored once if the information obtained is sufficient to terminate the search. The
transposition table assists in finding and eliminating these duplicated positions.
The same position in the game will always hash to the same location in the transposition table. What if
the information stored in the table is the same position as the current node, and the stored result of a
search of that position is at least as deep as the search we are attempting to execute? If we have an
exact minimax value in the hash table for a search that is at least as deep as the one to be executed,

http://gamasutra.com/features/20000626/brockington_03.htm (1 of 2) [25/06/2002 3:47:29 PM]


Gamasutra - Features - "Pawn Captures Wyvern: How Computers Chess Can Improve Your Pathfinding" [06.26.00]
we can use the result from the hash table and prune the entire search.
Most of the time, the duplicate detection will fail to completely eliminate the search, and we can
exploit the transposition table to improve our move ordering. In the games we are studying, the best
move from a previous search depth is likely to be the best move at the current search depth. Thus, we
can obtain the previous best move from the transposition table, and search the previous best move
before all others. In general, the move ordering benefits of combining iterative deepening and the
transposition table are at least as important to the node count as the duplicate detection property,
depending on the application chosen.
3.4 The History Heuristic

The transposition table only offers move ordering information about a single move in the move list.
The history heuristic [Schaeffer 1989] is a useful technique for sorting all other moves. In the game of
chess, a 64 by 64 matrix is used to store statistics. Each time a move from a square startsq to a
square endsq is chosen as a best move during the search, a bonus is stored in the matrix at the
location [startsq,endsq]. The size of this bonus depends on the depth at which the move was
successful at. A bonus that varies exponentially based on the depth of the subtree under that position
has been found to work well in practice. Moves with higher history values are more likely to be best
moves at other points in the tree; thus, moves are sorted based on their current history values. This
makes a dynamic ordering for all possible legal moves in cases where no ordering information exists.
In the programs that the author is aware of, both move ordering techniques are used. The
transposition table move is always used first, since it yields specific information about that node from a
previous search. Once the transposition table move has been searched, the remaining moves are
sorted by the history heuristic.
3.5 Another Method Of Searching A Game Tree - SSS*

Stockman [1979] introduced the SSS* algorithm, a variant to the depth-first search algorithms for
determining the minimax value. Initially, it was believed that the algorithm dominated in the sense
that SSS* will not search a node if did not search it. A problem with SSS* is that a list structure
(the OPEN list) must be maintained, which could grow to b^d/2 elements, where b is the branching
factor and d is the depth of the tree to be searched. At the time, this space requirement was
considered to be too large for a practical chess-playing program. Even if the space requirement was
not a problem, maintaining the OPEN list slowed down the algorithm to make it slower than in
practice.

Although versions of SSS* eventually managed to become faster than for game trees [Reinefeld
1994a], it has been recently discovered that SSS* can be implemented as a series of null-window
calls, using a transposition table instead of an OPEN list [Plaat 1996]. The research showed that the
drawbacks of SSS* are not true. However, it is also important to note that the benefits also disappear:
SSS* is not necessarily better than when dynamic move reordering is considered. When all of the
typical enhancements are used, SSS* can be outperformed by NegaScout and MTD(f).
In game-tree search, a depth-first search algorithm generates results faster than a best-first search
algorithm. A* is also a best-first search algorithm. Is there a better single-agent search algorithm than
A*, that uses a depth-first iterative deepening formulation?

________________________________________________________

4.0 Reimplementing A*

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://gamasutra.com/features/20000626/brockington_03.htm (2 of 2) [25/06/2002 3:47:29 PM]


Gamasutra - Features - "Pawn Captures Wyvern: How Computers Chess Can Improve Your Pathfinding" [06.26.00]

| | | |

Features
by Mark Brockington
Gamasutra
June 26, 2000
Editor's note: This paper was originally published in the 2000 Game Developer's Conference
proceedings

Printer Friendly 4. Reimplementing A*


Contents
Version
The first stage of the plan is to reimplement A* as a depth-first search
algorithm. The second stage is to implement the two move ordering 1. Introduction
Discuss this Article techniques that we described in Section 3: transposition tables and the
history heuristic. 3.1 Game Trees and
Minimax Search
4.1 IDA*
3.2 Iterative Deepening
Korf [1985] was the first researcher to emphasize why one would want to
use depth-first iterative deepening in any search framework, including 4.0 Reimplementing A*
heuristic searches such as A*.
4.3 The History of
The implementation will be described first. In Listing 2, we see some sample Heuristic
code that could be used to implement IDA*. There are similarities between
this formulation and Maximize (from Listing 1). This implementation, like Minimax, is simple for a
search domain that has discrete steps. For each position p, we check to see if we are at a goal node,
or if the search should be cut off immediately. We use the typical condition from A* search, provided
that we have an admissible heuristic for estimating the distance to the goal. If the depth (number of
steps taken, g in the code and referred to as g(p) in the literature) plus the heuristic estimate to the
goal (h or h(p)) is greater than the cut off for this search (fmax), then we stop searching at this node.
If we are not at a goal node, nor should the search be pruned, we search all successors recursively. It
is important to note that we increase the distance traveled by a step at each ply of the tree.
How does one determine fmax? A second routine usually sets this value, calling IDA* over and over
again until a goal node is found. In our sample, ComputeNumberOfMoves() is a sample driver routine
for IDA*, and can be seen in Listing 3. When we pass a position into the
ComputeNumberOfMoves()function, we expect to get the minimum number of steps required to reach
a goal node. The algorithm starts with fmax = h(startposition), and then calls the IDAStar() function,
incrementing fmax until the IDAStar() function returns TRUE.
The idea of IDA* is not new, and it should not be new to you either. A previous article in Game
Letters to the Editor: Developer [Stout 1996] mentions this in passing as something you may want to consider to improve
Write a letter the speed of your pathfinding.
View all letters
Now, you are probably wondering whether IDA* is any worse than A*. After all, A* only expands each
node once during a search, and the nodes near the top of the tree are expanded many times. A* and
IDA* have been shown mathematically to search c * b^d nodes, where b is the typical number of
alternatives at each node, d is the depth to the closest solution and c is a constant. The only difference
is that the constant c is a little larger for IDA*.

IDA* is a little slower, but what do you gain? Well, have you seen any mention of sorting OPEN
positions on a list, or inserting entries into the CLOSED list? When you use a depth-first iterative
deepening approach, you don't have to store either list. IDA* uses O(d) memory instead of A*, which
uses O(b^d) memory. This makes IDA* a good choice where memory is at a premium. Also note that
because you have very little state information during a search, IDA* is very easy to save and restore if
the AI time slice is up.
4.2 Transposition Tables

If you are using IDA*, you have lost the CLOSED list. Unlike the OPEN list, the CLOSED list has other
functions. The primary function of the CLOSED list in A* is the ability to detect duplicate positions
within the tree. If the same node is reached by two separate paths, IDA* will blindly search through
the node both times. When the first path to the node is shorter than the second path, we have wasted
search effort. We would like a technique that allows us to detect duplicates, and store information

http://gamasutra.com/features/20000626/brockington_04.htm (1 of 2) [25/06/2002 3:47:51 PM]


Gamasutra - Features - "Pawn Captures Wyvern: How Computers Chess Can Improve Your Pathfinding" [06.26.00]
about the previously attempted depth at a given node. Thus, we want to apply transposition tables to
IDA*.
A transposition table in a computer chess program is implemented as a closed hash table where older
and less relevant data can be overwritten with newer and more relevant data. By taking the position p
and computing a hash function hash(p), one can store the fact that the node was reached in g steps
and searched to a total path of size f unsuccessfully. Whenever a node is first examined in IDA*, we
can quickly look in the table and determine whether or not the node has been previously searched. If
it has, we can compare the stored number of steps to reach p and the current number of steps taken
to reach p. If the current path to reach p is longer than the stored path, we do not search the
successors of this node. This information about g for various positions is also stored in between
iterations of fmax. Whenever we reach the position by a non-optimal path, one can immediately
eliminate the search for all future iterations.
We can also use the transposition table to detect duplicate positions, but we can use it to tell the
algorithm which successor had the most promise during a previous iteration for each position. One
measure of promise is the successor leading to the smallest h value during the current level of search.
The stored move is searched first on subsequent iterations, in the same manner that we search the
stored move from the transposition table first in a computer chess program.
For the typical pathfinding algorithm, depending on the structure of your nodes, you may not need a
large hash table. One can derive a large portion of the benefit by having a CLOSED list that is only
2-5% of the size of the typical CLOSED list generated by an A* search on the same position.
A key component of the transposition table is the entry replacement scheme. To replace entries, one
would overwrite an entry in the hash table if the node in the hash table has a longer path from the
start position than the node we want to write in (store the lowest g(p) instances). We want to do this
because cutoffs higher up in the tree save more nodes than cutoffs lower down in the tree. Other
recent research in the computer chess community has dealt with two-stage replacement schemes for
transposition tables. In one experiment [Breuker 1996], half of the transposition table was reserved
for the nodes closest to the root, and the other half of the transposition table was reserved for the
most recently visited nodes regardless of depth. This yielded an improvement to search size at a
minimal performance cost.
How much does adding a transposition table to IDA* save us? Experiments have shown that a
transposition table can reduce the size of standard 15-puzzle searches by nearly 50% [Reinefeld
1994b], with the cost of storage and access being O(1), in comparison to O(log n) searches on a data
structure for a CLOSED list that doesn't lose information.

________________________________________________________

4.3 The History of Heuristic

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://gamasutra.com/features/20000626/brockington_04.htm (2 of 2) [25/06/2002 3:47:51 PM]


Gamasutra - Features - "Pawn Captures Wyvern: How Computers Chess Can Improve Your Pathfinding" [06.26.00]

| | | |

Features
by Mark Brockington
Gamasutra
June 26, 2000
Editor's note: This paper was originally published in the 2000 Game Developer's Conference
proceedings

Printer Friendly 4.3 The History Heuristic


Contents
Version
We usually sort moves in IDA* based on a static move ordering: the moves
that have the lowest h values are searched first. Can we do better with a 1. Introduction
Discuss this Article dynamic move ordering? For IDA*, move ordering doesn't really do us a lot
of good, except in the last (and largest) iteration, where we find the 3.1 Game Trees and
solution. If we can find the solution early in the last iteration, we can avoid Minimax Search
searching a large number of nodes. Thus, we want to acquire information
over the course of a search that will help us bias the last search owards the 3.2 Iterative Deepening
solution.
4.0 Reimplementing A*
A sliding-tile experiment [Reinefeld 1994b] gave a description for a history
heuristic for the 15-puzzle. The history heuristic was stored in a 4.3 The History of
Heuristic
3-dimensional array. The three dimensions were the tiles (16) in each
position (16) moving in each direction (4). To install information, the
experiment counted the number of times that a move led to the deepest subtree (i.e. attained the
smallest h value for an examined node within its subtree). The experiment met with some success, as
the IDA* algorithm searched approximately 6% less nodes when the history heuristic was used versus
the version that used a static move ordering.
We could use both the static move ordering and the dynamic information gathered by the history
heuristic to generate a hybrid heuristic for ordering successors. This type of hybrid heuristic could
improve the ordering of moves more than either technique in isolation.
5. Conclusions

We have implemented the techniques described above, and we are currently using them to plot paths
for our creatures in our soon-to-be-released role-playing game, Neverwinter Nights.
There are many caveats to using these techniques, and it is important to be able to understand the
drawbacks. The speed improvements that these techniques yield will vary depending on your
application (they vary dramatically when implementing them in chess, Othello and checkers
Letters to the Editor: programs!) ... but you now have some new enhancements that can help you search more efficiently.
Write a letter
View all letters
To summarize the utility of adding standard enhancements to search algorithms, let us examine
another problem: finding push-optimal solutions for Sokoban problems. If you have never seen the
game Sokoban, a picture of one of the 90 positions is given in Figure 4. The goal is for the little worker
to push all of the round stones into the goal squares (the goal squares are shaded with diagonal lines).
On the surface, this may seem as easy as pathfinding, and an easy application for A*. However, all
pathfinding "mistakes" are undoable by retracing the path. One wrong push of a stone could leave you
in a state where you are unable to complete the task. Thus, the need to plan the path of all stones to
the goal squares is paramount.

http://gamasutra.com/features/20000626/brockington_05.htm (1 of 3) [25/06/2002 3:48:24 PM]


Gamasutra - Features - "Pawn Captures Wyvern: How Computers Chess Can Improve Your Pathfinding" [06.26.00]

Figure 4. Puzzle 1 from Sokoban.

IDA* is incapable of solving any of the puzzles with 20 million nodes searched. If we enhance IDA*
with the transposition table and the move ordering techniques, 4 of the puzzles can be solved
[Junghanns 1997]. If we search one billion nodes, only 6 of the 90 puzzles can be solved using IDA*,
transposition tables and move ordering. If we use all of the domain-dependent techniques the
researchers developed (including deadlock tables, tunnel macros, goal macros, goal cuts, pattern
search, relevance cuts and overestimation), the program Rolling Stone can solve 52 of the 90
problems within the billion node limit for each puzzle [Junghanns 1999]. Pathfinding is a relatively
trivial problem in comparison to finding push-optimal solutions for Sokoban puzzles, and I am happy to
say my bosses at BioWare haven't asked me to solve Sokoban in real time.
There's a lot of very good academic information on single-agent search (including a special issue of the
journal Artificial Intelligence later this year which will be devoted to the topic), and I would encourage
everyone to look up some of these references. If you have any further questions on any of the
reference material, please feel free to e-mail me.

Mark Brockington is the lead research scientist at BioWare Corp. His email adress is
markb@bioware.com

References

[Breuker 1996] D. M. Breuker, J. W. H. M. Uiterwijk, and H. J. van den Herik. Replacement Schemes
and Two-Level Tables. ICCA Journal, 19(3):175-179, 1996.
[Greenblatt 1967] R. D. Greenblatt, D. E. Eastlake, and S.D. Crocker. The Greenblatt Chess Program.
In Proceedings of the Fall Joint Computer Conference, volume 31, pages 801-810, 1967.
[Hart 1968] P. E. Hart, N. J. Nilsson, and B. Raphael. A Formal Basis for the Heuristic Determination of
Minimum Cost Paths. IEEE Transactions on Systems Science and Cybernetics, SSC-4(2):100-107,
1968.
[Junghanns 1997] A. Junghanns and J. Schaeffer. Sokoban: A Challenging Single-Agent Search
Problem, Workshop Using Games as an Experimental Testbed for AI Research, IJCAI-97, Nagoya,
Japan, August 1997.
[Junghanns 1999] A. Junghanns. Pushing the Limits: New Developments in Single-Agent Search, PhD
Thesis, Department of Computing Science, University of Alberta, 1999. URL:
http://www.cs.ualberta.ca/~games/Sokoban/papers.html

[Knuth 1975] D. E. Knuth and R. W. Moore. An Analysis of Alpha-Beta Pruning. Artificial Intelligence,
6(3):293-326, 1975.
[Korf 1985] R. E. Korf. Depth-First Iterative Deepening: An Optimal Admissible Tree Search. Artificial
Intelligence, 27:97-109, 1985.
[Nilsson 1971] N. J. Nilsson. Problem-Solving Methods in Artificial Intelligence. McGraw-Hill Book
Company, New York, NY, 1971.
[Plaat 1996] A. Plaat, J. Schaeffer, W. Pijls, and A. de Bruin. Exploiting Graph Properties of Game
Trees. In AAAI-1996, volume 1, pages 234-239, Portland, Oregon, August 1996.
[Reinefeld 1983] A. Reinefeld. An Improvement to the Scout Tree-Search Algorithm. ICCA Journal,
6(4):4-14, 1983.

[Reinefeld 1994a] A. Reinefeld. A Minimax Algorithm Faster than Alpha-Beta. In H.J. van den Herik,
I.S. Herschberg and J.W.H.M. Uiterwijk, editors, Advances In Computer Chess 7, pages 237-250.
University of Limburg, 1994.

http://gamasutra.com/features/20000626/brockington_05.htm (2 of 3) [25/06/2002 3:48:24 PM]


Gamasutra - Features - "Pawn Captures Wyvern: How Computers Chess Can Improve Your Pathfinding" [06.26.00]

[Reinefeld 1994b] A. Reinefeld and T. A. Marsland. Enhanced Iterative-Deepening Search. IEEE


Transactions on Pattern Analysis and Machine Intelligence, PAMI-16(7):701-710,1994.
[Schaeffer 1989] J. Schaeffer. The History Heuristic and Alpha-Beta Search Enhancements In Practice.
IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-11(11): 1203-1212, 1989.
[Scott 1969] J. J. Scott. A Chess-Playing Program. In B. Meltzer and D. Michie, editors, Machine
Intelligence 4, pages 255-265. Edinburgh University Press, 1969.
[Slate 1977] D. J. Slate and L. R. Atkin. Chess 4.5 - The Northwestern University Chess Program. In
P.W. Frey, editor, Chess Skill in Man and Machine, pages 82-118. Springer-Verlag, New York, 1977.

[Stockman 1979] G. C. Stockman. A Minimax Algorithm Better than Alpha-Beta? Artificial Intelligence,
12:179-196, 1979.
[Stout 1996] W. B. Stout. Smart Moves: Intelligent Path-Finding. Game Developer, pp. 28-35,
Oct./Nov. 1996.
[von Neumann 1944] J. von Neumann and O. Morgenstern. Theory of Games and Economic Behavior.
Princeton Press, Princeton, NJ, 1944

Discuss this article in Gamasutra's discussion forums

________________________________________________________

[Back To] 1. Introduction

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://gamasutra.com/features/20000626/brockington_05.htm (3 of 3) [25/06/2002 3:48:24 PM]


Gamasutra - Features - "Smart Moves: Intelligent Path Finding" [02.12.99]

| | | |

Features
By W. Bryan Stout
Gamasutra
February 12, 1999
Smart Moves: Intelligent Path-Finding
Published in Of all the decisions involved in computer-game AI, the most common is probably
Game Developer Contents
path-finding-looking for a good route for moving an entity from here to there. The
Magazine, entity can be a single person, a vehicle, or a combat unit; the genre can be an
Introduction
October 1996 action game, a simulator, a role-playing game, or a strategy game. But any game Path-Finding On The
in which the computer is responsible for moving things around has to solve the Move
path-finding problem.
Looking Before You
And this is not a trivial problem. Questions about path-finding are regularly seen in Leap
online game programming forums, and the entities in several games move in less
than intelligent paths. However, although path-finding is not trivial, there are some The Star of the
well-established, solid algorithms that deserve to be known better in the game Search Algorithms (A*
Search)
community.
How Do I Use A*?
Several path-finding algorithms are not very efficient, but studying them serves us
by introducing concepts incrementally. We can then understand how different Transforming the
shortcomings are overcome. Search Space

To demonstrate the workings of the algorithms visually, I have developed a Storing It Better
program in Delphi 2.0 called "PathDemo." It is available for readers to download.
Fine-Tuning Your
The article and demo assume that the playing space is represented with square
Search Engine
tiles. You can adapt the concepts in the algorithms to other tilings, such as
hexagons; ideas for adapting them to continuous spaces are discussed at the end What If I'm in a
of the article. Smooth World?

Path-Finding on the Move

Letters to the Editor:


Write a letter
View all letters

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990212/sm_01.htm [25/06/2002 3:48:52 PM]


Gamasutra - Features - "Smart Moves: Intelligent Path Finding" [02.12.99]

| | | |

Features
By W. Bryan Stout
Gamasutra
February 12, 1999 Path-Finding on the Move
The typical problem in path-finding is obstacle avoidance. The simplest approach to Contents
Published in the problem is to ignore the obstacles until one bumps into them. The algorithm
Game Developer Introduction
would look something like this:
Magazine,
Path-Finding On The
October 1996 Move

Looking Before You


while not at the goal Leap

The Star of the


pick a direction to move toward the goal Search Algorithms (A*
Search)
if that direction is clear for movement How Do I Use A*?

move there Transforming the


Search Space

else Storing It Better

Fine-Tuning Your
pick another direction according to Search Engine

an avoidance strategy What If I'm in a


Smooth World?

This approach is simple because it makes few demands: all that needs to be known are the relative
positions of the entity and its goal, and whether the immediate vicinity is blocked. For many game
situations, this is good enough.

Different obstacle-avoidance strategies include:


● Movement in a random direction. If the obstacles are all small and convex, the entity (shown as
a green dot) can probably get around them by moving a little bit away and trying again, until it
reaches the goal (shown as a red dot). Figure 1A shows this strategy at work. A problem arises
Letters to the Editor: with this method if the obstacles are large or if they are concave, as is seen in Figure 1B-the
Write a letter
entity can get completely stuck, or at least waste a lot of time before it stumbles onto a way
View all letters
around. One way to avoid this: if a problem is too hard to deal with, alter the game so it never
comes up. That is, make sure there are never any concave obstacles.

● Tracing around the obstacle. Fortunately, there are other ways to get around. If the obstacle is
large, one can do the equivalent of placing a hand against the wall and following the outline of
the obstacle until it is skirted. Figure 2A shows how well this can deal with large obstacles. The
problem with this technique comes in deciding when to stop tracing. A typical heuristic may be:
"Stop tracing when you are heading in the direction you wanted to go when you started tracing."
This would work in many situations, but Figure 2B shows how one may end up constantly
circling around without finding the way out.

● Robust tracing. A more robust heuristic comes from work on mobile robots: "When blocked,
calculate the equation of the line from your current position to the goal. Trace until that line is
again crossed. Abort if you end up at the starting position again." This method is guaranteed to
find a way around the obstacle if there is one, as is seen in Figure 3A. (If the original point of
blockage is between you and the goal when you cross the line, be sure not to stop tracing, or
more circling will result.) Figure 3B shows the downside of this approach: it will often take more
time tracing the obstacle than is needed, making it look pretty simple-minded-though not as
simple as endless circling. A happy compromise would be to combine both approaches: always
use the simpler heuristic for stopping the tracing first, but if circling is detected, switch to the

http://www.gamasutra.com/features/19990212/sm_02.htm (1 of 2) [25/06/2002 3:49:14 PM]


Gamasutra - Features - "Smart Moves: Intelligent Path Finding" [02.12.99]
robust heuristic.

Looking Before You Leap

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990212/sm_02.htm (2 of 2) [25/06/2002 3:49:14 PM]


Gamasutra - Features - "Smart Moves: Intelligent Path Finding" [02.12.99]

| | | |

Features
By W. Bryan Stout
Gamasutra
February 12, 1999 Looking Before You Leap
Although the obstacle-skirting techniques discussed above can often do a passable Contents
Published in or even adequate job, there are situations where the only intelligent approach is to Introduction
Game Developer plan the entire route before the first step is taken. In addition, these methods do
Magazine, little to handle the problem of weighted regions, where the difficulty is not so much Path-Finding On The
October 1996 avoiding obstacles as finding the cheapest path among several choices where the Move
terrain can vary in its cost.
Looking Before You
Fortunately, the fields of Graph Theory and conventional AI have several Leap
algorithms that can be used to handle both difficult obstacles and weighted
The Star of the
regions. In the literature, many of these algorithms are presented in terms of Search Algorithms (A*
changing between states, or traversing the nodes of a graph. They are often used Search)
in solving a variety of problems, including puzzles like the 15-puzzle or Rubik's
cube, where a state is an arrangement of the tiles or cubes, and neighboring states How Do I Use A*?
(or adjacent nodes) are visited by sliding one tile or rotating one cube face.
Applying these algorithms to path-finding in geometric space requires a simple Transforming the
adaptation: a state or a graph node stands for the entity being in a particular tile, Search Space
and moving to adjacent tiles corresponds to moving to the neighboring states, or
adjacent nodes. Storing It Better

Fine-Tuning Your
Working from the simplest algorithms to the more robust, we have: Search Engine
● Breadth-first search. Beginning at the start node, this algorithm first
What If I'm in a
examines all immediate neighboring nodes, then all nodes two steps away,
Smooth World?
then three, and so on, until a goal node is found. Typically, each node's
unexamined neighboring nodes are pushed onto an Open list, which is
usually a FIFO (first-in-first-out) queue. The algorithm would go something
like what is shown in Listing 1. Figure 4 shows how the search proceeds.
We can see that it does find its way around obstacles, and in fact it is
guaranteed to find a shortest path-that is, one of several paths that tie for
the shortest in length-if all steps have the same cost. There are a couple of
obvious problems. One is that it fans out in all directions equally, instead of
directing its search towards the goal; the other is that all steps are not
equal-at least the diagonal steps should be longer than the orthogonal ones.
Letters to the Editor:
Write a letter ● Bidirectional breadth-first search. This enhances the simple breadth-first
View all letters search by starting two simultaneous breadth-first searches from the start
and the goal nodes and stopping when a node from one end's search finds a
neighboring node marked from the other end's search. As seen in Figure 5,
this can save substantial work from simple breadth-first search (typically by
a factor of 2), but it is still quite inefficient. Tricks like this are good to
remember, though, since they may come in handy elsewhere.

● Dijkstra's algorithm. E. Dijkstra developed a classic algorithm for traversing


graphs with edges of differing weights. At each step, it looks at the
unprocessed node closest to the start node, looks at that node's neighbors,
and sets or updates their respective distances from the start. This has two
advantages to the breadth-first search: it takes a path's length or cost into
account and updates the goodness of nodes if better paths to them are
found. To implement this, the Open list is changed from a FIFO queue to a
priority queue, where the node popped is the one with the best score-here,
the one with the lowest cost path from the start. (See Listing 2.) We see in
Figure 6 that Dijkstra's algorithm adapts well to terrain cost. However, it
still has the weakness of breadth-width search in ignoring the direction to the
goal.

http://www.gamasutra.com/features/19990212/sm_03.htm (1 of 2) [25/06/2002 3:50:14 PM]


Gamasutra - Features - "Smart Moves: Intelligent Path Finding" [02.12.99]
● Depth-first search. This search is the complement to breadth-first search;
instead of visiting all a node's siblings before any children, it visits all of a
node's descendants before any of its siblings. To make sure the search
terminates, we must add a cutoff at some depth. We can use the same code
for this search as for breadth-first search, if we add a depth parameter to
keep track of each node's depth and change Open from a FIFO queue to a
LIFO (last-in-first-out) stack. In fact, we can eliminate the Open list entirely
and instead make the search a recursive routine, which would save the
memory used for Open. We need to make sure each tile is marked as
"visited" on the way out, and is unmarked on the way back, to avoid
generating paths that visit the same tile twice. In fact, Figure 7 shows that
we need to do more than that: the algorithm still can tangle around itself
and waste time in a maddening way. For geometric path-finding, we can add
two enhancements. One would be to label each tile with the length of the
cheapest path found to it yet; the algorithm would then never visit it again
unless it had a cheaper path, or one just as cheap but searching to a greater
depth. The second would be to have the search always look first at the
children in the direction of the goal. With these two enhancements checked,
one sees that the depth-first search finds a path quickly. Even weighted
paths can be handled by making the depth cut-off equal the total
accumulated cost rather than the total distance.

● Iterative-deepening depth-first search. Actually, there is still one fly in the


depth-first ointment-picking the right depth cutoff. If it is too low, it will not
reach the goal; if too high, it will potentially waste time exploring blind
avenues too far, or find a weighted path which is too costly. These problems
are solved by doing iterative deepening, a technique that carries out a
depth-first search with increasing depth: first one, then two, and so on until
the goal is found. In the path-finding domain, we can enhance this by
starting with a depth equal to the straight-line distance from the start to the
goal. This search is asymptotically optimal among brute force searches in
both space and time.

● Best-first search. This is the first heuristic search considered, meaning that it
takes into account domain knowledge to guide its efforts. It is similar to
Dijkstra's algorithm, except that instead of the nodes in Open being scored
by their distance from the start, they are scored by an estimate of the
distance remaining to the goal. This cost also does not require possible
updating as Dijkstra's does. Figure 8 shows its performance. It is easily the
fastest of the forward-planning searches we have examined so far, heading
in the most direct manner to the goal. We also see its weaknesses. In 8A, we
see that it does not take into account the accumulated cost of the terrain,
plowing straight through a costly area rather than going around it. And in
8B, we see that the path it finds around the obstacle is not direct, but
weaves around it in a manner reminiscent of the hand-tracing techniques
seen above.

The Star of the Search Algorithms (A* Search)

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990212/sm_03.htm (2 of 2) [25/06/2002 3:50:14 PM]


Gamasutra - Features - "Smart Moves: Intelligent Path Finding" [02.12.99]

| | | |

Features
By W. Bryan Stout
Gamasutra
February 12, 1999 The Star of the Search Algorithms (A* Search)
The best-established algorithm for the general searching of optimal paths is A* Contents
Published in (pronounced "A-star"). This heuristic search ranks each node by an estimate of the Introduction
Game Developer best route that goes through that node. The typical formula is expressed as:
Magazine,
Path-Finding On The
October 1996 Move

f(n) = g(n) + h(n) Looking Before You


Leap

The Star of the


Search Algorithms (A*
Search)
where:
How Do I Use A*?
f(n) is the score assigned to node n Transforming the
Search Space
g(n) is the actual cheapest cost of
Storing It Better

arriving at n from the start Fine-Tuning Your


Search Engine

h(n) is the heuristic estimate of the What If I'm in a


Smooth World?
cost to the goal from n

So it combines the tracking of the previous path length of Dijkstra's algorithm, with the heuristic
estimate of the remaining path from best-first search. The algorithm proper is seen in Listing 3. Since
some nodes may be processed more than once-from finding better paths to them later-we use a new
list called Closed to keep track of them.

A* has a couple interesting properties. It is guaranteed to find the shortest path, as long as the
heuristic estimate, h(n), is admissible-that is, it is never greater than the true remaining distance to
Letters to the Editor: the goal. It makes the most efficient use of the heuristic function: no search that uses the same
Write a letter
heuristic function h(n) and finds optimal paths will expand fewer nodes than A*, not counting
View all letters tie-breaking among nodes of equal cost. In Figures 9A through 9C, we see how A* deals with
situations that gave problems to other search algorithms.

How Do I Use A*?

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990212/sm_04.htm [25/06/2002 3:50:34 PM]


Gamasutra - Features - "Smart Moves: Intelligent Path Finding" [02.12.99]

| | | |

Features
By W. Bryan Stout
Gamasutra
February 12, 1999 How Do I use A*?
A* turns out to be very flexible in practice. Consider the different parts of the Contents
Published in algorithm. Introduction
Game Developer
Magazine, The state would often be the tile or position the entity occupies. But if needed, it
Path-Finding On The
October 1996 can represent orientation and velocity as well (for example, for finding a path for a Move
tank or most any vehicle-their turn radius gets worse the faster they go).
Looking Before You
Neighboring states would vary depending on the game and the local situation. Leap
Adjacent positions may be excluded because they are impassable or are between
the neighbors. Some terrain can be passable for certain units but not for others; The Star of the
units that cannot turn quickly cannot go to all neighboring tiles. Search Algorithms (A*
Search)
The cost of going from one position to another can represent many things: the
How Do I Use A*?
simple distance between the positions; the cost in time or movement points or fuel
between them; penalties for traveling through undesirable places (such as points Transforming the
within range of enemy artillery); bonuses for traveling through desirable places Search Space
(such as exploring new terrain or imposing control over uncontrolled locations);
and aesthetic considerations-for example, if diagonal moves are just as cheap as Storing It Better
orthogonal moves, you may still want to make them cost more, so that the routes
chosen look more direct and natural. Fine-Tuning Your
Search Engine
The estimate is usually the minimum distance between the current node and the
What If I'm in a
goal multiplied by the minimum cost between nodes. This guarantees that h(n) is
Smooth World?
admissible. (In a map of square tiles where units may only occupy points in the
grid, the minimum distance would not be the Euclidean distance, but the minimum number of
orthogonal and diagonal moves between the two points.)

The goal does not have to be a single location but can consist of multiple locations. The estimate for a
node would then be the minimum of the estimate for all possible goals.

Search cutoffs can be included easily, to cover limits in path cost, path distance, or both.
From my own direct experience, I have seen the A* star search work very well for finding a variety of
types of paths in wargames and strategy games.
Letters to the Editor:
Write a letter
View all letters
The Limitations of A*
There are situations where A* may not perform very well, for a variety of reasons. The more or less
real-time requirements of games, plus the limitations of the available memory and processor time in
some of them, may make it hard even for A* to work well. A large map may require thousands of
entries in the Open and Closed list, and there may not be room enough for that. Even if there is
enough memory for them, the algorithms used for manipulating them may be inefficient.

The quality of A*'s search depends on the quality of the heuristic estimate h(n). If h is very close to
the true cost of the remaining path, its efficiency will be high; on the other hand, if it is too low, its
efficiency gets very bad. In fact, breadth-first search is an A* search, with h being trivially zero for all
nodes-this certainly underestimates the remaining path cost, and while it will find the optimum path, it
will do so slowly. In Figure 10A, we see that while searching in expensive terrain (shaded area), the
frontier of nodes searched looks similar to Dijkstra's algorithm; in 10B, with the heuristic increased,
the search is more focused.

Let's look at ways to make the A* search more efficient in problem areas.

Transforming the Search Space

http://www.gamasutra.com/features/19990212/sm_05.htm (1 of 2) [25/06/2002 3:51:00 PM]


Gamasutra - Features - "Smart Moves: Intelligent Path Finding" [02.12.99]

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990212/sm_05.htm (2 of 2) [25/06/2002 3:51:00 PM]


Gamasutra - Features - "Smart Moves: Intelligent Path Finding" [02.12.99]

| | | |

Features
By W. Bryan Stout
Gamasutra
February 12, 1999 Transforming the Search Space
Perhaps the most important improvement one can make is to restructure the Contents
Published in problem to be solved, making it an easier problem. Macro-operators are sequences Introduction
Game Developer of steps that belong together and can be combined into a single step, making the
Magazine, search take bigger steps at a time. For example, airplanes take a series of steps in Path-Finding On The
October 1996 order to change their orientation and altitude. A common sequence may be used Move
as a single change of state operator, rather than using the smaller steps
individually. In addition, search and general problem-solving methods can be Looking Before You
greatly simplified if they are reduced to sub-problems, whose individual solutions Leap
are fairly simple. In the case of path-finding, a map can be broken down into large
contiguous areas whose connectivity is known. One or two border tiles between The Star of the
each pair of adjacent areas are chosen; then the route is first laid out in by a Search Algorithms (A*
Search)
search among adjacent areas, in each of which a route is found from one border
point to another. How Do I Use A*?

For example, in a strategic map of Europe, a path-finder searching for a land route Transforming the
from Madrid to Athens would probably waste a fair amount of time looking down Search Space
the boot of Italy. Using countries as areas, a hierarchical search would first
determine that the route would go from Spain to France to Italy to Yugoslavia Storing It Better
(looking at an old map) to Greece; and then the route through Italy would only
Fine-Tuning Your
need to connect Italy's border with France, to Italy's border with Yugoslavia. As Search Engine
another example, routes from one part of a building to another can be broken
down into a path of rooms and hallways to take, and then the paths between doors What If I'm in a
in each room. Smooth World?

It is much easier to choose areas in predefined maps than to have the computer figure them out for
randomly generated maps. Note also that the examples discussed deal mainly with obstacle
avoidance; for weighted regions, it is trickier to assign useful regions, especially for the computer (it
may not very useful, either).

Storing It Better

Letters to the Editor:


Write a letter
View all letters

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990212/sm_06.htm [25/06/2002 3:51:16 PM]


Gamasutra - Features - "Smart Moves: Intelligent Path Finding" [02.12.99]

| | | |

Features
By W. Bryan Stout
Gamasutra
February 12, 1999 Storing it Better
Even if the A* search is relatively efficient by itself, it can be slowed down by Contents
Published in inefficient algorithms handling the data structures. Regarding the search, two Introduction
Game Developer major data structures are involved.
Magazine,
Path-Finding On The
October 1996 The first is the representation of the playing area. Many questions have to be Move
addressed. How will the playing field be represented? Will the areas accessible
from each spot-and the costs of moving there-be represented directly in the map Looking Before You
or in a separate structure, or calculated when needed? How will features in the Leap
area be represented? Are they directly in the map, or separate structures? How
can the search algorithm access necessary information quickly? There are too The Star of the
many variables concerning the type of game and the hardware and software Search Algorithms (A*
Search)
environment to give much detail about these questions here.
How Do I Use A*?
The second major structure involved is the node or state of the search, and this
can be dealt with more explicitly. At the lower level is the search state structure. Transforming the
Fields a developer might wish to include in it are: Search Space
❍ The location (coordinates) of the map position being considered at this
Storing It Better
state of the search.
Fine-Tuning Your
❍ Other relevant attributes of the entity, such as orientation and velocity. Search Engine

What If I'm in a
❍ The cost of the best path from the source to this location. Smooth World?

❍ The length of the path up to this position.

❍ The estimate of the cost to the goal (or closest goal) from this location.

❍ The score of this state, used to pick the next state to pop off Open.

❍ A limit for the length of the search path, or its cost, or both, if applicable.
Letters to the Editor:
Write a letter ❍ A reference (pointer or index) to the parent of this node-that is, the node
View all letters that led to this one.

❍ Additional references to other nodes, as needed by the data structure


used for storing the Open and Closed lists; for example, "next" and maybe
"previous" pointers for linked lists, "right," "left," and "parent" pointers for
binary trees.
Another issue to consider is when to allocate the memory for these structures; the answer depends on
the demands and constraints of the game, hardware, and operating system.

On the higher level are the aggregate data structures-the Open and Closed lists. Although keeping
them as separate structures is typical, it is possible to keep them in the same structure, with a flag in
the node to show if it is open or not. The sorts of operations that need to be done in the Closed list
are:
❍ Insert a new node.
❍ Remove an arbitrary node.
❍ Search for a node having certain attributes (location, speed, direction).
❍ Clear the list at the end of the search.
The Open list does all these, and in addition will:
● Pop the node with the best score.

http://www.gamasutra.com/features/19990212/sm_07.htm (1 of 2) [25/06/2002 3:51:57 PM]


Gamasutra - Features - "Smart Moves: Intelligent Path Finding" [02.12.99]
● Change the score of a node.
The Open list can be thought of as a priority queue, where the next item popped off is the one with the
highest priority-in our case, the best score. Given the operations listed, there are several possible
representations to consider: a linear, unordered array; an unordered linked list; a sorted array; a
sorted linked list; a heap (the structure used in a heap sort); a balanced binary search tree.

There are several types of binary search trees: 2-3-4 trees, red-black trees, height-balanced trees
(AVL trees), and weight-balanced trees.

Heaps and balanced search trees have the advantage of logarithmic times for insertion, deletion, and
search; however, if the number of nodes is rarely large, they may not be worth the overhead they
require.

Fine-Tuning Your Search Engine

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990212/sm_07.htm (2 of 2) [25/06/2002 3:51:57 PM]


Gamasutra - Features - "Smart Moves: Intelligent Path Finding" [02.12.99]

| | | |

Features
By W. Bryan Stout
Gamasutra
February 12, 1999 What if I'm in a Smooth World?
All these search methods have assumed a playing area composed of square or Contents
Published in hexagonal tiles. What if the game play area is continuous? What if the positions of Introduction
Game Developer both entities and obstacles are stored as floats, and can be as finely determined as
Magazine, the resolution of the screen? Figure 11A shows a sample layout. For answers to Path-Finding On The
October 1996 these search conditions, we can look at the field of robotics and see what sort of Move
approaches are used for the path-planning of mobile robots. Not surprisingly,
many approaches find some way to reduce the continuous space into a few Looking Before You
important discrete choices for consideration. After this, they typically use A* to Leap
search among them for a desirable path. Ways of quantizing the space include:
The Star of the
❍ Tiles. A simple approach is to slap a tile grid on top of the space. Tiles that Search Algorithms (A*
contain all or part of an obstacle are labeled as blocked; a fringe of tiles Search)
touching the blocked tiles is also labeled as blocked to allow a buffer of
movement without collision. This representation is also useful for weighted How Do I Use A*?
regions problems. See Figure 11B.
Transforming the
Search Space
❍ Points of visibility. For obstacle avoidance problems, you can focus on the
critical points, namely those near the vertices of the obstacles (with enough Storing It Better
space away from them to avoid collisions), with points being considered
connected if they are visible from each other (that is, with no obstacle Fine-Tuning Your
Search Engine
between them). For any path, the search considers only the critical points as
intermediate steps between start and goal. See Figure 11C.
What If I'm in a
Smooth World?
❍ Convex polygons. For obstacle avoidance, the space not occupied by
polygonal obstacles can be broken up into convex polygons; the intermediate
spots in the search can be the centers of the polygons, or spots on the
borders of the polygons. Schemes for decomposing the space include:
C-Cells (each vertex is connected to the nearest visible vertex; these lines
partition the space) and Maximum-Area decomposition (each convex vertex
of an obstacle projects the edges forming the vertex to the nearest obstacles
or walls; between these two segments and the segment joining to the
nearest visible vertex, the shortest is chosen). See Figure 11D. For
Letters to the Editor:
weighted regions problems, the space is divided into polygons of
Write a letter homogeneous traversal cost. The points to aim for when crossing boundaries
View all letters
are computed using Snell's Law of Refraction. This approach avoids the
irregular paths found by other means.

❍ Quadtrees. Similar to the convex polygons, the space is divided into


squares. Each square that isn't close to being homogenous is divided into
four smaller squares, recursively. The centers of these squares are used for
searching a path. See Figure 11E.

❍ Generalized cylinders. The space between adjacent obstacles is considered


a cylinder whose shape changes along its axis. The axis traversing the space
between each adjacent pair of obstacles (including walls) is computed, and
the axes are the paths used in the search. See Figure 11F.

❍ Potential fields. An approach that does not quantize the space, nor require
complete calculation beforehand, is to consider that each obstacle has a
repulsive potential field around it, whose strength is inversely proportional to
the distance from it; there is also a uniform attractive force to the goal. At
close regular time intervals, the sum of the attractive and repulsive vectors
is computed, and the entity moves in that direction. A problem with this
approach is that it may fall into a local minimum; various ways of moving
out of such spots have been devised.

http://www.gamasutra.com/features/19990212/sm_09.htm (1 of 2) [25/06/2002 3:52:44 PM]


Gamasutra - Features - "Smart Moves: Intelligent Path Finding" [02.12.99]

Bryan Stout has done work in "real" AI for Martin Marietta and in computer games for
MicroProse. He is preparing a book on computer game AI to be published by Morgan
Kaufmann Publishers in 2001. He can be contacted at bstout@mindspring.com.

[Back to] Introduction

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/19990212/sm_09.htm (2 of 2) [25/06/2002 3:52:44 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]

| | | |

Features
by Marco Pinter
Gamasutra
[Author's Bio] Toward More Realistic Pathfinding
March 14, 2001
Pathfinding is a core component of most games today. Characters, animals, and vehicles all move in
some goal-directed manner, and the program must be able to identify a good path from an origin to a
goal, which both avoids obstacles and is the most efficient way of getting to the destination. The
best-known algorithm for achieving this is the A* search (pronounced "A star"), and it is typical for a
Introduction lead programmer on a project simply to say, "We'll use A* for pathfinding." However, AI programmers
have found again and again that the basic A* algorithm can be woefully inadequate for achieving the
Adding Realistic Turns kind of realistic movement they require in their games.
Directional Curved This article focuses on several techniques for achieving more realistic looking results from pathfinding.
Paths Many of the techniques discussed here were used in the development of Activision's upcoming Big
Game Hunter 5, which made for startlingly more realistic and visually interesting movement for the
A Better Smoothing various animals in the game. The focal topics presented here include:
Pass
● Achieving smooth straight-line movement. Figure 1a shows the result of a standard A*
Printer Friendly search, which produces an unfortunate "zigzag" effect. This article presents postprocessing
Version solutions for smoothing the path, as shown in Figure 1b.
● Adding smooth turns. Turning in a curved manner, rather than making abrupt changes of
Discuss this direction, is critical to creating realistic movement. Using some basic trigonometry, we can make
Article
turns occur smoothly over a turning radius, as shown in Figure 1c. Programmers typically use
the standard A* algorithm and then use one of several hacks or cheats to create a smooth turn.
Several of these techniques will be described.
● Achieving legal turns. Finally, I will discuss a new formal technique which modifies the A*
algorithm so that the turning radius is part of the actual search. This results in guaranteed
"legal" turns for the whole path, as shown in Figure 1d.
Letters to the Editor:
Write a letter
View all letters

http://www.gamasutra.com/features/20010314/pinter_01.htm (1 of 6) [25/06/2002 3:54:04 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]

FIGURE 1. Some of the techniques discussed in this article.


(a) is the result of a standard A* search, while (b) shows the
results of a postprocess smoothing operation. (c) shows the
application of a turning radius for curved turns. (d) illustrates
an A* modification that will enable searches to include curved
turns that avoid collisions.

Dealing with realistic turns is an important and timely AI topic. In the August 2000 issue of Game
Developer ("The Future of Game AI"), author Dave Pottinger states, "So far, no one has proffered a
simple solution for pathing in true 3D while taking into account such things as turn radius and other
movement restrictions," and goes on to describe some of the "fakes" that are commonly done. Also, in
a recent interview on Feedmag.com with Will Wright, creator of The Sims, Wright describes movement
of The Sims' characters: "They might have to turn around and they kind of get cornered -- they
actually have to calculate how quickly they can turn that angle. Then they actually calculate the angle
of displacement from step to step. Most people don't realize how complex this stuff is..."
In addition to the above points, I will also cover some important optimization techniques, as well as
some other path-related topics such as speed restrictions, realistic people movement, and movement
along roads. After presenting the various techniques below, we'll see by the end that there is no true
"best approach," and that the method you choose will depend on the specific nature of your game, its
characters, available CPU cycles and other factors.
Note that in the world of pathfinding, the term "unit" is used to represent any on-screen mobile
element, whether it's a player character, animal, monster, ship, vehicle, infantry unit, and so on. Note
also that while the body of this article presents examples based on tile-based searching, most of the
techniques presented here are equally applicable to other types of world division, such as convex
polygons and 3D navigation meshes.

A Brief Introduction to A*
The A* algorithm is a venerable technique which was originally applied to various mathematical
problems and was adapted to pathfinding during the early years of artificial intelligence research. The
basic algorithm, when applied to a grid-based pathfinding problem, is as follows: Start at the initial
position (node) and place it on the Open list, along with its estimated cost to the destination, which is
determined by a heuristic. The heuristic is often just the geometric distance between two nodes. Then
perform the following loop while the Open list is nonempty:
● Pop the node off the Open list that has the lowest estimated cost to the destination.
● If the node is the destination, we've successfully finished (quit).
● Examine the node's eight neighboring nodes.

http://www.gamasutra.com/features/20010314/pinter_01.htm (2 of 6) [25/06/2002 3:54:04 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]
● For each of the nodes which are not blocked, calculate the estimated cost to the goal of the path
that goes through that node. (This is the actual cost to reach that node from the origin, plus the
heuristic cost to the destination.)
● Push all those nonblocked surrounding nodes onto the Open list, and repeat loop.
In the end, the nodes along the chosen path, including the starting and ending position, are called the
waypoints. The A* algorithm is guaranteed to find the best path from the origin to the destination, if
one exists. A more detailed introduction to A* is presented in Bryan Stout's Game Developer article
"Smart Moves: Intelligent Pathfinding" (October/November 1996), which is also available on
Gamasutra.com.

Hierarchical Pathfinding
Critical to any discussion of efficient pathfinding within a game is the notion of hierarchical maps. To
perform an efficient A* search, it is important that the origin and destination nodes of any particular
search are not too far apart, or the search time will become enormous. I recommend that the distance
between origin and destination be constrained to 40 tiles, and that the total search space be no more
than 60x60 tiles (creating a 10-tile-wide buffer behind both origin and destination, allowing the path to
wrap around large obstacles.) If units need to search for more distant destinations, some method of
hierarchical pathfinding should be used.
In the real world, people do not formulate precise path plans which stretch on for miles. Rather, if a
person has some domain knowledge of the intermediate terrain, they will subdivide the path, i.e. "first
get to the highway on-ramp, then travel to the exit for the mall, then drive to the parking lot."
Alternatively, if a person has no domain knowledge, they will create intermediate points as they see
them. For example, if you wanted to eventually reach some point you knew was far to the North, you
would first look North and pick a point you could see, plan a path toward it, and only when you got
there, you would pick your next point.
Within a game program, the techniques for creating a map hierarchy include:
1. Subdivide the line to the destination into midpoints, each of which is then used as a
subdestination. Unfortunately, this always leaves the possibility that a chosen midpoint will be at
an impossible location, which can eliminate the ability to find a valid path (see the "Path Failure"
section later in this article).
2. Preprocess the map into a large number of regions, for example castles, clearings, hills, and so
on. This can be done by an artist/designer, or even automated if maps are random. Then start
by finding a path on the "region map" to get from the current position to the destination region,
and then find a tile-based path on the detailed map to get to the next region. Alternatively, if a
unit has no region knowledge and you want to be completely realistic with its behavior, it can
just choose the next region which lies in the compass direction of its ultimate destination.
(Though again, this can result in path failure.)

A Faster Implementation of the Standard A*


Before proceeding with turning and smoothing modifications to the A* algorithm, let's start with some
basic optimizations that can speed up the standard A* algorithm by a factor of 40 or more. To start
with, the standard A* algorithm uses a sorted linked list (or two linked lists) to track nodes that are
checked. Instead, we'll use a 60x60 fixed matrix. When starting a search from point a to point b, we
find the midpoint between those two and place it at point [30,30] on our matrix. Each point on the
matrix stores:
● The cost to get to the point
● The total cost through that point to the goal
● The [x,y] location of its "parent" tile (the tile before it on the path)
● A Boolean stating whether or not it is on the "Open" list of actively pursued nodes, and
● The [x,y] locations of the Previous and Next nodes in the Open list.
We also keep a separate array of 1-bit Booleans, which store whether or not each node in our matrix
has been touched yet during this search. That way, we can very rapidly initialize at the beginning of
the search without needing to clear the entire matrix.
Whereas the original algorithm maintains a separate sorted Open list (actually a Priority Queue), we
instead maintain basic list functionality simply by using Previous and Next pointers within the fixed
array. Note that we do have the memory requirement for our 60x60 matrix, but our compacted data
structure requires only 16 bytes per node, for a total of 57K. (Even expanding the matrix to 120x120
will only require 230K of memory.)

Note additionally that the "list" can be implemented as a binary tree (by having two Next node
pointers at each element), but we've actually found it to be substantially faster to have a simple
(non-priority) list. While this does result in time O(n) for the search for the lowest cost node at the top
of the A* loop (rather than O(log n) for a priority queue), it excels in that all insertions and deletions,
of which there are many, are only O(1). Best of all, it eliminates the inner loop search that checks if
neighboring nodes yet exist on the Open or Closed lists, which otherwise would take O(n) (or maybe a

http://www.gamasutra.com/features/20010314/pinter_01.htm (3 of 6) [25/06/2002 3:54:04 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]
bit better if a hash table is used).
Overall, by avoiding all memory allocations and list insertions, this method turns out to be dramatically
faster. I have profiled it to be as much as 40 times faster than standard A* implementations.
Note that for the Directional search described later in this article, eight times the number of nodes are
necessary, so the memory requirement will all increase by a factor of eight.

Smoothing the A* Path


The first and most basic step in making an A* path more realistic is getting rid of the zigzag effect it
produces, which you can see in Figure 2a. This effect is caused by the fact that the standard A*
algorithm searches the eight tiles surrounding a tile, and then proceeds to the next tile. This is fine in
primitive games where units simply hop from tile to tile, but is unacceptable for the smooth movement
required in most games today.

FIGURE 2. The common zigzag effect of the


standard A* algorithm (a); a modification with
fewer, but still fairly dramatic, turns (b); and
the most direct -- and hence desired -- route
(c). To achieve the path shown in Figure 2c,
the four waypoints shown in red in Figure 2a
were eliminated.

One simple method of reducing the number of turns is to make the following modification to the A*
algorithm: Add a cost penalty each time a turn is taken. This will favor paths which are the same
distance, but take fewer turns, as shown in Figure 2b. Unfortunately, this simplistic solution is not very
effective, because all turns are still at 45-degree angles, which causes the movement to continue to
look rather unrealistic. In addition, the 45-degree-angle turns often cause paths to be much longer
than they have to be. Finally, this solution may add significantly to the time required to perform the A*
algorithm.
The actual desired path is that shown in Figure 2c, which takes the most direct route, regardless of the
angle. In order to achieve this effect, we introduce a simple smoothing algorithm which takes place
after the standard A* algorithm has completed its path. The algorithm makes use of a function
Walkable(pointA, pointB), which samples points along a line from point A to point B at a certain
granularity (typically we use one-fifth of a tile width), checking at each point whether the unit overlaps
any neighboring blocked tile. (Using the width of the unit, it checks the four points in a diamond

http://www.gamasutra.com/features/20010314/pinter_01.htm (4 of 6) [25/06/2002 3:54:04 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]
pattern around the unit's center.) The function returns true if it encounters no blocked tiles and false
otherwise. See Figure 3 for an illustration, and Listing 1 for pseudocode.
LISTING 1. Pseudocode for the simple smoothing algorithm. The smoothing algorithm simply checks
from waypoint to waypoint along the path, trying to eliminate intermediate waypoints when possible.
checkPoint = starting point of path
currentPoint = next point in path
while (currentPoint->next != NULL)
if Walkable(checkPoint, currentPoint->next)
// Make a straight path between those points:
temp = currentPoint
currentPoint = currentPoint->next
delete temp from the path
else
checkPoint = currentPoint
currentPoint = currentPoint->next

FIGURE 3. Illustration of the Walkable()


function which checks for path collisions.

The smoothing algorithm simply checks from waypoint to waypoint along the path, trying to eliminate
intermediate waypoints when possible. To achieve the path shown in Figure 2c, the four waypoints
shown in red in Figure 2a were eliminated.
Since the standard A* algorithm searches the surrounding eight tiles at every node, there are times
when it returns a path which is impossible, as shown with the green path in Figure 4. In these cases,
the smoothing algorithm presented above will smooth the portions it can (shown in purple), and leave
the "impossible" sections as is.
This simple smoothing algorithm is similar to "line of sight" smoothing, in which all waypoints are
progressively skipped until the last one that can be "seen" from the current position. However, the
algorithm presented here is more accurate, because it adds collision detection based on the width of
the character and also can be used easily in conjunction with the realistic turning methods described in
the next section.

http://www.gamasutra.com/features/20010314/pinter_01.htm (5 of 6) [25/06/2002 3:54:04 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]

FIGURE 4. This smoothing algorithm will leave


impossible paths alone.

Note that the simple smoothing algorithm presented above, like other simple smoothing methods, is
less effective with large units and with certain configurations of blocking objects. A more sophisticated
smoothing pass will be presented later.

________________________________________________________

Adding Realistic Turns

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/20010314/pinter_01.htm (6 of 6) [25/06/2002 3:54:04 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]

| | | |

Features
by Marco Pinter
Gamasutra
[Author's Bio] Toward More Realistic Pathfinding
March 14, 2001
Adding Realistic Turns
The next step is to add realistic curved turns for our units, so that they don't appear to change
direction abruptly every time they need to turn. A simple solution involves using a spline to smooth
Introduction the abrupt corners into turns. While this solves some of the aesthetic concerns, it still results in
physically very unrealistic movement for most units. For example, it might change an abrupt cornering
Adding Realistic Turns of a tank into a tight curve, but the curved turn would still be much tighter than the tank could
actually perform.
Directional Curved
Paths

A Better Smoothing
Pass

Printer Friendly
Version

Discuss this
Article

Letters to the Editor:


Write a letter
View all letters
FIGURE 5. Determining the shortest path from
the origin to the destination.

For a better solution, the first thing we need to know is the turning radius for our unit. Turning radius
is a fairly simple concept: if you're in a big parking lot in your car, and turn the wheel to the left as far
as it will go and proceed to drive in a circle, the radius of that circle is your turning radius. The turning
radius of a Volkswagen Beetle will be substantially smaller than that of a big SUV, and the turning
radius of a person will be substantially less than that of a large, lumbering bear.

Let's say you're at some point (origin) and pointed in a certain direction, and you need to get to some
other point (destination), as illustrated in Figure 5. The shortest path is found either by turning left as
far as you can, going in a circle until you are directly pointed at the destination, and then proceeding
forward, or by turning right and doing the same thing.
In Figure 5 the shortest route is clearly the green line at the bottom. This path turns out to be fairly
straightforward to calculate due to some geometric relationships, illustrated in Figure 6.

http://www.gamasutra.com/features/20010314/pinter_02.htm (1 of 5) [25/06/2002 3:55:15 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]

FIGURE 6. Calculating the length of the path.

First we calculate the location of point P, which is the center of our turning circle, and is always radius
r away from the starting point. If we are turning right from our initial direction, that means P is at an
angle of (initial_direction - 90) from the origin, so:
angleToP = initial_direction - 90
P.x = Origin.x + r * cos(angleToP)
P.y = Origin.y + r * sin(angleToP)

Now that we know the location of the center point P, we can calculate the distance from P to the
destination, shown as h on the diagram:
dx = Destination.x - P.x
dy = Destination.y - P.y
h = sqrt(dx*dx + dy*dy)

At this point we also want to check that the destination is not within the circle, because if it were, we
could never reach it:
if (h < r)
return false

Now we can calculate the length of segment d, since we already know the lengths of the other two
sides of the right triangle, namely h and r. We can also determine angle from the right-triangle
relationship:
d = sqrt(h*h - r*r)
theta = arccos(r / h)

Finally, to figure out the point Q at which to leave the circle and start on the straight line, we need to
know the total angle + , and is easily determined as the angle from P to the destination:
phi = arctan(dy / dx) [offset to the correct quadrant]
Q.x = P.x + r * cos(phi + theta)
Q.y = P.y + r * sin(phi + theta)

The above calculations represent the right-turning path. The left-hand path can be calculated in
exactly the same way, except that we add 90 to initial_direction for calculating angleToP, and
later we use - instead of + . After calculating both, we simply see which path is shorter and
use that one.
In our implementation of this algorithm and the ones that follow, we utilize a data structure which
stores up to four distinct "line segments," each one being either straight or curved. For the curved
paths described here, there are only two segments used: an arc followed by a straight line. The data
structure contains members which specify whether the segment is an arc or a straight line, the length
of the segment, and its starting position. If the segment is a straight line, the data structure also
specifies the angle; for arcs, it specifies the center of the circle, the starting angle on the circle, and
the total radians covered by the arc.
Once we have calculated the curved path necessary to get between two points, we can easily calculate
our position and direction at any given instant in time, as shown in Listing 2.
LISTING 2. Calculating the position and orientation at a particular time.
distance = unit_speed * elapsed_time
loop i = 0 to 3:

http://www.gamasutra.com/features/20010314/pinter_02.htm (2 of 5) [25/06/2002 3:55:15 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]
if (distance < LineSegment[i].length)
// Unit is somewhere on this line segment
if LineSegment[i] is an arc
determine current angle on arc (theta) by adding or
subtracting (distance / r) to the starting angle
depending on whether turning to the left or right
position.x = LineSegment[i].center.x + r*cos(theta)
position.y = LineSegment[i].center.y + r*sin(theta)
determine current direction (direction) by adding or
subtracting 90 to theta, depending on left/right
else
position.x = LineSegment[i].start.x
+ distance * cos(LineSegment[i].line_angle)
position.y = LineSegment[i].start.y
+ distance * sin(LineSegment[i].line_angle)
direction = theta
break out of loop
else
distance = distance - LineSegment[i].length

Legal Turns: The Basic Methods


So now that we know how to find and follow an efficient curved line between two points, how do we
use this in our pathing? The methods discussed in this section are all postprocessing techniques. In
other words, they involve using the standard A* algorithm during initial pathfinding, and then adding
curved turns later in some fashion, either in an extended pathfinding or during actual unit movement.

FIGURE 7. Decreasing the turning radius (a), and making a


three-point turn (b).
1. Simple solution: ignoring blocked tiles. We start with the simplest solution. First use the A*
algorithm to calculate the path. Then progress from point to point in the path as follows: At any
waypoint, a unit has a position, an orientation, and a destination waypoint. Using the algorithm
described in the preceding section, we can calculate the fastest curved path to get from the
current waypoint to the next waypoint. We don't care what direction we are facing when we
reach the destination waypoint, though that will turn out to be the starting orientation for the
following waypoint. If we skim some obstacles along the way, so be it -- this is a fast
approximation, and we are willing to overlook such things. Figure 1c shows the result of this
method. The curves are nice, but on both turns, the side of the ship will overlap a blocking tile.

This solution is actually quite acceptable for many games. However, we often don't want to allow any
obviously illegal turns where the unit overlaps obstacles. The next three methods address this
problem.
2. Path recalculations. With this method, after the A* has completed, we step through the path,
making sure every move from one waypoint to the next is valid. (This can be done as part of a
smoothing pass.) If we find a collision, we mark the move as invalid and try the A* path search
again. In order to do this, we need to store one byte for every tile (or add an additional byte to
the matrix elements described in the optimization section above). Each bit will correspond to one
of the eight tiles accessible from that tile. Then we modify the A* algorithm slightly so that it
checks whether a particular move is valid before allowing it. The main problem with this method
is that by invalidating certain moves, a valid path approaching the tile from a different direction
can be left unfound. Also, in a worst-case scenario, this method could need to recalculate the
path many times over.
3. Making tighter turns. Another solution is that whenever we need to make a turn that would
normally cause a collision, we allow our turning radius to decrease until the turn becomes legal.
This is illustrated with the first turn in Figure 7a. One proviso is that when we conduct the A*
search, we need to search only the surrounding four tiles at every node (as opposed to eight), so
we don't end up with impossible situations like the one illustrated in Figure 4. In the case of

http://www.gamasutra.com/features/20010314/pinter_02.htm (3 of 5) [25/06/2002 3:55:15 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]
vehicles, this method may look odd, whereby some lumbering tank suddenly makes an
unbelievably tight turn. However, in other cases this may be exactly what you want. Unlike
vehicles, which tend to have a constant turning radius, if your units are people, they are able to
turn much more tightly if they are creeping along than if they are running. So in order to follow
the simple path, you simply need to decelerate the unit as it approaches the turn. This can yield
very realistic movement. (See the sections on "Speed and People Movement" for a further
discussion.)
4. Backing up. Our final solution comes from real-world experience. How do we make a very tight
turn into a driveway? We back up and make a three-point turn, of course, as illustrated in Figure
7b. If your units are able to perform such maneuvers, and if this is consistent with their
behavior, this is a very viable solution.

Legal Turns: The Directional A* Algorithm


None of the methods presented in the above section is formally correct. Method two can often fail to
find valid paths, and methods one, three, and four are all basically cheats. Comparing Figures 1c and
1d, we see that the only valid solution which takes turning radius into account may require a
completely different route from what the basic A* algorithm provides. To solve this problem, I'll
introduce a significant modification to the algorithm, which I'll term the Directional A*.
The main change to the algorithm is the addition of a third dimension. Instead of having a flat grid of
nodes, where each node represents an XY grid position, we now have a three-dimensional space of
nodes, where a node <X,Y,orientation> represents the position at that node, as well as the compass
orientation of the unit (N, S, E, W, NE, NW, SE, SW.) For example, a node might be [X = 92, Y = 142,
orientation = NW]. Thus there are eight times as many nodes as before. There are also 64 times as
many ways of getting from one <X,Y> location to another, because you can start at the first node
pointing any one of eight directions, and end at the next node pointing any one of eight directions.

During the algorithm, when we're at a parent node p and checking a child node q, we don't just check
if the child itself is a blocked tile. We check if a curved path from p to q is possible (taking into account
the orientation at p, the orientation at q, and the turning radius); and if so, we check if traveling on
that path would hit any blocked tiles. Only then do we consider a child node to be valid. In this
fashion, every path we look at will be legal, and we will end up with a valid path given the size and
turning radius of the unit. Figure 8 illustrates this.

FIGURE 8. A legal turn wich will only be found with the Directional A*
technique.

The shortest path, and the one that would be chosen by the standard A* algorithm, goes from a to c.
However, the turning radius of the unit prevents the unit from performing the right turn at c given the
surrounding blockers, and thus the standard A* would return an invalid path in this case. The
Directional A*, on the other hand, sees this and instead looks at the alternate path through b. Yet
even at b, a 90 degrees turn to the left is not possible due to nearby blockers, so the algorithm finds
that it can make a right-hand loop and then continue.

http://www.gamasutra.com/features/20010314/pinter_02.htm (4 of 5) [25/06/2002 3:55:15 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]

________________________________________________________

Directional Curved Paths

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/20010314/pinter_02.htm (5 of 5) [25/06/2002 3:55:15 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]

| | | |

Features
by Marco Pinter
Gamasutra
[Author's Bio] Toward More Realistic Pathfinding
March 14, 2001
Directional Curved Paths
In order to implement the Directional A* algorithm, it is necessary to figure out how to compute the
shortest path from a point p to a point q, taking into account not only starting direction, orientation,
Introduction and turning radius, but also the ending direction. This algorithm will allow us to compute the shortest
legal method of getting from a current position and orientation on the map to the next waypoint, and
Adding Realistic Turns also to be facing a certain direction upon arriving there.

Directional Curved Earlier we saw how to compute the shortest path given just a starting orientation and turning radius.
Paths Adding a fixed final orientation makes the process a bit more challenging.
A Better Smoothing
Pass

Printer Friendly
Version

Discuss this
Article

Letters to the Editor:


Write a letter
View all letters

FIGURE 9. Arriving at the destination facing a


certain direction.

There are four possible shortest paths for getting from origin to destination with fixed starting and
ending directions. This is illustrated in Figure 9. The main difference between this and Figure 5 is that
we approach the destination point by going around an arc of a circle, so that we will end up pointing in
the correct direction. Similar to before, we will use trigonometric relationships to figure out the angles
and lengths for each segment, except that there are now three segments in total: the first arc, the line
in the middle, and the second arc.

http://www.gamasutra.com/features/20010314/pinter_03.htm (1 of 7) [25/06/2002 3:56:51 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]

We can easily position the turning circles for both origin and destination in the same way that we did
earlier for Figure 6. The challenge is finding the point (and angle) where the path leaves the first
circle, and later where it hits the second circle. There are two main cases that we need to consider.
First, there is the case where we are traveling around both circles in the same direction, for example
clockwise and clockwise (see Figure 10).

FIGURE 10. Case 1: Traveling around both


circles in the same direction.

For this case, note the following:


1. The line from P1 to P2 has the same length and slope as the (green) path line below it.
2. The arc angle at which the line touches the first circle is simply 90 degrees different from the
slope of the line.
3. The arc angle at which the line touches the second circle is exactly the same as the arc angle at
which it touches the first circle.

The second case, where the path travels around the circles in opposite directions (for example,
clockwise around the first and counterclockwise around the second), is somewhat more complicated
(see Figure 11). To solve this problem, we imagine a third circle centered at P3 which is tangent to the
destination circle, and whose angle relative to the destination circle is at right angles with the (green)
path line. Now we follow these steps:
1. Observe that we can draw a right triangle between P1, P2, and P3.
2. We know that the length from P2 to P3 is (2 * radius), and we already know the length from P1
to P2, so we can calculate the angle as = arccos(2 * radius / Length(P1, P2))

3. Since we also already know the angle of the line from P1 to P2, we just add or subtract
(depending on clockwise or counterclockwise turning) to get the exact angle of the (green) path
line. From that we can calculate the arc angle where it leaves the first circle and the arc angle
where it touches the second circle.

We now know how to determine all four paths from origin to destination, so given two nodes (and their
associated positions and directions), we can calculate the four possible paths and use the one which is
the shortest.

FIGURE 11. Case 2: Traveling around the


circles in opposite directions.

Note that we can now use the simple smoothing algorithm presented earlier with curved paths, with
just a slight modification to the Walkable(pointA, pointB) function. Instead of point-sampling in a
straight line between pointA and pointB, the new Walkable(pointA, directionA, pointB,
directionB) function samples intermediate points along a valid curve between A and B given the

http://www.gamasutra.com/features/20010314/pinter_03.htm (2 of 7) [25/06/2002 3:56:51 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]
initial and final directions.
Discrete and nondiscrete positions and directions. Some readers may be concerned at this point,
since it seems that our algorithm is dependent on movement always starting at exactly the center
position of a tile, and at exactly one of eight compass directions. In real games, a character may be in
the middle of walking between two tiles at the exact moment we need it to change direction. In fact,
we can easily modify the algorithm so that whenever the origin node is the starting point of the
search, we do the curve computations based on the true precise position and angle of the character's
starting point. This eliminates the restriction.

Nonetheless, the algorithm still requires that the waypoints are at the center of tiles and at exact
compass directions. These restrictions can seemingly cause problems where a valid path may not be
found. The case of tile-centering is discussed in more detail below. The problem of rounded compass
directions, however, is in fact very minimal and will almost never restrict a valid path. It may cause
visible turns to be a bit more exaggerated, but this effect is very slight.
Expanded searching to surrounding tiles. So far in this discussion, we have assumed that at every
node, you check the surrounding eight locations as neighbors. We call this a Directional-8 search. As
mentioned in the preceding paragraph, there are times when this is restrictive. For example, the
search shown in Figure 12 will fail for a Directional-8 search, because given a wide turning radius for
the ship, it would impossible to traverse a -> b -> c -> d without hitting blocking tiles. Instead, it is
necessary to find a curve directly from a -> d.

FIGURE 12. A legal curved path which cannot be found by the


Directional-8 algorithm.

Accomplishing this requires searching not just the surrounding eight tiles, which are one tile away, but
the surrounding 24 tiles, which are two tiles away. We call this a Directional-24 search, and it was
such a search that produced the valid path shown in Figure 12. We can even search three tiles away
for a Directional-48 search. The main problem with these extended searches is computation time. A
node in a Directional-8 search has 8 x 8 = 64 child nodes, but a node in a Directional-24 search has 24
x 8 = 192 child nodes.
A small optimization we can do is to set up a directional table to tell us the relative position of a child
given a simple index. For example, in a Directional-48 search, we loop through directions 0 -> 47, and
a sample table entry would be:
DirTable[47] = <-3,+3>.

The modified heuristic. Our modified A* algorithm will also need a modified heuristic (to estimate
the cost from an intermediate node to the goal). The original A* heuristic typically just measures a
straight-line distance from the current position to the goal position. If we used this, we would end up
equally weighing every compass direction at a given location, which would make the search take
substantially longer in most cases. Instead, we want to favor angles that point toward the goal, while
also taking turning radius into account. To do this, we change the heuristic to be the distance of the

http://www.gamasutra.com/features/20010314/pinter_03.htm (3 of 7) [25/06/2002 3:56:51 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]
shortest curve from the current location and angle to the destination location, as calculated in the
"Adding Realistic Turns" section earlier.
To avoid making this calculation each time, we set up a heuristic table in advance. The table contains
heuristic values for any destination tile within a 10-tile distance (with a granularity of 1/64th tile), and
at any angle relative to the current direction (with a granularity of eight angles.) Any destination tile
beyond 10 tiles is computed with the 10-tile value, plus the difference in actual distance, which turns
out to be a very close approximation. The total data size of the table is thus 640 (distance) x 8
(directions) x 4 (bytes) = 20K. Since the table is dependent on the turn radius of the unit, if that turn
radius changes, we need to recalculate the table.
Using a hit-check table. The trigonometric calculations described above to determine the path from
one node to another are not trivial and take some computational time. Pair this with the requirement
of "walking" through the resultant path to see if any tiles are hit, and the fact that this whole process
needs to be performed at every possible node combination in the search. The result is a total
computation time that would be absurdly long. Instead, we use a special table which substantially
reduces the computation time. For any given starting direction (8 total), ending direction (8 total), and
ending position (up to 48, for a Directional-48 search), the table stores a 121-bit value. This value
represents an 11x11 grid surrounding the origin, as seen in Figure 13.

FIGURE 13. Illustration of the


hit-check table.

Any tiles that would be touched by a unit traveling between those nodes (other than the origin and
destination tiles themselves) will be marked by a "1" in the appropriate bit-field, while all others will be
"0." Then during the search algorithm itself, the table will simply be accessed, and any marked nodes
will result in a check to see if the associated node in the real map is blocked or not. (A blocked node
would then result in report of failure to travel between those nodes.) Note that the table is dependent
on both the size and turn radius of the unit, so if those values change, the table will need to be
recomputed.
Earlier, I mentioned how the very first position in a path may not be at the precise center location of a
tile or at a precise compass direction. As a result, if we happen to be specifically checking neighbors of
that first tile, the algorithm needs to do a full computation to determine the path, since the table
would not be accurate.
Other options: backing up. Finally, if your units are able to move in reverse, this can easily be
incorporated into the Directional A* algorithm to allow further flexibility in finding a suitable path. In
addition to the eight forward directions, simply add an additional eight reverse directions. The
algorithm will automatically utilize reverse in its path search. Typically units shouldn't be traveling in
reverse half the time though, so you can also add a penalty to the distance and heuristic computations
for traveling in reverse, which will "encourage" units to go in reverse only when necessary.

Correctness of Directional A*
The standard A* algorithm, if used in a strict tile-based world with no turning restrictions, and in a
world where a unit must always be at the center of a tile, is guaranteed to find a solution if one exists.
The Directional A* algorithm on the other hand, when used in a more realistic world with turning
restrictions and nondiscrete positions, is not absolutely guaranteed to find a solution. There are a
couple reasons for this.
Earlier we saw how the Directional-8 algorithm could occasionally miss a valid path, and this was
illustrated in Figure 12. The conclusion was to use a Directional-24 search or even a Directional-48
search. However, in very rare circumstances, the same problem could occur with a Directional-48
search. We could extend even further to a Directional-80 search, but at that point the computation

http://www.gamasutra.com/features/20010314/pinter_03.htm (4 of 7) [25/06/2002 3:56:51 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]
time required would probably be too high.
The other problem is that the shortest legal curved path between two points, which we compute in our
algorithm, is not the only legal curved path. For one thing, there are the four possible paths shown in
Figure 9. Our algorithm simply picks the shortest and assumes that is the correct one. Yet possibly
that one path may fail, while one of the other three may have succeeded. (Though when I tried to
fabricate such a condition, it proved almost impossible. Another route was always found by the
algorithm.) Furthermore, the four paths shown in that figure are not the only legal paths, either. There
are theoretically an infinite number of paths that twist and turn in many different ways.
In practice, though, it is very rare for the Directional-24 search to fail to find a valid path. And it is
almost impossible for the Directional-48 search to fail.

Fixed-Angle Character Art


Until now, the discussion has assumed that units can face any direction: 27 degrees, 29 degrees, 133
degrees, and so on. However, certain games which do not use real-time 3D art do not have this
flexibility. A unit may be prerendered in only eight or 16 different angles. Fortunately, we can deal
with this without too much trouble. In fact, the accompanying test program includes an option of
16-angle fixed art, to illustrate the process.
The trivial way of dealing with the problem is to do all of the calculations assuming continuous angles,
and then when rendering to the screen, simply round to the nearest legal direction (for example, a
multiple of 22.5 degrees for 16-angle art) and draw the appropriate frame. Unfortunately, this usually
results in a visible "sliding" effect for the unit, which typically is unacceptable.

FIGURE 14. Turning with fixed-angle


character art.

What we really want is a solution which can modify a continuous path, like the one shown at the
bottom of Figure 14, and create a very similar path using just discrete lines, as shown in the top of the
figure.
The solution involves two steps. First, for all arcs in the path, follow the original circle as closely as
possible, but staying just outside of it. Second, for straight lines in the path, create the closest
approximation using two line segments of legal angles. These are both illustrated in Figure 15. For the
purposes of studying the figure, we have allowed only eight legal directions, though this can easily be
extended to 16.
On the left, we see that we can divide the circle into 16 equivalent right triangles. Going one direction
on the outside of the circle (for example, northeast) involves traversing the bases of two of these
triangles. Each triangle has one leg which has the length of the radius, and the inside angle is
simply /8 or 22.5 degrees. Thus the base of each triangle is simply:
base = r * tan(22.5)

We can then extrapolate any point on the original arc (for example, 51.2 degrees) onto the point
where it hits one of the triangles we have just identified. Knowing these relationships, we can then
calculate (with some additional work) the starting and ending point on the modified arc, and the total
distance in between.

http://www.gamasutra.com/features/20010314/pinter_03.htm (5 of 7) [25/06/2002 3:56:51 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]

FIGURE 15. Geometry of fixed-angle turning.

For a straight line, we simply find the two lines of the closest legal angles, for example 0 degrees and
45 degrees, and determine where they touch. As shown in the figure, there are actually two such
routes (one above the original line and another below), but in our sample program we always just pick
one. Using basic slope and intercept relationships, we simply calculate intersection of the two lines to
determine where to change direction.

Note that we still use the "line segment" storage method introduced earlier to store the modified path
for fixed character art. In fact, this is why I said earlier we would need up to four line segments. The
starting and ending arc remain one line segment each (and we determine the precise position of the
unit on the modified "arc" while it is actually moving), but the initial straight line segment between the
two arcs now becomes two distinct straight line segments.

The Road Problem


The approach we have taken thus far is to find the shortest curve possible between any two points in a
path. So if a unit is headed due east to a point p (and thus is pointing east when it hits p), and then
needs to go due north for five tiles to hit a point q, the unit will first need to turn left for approximately
105 degrees of its turning circle, and then head at an approximate direction of north-northwest until it
arrives at point q. Note that we could alternatively have defined the path to turn substantially further
around the circle and then travel due north, but that would have been a longer path. See the vertical
path portions of Figure 16 for an illustration.

A B

FIGURE 16. (a) Standard cornering, and (b) Modified tight cornering
for roads.

At certain times, even though it is longer, the path in Figure 16b may be what is desired. This most
often occurs when units are supposed to be traveling on roads. It simply is not realistic for a vehicle to
drive diagonally across a road just to save a few feet in total distance.
There are a few ways of achieving this.
1. When on roads, make sure to do only a regular A* search or a Directional-8 search, and do not
apply any smoothing algorithm afterwards. This will force the unit to go to the adjacent tile.
However, this will only work if the turning radius is small enough to allow such a tight turn.

http://www.gamasutra.com/features/20010314/pinter_03.htm (6 of 7) [25/06/2002 3:56:51 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]
Otherwise, the algorithm will find an adjacent tile which is off the road.
2. Temporarily disallow movement to any off-road tile. This has the same constraints as the above
method.
3. Same as (1), but for units that have too wide a turning radius to turn into an adjacent road tile,
do a Directional-24 or Directional-48 search as appropriate. For example, the unit shown in
Figure 16b apparently requires two tiles to make a 90-degree turn, so a Directional-24 search
would be appropriate.
4. Determine the number of tiles needed for a turn (for example two tiles, as in the figure), and
temporarily place blocking tiles adjacent to the road after that number of tiles has gone by,
beyond every turn. These temporary blockers are in fact displayed in Figure 16b. This method is
analogous to placing "cones" by the road.

________________________________________________________

A Better Smoothing Pass

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/20010314/pinter_03.htm (7 of 7) [25/06/2002 3:56:51 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]

| | | |

Features
by Marco Pinter
Gamasutra
[Author's Bio] Toward More Realistic Pathfinding
March 14, 2001
A Better Smoothing Pass
The smoothing algorithm given earlier is less than ideal when used by itself. There are two reasons for
this. Figure 17 demonstrates the first problem. The algorithm stops at point q and looks ahead to see
Introduction how many nodes it can skip while still conducting a legal move. It makes it to point r, but fails to allow
a move from q to s because of the blocker near q. Therefore it simply starts again at r and skips to the
Adding Realistic Turns destination. What we'd really like to see is a change of direction at p, which cuts diagonally to the final
destination, as shown with the dashed line.
Directional Curved
Paths

A Better Smoothing
Pass

Printer Friendly
Version

Discuss this
Article

Letters to the Editor:


Write a letter
View all letters

FIGURE 17. One shortcoming of the simple


smoothing algorithm.

The second problem exhibits itself only when we have created a path using the simple
(non-directional) method, and is demonstrated by the green line in Figure 18. The algorithm moves
forward linearly, keeping the direction of the ship pointing straight up, and stops at point p. Looking
ahead to the next point (q), it sees that the turning radius makes the turn impossible. The smoothing
algorithm then proceeds to "cheat" and simply allow the turn. However, had it approached p from a
diagonal, it could have made the turn legally as evidenced by the blue line.
To fix these problems, we introduce a new pre-smoothing pass that will be executed after the A*
search process, but prior to the simple smoothing algorithm described earlier. This pass is actually a
very fast version of our Directional-48 algorithm, with the difference that we only allow nodes to move
along the path we previously found in the A* search, but we consider the neighbors of any node to be
those waypoints which were one, two, or three tiles ahead in the original path. We also modify the
cost heuristic to favor the direction of the original path (as opposed to the direction toward the goal).
The algorithm will automatically search through various orientations at each waypoint, and various
combinations of hopping in two- or three-tile steps, to find the best way to reach the goal.

http://www.gamasutra.com/features/20010314/pinter_04.htm (1 of 6) [25/06/2002 3:58:00 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]

FIGURE 18. Another shortcoming: the simple


smoothing algorithm is unable to find and
execute a turn within the legal turning radius.

Because this algorithm sticks to tiles along the previous path, it runs fairly quickly, while also allowing
us to gain many of the benefits of a Directional-48 search. For example, it will find the legal blue line
path shown in Figure 18. Of course it is not perfect, as it still will not find paths that are only visible to
a full Directional-48 search, as seen in Figure 19.
The original, nondirectional search finds the green path, which executes illegal turns. There are no
legal ways to perform those turns while still staying on the path. The only way to arrive at the
destination legally is via a completely different path, as shown with the blue line. This pre-smoothing
algorithm cannot find that path: it can only be found using a true Directional search, or by one of the
hybrid methods described later. So the pre-smoothing algorithm fails under this condition. Under such
a failure condition, and especially when the illegal move occurs near the destination, the
pre-smoothing algorithm may require far more computation time than we desire, because it will search
back through every combination of directional nodes along the entire path. To help alleviate this and
improve performance, we add an additional feature such that once the pre-smoothing algorithm has
reached any point p along the path, if it ever searches back to a point that is six or more points prior
to p in the path, it will fail automatically.

http://www.gamasutra.com/features/20010314/pinter_04.htm (2 of 6) [25/06/2002 3:58:00 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]
FIGURE 19. The blue line shows the only truly
legal path, which the pre-smoothing algorithm
can't find, but the Directional search can.

Path Failure and Timeslicing


Depending on the pathfinding method chosen, it is possible for failure to be reported either when there
is truly no possible path, or when the chosen solution simply has not found the path (which is more
likely to occur when utilizing fast, informal methods.) What to do in the case of failure is entirely
dependent on the specifics of the game. Typically, it simply means that the current goal -- a food
source, ammunition depot, or enemy base -- is not attainable from the current position, and the unit
must choose a different goal. However, it is possible that in certain circumstances we know the goal is
achievable, and it is important to find it for gameplay. In these cases we might have started with a
faster search method, but if that fails, we can proceed from scratch with a slower, methodical search,
such as the standard Directional-24.
The key problem is that A* and its derivative methods of pathfinding perform very poorly if pathfinding
fails. To alleviate this problem, it is important to minimize failures. This can be done by dividing up the
map into small regions in advance (let's say 1,000 total), and precomputing whether it's possible,
given two regions ra and rb, to get from some tile in ra to some tile in rb. We need only one bit to
store this, so in our example this uses a 128K table. Then before executing any pathfind, we first
check the table. If travel between the regions is impossible, we immediately report failure. Otherwise,
it is probably possible to get from our specific source tile to our specific destination tile, so we proceed
with the pathfinding algorithm.
In the next section, I'll discuss the time performance for the different algorithms presented here, and
how to mix and match techniques to achieve faster times. Note that it is possible to timeslice the
search algorithm so that it can be interrupted and restarted several times, and thereby take place over
a number of frames, in order to minimize overall performance degradation due to a slow search. The
optimized algorithm presented earlier uses a fixed matrix to keep track of the intermediate results of a
search, so unfortunately this means that any other units requiring a pathfinding search during that
time will be "starved" until the previous pathing is complete. To help alleviate this, we can instead
allocate two matrices, one for the occasional slow search that takes several timesliced frames and the
other for interspersed fast searches. Only the slow searches, then, will need to pause (very briefly) to
complete. In fact, it is actually quite reasonable that a unit "stop and think" for a moment to figure out
a particularly difficult path.

Performance and Hybrid Solutions


In this section, I'll discuss performance of many of the techniques presented earlier and how to utilize
that knowledge to choose the best algorithms for a particular application.
The major observation testing the various techniques presented here was that the performance of the
Directional algorithm is slow, probably too slow for many applications. Perhaps most units in a
particular game can utilize the simple A* algorithm with smoothing passes, while a particular few large
units could utilize a Directional algorithm. (It is nice to note, however, that only a few years ago,
system performance would have prohibited implementation of anything but the simplest A* algorithm,
and perhaps a few years from now, the performance issues discussed here will not be significant.)
Knowing the performance limitations, there are also hybrid
solutions that can be devised which are often as fast as a
simple A* algorithm (with smoothing), but also utilize some
of the power of a formal Directional search. One excellent
such solution, which is incorporated into the sample
program provided, is presented below. (Note that this
particular solution will not be useful if a unit is greater than
one tile wide.)
1. Start by performing a simple A* search, only
searching the surrounding four tiles at every node. If
this search fails, report failure. (The result of such a
search is illustrated in Figure 20a.)
2. Perform the Smoothing-48 pass. If it is successful,
skip to (6). Otherwise, check the search matrix to see
what was the farthest node hit, and continue to (3).
(This step is illustrated by the brown path in Figure
20b.)
3. Perform another Smoothing-48 pass, this time
starting from the destination and working backward.
This one will also fail, so check the search matrix to
see the farthest node hit. (This step is illustrated by
the green path in Figure 20b.)
4. Look at the failure points from both smoothing passes
(points a and b in the figure.) This is the section that
needs to be searched for an alternate route. If the
points are more than 12 tiles distant in the X or Y

http://www.gamasutra.com/features/20010314/pinter_04.htm (3 of 6) [25/06/2002 3:58:00 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]
directions, report failure immediately. Otherwise, step
away one tile at a time in each smoothing list (the
one starting from the origin and the one starting from
the destination) until the points are approximately
(but not more than) 12 tiles distant from one another.
(Note that this step is not performed in Figure 20.)
5. Perform the Directional-8 algorithm between the two
points determined above. If the search fails, report
failure. (This step is illustrated by the blue line in
Figure 20b.) Otherwise, attach the three path
segments found and proceed to (6).
6. Perform the final (simple) smoothing algorithm on the
resultant path. (The resultant path is shown in Figure
20c.)

For most searches (99 percent in some applications, less in


others, depending on the terrain layout), this hybrid
solution will run exactly as fast as a simple A* algorithm
with Smoothing-48 and simple smoothing. In the other rare
cases, it will require the additional time necessary to do a
(difficult) Directional-8 search on points which are 12 tiles
apart. As discussed earlier, those cases can be timesliced
over multiple frames.
Note that the circuitous route required for the path in Figure
20 is somewhat complex, and the search required around
200ms when tested. Figure 21a shows a simpler version
that only required 85ms. Finally, by simply adding one
blocking tile, as shown in Figure 21b, the time is reduced to
under 6ms, because the initial A* search was able to find a
valid route and was not "misled" into an impossible section
of terrain.

Speed and Other Movement Restrictions


In this article I've focused primarily on turning radius as the
main movement restriction and how to deal with it. There
are in fact other movement restrictions that are handled by
the A* algorithm, either automatically or in conjunction with
the methods described here.

FIGURE 21. Faster hybrid paths. FIGURE 20 (a, b and c from top to
bottom) The (fast) hybrid
The standard A* algorithm allows for tiles to have a variety Directional A* pathfinding
of costs. For example, movement on sand is much more technique.
"expensive" than movement over pavement, so the algorithm will favor paths on pavement even if the
total distance may be longer. This type of terrain costing is fully supported by the Directional A*
algorithm and other techniques listed here.
Speed restrictions are a trickier issue. For the map in Figure 22, the algorithms presented here will
choose the green path, because it has the shortest distance. However, for a vehicle with slow
acceleration/deceleration times, the blue path would be faster, because it only requires two turns
instead of eight, and has long stretches for going at high speeds.

The formal way to attack this problem would be to add yet another dimension to the search space. For
the Directional algorithm, we added current direction as a third dimension. We could theoretically add

http://www.gamasutra.com/features/20010314/pinter_04.htm (4 of 6) [25/06/2002 3:58:00 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]
current speed as a fourth dimension (rounding to a total of eight or 10 approximate speeds.) When
moving from one node to another, we would have to check whether the increase or decrease in speed
would be possible given the vehicle's acceleration or braking capability, and any turning which is in
progress. Of course, this increases the search space dramatically and will hurt performance quite a bit.
The simplest way of incorporating speed as a factor, though not the most precise, is simply to modify
the costs in the Directional search so that any turns are "charged" extra. This will penalize turning, due
to the reductions in speed necessary to make turns. Unfortunately, this is only effective in
Directional-24 or Directional-48 searches. A Directional-8 search yields lots of extraneous turns which
are later dealt with by the smoothing pass, but since the penalties proposed here would occur during
the main search phase, the accuracy could suffer quite a bit.

FIGURE 22. Illustration of the speed


restriction problem.

People Movement and Friendly-Collision Avoidance


Fluid turning radius. The tables used to optimize the Directional algorithms, discussed earlier, are
based on a fixed turning radius and unit size. In the sample program provided, if the turning radius or
unit size is changed, the tables are recalculated. However, as mentioned earlier in the "Basic Methods"
section, some units may have a more fluid turning radius that depends on their speed or other factors.
This is especially true of "people" units, which can easily slow down to make a tighter turn.
Resolving a path under these circumstances can become increasingly complex. In addition to the
requirement for much more memory for tables (covering a range of turning radii), a formal search
algorithm would in fact need to track an additional speed dimension, and factor acceleration into
account when determining on-the-fly turning ability and resultant speed.
Instead, it is much simpler and more efficient to either:
a. Do a standard A* search and subsequently apply decelerations as appropriate before turns, as
described in "Basic Methods" section, or;
b. Set a very tight turning radius and do a Directional search while penalizing significantly for turns.
This will have the result of favoring solutions that don't require overly tight turns, but still
allowing such solutions.

Friendly-collision avoidance. Units which are friendly to one another typically need some method of
avoiding collisions and continuing toward a goal destination. One effective method is as follows: Every
half-second or so, make a quick map of which tiles each unit would hit over the next two seconds if
they continued on their current course. Each unit then "looks" to see whether it will collide with any
other unit. If so, it immediately begins decelerating, and plans a new route that avoids the problem
tile. (It can start accelerating again once the paths no longer cross.) Ideally, all units will favor
movement to the right side, so that units facing each other won't keep hopping back to the left and
right (as we often do in life). Still, units may come close to colliding and need to be smart enough to
stop, yield to the right, back up a step if there's not enough room to pass, and so on.

Final Notes
This article has made some simplifying assumptions to help describe the search methods presented.
First, all searches shown have been in 2D space. Most games still use 2D searches, since the third
dimension is often inaccessible to characters, or may be a slight variation (such as jumping) that

http://www.gamasutra.com/features/20010314/pinter_04.htm (5 of 6) [25/06/2002 3:58:00 PM]


Gamasutra - Features - "Toward More Realistic Pathfinding" [03.14.01]
would not affect the search. All examples used here have also utilized simple grid partitioning, though
many games use more sophisticated 2D world partitioning such as quadtrees or convex polygons.
Some games definitely do require a true search of 3D space. This can be accomplished in a fairly
straightforward manner by adding height as another dimension to the search, though that typically
makes the search space grow impossibly large. More efficient 3D world partitioning techniques exist,
such as navigation meshes. Regardless of the partitioning method used, though, the pathfinding and
smoothing techniques presented here can be applied with some minor modifications.

The algorithms presented in this article are only partially optimized. They can potentially be sped up
further through various techniques. There is the possibility of more and better use of tables, perhaps
even eliminating trigonometric functions and replacing them with lookups. Also, the majority of time
spent in the Directional algorithm is in the inner loop which checks for blocking tiles which may have
been hit. An optimization of that section of code could potentially double the performance. Finally, the
heuristics used in the Directional algorithm and the Smoothing-48 pass could potentially be revised to
find solutions substantially faster, or at least tweaked for specific games.
Pathfinding is a complex problem which requires further study and refinement. Clearly not all
questions are adequately resolved. One critical issue at the moment is performance. I am confident
that some readers will find faster implementations of the techniques presented here, and probably
faster techniques as well. I look forward to this growth in the field.

For More Information


Sample application and source
Download (104K)

Game Developer magazine

Pottinger, Dave C. "Coordinated Unit Movement" (January 1999).


http://www.gamasutra.com/features/19990122/movement_01.htm

Pottinger, Dave C. "Implementing Coordinated Movement" (February 1999).


http://www.gamasutra.com/features/19990129/implementing_01.htm

Pottinger, Dave C. "The Future of Game AI" (August 2000).


http://www.gamasutra.com/features/20001108/laird_01.htm

Stout, W. Bryan. "Smart Moves: Intelligent Pathfinding" (October/November 1996).


http://www.gamasutra.com/features/19970801/pathfinding.htm

Web sites

Steven Woodcock's Game AI Page


http://www.gameai.com

Books

Game Programming Gems (Charles River Media, 2000)


Refer to chapters 3.3, 3.4, 3.5, and 3.6.
Discuss this article in Gamasutra's discussion forum

________________________________________________________

[Back to] Introduction

join | contact us | advertise | write | my profile


news | features | companies | jobs | resumes | education | product guide | projects | store

Copyright © 2002 CMP Media LLC. All rights reserved.


privacy policy | terms of service

http://www.gamasutra.com/features/20010314/pinter_04.htm (6 of 6) [25/06/2002 3:58:00 PM]

S-ar putea să vă placă și