Sunteți pe pagina 1din 37

The Chinese University of Hong Kong

Fall 2009

Automata theory
and formal languages

Andrej Bogdanov
http://www.cse.cuhk.edu.hk/~andrejb/csc3130

What are computers good at?


In 1997, IBM Deep Blue
defeated world chess
champion Gary
Kasparov in a six
match tournament.

What are computers good at?

The search engine


Google indexes
2,000,000,000 web
pages. It lets you find
pretty much anything
you want.

What else?

Recommend
books

Fly
airplanes

Is there anything a computer cannot


do?

Impossibilities

Why do we care about the


impossible?

Perpetual motion
In the middle ages, people
wanted a machine that does not
use any energy
Later, discoveries in physics
showed that energy cannot be
created out of thin air
Perpetual motion is a futile
endeavor

Understanding the impossible helps


us
channel our energies towards the
more useful.

The laws of computation


Just like the laws of physics tell
us
what is (im)possible for nature to
do...

...the laws of computation tell us


what is (im)possible for
computers.

Automata theory
Automata theory studies the laws of
computation.

In reality, the laws of computation are not quite


understood, but automata theory is a good start.

A simple computer
H
ITC
W
S

BATTERY

input: switch
output: light bulb
actions: flip
switch
states: on, of

A simple computer
H
ITC
W
S

f
BATTERY

start

on

of
f

input: switch
output: light bulb
actions: f for flip
switch
states: on, of

bulb is on if and only


if there was an odd
number of flips

Another computer
1
1

start

of

BATTERY

1
of

actions: 1 for flip


switch 1
actions: 2 for flip
switch 2

1
2

inputs: switches 1 and


2

of

on

bulb is on if and only


if both switches were
flipped an odd
number of times

A design problem
1

4
5

BATTERY

2
3

Design a circuit where the light is on only


when all switches are flipped the same
number of times

A design problem
Such devices are difficult to reason about,
because they can be designed in an infinite
number of ways
f
on

of
f

By representing them as abstract


computational devices, or automata, we will
learn how to answer such questions

These devices can model many


things

They can describe the operation of any


small computer, like the control
component of an alarm clock or a microwave
They are also used in lexical analyzers to
recognize well formed expressions in
programming languages:
ab1 is a legal name of a variable in C
5u= is not

Diferent kinds of automata


This was only one example of a
computational device, and there are others
We will look at diferent devices, and look
at these kinds of questions:
What kinds of problems can a given type of
device solve?
What things are impossible for this kind of
device?
Is one type of device more powerful than
another?

Some devices we will see


finite automata Devices with a finite amount of
memory.
Used to model small computers.
push-down
automata

Devices with infinite memory that


can be accessed in a restricted way.
Used to model parsers, etc.

Turing
Machines

Devices with infinite memory.

time-bounded
Turing
Machines

Infinite memory, but bounded


running time.

Used to model any computer.

Used to model any computer


program that runs in a reasonable
amount of time.

Some highlights of the course


Finite automata
We will understand what kinds of things a device
with finite memory can do, and what it cannot do
Introduce simulation: the ability of one device to
imitate another device
Introduce nondeterminism: the ability of a device
to make arbitrary choices

Push-down automata
These devices are related to grammars, which
describe the structure of programming (and
natural) languages

Some highlights of the course


Turing Machines
This is a general model of a computer, capturing
anything we could ever hope to compute
But there are many things that computers cannot do:

Given the code of a computer program, can you


tell if it prints banana?
#include <stdio.h>
main(t,_,a)char *a;{return!0<t?t<3?main(-79,-13,a+main(-87,1-_,
main(-86,0,a+1)+a)):1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,
"@n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l,+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l \
q#'+d'K#!/+k#;q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw' i;# \
){nl]!/n{n#'; r{#w'r nc{nl]'/#{l,+'K {rw' iK{;[{nl]'/w#q#n'wk nw' \
iwk{KK{nl]!/w{%'l##w#' i; :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c \
;;{nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;#'rdq#w! nr'/ ') }+}{rl#'{n' ')# \
}'+}##(!!/")
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
:0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry"),a+1);}

banana

Some highlights of the course


Time-bounded Turing Machines
Many problems are possible to solve on a computer
in principle, but take too much time in practice
Traveling salesman: Given a list of cities, find the
shortest way to visit them and come back home
Beijing
Chengdu

Xian

Shanghai

Guangzhou
Hong Kong

Easy in principle: Try the cities in every possible


order
Hard in practice: For 100 cities, this would take
100+ years even on the fastest computer!

Preliminaries of automata theory


How do we formalize the question
Can device A solve problem B?
First, we need a way of describing the
problems that we are interested in solving

Problems
Examples of problems we will consider

Given a word s, does it contain the subword fool?


Given a number n, is it divisible by 7?
Given a pair of words s and t, are they the same?
Given an expression with brackets, e.g. (()()), does
every left bracket match with a subsequent right
bracket?

All of these have yes/no answers.


There are other types of problems, like Find this
or How many of that but we wont look at
them.

Alphabets and strings


A common way to talk about words, numbers,
pairs of words, etc. is by representing them as
strings
To define strings, we start with an alphabet

An alphabet is a finite set of symbols.


Examples

1 = {a, b, c, d, , z}: the set of letters in English


2 = {0, 1, , 9}: the set of (base 10) digits
3 = {a, b, , z, #}: the set of letters plus the
special symbol #
4 = {(, )}: the set of open and closed brackets

Strings
A string over alphabet is a finite sequence
of symbols in .
The empty string will be denoted by
Examples
abfbz is a string over 1 = {a, b, c, d, , z}
9021 is a string over 2 = {0, 1, , 9}
ab#bc is a string over 3 = {a, b, , z, #}
))()(() is a string over 4 = {(, )}

Languages

A language is a set of strings over an alphabe


Languages can be used to describe problems
with yes/no answers, for example:
L1 = The set of all strings over 1 that contain
the substring fool
L2 = The set of all strings over 2 that are
divisible by 7
= {7, 14, 21, }
L3 = The set of all strings of the form s#s
where s is any
string over {a, b, , z}

Finite Automata

Example of a finite automaton


f
on

of
f

There are states of and on, the automaton


starts in of and tries to reach the accept
state on
What sequences of fs lead to the accept state?
Answer: {f, ff, fff, } = {f n: n is odd}
This is a finite automaton over alphabet {f}

Deterministic finite automata


A deterministic finite automaton (DFA) is a
5-tuple (Q, , , q0, F) where

Q is a finite set of states


is an alphabet
: Q Q is a transition function
q0 Q is the initial state
F Q is a set of accepting states (or final
states).

In diagrams, the accepting states will be


denoted by double loops

Example
0
q0

1
1

q1

0,1
0

q2

states

table of
alphabet = {0, 1}
transition function
states Q = {q0, q1, q2}
inputs
initial state q0
0 1
accepting states F = {q0, q1}
q0 q0 q1
q1 q2 q1
q2 q2 q2

Language of a DFA

The language of a DFA (Q, , , q0, F) is the set


all strings over that, starting from q0 and
following the transitions as the string is read le
to right, will reach some accepting state.
f

M:

on

of
f

Language of M is {f, ff, fff, } = {f n: n is


odd}

Examples
= {0, 1}
0
q0

q1

q1

b
b

0
q0

1
1

q3
a

q1

q2

q2

b
b

q4

1
1

= {a, b}

q0

0,
1

What are the languages of these automata?

Examples
Construct a DFA over alphabet {0, 1} that
accepts all strings with at most three 1s

Examples
Construct a DFA over alphabet {0, 1} that
accepts all strings with at most three 1s
Answer
0

0
q0

q1

0
1

q2

0
1

q3

0,
1

q4+

Examples
Construct a DFA that accepts the language
L = {010,
1}

( = {0, 1}
)

Examples
Construct a DFA that accepts the language
( = {0, 1}
)

L = {010,
1}

Answer
0

q0

1
0

q01

q010

1
0, 1

q1

0, 1

qdie

0, 1

Examples
Construct a DFA over alphabet {0, 1} that
accepts all strings that end in 101

Examples
Construct a DFA over alphabet {0, 1} that
accepts all strings that end in 101
Hint: The DFA must remember the last 3
bits of the string it is reading

Examples
Construct a DFA over alphabet {0, 1} that
accepts all strings that end in 101
Sketch of answer:

q
1

1
0

q
q

S-ar putea să vă placă și