Sunteți pe pagina 1din 23

CPSC 388 Compiler Design

and Construction
Scanner Regular Expressions to DFA
Announcements
ACM Programming contest
(Tues 8pm)
PROG 1 Feedback
Linux Install Fest When?
Saturday?, Fliers, CDROMS, Bring
Laptops (do at own risk)
LUG
Understanding Editors (Eclipse, Vi,
Emacs)
Scanners
Lexical Analyzer
(Scanner)
Source
Code
Token
Stream
Regular
Expression
Deterministic
Finite
State
Automata
Nondeterministic
Finite
State
Automata
Regular Expressions
Easy way to express a language that is
accepted by FSA
Rules:
is a regular expression
Any symbol in is a regular expression
If r and s are any regular expressions then so is:
r|s denotes union e.g. r or s
rs denotes r followed by s (concatination)
(r)* denotes concatination of r with itself zero or
more times (Kleene closer)
() used for controlling order of operations

RE to NFA: Step 1
Create a tree from the Regular
Expression
Example
(a(a|b))*
a
b a
|
Cat
*
Leaf Nodes are either
members of
or
Internal Nodes are operators
cat, |, *
RE to NFA: Step 2
Do a Post-Order Traversal of Tree
(children processed before parent)
At each node follow rules for
conversion from a RE to a NFA

Leaf Nodes
Either or member of
a
b a
|
Cat
*
S F

S F
a
Internal Nodes
Need to keep track of left (l)and right
(r) NFA and merge them into a single
NFA
Or
Concatination
Kleene Closure
Or Node
S
F
l
r




Concatenation Node
l r
Kleene Closure

S F



Try It
Convert the regular expression to a
NFA
(a|b)*abb
First convert RE to a tree
Then convert tree to NFA
NFA to DFA
Recall that a DFA can be represented
as a transition table
Characters
+ - Digit

State
S A A B
A B
B B
Operations on NFA
-closure(t) Set of NFA states
reachable from NFA state t on -
transitions alone.
-closure(T) Set of NFA states
reachable from some NFA state t in
set T on -transitions alone.
move(T,a) Set of NFA states to
which there is a transition on input
symbol a from some state t in T
NFA to DFA Algorithm
Initially -closure(s) is the only state
in DFA and it is unmarked
While (there is unmarked state T in DFA)
mark T;
for (each input symbol a) {
U = -closure(move(T,a));
if (U not in DFA)
add U unmarked to DFA
transition[T,a]=U;
Try it
Take NFA from previous example and
construct DFA
Regular Expression: (a|b)*abb
S 1
2
4 5
3
6 7 8 9 F







a
b
a b b
Corresponding DFA
NewS
S,1,2,4,7
B
1,2,3,4
6,7,8
D
1,2,4,5,
6,7,9
NewF
1,2,4,5,
6,7,F
C
1,2,4,
5,6,7
a
a
a
a
a
b
b
b
b b
Start State and Accepting States
The Start State for the DFA is
-closure(s)

The accepting states in the DFA are
those states that contain an accepting
state from the NFA
Efficiency of Algorithms
RE -> NFA


NFA -> DFA


Recognition of a string by DFA
O(|r|) where |r| is the size of the RE


O(|r|
2
2
|r|
) worst case
(not seen in typical programming languages)

O(|x|) where |x| is length of string

More Practice
Convert RE to NFA
((|a)b*)*

Convert NFA to DFA

S
1

a
b
b
4 3
2

a
Solution to Practice
RE to NFA
S 1
2
4 5
3
6 7 8 9 F







a
b




Solution to Practice
NFA to DFA
NewS
S,1,3
B
4
A
2
b
b
a
a
Summary of Scanners
Lexemes
Tokens
Regular Expressions, Extended RE
Regular Definitions
Finite Automata (DFA & NFA)
Conversion from RE->NFA->DFA
JLex

S-ar putea să vă placă și