Documente Academic
Documente Profesional
Documente Cultură
So ( a ) | ( ( b )* ( c ) ) = a | b*c
Either a single a or zero or more b`s followed by a single c
Examples
- let ε = {a,b}
o a | b = {a,b}
o (a | b )(a | b ) = {aa,ab,ba,bb}
o a* = {∈, a,aa,aaa,…}
Regular Definition
- For notational convenience, we give names to regular expressions
- If ∑ is an alphabet of basic symbols, then a regular definition is a
sequence of a definitions of the form
o D1 → r1
o D2 → r2
o … Dn → rn
Where Di is a distinct name and ri is a regular expression over the
symbol ∑ U { d1,d2,di-1} i.e. the basic symbols and the previously
defined names
By restricting each ri to symbols of ∑ and previously defined
names, we an construct regular expression over ∑ for any ri by
repeatedly replacing regular expression names by the expressions
they denote. If ri used dj for some j≥I, then ri must be recursively
defined, and the substitution process would not terminate
- Example
o Letter → A|B|….|Z|a|b|….|z
o digit → 0|1|….|9
o Id → letter (letter | digit)*
- Example
o digit → 0|1|2|…|9
o optional-fraction → . digits | ∈
o optional-exponent → ( E ( +| - | ∈ ) digits ) | ∈
- (r)?=r|∈
- r+ = r r*
- r* = r+ | ∈
- digit → 0|1|….|9
- digits → digit+
- optional-fraction → ( . digits )?
- optional-exponent → ( E ( + | - ) ? digits ) ?
Recognition of Token
- Takes place by implementing a stylized flowchart, called Transition
Diagram
- Transition Diagram
o Depicts actions that take place when a Lexical Analyzer is
called by the parser to get next token
o Positions in a TD are drawn as circles, called States
o States are connected by arrows, called Edges
o Edges leaving state S have Labels indicating the i/p character
that can next appear after the TD has reached state S
o Start state is the initial state of the TD where control resides
when we begin to recognize a token
o E.g.
start > 6v = 7
8
other *
= 5 return ( relop, EQ )
> 6 = 7
return (relop, GE )
ws --- ---
If If ---
then Then ---
else Else ---
Id id Pointer to
symbol Table
entry
num num Pointer to ST
entry
< relop LT
<= relop LE
> relop GT
>= relop GE
<> relop NE
= relop EQ
Ws → delim +
RD for operators
1
E 7 N 8 D 9 ws/: 0
*
L S E ws 1
1 1 1 4
1 2 3
I 1 F 1 ws/( 1
5 6 7
T H 1 E 2N N 2 ws 2
1 9 0 1 2
8
Note : All final States (6,10,14,17,22) have * like one shown on state
6. the symbol is called retraction symbol which indicates that
pointer moves back to previous sate
Letter or digit *
Start letter 1 other 1
9
0 1 Return(gettoken(),installid())
digit digit dg
*
1 digit 1 . 1 digit 1 E 1 +/- 1 digit 1 ot 1
S 2 3 4 5 6 7 8 9
Install-num()
E digit
digit digit *
digit . digit other
S 2 2 2 2 2
0 1 2 3 4 Install_num()
digit *
digit
S 2 2 other 2
5 6 7
Install_num()
Delim
*
S delim other
2 2 3
8 9 0