Sunteți pe pagina 1din 82

Context-Free Languages

Regular Languages
} 0 : { > n b a
n n
} {
R
ww
* *b a
* ) ( b a +

Regular Languages
} {
n n
b a
} {
R
ww
Context-Free Languages
Context-Free Languages
Pushdown
Automata
Context-Free
Grammars
stack
automaton
Grammars
Grammars express languages

Example: the English language


verb predicate
noun article phrase noun
predicate phrase noun sentence

_
_

walks verb
runs verb
dog noun
cat noun
the article
a article

A derivation of the dog walks:




walks dog the
verb dog the
verb noun the
verb noun article
verb phrase noun
predicate phrase noun sentence

_
_
A derivation of a cat runs:
runs cat a
verb cat a
verb noun a
verb noun article
verb phrase noun
predicate phrase noun sentence

_
_
Language of the grammar:


L = { a cat runs,
a cat walks,
the cat runs,
the cat walks,
a dog runs,
a dog walks,
the dog runs,
the dog walks }
Notation

dog noun
cat noun

Variable Terminal
Production Rules
Another Example
Grammar:



Derivation of sentence :

S
aSb S
ab aSb S
ab
aSb S S
aabb aaSbb aSb S
aSb S S
aabb

S
aSb S
Grammar:



Derivation of sentence :

Other derivations:
aaabbb aaaSbbb aaSbb aSb S
aaaabbbb aaaaSbbbb
aaaSbbb aaSbb aSb S


Language of the grammar

S
aSb S
} 0 : { > = n b a L
n n
More Notation

Grammar ( ) P S T V G , , , =
: V
: T
: S
: P
Set of variables
Set of terminal symbols
Start variable
Set of Production rules
Example

Grammar :

S
aSb S G
( ) P S T V G , , , =
} {S V = } , { b a T =
} , { = S aSb S P
More Notation
Sentential Form:
A sentence that contains
variables and terminals

Example:


aaabbb aaaSbbb aaSbb aSb S
Sentential Forms sentence

We write:



Instead of:

aaabbb S
*

aaabbb aaaSbbb aaSbb aSb S



In general we write:




If:
n
w w
*
1

n
w w w w
3 2 1


By default:

w w
*

Example

S
aSb S
aaabbb S
aabb S
ab S
S
*
*
*
*

Grammar
Derivations
b aaaaaSbbbb aaSbb
aaSbb S
-
-

S
aSb S
Grammar
Example
Derivations
Another Grammar Example
Grammar :

A
aAb A
Ab S
Derivations:
aabbb aaAbbb aAbb Ab S
abb aAbb Ab S
b Ab S



G
More Derivations

aaaabbbbb aaaaAbbbbb
aaaAbbbb aaAbbb aAbb Ab S


b b a S
bbb aaaaaabbbb S
aaaabbbbb S
n n
-
-
-

Language of a Grammar

For a grammar
with start variable :
G
S
} : { ) ( w S w G L
-
=
String of terminals
Example
For grammar :

A
aAb A
Ab S
} 0 : { ) ( > = n b b a G L
n n
Since:
b b a S
n n
-

G
A Convenient Notation

A
aAb A
| aAb A
the article
a article

the a article |
Example

A context-free grammar :

S
aSb S
aabb aaSbb aSb S
G
A derivation:

A context-free grammar :

S
aSb S
aaabbb aaaSbbb aaSbb aSb S
G
Another derivation:

S
aSb S
= ) (G L
(((( ))))
} 0 : { > n b a
n n
Describes parentheses:

S
bSb S
aSa S
abba abSba aSa S
A context-free grammar : G
A derivation:
Example

S
bSb S
aSa S
abaaba abaSaba abSba aSa S
A context-free grammar : G
Another derivation:

S
bSb S
aSa S
= ) (G L
}*} , { : { b a w ww
R
e

S
SS S
aSb S
ab abS aSbS SS S
A context-free grammar : G
A derivation:
Example

S
SS S
aSb S
abab abaSb abS aSbS SS S
A context-free grammar : G
A derivation:

S
SS S
aSb S
} prefix any in
) ( ) ( and
), ( ) ( : {
v
v n v n
w n w n w
b a
b a
>
=
() ((( ))) (( ))
= ) (G L
Describes
matched
parentheses:
Definition: Context-Free Grammars
Grammar
Productions of the form:
x A
String of variables
and terminals
) , , , ( P S T V G =
Variables Terminal
symbols
Start
variable
Variable
*} , : { ) (
*
T w w S w G L e =
) , , , ( P S T V G =
Definition: Context-Free Languages

A language is context-free

if and only if

there is a context-free grammar
with
L
G
) (G L L =
Derivation Order

AB S . 1

A
aaA A
. 3
. 2

B
Bb B
. 5
. 4
aab aaBb aaB aaAB AB S
5 4 3 2 1

Leftmost derivation:
aab aaAb Ab ABb AB S
3 2 5 4 1

Rightmost derivation:
| A B
bBb A
aAB S

Leftmost derivation:
abbbb abbbbB
abbBbbB abAbB abBbB aAB S


Rightmost derivation:
abbbb abbBbb
abAb abBb aA aAB S


Derivation Trees


AB S
AB S
| aaA A | Bb B
S
B A
AB S
| aaA A | Bb B
aaAB AB S
a a
A
S
B A
AB S
| aaA A | Bb B
aaABb aaAB AB S
S
B A
a a
A B b
AB S
| aaA A | Bb B
aaBb aaABb aaAB AB S
S
B A
a a
A B b

AB S
| aaA A | Bb B
aab aaBb aaABb aaAB AB S
S
B A
a a
A B b

Derivation Tree
aab aaBb aaABb aaAB AB S
yield
aab
b aa
=

S
B A
a a
A B b

Derivation Tree
AB S
| aaA A | Bb B
Ambiguity

a E E E E E E | ) ( | | - +
a a a - +
E
E E
E E
+
a
a
a
-
a a a E a a
E E a E a E E E
* + - +
- + + +
leftmost derivation
a E E E E E E | ) ( | | - +
a a a - +
E
E E
+
a
a
-
E E
a
a a a E a a
E E a E E E E E E
- + - +
- + - + -
leftmost derivation
a E E E E E E | ) ( | | - +
a a a - +
E
E E
+
a
a
-
E E
a
E
E E
E E
+
a
a
a
-
Two derivation trees
The grammar
a E E E E E E | ) ( | | - +
is ambiguous:
E
E E
+
a
a
-
E E
a
E
E E
E E
+
a
a
a
-
string a a a - + has two derivation trees
string a a a - + has two leftmost derivations
a a a E a a
E E a E E E E E E
- + - +
- + - + -
a a a E a a
E E a E a E E E
* + - +
- + + +
The grammar
a E E E E E E | ) ( | | - +
is ambiguous:
Definition:
A context-free grammar is ambiguous

if some string has:

two or more derivation trees
G
) (G L we
In other words:
A context-free grammar is ambiguous

if some string has:

two or more leftmost derivations
G
) (G L we
(or rightmost)
Why do we care about ambiguity?
E
E E
+
a
a
-
E E
a
E
E E
E E
+
a
a
a
-
a a a - +
take 2 = a
E
E E
+
-
E E
E
E E
E E
+
-
2 2 2 - +
2
2 2 2 2
2
E
E E
+
-
E E
E
E E
E E
+
-
6 2 2 2 = - +
2
2 2 2 2
2
8 2 2 2 = - +
4
2 2
2
6
2 2
2 4
8
E
E E
E E
+
-
6 2 2 2 = - +
2
2 2
4
2 2
2
6
Correct result:
We want to remove ambiguity
Ambiguity is bad for programming languages
Left Recursion & Right Recursion

It is possible for a recursive-descent parser to loop forever.

The same effect can be achieved by rewriting the
productions for A in the following manner, using a new
nonterminal R:
The left-recursion-elimination technique sketched in
previous Fig. can also be applied to productions containing
semantic actions.
First, the technique extends to multiple productions for A.
Position of Parser
There are three general types of parsers for grammars:
UNIVERSAL
TOP-DOWN
BOTTOM-UP

Universal parsing methods such as
Cocke-Younger-Kasami algorithm
Earley's algorithm can parse any grammar

These general methods are, however, too inefficient to use in
production compilers.

The methods commonly used in compilers can be classified as
being either top-down or bottom-up.

Top-down methods build parse trees from the top (root) to
the bottom (leaves), while Bottom-up methods start from
the leaves and work their way up to the root.

In either case, the input to the parser is scanned from left to
right, one symbol at a time.
The most efficient top-down and bottom-up
methods work only for subclasses of
grammars,

but several of these classes, particularly, LL and
LR grammars, are expressive enough to describe
most of the syntactic constructs in modern
programming languages.

Parsers implemented by hand often use LL
grammars;
for example, the predictive-parsing approach

Parsers for the larger class of LR grammars
are usually constructed using automated tools.
Associativity of operators
Our grammar gives left associativity.
That is, if you traverse the parse tree in postorder and
perform the indicated arithmetic you will evaluate the
string left to right.

If you wished to generate right associativity, you would
change the productions

S-ar putea să vă placă și