LR Parsing PDF

11-711 Algorithms for NLP
LR Parsing
Reading: Hopcroft and Ullman, Intro. to Automata Theory, Lang. and Comp. Section 10.6-10.7, pp. 248256
Shift-Reduce Parsing
A class of parsers with the following principles: Parsing is done Bottom-Up, reducing the input into the grammar start symbol The parser builds a right-most derivation of the input in reverse Parsing algorithm simulates the operation of a PDA Prex of the sentential form is kept on the stack Two types of operation: Shift the next input symbol onto the stack Reduce the stack by popping the RHS of a grammar rule, and pushing the corresponding LHS non-terminal symbol Parser is usually deterministic and with no back-tracking Extremely efcient, operating in linear time
But - possible to construct for only a limited class of CFGs

1 11-711 Algorithms for NLP
LR Parsing
General Principles: Use sets of dotted grammar rules to reect the state of the parser: What constituents have we constructed so far What constituents are we predicting next Pre-compile the grammar into a collection of nite sets of dotted rules Use these sets to capture the state of the parser during parsing The Parser is a deterministic shift-reduce parser. Developed by Knuth in the late 1960s - as a framework for compiling programming languages

LR Parsing Algorithm
Performs shift and reduce parsing actions on the stack, and changes state with each operation Is driven by a pre-compiled parsing table that has two parts The action table species the next shift or reduce parsing operation The goto table species which state to transfer to after a reduction The stack stores a string of the form 0 1 1 2 are parser states and the are grammar symbols

where the
At each step the parser does one of the following types of operations: Shift(s): Push the current input symbol the new state
on the stack followed by
Reduce(i): Reduce the stack according to rule of the grammar Reject: Reject the input as ungrammatical and signal an error Accept: Accept the input as grammatical and halt

LR Parsing - Example
The Grammar: 1 2
!"# !$ &'% !"# &
3

4

5

6
1 )
The original input: The large can can hold the water POS assigned input: art adj n aux v art n art adj n aux v art n $ Parser input:
1 ) 1 )
!$ & % ' !() 0
Constructed Parsing Table for the Grammar:
Reduce State 0 1 2 3 4 5 6 7 8 9 10 11 12 13 r3 r4 r5 r6 r2 sh3 sh4 r1 sh8 art sh3 adj sh4 n
Shift aux v $ acc sh6 sh9 sh10 sh6 sh13 sh7 12 sh7 NP 2
Goto VP S 1 5
11
The input:
)
art adj n aux v art n $
Step 0 1 2 3 4 5 6 7 8 9 10 11 12 13
Action
Stack after action

0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 3 3 8 8 13
2 2 7 5 2 2
sh3 sh8 sh13 r2 sh6 sh7 sh3 sh9 r3 r6 r5 r1 accept
2 2 2 2 2 2 2 2 2 2 2 2 2
34 34 34 9 9 9 9 9 9 9 9 @ @ @ @ @ @ @ @ 5 5
1 1
2 2 2 2
36 36 7 B B B B B B @
2 2 2 2 2 2 2 2 6 6 6 6 6 6 7 7 7 7 3 3 9
2 5 2 2 2
2 2 2 2 2 2 2 2
3A 3A 3A 3A 3A 3A 2 D
2 2 2 2 2
C C C C D
2 2 2
34 34 9
11
5 @
12
E E
Constructing an SLR Parsing Table

An 0 item is a dotted grammar rule
F G I H I H P U R S T V W Q
We construct a deterministic FSA that recognizes prexes of rightmost sentential forms of the grammar . The states of the FSA are sets of LR(0) items We augment the grammar with a new start rule We dene the closure operation on a set 1. Every item in 2. If add
R c T d S Q
of
0 items:

is also in
XY` (a "b
XY`
a(
"b
and

is a rule in
e
, then
to
T
The closure operation adds predicted new items to the set (similar to Earleys Predictor operation)
10
R e H
XY`
a(
"b

We dene the Goto operation for an item set and a grammar symbol : is the closure of the set of all items such that
f f I H f
U`
Example:
Similar to Earleys Scanner and Completer operations
I H P
c T S
1 h H Ti p q q t q q

U`
!()
g r q q 0 s v u q q
11

l
We construct the collection of sets of LR(0) items for an augmented grammar G We start with the item set So = {closure({[S1+ a s ] ) ) ) The algorithm:
procedure iterns ( G' ) ; begin C := { ~ . l o s u r e ( { l S1 . .SI})}; repeat for each set o f items I in C and each grammar symbol X such that goto ( I . X ) is not empty and not in C do add g o t o ( / , X ) to C until no more sets o f items can be added to C end
0 items for our simple NL Grammar:
@ 87 @ 365 9 3A @ @ @ C 34 BD
Constructing an SLR Parsing Table - Example
11 :
12 :
E E
@ @ BD 9 3A 3A 9 @ D D D D 9 @ @ @ 9 @ 9 @ E @ @ 9 C C 34 34 36 34 9 365 87 536 85 D @ B D @ @ 7 @ 87 8
13 :
58 34 @ 9
87 36 @
x w
Building the collection of sets of
E E
@ D @ E y E @ E 9 9 34 536
87 @ 58 34 @ 9 9 @ y E 36 @ E 9 87
@ BD D 3A @ E D @ D 9 9 C @ 34 @ 36 5
87 8 5 7 34 @ 9 9 @ 36 8
0:
1:
2:
3:
4:
E
10 : 8: 9:
E
5:
6:
7:
13

Building the FSA and Parsing Table: 1. Construct the collection of sets of grammar 2. State corresponds to item set
! a & g V H ` g
0 items from the

F G
(a) For any terminal symbol , if then set

1
and
a
I H
Q c T S 1
(b) If (c) If
#!
(d) For any non-terminal symbol then set

1 R U` # # `
(e) All table entries not dened in (a)-(d) are set as error
U`
#!
&
and
$ ( b
is rule then set for all terminal symbols

Q
I H `
c T 1 a ! "b c T
then set
$

1 I
#!
&
bX
, if
a
and
I H P R
R Q cdT S 1
` g
Constructing an SLR Parsing Table - Example

The constructed FSA for our example grammar:
S0
S
S1
NP adj art
S2
VP
S5
aux
S4
n n
S3
art adj v v
aux
S6
VP
S7 S10 S9 S8
NP n
S11
S13
adj
15
S12
LR(k) Parsing
How to handle conicts in the SLR table: A table conict: more than one action is specied in

Conicts can be either shift-reduce or reduce-reduce Parser will not be able to parse deterministically A Grammar for which this happens is not SLR More powerful techniques for building item sets can sometimes resolve the problem, by making use of lookaheads into the input Known techniques: Canonical LR(k), LALR(k) A lookahead of one is sufcient (and optimal) in many cases Another option - extending the LR Parsing algorithm: GLR Parsing
16
Parsing with an LR Parser

The pointers that form the parse tree can be created while performing reduce actions A parse node is created for each constituent that is pushed onto the stack When we do a reduce - we create a new parse node for the LHS non-terminal and link it to the parse-nodes of the popped RHS constituents At the end - the the parse tree constituent on the stack points to the root of

17
The input:
)
art adj n aux v art n $
18
Step 0 1 2 3 4 5 6 7 8 9 10 11 12 13
Action
Stack after action

0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 4 4 4 4 4 4 4 4 3 3 3 2 2 8 8 3 13
2 2 7 2
Parse Node
sh3 sh8 sh13 r2 sh6 sh7 sh3 sh9 r3 r6 r5 r1 accept
1 2 3 4
5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 9 3 3 8 9
2 5 2 2
art adj n NP (1 2 3) aux v art n NP (7 8)
2 2 2 2 2 2 2 2 2 2 2 2 2
34 34 34 9 9 9 9 9 9 9 9 E E
12
2
5 5 5 @ @ @ @ @ @ @ @
2 2 2
36 36
2 2 2 2 2 2 2 2
B 2
2 2 2 2 2 2 2 2
5 6 7 8 9
3A 3A 3A 3A 3A 3A @ D B B B B B
2 2 2 2 2
C C C C D
2 2 2
10
34 34 9 2
11
5 @
12
10 VP (6 9) 11 VP (5 10) 12 S (4 11)
11
19

LR Parsing PDF

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

LR Parsing PDF

Încărcat de

Drepturi de autor:

Formate disponibile

11-711 Algorithms for NLP

But - possible to construct for only a limited class of CFGs

on the stack followed by

   !$ & % '   !()   0    

11-711 Algorithms for NLP

11-711 Algorithms for NLP

Reduce State 0 1 2 3 4 5 6 7 8 9 10 11 12 13 r3 r4 r5 r6 r2 sh3 sh4 r1 sh8 art sh3 adj sh4 n

11-711 Algorithms for NLP

art adj n aux v art n $

11-711 Algorithms for NLP

Stack after action

sh3 sh8 sh13 r2 sh6 sh7 sh3 sh9 r3 r6 r5 r1 accept

11-711 Algorithms for NLP

Constructing an SLR Parsing Table

11-711 Algorithms for NLP

Constructing an SLR Parsing Table

Similar to Earleys Scanner and Completer operations

11-711 Algorithms for NLP

Constructing an SLR Parsing Table

11-711 Algorithms for NLP

0 items for our simple NL Grammar:

Constructing an SLR Parsing Table - Example

Building the collection of sets of

11-711 Algorithms for NLP

Constructing an SLR Parsing Table

0 items from the

(a) For any terminal symbol , if then set

(d) For any non-terminal symbol then set

is rule then set for all terminal symbols

Constructing an SLR Parsing Table - Example

11-711 Algorithms for NLP

11-711 Algorithms for NLP

Parsing with an LR Parser

11-711 Algorithms for NLP

art adj n aux v art n $

11-711 Algorithms for NLP

Stack after action

sh3 sh8 sh13 r2 sh6 sh7 sh3 sh9 r3 r6 r5 r1 accept

art adj n NP (1 2 3) aux v art n NP (7 8)

11-711 Algorithms for NLP

S-ar putea să vă placă și

!$ & % ' !() 0