Sunteți pe pagina 1din 37

Deterministic Context-Free Languages

Denition A pushdown automaton (PDA) A = (Q, , , , q0 , Z0, F ) is deterministic if: 1) whenever (q, a, X) is nonempty for some a , then (q, , X) is empty, and 2) for each q Q, a {} and X , (q, a, X) contains at most one element. A language L is a deterministic context-free language (DCFL) if it is accepted by a deterministic pushdown automaton (DPDA).
Advanced Automata Theory CS 9620

Normal forms for DPDAs Lemma Every DCFL is L(A) for a DPDA A = (Q, , , , q0 , Z0, F ) such that if (q, a, X) = (p, ), then || 2. Proof. If (q, a, X) = (r, ) and || > 2, let = Y1Y2 Yn, where n 3. Create new nonaccepting states p1, p2, . . . , pn2, and redene (q, a, X) = (p1, Yn1Yn).

Advanced Automata Theory

CS 9620

Then dene (pi, , Yni) = (pi+1, Yni1Yni) for 1 i n 3 and (pn2, , Y2) = (r, Y1Y2). Thus, in state q, on input a, with X on top of the stack, the revised DPDA still replaces X with Y1Y2 Yn = and enters state r, but it now takes n 1 steps to do so. Lemma Every DCFL is L(A) for a DPDA A = (Q, , , , q0 , Z0, F ) such that if (q, a, X) = (p, ), then is either (a pop), X (no stack change), or of the form Y X (a push) for some stack symbol Y .
Advanced Automata Theory CS 9620

Proof. Assume L = L(A), where


A = (Q, , , , q0, X0, F )

satises the previous lemma. We construct A to simulate A while keeping the top stack symbol of A in As control. Formally, let
Q = Q , q0 = [q0, X0],

F = F and = {Z0}, where Z0 is a new symbol not in . Dene be: i) If (q, a, X) = (p, ), then for all Y , ([q, X], a, Y ) = ([p, Y ], ).
Advanced Automata Theory CS 9620

ii) If (q, a, X) = (p, Y ), then for all Z, ([q, X], a, Z) = ([p, Y ], Z). iii) If (q, a, X) = (p, Y Z), then for all W , ([q, X], a, W ) = ([p, Y ], ZW ). It is easy to show by induction on the number of moves that
(q0, w, X0) (p, , X1X2 Xn) M

if and only if
([q0, X0], w, Z0) ([p, X1], , X2 XnZ0). M

Thus L(M ) = L(M ).


Advanced Automata Theory CS 9620

Forcing DPDA to scan their input Lemma Let M be a DPDA. There exists an equivalent DPDA M such that on every input, M scans the entire input. Proof. There are the following situations that a DPDA would not nish reading the entire input: (1) the stack is emptied; (2) a transition is not dened; (3) there is an innity of -moves. Let M = (Q, , , , q0 , Z0, F ). mentioned problems, we dene In order to solve the above

M = (Q, , , , q0, X0, F ) where Q = Q {q0, d, f }, = {X0}, F = F {f }, and


Advanced Automata Theory CS 9620

(1) (q0, , X0) = (q0, Z0X0). X0 marks the bottom of the stack.

(2) If for some q Q, a , and Z , (q, a, Z) and (q, , Z) are both empty, then (q, a, Z) = (d, Z). Also for all q Q and a , (q, a, X0 ) = (d, X0). (3) (d, a, Z) = (d, Z) for all a and Z . (4) If for q and Z and for all i there exist qi and i for which (q, , Z) i (qi, , i ), then (q, , Z) = (d, Z) if no qi is nal and (q, , Z) = (f, Z) if one or more of the qis is nal.
Advanced Automata Theory CS 9620

(5) (f, , Z) = (d, Z) for all Z . (6) For any q Q, a {}, and Z , if (q, a, Z) has not been dened by (2) or (4), then (q, a, Z) = (q, a, Z). It is clear that L(M ) = L(M ). To prove that M reads all its input, suppose that for some proper prex x of xy
(q0, xy, X0) (q, y, Z1 Z2 Zk X0), M

and from (q, y, Z1 Z2 Zk X0) no symbol of y is ever consumed.

Advanced Automata Theory

CS 9620

By rule (2) it is not possible that M halts. By rule (4), it is not possible that M makes an innite sequence of -moves without erasing Z1. Therefore M must eventually erase Z1. Similarly M must erase Z2, . . . , Zk and eventually enter (q , y, X0 ). By rule (2), (q , y, X0 ) M (d, y , X0), where y = ay for some a . Thus, M did not fail to read past x as supposed, and M satises the conditions of the lemma.

Advanced Automata Theory

CS 9620

Closure under complementation Theorem The complement of a DCFL is a DCFL. Proof. Let M = (Q, , , , q0 , Z0, F ) be a DPDA that reads the entire input. Let M = (Q, , , , q0, Z0, F ) be a DPDA simulating M , where Q = {[q, k] | q Q and k = 1, 2, or 3}. Let F = {[q, 3] | q Q}, and let
q0

[q0, 1] if q0 is in F ; [q0, 2] if q0 is not in F .

Advanced Automata Theory

CS 9620

The purpose of k in [q, k] is to record, between true inputs, whether or not M has entered an accepting state. If M has entered an accepting state since the last true input, then k = 1. Otherwise, k = 2. If k = 2, M rst changes k to 3 and then simulates the move of M changing k to 1 or 2, depending on whether the new state of M is in F . Thus, is dened as follows, for q, p Q and a (1) If (q, , Z) = (p, ), then for k = 1 or 2, ([q, k], , Z) = ([p, k ], ), where k = 1 if k = 1 or p F ; otherwise k = 2.
Advanced Automata Theory CS 9620

(2) If (q, a, Z) = (p, ), for a , then ([q, 2], , Z) = ([q, 3], Z) and ([q, 1], a, Z) = ([q, 3], a, Z) = ([p, k], ) where k = 1 or 2 for p F or p F , respectively. We claim that L(M ) = L(M ). Suppose that a1a2 an is in L(M ). Then M enters an accepting state after reading an. In that case, the second component of the state of M will be 1 before it is possible for M to use a true input after an. Therefore, M does not accept while an was the last true input.
Advanced Automata Theory CS 9620

If a1a2 an is not in L(M ), M will read the entire inputs and will sometime after reading an have no -moves to make. At this time, the second component of M s state is 2, since a1a2 an is not in L(M ). By rule (2) above, M will accept before attempting to use a true input symbol. Example Let L = {aibj ck | i = j or j = k}. It is clear that L is CF. We can show that L is not DCF. Assume that L is DCF, then L is DCF by the previous theorem. Then L abc is CF. However, we know that L abc = {aibj ck | i = j = k}, which is not CF. Therefore, L cannot be DCF.

Advanced Automata Theory

CS 9620

Some decision properties properties of DCFLs

and

undecidable

Theorem Let L be a DCFL and R a regular set. The following problems are decidable. (1) Is R L? (2) Is L = R? (3) Is L = ? (4) Is L a CFL?

Advanced Automata Theory

CS 9620

Proof. (1) R L if and only if L R = . Since L is contextfree and L R is context-free. Then L R = is decidable. (2) L = R if and only if L1 = (L R) (L R) is empty. Since the DCFLs are eectively closed under complementation and CFLs are eectively closed under union and intersection with a regular set, L1 is a CFL and emptiness for CFLs is decidable. (3) L is a DCFL. Emptiness for CFLs is decidable. (4) The property L is a CFL is trivial for DCFLs and hence is decidable.
Advanced Automata Theory CS 9620

Theorem Let L1 and L2 be arbitrary DCFLs. Then the following problems are undecidable. (1) Is L1 L2 = ? (2) Is L1 L2? Proof. It is known that the set of all valid computations of a Turing machine M can be represented by the intersection of two context-free languages, i.e., L1 L2. It is easy to show that both L1 and L2 are DCFLs by exhibiting DPDA that accept them. Thus, L1 L2 = is undecidable since L(M ) = is undecidable. L1 L2 if and only if L1 L2 = . So, (2) follows from (1).
Advanced Automata Theory CS 9620

LR(0) Grammars
Denition An item for a given CFG is a production with a dot anywhere in the right side, including the beginning or end.
Example G1 = (N, , P, S ) is a CFG, where P : S Sc S SA | A A aSb | ab The items for G1 are S Sc S S c S Sc S S S S S SA SA SA A A A aSb A a Sb A aS b A aSb A ab Aab A ab
CS 9620

Advanced Automata Theory

We use symbols rm and to denote single step and multiple rm steps rightmost derivations, respectively. A right-sentential form is a sentential form that can be derived by a rightmost derivation. Denition A handle of a right-sentential form for CFG G is a substring such that S rm Aw rm w, and w = . Denition A viable prex of a right-sentential form is any prex of ending no further right than the right end of a handle of .

Advanced Automata Theory

CS 9620

Example In grammar G1, there is a rightmost derivation S Sc SAc SaSbc. Thus, SaSbc is a right-sentential form, and its handle is aSb. The viable prexes of SaSbc are , S, Sa, SaS, and SaSb. Denition An item A is said to be valid for a viable prex if there is a rightmost derivation S Aw rm w rm and = .
Advanced Automata Theory CS 9620

An item is said to be complete if the dot is the rightmost symbol in the item. Example Consider G1 and the right-sentential form abc. Since S Ac rm abc rm we see that A ab is valid for viable prex ab. We also see that A a b is valid for viable prex a, and A ab is valid for viable prex . As A ab is a complete item, we might be able to deduce that Ac was the previous right sentential form for abc.
Advanced Automata Theory CS 9620

For every CFG G, the set of viable prexes is a regular set, and this set is accepted by an NFA whose states are the items for G. Let G = (N, , P, S) be a CFG. The NFA M recognizing the viable prexes for G is dened as follows. Let M = (Q, N , , q0, Q), where Q is the set of items for G plus the state q0, which is not an item. Dene (1) (q0, ) = {S | S P }, (2) (A B, ) = {B | B P }, (3) (A X, X) = {A X }.

Advanced Automata Theory

CS 9620

Example The NFA for grammar G1 is shown below.


Start q
0

S-> . Sc

S->S . c

S->Sc .

S-> . SA S S->S . A

S-> . A

S->A .

A-> . aSb a A->a . Sb

A-> . ab a A->a . b b A->ab .

A S->SA .

S A->aS . b b A->aSb .

Advanced Automata Theory

CS 9620

Denition We say that a grammar G is an LR(0) grammar if (1) its start symbol does not appear on the right side of any production, and (2) for every viable prex of G , whenever A is a complete item valid for , then no other complete item nor any item with a terminal to the right of the dot is valid for . Note that there is no prohibition against several incomplete items being valid for , as long as no complete item is valid.

Advanced Automata Theory

CS 9620

Example The DFA constructed from the NFA for grammar G1 is shown in the next diagram. All the the states that have a complete item have only one complete item and no incomplete items with a terminal to the right of the dot. Also, the start symbol S does not appear on the right side of any production. So, G1 is LR(0).

Advanced Automata Theory

CS 9620

Start

q 0 S-> . Sc S -> . SA S -> . A A -> . aSb A -> . ab 0 A a

S -> S . c S -> S . A A -> . aSb A -> . ab 1

S -> Sc . 4 S -> SA . 5

S -> A . 2 A

A -> a . Sb A -> a . b S -> . SA S -> . A A -> . aSb A -> . ab 3 b A -> ab . 7

S a

A -> aS . b S -> S . A A -> . aSb A -> . ab 6

A -> aSb . 8

Advanced Automata Theory

CS 9620

Example Let x = aababba be an input. The parsing of the input x using the DFA is shown in the following table. 1) 2) 3) 4) 5) 6) 7) 8) 9) 10) 11) 12) 13) 14) 15) Stack 0 0a3 0a3a3 0a3a3b7 0a3A2 0a3S6 0a3S6a3 0a3S6a3b7 0a3S6A5 0a3S6 0a3S6b8 0A2 0S1 0S1c4 Remaining input aababbc ababbc babbc abbc abbc abbc bbc bc bc bc c c c Comments Initial conguration Shift Shift Shift Reduce by A ab Reduce by S A Shift Shift Reduce by A ab Reduce by S SA Shift Reduce by A aSb Reduce by S A Shift Accept
CS 9620

Advanced Automata Theory

Denition A language L is said to have the prex property if, whenever w is in L, no proper prex of w is in L. Theorem A language L has an LR(0) grammar if and only if L is a DCFL with the prex property. Corollary L$ has LR(0) grammar if and only if L is a DCFL, where $ is not a symbol of Ls alphabet.

Advanced Automata Theory

CS 9620

LR(k) Grammars
Denition An LR(1) item consists of an LR(0) item followed by a lookahead set consisting of terminals and/or the special symbol $, which serves to denote the right end of a string. The generic form of an LR(1) item is A , {a1, a2, . . . , an}. We say that LR(1) item A , {a} is valid for viable prex if there is a rightmost derivation

Advanced Automata Theory

CS 9620

S Ay rm y rm where = , and either (i) a is the rst symbol of y, or (ii) y = and a is $. Also, A , {a1, a2, . . . , an} is valid for if for each i, A , {ai} is valid for . The set of all viable prex of a grammar is a regular set. We can construct an NFA to recognize this set using the set of LR(1) items as the set of states. The transitions of the NFA are dened as follows.
Advanced Automata Theory CS 9620

(1) There is a transition on X from A X, {a1, a2, . . . , an} to A X , {a1, a2, . . . , an}. (2) There is a transition on from A B, {a1, a2, . . . , an} to B , T , if B is a production and T is the set of terminals and/or $ such that b is in T if and only if either (i) derives a terminal string beginning with b, or (ii) , and b is ai for some 1 i n. (3) There is an initial state q0 with transitions on to S , {$} for each production S .
Advanced Automata Theory CS 9620

Example Consider the grammar G2 S A A BA | B aB | b which happened to generate the language (ab). The NFA accepting the set of viable prexes of G2 is shown below.
Start q
0

S -> . A, {$}

S -> A . , {$}

A -> . BA, {$} B A -> B . A, {$} A A -> BA . , {$}

A -> . , {$}

B -> . aB, {a,b,$} a

B -> . b, {a,b,$} b B -> b . , {a,b,$}

B -> a . B, {a,b,$} B B -> aB . , {a,b,$}

Advanced Automata Theory

CS 9620

The corresponding DFA is shown in the following.


q Start
0

S -> . A, {$} A -> . BA, {$} A -> . , {$} B -> . aB, {a,b,$} B -> . b, {a,b,$} 0 B A -> B . A, {$} a A -> . BA, {$} A -> . , {$} B -> . aB, {a,b,$} B -> . b, {a,b,$} 2 A A -> BA . , {$} 5 B

S -> A . , {$} 1 a B -> a . B, {a,b,$} B -> . aB, {a,b,$} B -> . b, {a,b,$} 3 B b b B -> aB . , {a,b,$} 6

a b

B -> b . , {a,b,$} 4

Advanced Automata Theory

CS 9620

Denition A grammar is said to be LR(1) if (1) the start symbol appears on no right side, and (2) whenever the set of items I valid for some viable prex includes some complete item A , {a1, a2, . . . , an}, then (i) no ai appears immediately to the right of the dot in any item of I, and (ii) if B , {b1, b2, . . . , bk } is another complete item in I, then ai = bj for any 1 i n and 1 j k. It is easy to check that grammar G2 is a LR(1) grammar.

Advanced Automata Theory

CS 9620

The automaton that accepts an LR(1) language is like a DPDA, except that it is allowed to use the next input symbol in making its decisions even it makes a move that does not consume its input. The DPDA can keep the next symbol or $ in its state to indicate the symbol scanned. The rules whereby it decides to reduce or shift an input symbol onto the stack are: 1) If the top set of items has complete item A , {a1, a2, . . . , an}, where A = S, reduce by A if the current input symbol is in {a1, a2, . . . , an}.

Advanced Automata Theory

CS 9620

2) If the top set of items has an item S , {$}, then reduce by S and accept if the current symbol is $, that is, the end of the input is reached. 3) If the top set of items has an item A aB, T and a is the current input symbol, then shift. Note that the denition of an LR(1) grammar guarantees that at most one of the above will apply for any particular input symbol or $.

Advanced Automata Theory

CS 9620

Example The decision table for grammar G2 is shown in the following.


a Shift Shift Shift Reduce by B b Reduce by B aB b Shift Shift Shift Reduce by B b Reduce by B aB $ Reduce by A Accept Reduce by A Reduce by B b Reduce by A BA Reduce by B aB

0 1 2 3 4 5 6

Advanced Automata Theory

CS 9620

The sequence of actions taken by the parser on input aabb is shown in the following.
Stack 0 0a3 0a3a3 0a3a3b4 0a3a3B6 0a3B6 0B2 0B2b4 0B2B2 0B2B2A5 0B2A5 0A1 Input aabb$ abb$ bb$ b$ b$ b$ b$ $ $ $ $ $ $ Comments Initial Shift Shift Shift Reduce by B b Reduce by B aB Reduce by B aB Shift Reduce by B b Reduce by A Reduce by A BA Reduce by A BA Reduce by S A and accept

Advanced Automata Theory

CS 9620

S-ar putea să vă placă și