0 Voturi pozitive0 Voturi negative

9 (de) vizualizări28 paginiautomata and formal languages slides

Feb 09, 2013

© Attribution Non-Commercial (BY-NC)

PPT, PDF, TXT sau citiți online pe Scribd

automata and formal languages slides

Attribution Non-Commercial (BY-NC)

9 (de) vizualizări

automata and formal languages slides

Attribution Non-Commercial (BY-NC)

- LE-ANH-KHOA-BABAIU14124-ASSIGNMENT-GAME-THEORY.pdf
- Stepping Stone Method (Transportation Problem)
- Context Free Languages
- Grammar Types
- HP Lossless Audio Compression - AudioPaK
- Timetable Cape May June 2015 Final
- Computing Theory Lecture 4
- ITE1006 Theory-Of-Computation TH 1 AC40
- GATE 2014 Computer Science Sample Paper Answer Key
- B[1].TechCSE-IIYEAR.R10Students
- MC0073_System Programming-Assignement
- PS-3C
- Lecture__8
- NLP - Lecture1
- back-matter
- CSE(3-1) SYLLABUS
- Comput Ability 1
- MC Rebalancing
- 116 Integrals Using Reduction Formulas
- Incompleteness Theorems HT 2012

Sunteți pe pagina 1din 28

Wim van Dam Room 5109, Engr. I vandam@cs.ucsb.edu http://www.cs.ucsb.edu/~vandam/

CS138, Wim van Dam, UCSB

Formalities

The new Homework 3 is due on Friday afternoon. Questions?

This Week

This week: Simplification of Context-Free Grammars from An Introduction to Formal Languages and Automata by Peter Linz [Reader, pp. 7185] We will look at the important task of rewriting Context Free Grammars to equivalent ones to easy the computational problem of parsing words of the CFG. Ultimately, we will describe a parsing algorithm that works in polynomial time O(|w|3).

Section 6.1: It does not really matter whether or not the empty string is part of the language of a CFG (V,T,S,P). To add , you define V=V{S0} and add S0S| to P such that the CFG (V,T,S0,P) produces L(G){}. To remove : Exercise 13, p. 79. From now on, we assume that our languages are -free.

Let G=(V,T,S,P) have the rules A x1Bx2 (with AB) and B y1|y2||yn, then the following CFG G=(V,T,S,P) is equivalent with G (that is: L(G)=L(G)). P does not have the rule A x1Bx2 Instead it has: A x1y1x2 | x1y2x2 || x1ynx2. Proof: That G and G are equivalent should be obvious. Example: If P has A a|aaA|abBc and B abbA|b, then P has A a|aaA|ababbAc|abbc and B abbA|b B in P has become useless

CS138, Wim van Dam, UCSB

Definition 6.1: A variable AV of a CFG G is useful if and only if there is a wL(G) such that S * xAy * w. A variable that is not useful is useless; a production rule that uses a useless variable is a useless production rule. Example 1): With S A, and A aA|, and B bA, the B variable is useless: It is impossible to get S * xBy. Example 2): With S aSb | | A, and A aA, the variable A is useless: It is impossble to get A * x with xT*.

Theorem 6.2: There is an efficient algorithm that, given G=(V,T,S,P), produces an equivalent grammar (V,T,S,P) that has no useless variables or production rules. Proof: 1) By backtracking, collect all variables that can terminate, and remove all those that can not. 2) Now starting from S, draw the dependency graph of the reduced CFG to detect all variables A such that S * xAy. Remove all other variables and their production rules. The resulting CFG G is equivalent with G. See the Reader for the details.

CS138, Wim van Dam, UCSB

Removing -Productions

A -production rule is of the form A . Any variable A for which we have A * is called nullable. Theorem 6.3: If the language of a CFG G is -free, then we can efficiently rewrite G to an equivalent CFG G without -production rules. Proof: Backtracking, collect all nullable variables in VN. Add to P all production rules A x1x2xm as well as the rules that have the variables from VN replaced by . Unless all xj are nullable, then A is not added. Again, see Reader for more details.

CS138, Wim van Dam, UCSB

Example 6.5

Take the CFG defined by S ABaC A BC B b| C D| Dd The nullable variables are: VN = {A,B,C}

Thus we get the equivalent, -production free CFG : S ABaC | BaC | AaC | ABa | aC | Ba | Aa | a A BC | B | C Bb CD Dd

CS138, Wim van Dam, UCSB

After removing useless rules and -productions, we also want to get rid of unit-productions of the kind AB. Theorem 6.4: For a -production free CFG, we can make an equivalent CFG without unit-productions. We do this using the earlier described substitution rule (but be careful to avoid the case AB and BA). Proof: Backtracking, collect all variables with A*B. First, add to P all non-unit productions. For all (A,B) with A*B and B y1|y2||yn in P, add to P the production A y1|y2||yn.

Example 6.6

Take the CFG S Aa|B B A|bb A a|bc|B Unit productions for the CFG are: S * B and S * A B * A A * B

added with: S bb | a | bc B a | bc A bb

CS138, Wim van Dam, UCSB

Theorem 6.5: For all context free languages L without , there exist a context free grammar G that generates L, while G does not have useless productions, -productions, or unit productions. Proof: In the right order, perform the manipulations: 1) Remove -productions (might produce unit-productions) 2) Remove unit-productions (does not create -productions) 3) Remove useless productions (does not create unit or -productions) This theorem is useful for parsing algorithms

CS138, Wim van Dam, UCSB

Today

Last Monday we saw how to transform a (-free) CFG into an equivalent CFG that has: 1. no -productions (A * ) 2. no unit-productions (A * B) 3. no useless variables or useless productions Today we will discuss two important normal forms: the Chomsky Normal Form and the Greibach Normal Form, and the fast parsing of CFGs in CNF [Reader, pp. 8084].

After removing useless rules and -productions, we also want to get rid of unit-productions of the kind AB. Theorem 6.4: For a -production free CFG, we can make an equivalent CFG without unit-productions. We do this using the earlier described substitution rule (but be careful to avoid the case AB and BA). Proof: Backtracking, collect all variables with A*B. First, add to P all non-unit productions. For all (A,B) with A*B and B y1|y2||yn in P, add to P the production A y1|y2||yn.

Take the CFG S Aa|B B A|bb A a|bc|B Unit productions for the CFG are: S * B and S * A B * A A * B

added with: S bb | a | bc B a | bc A bb

CS138, Wim van Dam, UCSB

Definition 6.4: A CFG is in Chomsky normal form if and only if all production rules are of the form A BC or A x with variables A,B,CV and xT. (Sometimes rule S is also allowed.) CFGs in CNF can be parsed in time O(|w|3). Named after Noam Chomsky who in the 60s made seminal contributions to the field of theoretical linguistics. (cf. Chomsky hierarchy of languages).

CS138, Wim van Dam, UCSB

Theorem 6.6

Theorem 6.6: Every -free CFG G can be described by an equivalent CFG G in Chomsky normal form. The transformation from G to G can be done efficiently. Outline of Proof: 1. Rewrite G to eliminate unit and -productions. 2. Rewrite such that all terminal producing rules are of the form Baa. 3. Rewrite such that all variable producing rules are of the form ACD with C,DV.

Details of Proof

Step 2: How do you transform general production rules of the kind Ay1yn with yjVT to rules that are of the kind Ay1yn with yjV or Ay with yT? Answer: Introduce terminal producing variables Byy for each yT and replace in all relevant rules y by By. Step 3: How do you transform production rules of the kind AC1Cn with CjV to rules of the kind AC1C2? Answer: Make a chain of rules to produce C1Cn: AC1D1 and D1C2D2 and and Dn2Cn1Cn.

Initial grammar: S aSb | AAA and A a | SA Create a,b terminal producing variables X and Y to get: S XSY | AAA A a | SA Xa Yb Note that we do not create AX. Make variable chains to get: S XS1 | AS2 S1 SY S2 AA A a | SA Xa Yb

Definition 6.5: A CFG is in Greibach Normal Form if and only if all production rules are of the form Aax with aT and xV*. Note: several pairs (A,a) are allowed (unlike s-grammars). Theorem 6.7: For every CFG with L(G) there is an equivalent CFG that is in Greibach Normal Form. Proof: Just trust me on this one. Example: We can rewrite S ab | aS | aaS to GNF: S aB | aS | aAS, and A a, and B b.

CS138, Wim van Dam, UCSB

The CYK algorithm (Cocke-Younger-Kasami) decides in time O(|w|3) whether or not wL(G) with G in Chomsky NF. How it works: Let the string be w = a1an and define Vik = { A V : A * aiak } for all 1ikn, so that we want to know SV1n? We solve this by first determining V11,V22,,Vnn, then V12,V23,,Vn1 n, then V13,., up to the final V1n. Observation 1: Because of CNF, finding the Vii is trivial. Observation 2: Also, Vik is determined by the combinations Vik = {AV : ABC with AVij and BVj+1 k and ijk}. Using this dynamic programming technique we find V1n.

CS138, Wim van Dam, UCSB

CYK in Action

Take the grammar V11 V12 V13 V14 S AB | CC A CC C S,A . B BC | 0 V22 V23 V24 C0|1 with w = 1101 L? B,C S,A,B V11 = {C}, V22 = {C}, V33 = {B,C}, V44 = {C} V33 V34 V12 = {S,A}, V23 = {S,A}, V34 = {S,A,B} C V13 = {S}, V24 = {} V44 V14 = {S} S AB CCB 1CB 11B 11BC 110C 1101

CS138, Wim van Dam, UCSB

S,A

Complexity of CYK

There are O(n2) variable sets Vik that we have to construct. For each set Vik there are no more than n pairs (Vij,Vj+1 k) that we have to consider to determine Vik. In total, the running time is upper bounded by O(n3). Note that this does not include the time required to bring the CFG into Chomsky Normal Form (which can be done efficiently though).

Formalities

The new Homework 3 is due today, 5pm. New homework will be announced this weekend. Midterm on context free grammars will probably be later than originally planned (so, after Friday March 3). Coming Monday there will be no class. Questions?

CS138, Wim van Dam, UCSB

The CYK algorithm (Cocke-Younger-Kasami) decides in time O(|w|3) whether or not wL(G) with G in Chomsky NF. How it works: Let the string be w = a1an and define Vik = { A V : A * aiak } for all 1ikn, so that we want to know SV1n? We solve this by first determining V11,V22,,Vnn, then V12,V23,,Vn1 n, then V13,., up to the final V1n. Observation 1: Because of CNF, finding the Vii is trivial. Observation 2: Also, Vik is determined by the combinations Vik = {AV : ABC with AVij and BVj+1 k and ijk}. Using this dynamic programming technique we find V1n.

CS138, Wim van Dam, UCSB

CYK in Action

Take the grammar V11 V12 V13 V14 S AB | CC 1 A CC C S,A . B BC | 0 1 V22 V23 V24 C0|1 with w = 1101 L? B,C S,A,B 0 V11 = {C}, V22 = {C}, V33 = {B,C}, V44 = {C} V33 V34 V12 = {S,A}, V23 = {S,A}, V34 = {S,A,B} C V13 = {S}, V24 = {} 1 V44 V14 = {S} Retracing the V14 = {S} result gives the derivation tree:

CS138, Wim S AB CCB 1CB 11B 11BC 110C 1101van Dam, UCSB

S,A

An Exercise (1)

Write into Chomsky Normal Form the CFG: S aA|aBB A aaA| B bC|bbC CB Answer (1): First you remove the -productions (A): S aA|aBB|a A aaA|aa B bC|bbC CB

An Exercise (2)

Answer (2): Next you remove the unit-productions from: S aA|aBB|a A aaA|aa B bC|bbC CB Removing CB, we have to include the C*B possibility, which can be done by substitution (Thm 6.4) and gives: S aA|aBB|a A aaA|aa B bC|bbC C bC|bbC

- LE-ANH-KHOA-BABAIU14124-ASSIGNMENT-GAME-THEORY.pdfÎncărcat deVô Thường
- Stepping Stone Method (Transportation Problem)Încărcat deSwopnil Kalika
- Context Free LanguagesÎncărcat demyngcode
- Grammar TypesÎncărcat delanretobi
- HP Lossless Audio Compression - AudioPaKÎncărcat deAxacto Wizongod
- Timetable Cape May June 2015 FinalÎncărcat deAndreHendricks
- Computing Theory Lecture 4Încărcat deKwaku Don Aj Kisiedu
- ITE1006 Theory-Of-Computation TH 1 AC40Încărcat deRahul Jain
- GATE 2014 Computer Science Sample Paper Answer KeyÎncărcat deRubina Nigadi
- B[1].TechCSE-IIYEAR.R10StudentsÎncărcat devitcse2010
- MC0073_System Programming-AssignementÎncărcat deHarvinder Singh
- PS-3CÎncărcat deBrian Chen
- Lecture__8Încărcat deamargg
- NLP - Lecture1Încărcat deel_talliss
- back-matterÎncărcat deAntoun Awad
- CSE(3-1) SYLLABUSÎncărcat deDhoni Msd
- Comput Ability 1Încărcat dexyz
- MC RebalancingÎncărcat deGallo Solaris
- 116 Integrals Using Reduction FormulasÎncărcat deEsa Khan
- Incompleteness Theorems HT 2012Încărcat deneotonian
- scholarschallenge5aÎncărcat deapi-251795212
- A Heuristic Search Algorithm for Flow-Shop SchedulingÎncărcat deaameliamg
- Shocking TestÎncărcat deHazri Mohd Hamzah
- FinalÎncărcat deRajdeep Borgohain
- Problem Set 1Încărcat deAnuj Topiwala
- 7.SetsÎncărcat deMyra Balazon
- mc2.pdfÎncărcat defcruzx
- Coursework AI Method COB107Încărcat deaskaterina
- Critical Path MethodÎncărcat detalha
- Evaluation Criteria for Week 2 PCÎncărcat deM V Vamsi Krishna

- Automata 1Încărcat dewest_lmn
- Computer Graphics Lec_6.pdfÎncărcat dewest_lmn
- Computer Graphics Lec_5Încărcat dewest_lmn
- Computer Graphics Lec_6.2Încărcat dewest_lmn
- Computer Graphics Lec_6.1Încărcat dewest_lmn
- Computer Graphics Lec_2Încărcat dewest_lmn
- Computer Graphics Lec_3Încărcat dewest_lmn
- Computer Graphics Lec_1Încărcat dewest_lmn
- Computer Graphics Lec_4Încărcat dewest_lmn
- Crypt 1 a AES,IDEA,Blowfish IntroÎncărcat dewest_lmn
- Automata 3Încărcat dewest_lmn
- Automata 4Încărcat dewest_lmn

- Spacel UniÎncărcat deArnel Kirk
- CybersecurityÎncărcat deeomba
- EV 9.0 - Setting Up Exchange Server ArchivingÎncărcat devishwah22
- Samsung Lcd Ln40d550k1fxza Fast Track [x]Încărcat deAbraham Lara Bautista
- Siemens ProfisafeÎncărcat dericardo prezzi
- The Complete Guide to _useradd_ Command in Linux - 15 Practical ExamplesÎncărcat dehammet217
- NV5586AÎncărcat dek.jp914733
- acute-care-telemetry-datastream-process-monitoring-visualization-and-search-with-splunk (1)Încărcat debobwillmore
- M0-11873Încărcat deapi-3726475
- Student Attendance Management SystemÎncărcat deaayushee
- # DECRYPT MY FILES #.txtÎncărcat deDean Rizky Pratama
- Interrupción UART - AVRÎncărcat deMichael Vargas
- SampleExam02(Ch3, 4) (2).docÎncărcat deAnonymous 6Unl87Z
- Chapter07 Slides.pdfÎncărcat deargbgrb
- c 03478346Încărcat dePraveen Kumar
- Web Analyst - Project Manager - CRM Analyst -Appature - Silverpop - CRM ConsultantÎncărcat dejamarjames
- 7f0601_apb_eÎncărcat dejcmendez506
- Pom Flow Line Optimizer Demo EngÎncărcat deCarlos Juan Sarmient
- Dishonored - Table of contentsÎncărcat deSergio Lopez
- How to Extend RapidMiner 5Încărcat demajdoline19
- QW2548Încărcat deMawardi A Asja
- SteganographyÎncărcat deEr Uttam Jain
- Accepted AnswerÎncărcat deMohamed Abdirahman Addow
- Andreatta on Micro TonalityÎncărcat deAnonymous zzPvaL89fG
- Dreamweaver CS5 HilfeÎncărcat deValmir Miranda Porto
- 2097-rn003_-en-pÎncărcat deAnkush Kumar Sharma
- EPLAN Avela ChristofferÎncărcat dezagorje123
- English Sample Exam Asf 201611Încărcat deanafc1403
- CRM Transaction CodesÎncărcat deRodrigo Vieira
- 6 14 13 New Long Srp PriceÎncărcat deMarvin Bautista

## Mult mai mult decât documente.

Descoperiți tot ce are Scribd de oferit, inclusiv cărți și cărți audio de la editori majori.

Anulați oricând.