Documente Academic
Documente Profesional
Documente Cultură
A right regular grammar (also called right linear grammar) is a formal grammar (N, , P, S) such that all
the production rules in P are of one of the following forms:
N Non-Terminal
Input Set
P Production Rules
S Start Symbol
1. B a -where B is a non-terminal in N and a is a terminal in
2. B aC - where B and C are in N and a is in
3. B - where B is in N and denotes the empty string, i.e. the string of length 0.
In a left regular grammar (also called left linear grammar), all rules obey the forms
1. A a - where A is a non-terminal in N and a is a terminal in
2. A Ba - where A and B are in N and a is in
3. A - where A is in N and is the empty string.
An example of a right regular grammar G with N = {S, A}, = {a, b, c}, P consists of the following rules
1. S aS
2. S bA
3. A
4. A cA
A pushdown automata (PDA) is a finite state machine which has an additional stack
storage. The transitions a machine makes are based not only on the input and current state,
but also on the stack. The formal definition (in our textbook) is that a PDA is this:
M = (Q,,,,s,F)
where
s Q: start state
F Q: final states
Here we discuss the relationship of L(DPDA) with regular language, CFL and ambiguous grammars
We have to have the finite qualifier because the full subset is infinite by virtue of
the * component. The meaning of the transition relation is that, for , if
((p,, ),(q,)) :
then:
replace on the top of the stack by (pop the and push the )
Palindrome example
These are examples 3.3.1 and 3.3.2 in the textbook. The first is this:
The machine pushes a's and b's in state s, makes a transition to f when it sees the
middle marker, c, and then matches input symbols with those on the stack and pops
the stack symbol. Non-accepting string examples are these:
in state s
ab
in state s with non-empty stack
abcab
in state f with unconsumed input and non-empty stack
abcb
in state f with non-empty stack
abcbab
in state f with unconsumed input and empty stack
Observe that this PDA is deterministic in the sense that there are no choices in
transitions.
or
means to do so without consulting the stack; it says nothing about whether the stack is
empty or not.
Nevertheless, one can maintain knowledge of an empty stack by using a dedicated stack
symbol, c, representing the "stack bottom" with these properties:
it is pushed onto an empty stack by a transition from the start state with no other
outgoing or incoming transitions
Behavior of PDA
The three groups of loop transitions in state q represent these respective functions:
input a with b's on the stack: pop b; or, input b with a's on the stack: pop a
To accept with empty stack:Every regular language is not N(P) for some DPDA P. A language L =
N(P) for some DPDA P if and only if L has prefix property. Definition of prefix property of L states that
if x, y
L, then x should not be a prefix of y, or vice versa. Non-Regular language L = WcWR could
prefix property. But the language, L={0*} could be accepted by DPDA with final state, but not with
empty stack, because strings of this language do not satisfy the prefix property. So every regular
language is not N(P).
You might wonder whether or not there is a kind of automaton that can serve as a
recognizer for context-free languages. Pushdown automata (PDAs) are a class of
automaton that are powerful enough to serve as recognizers for context-free
languages.
Pushdown automata are similar to finite automata: they consist of states and
transitions. However, an infinitely long "tape" is added to the automaton. The tape
can serve as storage for data that the automaton needs to remember. In any state, the
automaton chooses to do one of three things:
1. Read a symbol from the input string
2. Push a symbol on the tape
3. Pop a symbol from the tape
Symbols are written to and read from the tape in last-in, first-out order. In other
words, the symbol popped from the tape by the automaton will be the one most
recently pushed. This is exactly equivalent to a stack data structure.
Let's describe a PDA that recognizes the language
anbn
(All strings of the form n a's followed by n b's, for n >= 0.)
Note that the PDA a special symbol , which is not part of the language's alphabet, is
used in two ways
1. It is pushed onto the tape as the automaton's first action
2. It is appended to the input string, meaning that it will be the last symbol read
from the input string
Here is the state diagram for the PDA:
If the first symbol read is , then the string is accepted (the empty string is a
member of the language)
For each a symbol read, an a symbol is pushed onto the tape
When the first b symbol is encountered, the PDA attempts to pop a
matching a symbol from the tape
Subsequent b symbols must be matched with a symbols popped from the tape
When the terminating symbol is read, a matching symbol must be popped, in
which case the input string is accepted
Any time an unexpected symbol is encountered (from the input string or the tape), the
input string is rejected.
The PDA shown above is a deterministic pushdown automaton (D-PDA) because each
state has only one transition per input symbol. Note that nondeterministic pushdown
automata (N-PDAs) are possible; they may states that have multiple transitions on the
same input symbol. Interestingly, the power of D-PDAs and N-PDAs is not
equivalent; N-PDAs can recognize some languages that D-PDAs cannot. (Contrast
this with DFAs and NFAs, which have exactly the same expressive power.)
Deterministic context-free languages are the subset of context-free languages that can
be recognized by a D-PDA. An N-PDA can recognize any context-free language, and
thus are more general.
Limits to the expressive power of context-free languages
We saw that some interesting languages were not regular languages. For example,
arbitrary palindromes and other languages with balanced symbols are not regular.
Chomsky hierarchy
Grammar
Type-0
Type-1
Type-2
Languages
Automaton
(no restrictions)
Recursively
enumerable
Turing machine
Context-sensitive
Context-free
Example is n!
Example is anbncn
anbn
Example
a* b*
Regular
and
NFA or DFA
Type 0:
Unrestricted rewriting systems. The languages defined by Type 0 grammars are accepted
by Turing machines; Chomskyan transformations are defined as Type 0 grammars. Type
0 grammars have rules of the form
vocabulary V and
.
Type 1:
Context-sensitive grammars. The languages defined by Type 1 grammars are accepted by
linear bounded automata; the syntax of some natural languages (including Dutch, Swiss
German and Bambara), but not all, is generally held in computational linguistics to have
structures of this type. Type 1 grammars have rules of the form
where
,
,
, or of the form
, where
is the
initial symbol and is the empty string.
Type 2:
Context-free grammars. The languages defined by Type 2 grammars are accepted by
push-down automata; the syntax of natural languages is definable almost entirely in terms
of context-free languages and the tree structures generated by them. Type 2 grammars
have rules of the form
, where
,
. There are special `normal
forms', e.g. Chomsky Normal Form or Greibach Normal Form, into which any CFG can
be equivalently converted; they represent optimisations for particular types of processing.
Type 3:
Regular grammars. The languages defined by Type 3 grammars are accepted by finite
state automata; morphological structure and perhaps all the syntax of informal spoken
dialogue is describable by regular grammars. There are two kinds of regular grammar:
1. Right-linear (right-regular), with rules of the form
or
; the
structural descriptions generated with these grammars are right-branching.
2. Left-linear (left-regular), with rules of the form
or
; the
structural descriptions generated with these grammars are left-branching.