Sunteți pe pagina 1din 28

SAINT MARY

UNIVERSITY
Formal Language
Theory
Haftu Hagos

Chapter One
The Theory of Computation
Introduction
Computer science is a practical discipline. Those who have worked in it often have a marked
preference for useful and tangible problems over theoretical speculation. This is certainly true of
computer science students who are interested mainly in working on difficult applications from
the real world.

Theoretical questions are interesting to them only if they help in finding good solutions. This
attitude is appropriate, since without applications there would be little interest in computers. But
given this practical orientation, one might well ask why study theory.
The first answer is that theory provides concepts and principles that help us to understand the
general nature of the discipline. The field of computer science includes a wide range of special
topics from machine design to programming. The use of computers are in the real world involves
a wealth of specific detail that must be learned of a successful application. This makes computer
science a very diverse and broad discipline. But in spite of this diversity, there are some common
underlying principles. To study these basic principles, we construct abstract models of computers
and computation. These models embody the important features that are common to both
hardware and software and that are essential to many of the special and complex constructs we
encounter while working with computers.
A second, and perhaps not so obvious answer, is that the ideas we will discuss have some
intermediate and important applications. The fields of digital design, programming languages
and compiler designs are the most obvious examples, but there are many others. The concepts we
study here run like a thread through computer science from operating system to pattern
recognition.
The third answer is one which we try to convince the reader. The subject matter is interest
intellectually and fun. It provides many challenging, puzzle like problems that can lead to some
sleepless nights.
Therefore, in this course we will look at models that represent feature at the core of all computers
and their applications. To model the hardware of the computer, we introduce the notion of
automation (plural, automata). Automation is a construct that processes all the indispensable
features of a digital computer. It accepts an input, produces output, may have some temporary
storage, and can make decision in transforming the input into the output. A formal language is an
abstract of general characteristics of programming languages.
A formal language consists of a set of symbols and some rules of formation by which these
symbols can be combined into entities called sentences. A formal language is the set of all

strings permitted by the rules of formation. Though some the formal languages we study here are
simpler than programming languages, they have many of the same essential features. We can
learn a great deal about programming languages from formal languages. Finally, we will
formalize the concept of mechanical computation by giving a precise definition of the term
algorithm and we study the kinds of problems that are (and are not) suitable for solution by such
mechanical means.

1.1 Mathematical Preliminaries and Notation Sets


A set is a collection of elements without any structure other than membership. To indicate that x
is an element of set S, we write x

S. The statement that x is not in S is written x

S. A

set is specified by enclosing some description of its elements in curly braces; for example, the set
of integers 0, 1, 2, is shown as
S={0,1,2}
.

Ellipses are used whenever the elements are clear. Thus, {a, b, c,,z} stands for all the lower
case letters of the English alphabet, while {2,4,6,}denotes the set of all positive even integers.
When the need arises, we use more explicit notation, in which we write
S={i:i>0, iis even }
We read this as S is set of all i, such that i is greater than zero and i is even implies of course
that i is an integer.
The usual set operations are union () and intersection ( ), difference (-) and
Complementation defined as
S 1 S 2={ x : x S 1x S 2 }

S 1 S 2={ x : x S 1x S 2 }

S 1S 2={x : x S 1x S 2 }

S={ x : x Ux S }
The set with no elements, called the empty set or the null set is denoted by . From the
definition of a set, it is obvious that
S =S=S
S =

=U

S =S
The following useful identity equalities
S 1 S 2=S 1 S 2

DeMorgans
Law

S 1 S 2=S 1 S 2

A ( B C )=(A B)( A C) (Distribution property)

A set S1 is said to be a subset of S if every element of S1 is also an element of S.


S 1 S

Let A and B be sets. When does A = B?


When they contain the same elements.
When

A B

and

B A

If S1S, but S contains an element not on S1 we say that S1 is proper subset of S: we write this
as
S 1 S

IF S1 and S2 has no common elements that is S1 S2 = , then the sets are said to be
Disjoint sets.
Theorem: If A and B are both finite sets, then
n(A B) = n(A) + n(B) n(A B)
A set is said to be finite if it contains finite number of elements; otherwise it is said to be infinite.
The size of finite set is the number of elements in it. And is denoted by |S| .

A given set normally has many subsets. The set of all the subsets of a set S is called the powerset
of set S and is denoted by 2S. Observe that 2s is set of sets.
Example: set S is the set {a, b, c}, then its poweset is
2S = { , {a},{b},{c},{a, b},{a, c},{b, c},{a, b, c}} = 8
Sets are said to be Cartesian product of other sets. For the Cartesian product of two sets, which
itself is a set ordered pairs, we write
S 1 S 2={( x , y ) : x S 1 , y S 2 }
Example: let S1 = {2, 4} and S2= {2, 3, 5, 6}. Then
S 1 S 2={( 2,2 ) , ( 2,3 ) , ( 2,5 ) , ( 2,6 ) , ( 4,2 ) , ( 4,3 ) , ( 4,5 ) ,(4,6) }

1.2 Relations and Functions


A function is a rule that assign to elements of one set a unique element of another
set. If f denotes a function, then the first set is called the domain of f and the
second set is its range. Write

f : S 1 S 2

To indicate that the domain of f is a subset of S1 and that the range of f is a subset of S2. If the
domain of f is all of S1, we say that f is the total function on S1. Otherwise, f is said to be a
partial function
Relations are more general than functions: in a function each element of the domain has exactly
one associate element in the range; in a relation there may be several elements in the range.

1.3 Graphs and Tress


A graph is a construct consists of two finite sets, the set V = {v1, v2,,vn} of vertices and the set E
= {e1, e2,,em} of edges. Each edge is a pair of vertices from V. for instance
e i=( v j , v k )
is an edge from vj to vk.

Figure 1.1
Graphs are conveniently visualized by diagrams in which the vertices are represented as circles
and the edges as lines with arrows connecting the vertices as shown above.
The graph with vertices {v1, v2, v3} and edges {(v1, v3), (v2, v3), (v3, v1), (v3, v3)} is depicted
in figure 1.1.
Trees are particular types of graphs. A tree is a digraph that has no cycles, and that one distinct
vertex, called the root, Such that there is exactly one path from the root to every other vertex.
This definition implies that the root has no incoming edges and that there are some vertices
without outgoing edges. These are called the leaves of the tree.

The height of the tree is the largest level number of any vertex.

Figure 1.2

1.4 Proof Techniques


Testing program is essentially important. However, testing goes only so far, since we cant try
our program for every input. More importantly, if the program is complex, say a tricky recursion
or iteration. When our code testing tell us that the code is in correct, we still need to go it right.
To make our iteration or recursion correct, we need to set up an inductive hypothesis.

1.4.1 Deductive proof


As you know from your previous knowledge, a deductive proof consists of a sequence of
statements whose truth leads us from some initial statement, called hypothesis or the given
statements, to a conclusion statement. Each step in the proof must fellow , by some accepted
logical principle, from either the given facts, or some the previous statements in the deductive
proof or a combination of these.
The theorem that is proved when we go from a hypothesis H to a conclusion C is the statement
if H then C. we say that C is deducted from H. example theorem of the form if H then C will
illustrate these points.
6

Theorem 1.3: if

x4

then 2 x

First notice that, hypothesis H is x 4 , this hypothesis has a parameter x, and thus is neither
true nor false. Rather, its truth depends on the value of the parameter x; H is true for x =6 and
false for x =2.
x
2
Likewise the conclusion C is 2 x , this statement also uses parameter x and is true for
3
certain value of x and not others. For example, C is false for x =3, since 2 =8

which is not as

2
2
4
large as 3 =9 . On the other hand C is true for x =4 since 4 =2 =16 . For x = 5, the

statement is also true.


x
2
Perhaps, you can see the intuitive argument that tells us that the conclusion 2 x will be true

whenever x 4 .
x
2
Theorem 1.4: if x is sum of the squares of four positive integers, then 2 x .

Poof:
Step-1: we have repeated one of the given statements of the theorem: that x is the sum of the
squares of four integers. It often helps in proofs if we name quantities that are referred but not
named, and we have done so here, giving the four integers the names, a, b, c and d.

1.4.2 Contradiction proof


Theorem 1.5: let S be a finite subset of some infinite set U. let T to be the complement of S with
respect to S. then T is infinite.

Proof: intuitively, this theorem says that if you have an infinite supply of something (U), and you
take a finite amount away (S), then you still have an infinite amount left. Let us begin by
restating
the
fact
of
the
theorem
as
given
below.

However, we still stuck.


So let us try to proof the given theorem by contradiction technique.
The contradiction of the conclusion is T is finite. Let us assume T is finite along the statement
of the hypothesis that says S is finite: i.e., S = n for some integer n. similarly we can restate
the assumption that T is finite as T =m for some integer m.
Now one of the given statement tells us that

S T =U

and

S T =

that is the elements

of U are exactly the elements of S and T. thus, there must be n + m elements of U. since m + n is
an integer and we have shown U=n+ m , it follows that U is finite. More precisely, we
showed that the number of elements U is finite integers, which is the definition of finite. But
the statement that U is finite contradicts the given statement that U is infinite. We have thus
used the contradiction of our conclusion to prove contradiction of one of the given statements of
the hypothesis and by principle of proof by contradiction we may conclude the theorem is true.

1.5 Formal Languages


A language can be seen as a system suitable for expression of certain ideas, facts and concepts.
For formalizing the notion of a language one must cover all the varieties of languages such as
natural (human) languages and programming languages. Let us look at some common features
across the languages.
One may broadly see that a language is a collection of sentences; a sentence is a sequence of
words; and a word is a combination of syllables. If one considers a language that has a script,
then it can be observed that a word is a sequence of symbols of its underlying alphabet. It is
observed that a formal learning of a language has the following three steps.
8

1) Learning its alphabet - the symbols that are used in the language.
2) Its words - as various sequences of symbols of its alphabet.
3) Formation of sentences - sequence of various words that follow
certain rules of the language.
In this learning, step 3 is the most difficult part. Let us postpone discussing construction of
sentences and concentrating on steps 1 and 2. For the time being instead of completely ignoring
about sentences one may look at the common features of a word and a sentence to agree upon
both are just sequences of some symbols of the underlying alphabet. For example, the English
sentence
"The English articles - a, an and the are categorized into two types:
indefinite and definite."
may be treated as a sequence of symbols from the Roman alphabet along with enough
punctuation marks such as comma, full-stop, colon and further one more special symbol, namely
blank-space which is used to separate two words. Thus, abstractly, a sentence or a word may be
interchangeably used for a sequence of symbols from an alphabet. With this discussion we start
with the basic definitions of alphabets and strings and then we introduce the notion of language
formally.
Further, in this chapter, we introduce some of the operations on languages and discuss algebraic
properties of languages with respect to those operations. We end the chapter with an introduction
to finite representation of languages via regular expressions.

1.5.1 Alphabets
Definition: An alphabet is a finite set of objects called symbols.
Notation: ={a ,b , , z }
1.5.2

String

Definition: A string over an alphabet


from .

is a finite sequence of symbols

Notation: w, x, y for strings. Instead of w = (w1, w2, , wk) we will


simply write
w = w1w2 . . . wk.
Note: strings over binary alphabet {0, 1} are often called binary strings.
Definition: The length of string is the number of symbols contained in the
string.
Notation: |w|.

1.5.3 Language
Definition: A language over an alphabet is a set of strings over .
Notation: L, M, N ... for languages. |L| for the size (number of strings) of L.
Notation: ' will denote a set of all strings over
over is just a subset of .
Notation:

. Then, a language L

k will denote a set of all strings of length k over

Review Questions
1) Suppose Walters online music store conducts a customer survey to determine the preferences
of its customers. Customers are asked what type of music they like. They may choose from
the following categories: Pop (P), Jazz (J), Classical (C), and none of the above (N). Of 100
customers some of the results are as follows:
44 like Classical
27 like all three
15 like only Pop
10 like Jazz and Classical, but not Pop
How many like Classical but not Jazz? We can fill in the Venn diagram below to keep track of
the numbers.
There are n(C) = 44 total that like Classical, and n(C J) = 27+10 = 37 that like both Jazz and

Classical, so 4437 = 7 like Classical but not Jazz.


2) Let U = {1, 2, 3, 4, 5, , 10} A = {2, 4, 6, 8, 10} B = {3, 6, 9} C = {1, 2, 3, 8, 9, 10}
perform the indicated operations
a) A B

10

b)
c)
d)
e)
f)

A B
A C
(A C)
(A B) C
(A B) A

3) Determine if the following statements are true or false. Here A represents any set.
a) A
b) A A
c) (A) = A
4) Let U = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} and A = {1, 3, 5, 7, 9} and B = {1, 4, 5, 9}.
a) Find A B
b) Find A B
c) Use a Venn diagram to represent these sets.
5) One hundred students were surveyed and asked if they are currently taking math (M),
English (E) and/or History (H) The survey findings are summarized here:
Survey Results
n(M) = 45 n(M E) = 15
n(E) = 41 n(M H) = 18
n(H) = 40 n(M E H) = 7
n[(M E) (M H) (E H)] = 36

a) Use a Venn diagram to represent this data.


b) How many students are only taking math?
6) Ninety people at a Superbowl party were surveyed to see what they ate while watching the
game. The following data was collected:
48 had nachos.
39 had wings.
35 had potato skins.
20 had both wings and potato skins.
19 had both potato skins and nachos.
22 had both wings and nachos.
10 had nachos, wings and potato skins.
a) Use a Venn diagram to represent this data.
b) How many had nothing?
7) If L1 = {0,1, 01} and L2 = {1, 00}, then L1L2 is..?
8) If L1 = {b, ba, bab} and L2 = {, b, bb, abb} then we have L1L2= .?
9) In the context of formal language theory, another important notation is Kleene star or Kleene
closure. The Kleene closure of language L is denoted by L*, is defined as

11

.
Example: the kleene closure of the languange {01} is
{ , 01,0101,010101, .}
a) If L = {0, 10}, then L* is ?

********************THE END**********************

Chapter-2
Finite Automata
Introduction
We will be making use of mathematical models of physical systems called finite automata, or
finite state machines to recognize whether or not a string is in a particular language. This section
introduces this idea and gives the precise definition of what constitutes a finite automaton. We
look at several variations on the definition (to do with the concept of determinism) and see that
they are equivalent for the purpose of recognizing whether or not a string is in a given language.
Finite automata are a useful model for many important kinds of hardware and software. Let us
just list some of the most important kinds.
1) Software for designing and checking the behavior of digital circuits.
2) The lexical analyzer of a typical compiler that is the compiler component that breaks
the input into logical units such as identifiers, keywords and punctuation.
3) Software for scanning large bodies of text, such as collection of web pages, to finite
occurrences of words, phrases or other patterns.
4) Software for verifying systems of all types that have a finite number distinct state such as
communication protocols or protocols for secure exchange of information.
Finite Automaton can be classified into two types

Deterministic Finite Automaton (DFA)


12

Non-deterministic Finite Automaton (NDFA / NFA)

In DFA, for each input symbol, one can determine the state to which the machine
will move. Hence, it is called Deterministic Automaton. As it has a finite number
of states, the machine is called Deterministic Finite Machine or Deterministic
Finite Automaton.
In NDFA, for a particular input symbol, the machine can move to any combination of
the states in the machine. In other words, the exact state to which the machine
moves cannot be determined. Hence, it is called Non-deterministic Automaton.
As it has finite number of states, the machine is called Non-deterministic Finite
Machine or Non-deterministic Finite Automaton.

2.1 Deterministic finite Accepters i


The first steps of automaton we study in detail are finite accepters that are deterministic
in their operation. We start with a precise formal definition of deterministic acceptors.
2.1.1 Deterministic Accepters and Their transition Graphs
A deterministic finite accepter or dfa is defined by the quotable
M = (Q, , , q0, f),
Where
Q is finite set of internal states
is finite set of symbols called input alphabets
:Q Q is total function called the transition function

q 0 Q

F Q is a set of final states.

is the initial state

A deterministic finite accepter operates in the following manner. At this initial time, it is assumed
to be in the initial state q0, with its input mechanism on the left symbol of the input string.
During each move the automation, the input mechanism advances one position to the right, so
each move consumes one input symbol. When the end of the string is reached, the string is
accepted if the automaton is in its final state. Otherwise the string is rejected.
The input mechanism can move from left to right and reads exactly one symbol on each step.
The transitions from one internal state to another are governed by the transition function .
For example: if

13

( q 0,a )=q 1
Then if the dfa is in the state q0, and the current input symbol a, the dfa will go into state q1.
To visualize and represent finite automata, we use transition graphs, in which the vertex
represent states and the edges represent transitions. The labels in vertices are the names of the
states, while the labels on the edges are the current value of the input symbol.
For example: if q0 and q1 are the internal states of some dfa M, then the graph associated with M
will have one vertex labeled q0 and another labeled q1. And edge (q0.q1) labeled a represents the
transition ( q 0,a )=q 1 . The initial state will be identified by an incoming unlabeled arrow
not originating at any vertex. Final states are drawn with doubled circle.
For every transition rule, ( qi , a )=qj , the graph has an edge (qi,qj) labeled a.

From the above given graph representation,


M =( { q 0, q 1, q 2 } , {0,1 } , , q 0, {q 1})

Where

is given as

14

The dfa accepts the string 01. Stating in the q0, the symbol 0 is read first. Looking at the edges of
the graph, we see that the automaton remains in the state q0. Next 1 is read and the automaton
goes into state q1. We ate know at the end of the string and at the same time in a final state q1.
Therefore, the string 01 is accepted. The dfa doesnt accept the string 00, since after reading two
consecutive 0s, it will be in state q0.by similar reason we see that the automaton will accept the
strings 101, 0111, 11001, but not 1100 or 100.

2.1.2 Extended Transition Function for a DFA

It is convenient to introduce the extended transition function :Q Q

argument

the second

is a string, rather than a single symbol and its value gives the state the

automaton will be in after reading that string. For example if,


( q 0,a )=q 1
And
( q 1, b )=q 2
Then
( q 0, ab )=q 2
Formally, we can define

recursively by

( q 0, )=q

2.1

( ( q , w ) , a)
( q , wa )=

2.3

15

For all

q Q , w , a

. To this why this is appropriate, lets us apply

these definitions to the simple case above.

( ( q 0, a ) , b)
( q 0, ab )=

2.3

But

( ( q 0, ) , a)
( q 0, a )=
( q 0, a )
q1

Substituting in (2.3), we get


( q 0, ab )= ( q 0, ab )= ( q 1, ab )=q 2
As expected!!!

2.1.3 Languages Accepted by DFAs


Having made a precise definition of an accepter, we are now ready to define formally what we
mean by an associated language. The association is obvious; language is the set of all strings
accepted by the automaton.
Definition: the language accepted by a dfa M = (Q, , ,q0,F) is the set of all strings on
accepted by M in formal notation,

Note that we require that

and consequently , be total functions. At each step a unique

move is defined, so that we are justified in calling such an automaton deterministic. A dfa will

process every string and either accepted it or rejected it. Non acceptance means that the
dfa is stops in a no final state, so that

16

Linz Example 2.2

The automaton in Linz Figure 2.2 accepts all strings consisting of arbitrary numbers of as
followed by a single b.
In set notation, the language accepted by the automaton is L = {anb : n 0}.
Note that q2 has two self-loop edges, each with a different label. We write this compactly with
multiple labels.
A trap state is a state from which the automaton can never escape.

Note that q2 is a trap state in the dfa transition graph shown in Linz Figure 2.2.
Transition graphs are quite convenient for understanding finite automata.

Linz Fig. 2.2: DFA Transition Graph with Trap State

For other purposessuch as representing finite automata in programsa tabular representation of


transition function may also be convenient (as shown in Linz Fig. 2.3).

Linz Fig. 2.3: DFA Transition Table

17

Linz Example 2.3

Find a deterministic finite accepter that recognizes the set of all string on = {a, b} starting with
the prefix ab. Linz Figure 2.4 shows a transition graph for a dfa for this example.
The dfa must accept ab and then continue until the string ends.
This dfa has a final trap state q2 (accepts) and a non-final trap state q3 (rejects).

Linz Fig. 2.4


Linz Example 2.4

Find a dfa that accepts all strings on {0, 1}, except those containing the substring 001.
need to remember whether last two inputs were 00
use state changes for memory

Linz Fig. 2.5

Linz Figure 2.5 shows a dfa for this example.


Accepts: , 0, 00, 01, 010000
Rejects: 001, 000001, 0010101010101

2.1.4 Regular Languages


Linz Definition 2.3 (Regular Language): A language L is called regular if and only if there
exists a dfa M such that L = L(M).
Thus dfas define the family of languages called regular.
Linz Example 2.5
Show that the language L = {awa: w

{a, b} } is regular.
*

Construct a dfa.
Check whether begin/end with a.
Am in final state when second a input.

18

Linz Figure 2.6 shows a dfa for this example


**Question: How would we prove that a language is not regular?
We will come back to this question in chapter 4.

Linz Fig. 2.6

Linz Example 2.6


Let L be the language in the previous example (Linz Example 2.5).
Show that L2 is regular.
L2 = {aw1aaw2a: w1, w2 {a, b}*}.

Construct a dfa.
Use Example 2.5 dfa as starting point.
Accept two consecutive strings of form awa.
Note that any two consecutive as could start a second string.

Linz Figure 2.7 shows a dfa for this example

19

The last example suggests the conjecture that if a language L then so is L2, L3, etc. We will come
back to this issue in chapter 4.

2.2 Nondeterministic Finite Accepters


2.2.1 Nondeterministic Accepters
Linz Definition 2.4 (NFA): A nondeterministic finite accepter or nfa is defined by the tuple
M = (Q,

, , q0, F) where Q, , q0, and F are defined as for deterministic finite

accepters, but

:Q ( { }) 2Q
Remember for dfas:

Q is a finite set of internal states.


a finite set of symbols called the input alphabet.
q0 Q is the initial state.

F Q is a set of final states.


The key differences between dfas and nfas are
1. dfa: yields a single state
nfa: yields a set of states
2. dfa: consumes input on each move
nfa: can move without input ()
3. dfa: moves for all inputs in all states
nfa: some situations have no defined moves
An nfa accepts a string if some possible sequence of moves ends in a final state.
An nfa rejects a string if no possible sequence of moves ends in a final state.
Linz Example 2.7

Consider the transition graph shown in Linz Figure 2.8. Note the nondeterminism in state q0
with two possible transitions for input a. Also state q3 has no transition for any input.

20

Linz Fig. 2.8


Linz Example 2.8

Consider the transition graph for an nfa shown in Linz Figure 2.9.
Note the nondeterminism and the -transition.
Note: Here means the move takes place without consuming any input symbol.
This is different from accepting an empty string.
Transitions:

Linz Fig. 2.9

for
for
for
for

(q0,
(q1,
(q2,
(q2,

0)?
0)?
0)?
1)?

Accepts: , 10, 1010, 101010


Rejects: 0, 11
What about 110, 10100?

2.2.2 Extended Transition Function for an NFA


As with dfas, the transition function can be extended so its second argument is a string.
21

Requirement:

(qi, w) = Qj where Qj is the set of all possible states the automaton may be

in, having started in state qi and read string w.


Linz Definition 2.5 (Extended Transition Function): For an nfa, the extended transition

function is defined so that ( (qi,w) contains qj if there is a walk in the transition graph from
qi to qj labeled w. This holds for all qi, qj Q

and w

E*.

Linz Example 2.9


Linz Figure 2.10 represents an nfa. It has several -transitions and some undefined
transitions such as

(q 2, a)

Linz Fig. 2.10

Suppose we want to find

( q 1, a ) ( q 2, )

there is a walk labeled a involving two -transitions

from q1 to itself. By using some of the -edges twice we see that there are also walks involving -transitions to q0
and q2.
Thus

( q 1, a )={q 0, q 1, q 2}

Since there is a -edge between q2 and q0, we have immediately that

( q 2, )=q 0 . Also any state can be

reached from itself by making no involve, and consequently no input symbol,

( q 2, )=q 2 .

Therefore;

( q 2, )={q 0. q 2 }
Using many -transitions as needed you can also check that,

( q 2 , a a )={q 0, q 1, q 2}

2.2.3 Language Accepted by an NFA


Linz Definition 2.6 (Language Accepted by NFA): The language L accepted by the nfa
M = (Q,, , q0, F) is defined

L (M) = {w : (q0, w) F 0 }.

That is, L(M) is the set of all strings w for which there is a walk labeled w from the initial vertex
of the transition graph to some final vertex.

22

Linz Example 2.10 (Example 2.8 Revisited)

Lets again examine the automaton given in Linz Figure 2.9 (Example 2.8).
This nfa, called it M:
must end in q0
L(M) = {(10)n : n 0}

Note that q2 is a dead configuration because (q0, 110) = .

Linz Fig. 2.9 (Repeated)


2.2.4

Why Nondeterminism

When computers are deterministic?


an nfa can model a search or backtracking algorithm
nfa solutions may be simpler than dfa solutions (can convert from nfa to dfa)
nondeterminism may model externally influenced interactions (or abstract more detailed
computations)

Chapter-3
Grammars
Introduction
In this chapter we introduce the notation grammar called the context-free grammar (CFG) as a
language generator. The notation of derivation is instrumental in understanding how the strings
are generated in a grammar.
In the context of natural languages, the grammar of the language is the set of rules which are
used to construct /validate sentence of the language. We look into the general features of the
grammars (of natural languages) to formalize the notation in the present context which facilitate
for the better understanding of formal languages.
Consider the English language

23

The students study automata theory.


In order to observe that the sentence is grammatically correct, one may contribute certain rules of
the English grammar to the sentence and validate it. For instance the article The followed by
the noun students form a non-phrase and similarly the noun automata theory form a non-phrase.
Further, study is a verb. Now choose the sentential form subject verb-object of the English
grammar. Therefore, based upon this grammatical rule, the above sentence may be concluded
that is correct grammatically. This verification/derivation is depicted in figure 3.1. the derivation
can also be represented by a tree structure as figure 2.2.

Figure 3.1: derivation of an English sentence.

24

Figure 2.2: Derivation Tree of an English sentence.

3.1 Context-Free Grammar


A context-free grammar is a model considered by the Chomsky school of formal linguistics.
The idea is that sentences are recursively generated from internal mental symbols through a
series of production rules.
We know understand that a grammar should have the following components.
A set of nonterminal symbols
A set of terminal symbols
A set of rules
As a grammar is to construct /validate sentence of a language, we distinguish a
symbol in a set of nonterminal to represent a sentence from which various sentence
of the language can be generated/ validated.
Definition 3.1.1: A grammar is a quadruple
G = (N, , P, S)
Where
1. N is finite set of nonterminal
2. is a finite set of terminals
3. S N is the start symbol, and
4. P is a finite subset of

N V called the sets production rules. Here , V =N .

25

It is convenient to write

for the production rule

26

i In DFA, for each input symbol, one can determine the state to which the machine will
move. Hence, it is called Deterministic Automaton. As it has a finite number of states,
the machine is called Deterministic Finite Machine or Deterministic Finite
Automaton.

S-ar putea să vă placă și