Sunteți pe pagina 1din 14

Silicon Institute of Technology

Class Test – I (7th Sem. B. Tech- CS & IT), 2010


Sub : Compiler Design
Time – 60 mins. Max. Marks – 10 Date : 11.08.10
(Answer any Four including Q.1)
Q.1. Short type (any Five) [0.5 x 5 = 2.5]
a) What is a complier and how it is different from an interpreter?
b) Define regular expression.
c) Differentiate DFA & NFA.
d) Differentiate left sentential form and right sentential form of a string.
e) What is a cross compiler and its advantage?

Q.2. Consider the following while statement : [2.5]


While A > B && A <= 2 * B – 5 do
A=A+B
Identify the Tokens. Generate the Parse tree, intermediate code and optimized
code.
Q.3. Show that the following grammar is ambiguous. [2.5]
E ® E+E | E*E | (E) | id
Q.4. Using Thomson Construction Rule construct Î-NFA for the following regular
expressions. [2.5]
a) (a | b)* b) ab (a | b)* c) (a* | b*)* a d) (a | b)* a (a
+ +
| b) e) (a | b) abb
Q5. Give an equivalent DFA for the regular expression (a|b)*abb.
[2.5]

Good Luck
Solution
to
Sub – Compiler Design
Class Test – I, 2010

1)
a) Compiler and Interpreter both are translators for high level languages. But they
have differences which are given below.

Compiler Interpreter

i. Compiler translates the whole i. Interpreter translates the source


source code at a time and code line -by-line and halts
produces a list of errors. when an error is found.
ii. After removal of all errors the ii. After remova l of the error the
compiler produces the object interpreter moves with the
code for the given source code. translation of the next line of
iii. Compiler is faster. the source code.
iv. Compiler generates an object iii. Interpreter is slower.
code of the source code. iv. Interpreter generates an
v. The generate d object code is executable code of the source
stored in permanent storage code.
device v. The generated executable code
vi. The object code is used for is stored in temporary storage
execution of the program device i.e. primary memory.
without any need of the source vi. The source code is always
code. needed for execution of the
program as the executable code
is lost after the execution.

b) Regular Expression:
Regular expressions over åcan be defined recursively as follows:
1. Any terminal symbol (i.e. an element of å), L and f are regular
expressions.
2. The union of two regular expressions R1 and R 2, written as R1 + R2, is also
a regular expression.
3. The concatenation of two regular expressions R 1 and R 2, written as R 1 R2,
is also a regular expression.
4. The iteration (or closure) of a regular expression R, written as R *, is also a
regular expression.
5. If R is a regular expression, then (R) is also a regular expression.
6. A recursive application of the rules 1 -5 once or several times results into a
regular expression.
c) Non-deterministic Finite Automaton (NFA):

A non-deterministic finite automaton is a 5-tuple (Q, å, d, q0, F), where


i) Q is a finite nonempty set of states;
ii) å is a finite nonempty set of inputs;
iii) d is the transition function mapping from Q x å into 2Q which is the power set
of Q, the set of all subsets of Q;
iv) q0 Î Q is the initial state; and
v) F Í Q is the set of final states.

Deterministic Finite Automaton (DFA):

A deterministic finite automaton is a 5-tuple (Q, å, d, q0, F), where


i) Q is a finite nonempty set of states;
ii) å is a finite nonempty set of inputs;
iii) d is the transition function which maps Q x å into Q;
iv) q0 Î Q is the initial state; and
v) F Í Q is the set of final states.
NFA:
DFA: a) In NFA when an input is
a) In DFA when an input is given to a state it may transit
given to a state it transits to a to multiple states.
single state. b) NFA can not be implemented
b) DFA can be i mplemented in in a system.
a system.

d) If a Þ b by a step in which the leftmost non-terminal in a is replaced, we write


a Þ b . Every leftmost step, has the form wAg Þ lm
wdg in which w consists of
lm
terminals only. If a derives b by a leftmost derivation, we write a Þ * b . If
lm
S *Þ
lm
a, then we say a is a left sentential form of the given grammar.

Similarly, If a Þ b by a step in which the rightmost non -terminal in a is


replaced, we write a rmÞ b . Every rightmost step, has the form gAw Þ rm
gdw in
which w consists of terminals only. If a derives b by a rightmost derivation, we
write a rm
Þ* b . If S Þ*
rm
a, then we say a is a right sentential form of the given
grammar.

For example:
A given grammar is :
S®iCtS
S®iCtSeS
S®a
C®b
And a string w = i b t i b t a e a can be derived using leftmost deriva tion as
follows:
SÞiCtS
lm

ÞibtS
lm

Þ
lm
ibtiCtSeS

ÞibtibtSeS
lm

Þ
lm
ibtibtaeS

Þibtibtaea
lm

It is in the form of
Þ a1 Þ a2 Þ
S lm lm
× × × Þ an = w. Here all ai, 1≤ i ≤ n, is a left -sentential
lm lm
form of the given string.
Similarly, the rightmost derivation of the string is as follows:

ÞiCtS
S rm

ÞiCtiCtSeS
rm

Þ
rm
iCtiCtSea

ÞiCtiCtaea
rm

ÞiCtibtaea
rm

Þibtibtaea
rm

It is in the form of
Þ a1 Þ a2 rm
S rm Þ × × × rm
Þ an = w. Here all ai, 1≤ i ≤ n, is a right -sentential
rm
form of the given string. The rightmost derivation is also known as canonical
derivation.

e) A compiler is characterized by three languages: its source language, its object


language, and the language in which it is written. These languages may all be
quite different.
A compiler may run on one machine and produce object code for another
machine. Such a compiler is often called a cross-compiler.
Suppose a new language L is made available to two different machines A
SA
and B. As a first step we may write for machine A, a small compiler C A , that
translates a subset S of language L into the machine or assembly code of A.
LA
C
Then we write a compiler S
in the simple language S. This program,
SA LA
when run throughC , becomes C A
, the compiler for the complete language
A
L, running on machine A and producing object code for A.

LA SA LA
C S
®
C A
® C A

Now suppose we want to have another compiler for L to run on machine B


LA
and to produce code for B. If C S has been designed carefully and machine B
is not that different from machine A, then with little modification we can
LA LB
convert C S
into a compiler C L
which produces object code for B.
LB LB
So, using C L
to produce C B , a compiler for L on B, is a two step
LB LA
process which is given below. Here we first run C L
through C A
to
LB
produce C A , a cross -compiler for L which runs on machine A but produces
LB
code for machine B. Then we run C L through this cross -compiler to produce
the desired compiler for L t hat runs on machine B and produces object code for
B.
LA
C
LB LB
C L
®
A
® C A

LB
C
LB LB
C L
® A
® C B

Hence, we can see a cross -compiler helps u s to create a compiler for


another machine without starting from the scratch. It helps a lot in the form of
reduction of amount of coding for a new compiler.

2) The given while statement is


while A > B & A ≤ 2*B – 5 do
A := A + B;
The list of tokens present in the given string are:
while [id, n1] > [id, n2] & [id, n1] ≤ [const, n3] * [id, n2] – [const, n4]
do [id, n1] ¬ [id, n1] + [id, n2];
Here n1, n2, n3 and n4 stand for pointers to the symbo l table entries for A, B, 2, and 5,
respectively.

The parse tree for the given statement is given below.


statement

while-statement

while condition do statement

condition & condition assignment

relation relation
location ¬ exp

id(A)
exp + exp
exp relop exp exp relop exp

id(A) id(B)

id(A) > id(B) id(A) ≤ exp - exp

exp * exp const(5)

const(2) id(B)
The intermediate code for the given string is as follows:

L1: if A > B goto L2


goto L3
L2: T1 := 2 * B
T2 := T1 – 5
if A ≤ T2 goto L4
goto L3
L4: A := A + B
goto L1
L3:

In an attempt to code improvement we can have local transformations. Here we are


having two instances of jumps over jumps in the intermediate code.
if A > B goto L2
goto L3
L2:
This sequence of code can be replaced by the single statement
if A ≤ B goto L3.

By applying such replacement the optimized code will be as follows

L1: if A ≤ B goto L2
T1 := 2 * B
T2 := T1 – 5
if A > T2 goto L2
A := A + B
goto L1
L2:
3) The given grammar is
E ® E + E | E * E | (E) | id
A given grammar is said to be ambiguous if it produces more than one parse tree for
some sentence. In other words, if a string is derivable in more than one ways using a
given grammar, then the given grammar is said to be ambiguous.

Let’s have a sentence like id + id * id.


We can derive the above string in two ways, which is given below:
EÞE+E
Þ id + E
Þ id + E * E
Þ id + id * E
Þ id + id * id. And the other way is

EÞE*E
ÞE+E*E
Þ id + E * E
Þ id + id * E
Þ id + id * id.
For the above two derivations we can have two different parse trees, which are shown
below.

E E

E + E E * E

E * E E + E

id id id id id id
d d d d d d

Hence the given grammar is ambiguous.


4)
a) The regular expression is (a | b)*
The Î-NFA for the given regular expression is as below:
Î

q2 a q3 Î
Î
Î
q0 Î q1 q6 q7
Î
Î
q4 b q5

b) The regular expression is ab(a | b)*.


The Î-NFA for the given regular expression is as below:

a
q4 q5 Î
Î
Î
a b q2 Î q3 q8 q9
q0 q1
Î
Î
q6 b q7

Î
c) The regular expression is (a* | b*)*a
The Î-NFA for the given regular expression is as below:

q12

q11

q10

Î Î

q5 q9

Î
Î

Î q4 q8

Î Î
a Î b Î Î

q3 q7

Î Î

q2 q6

Î Î

q1

Î
q0
d) The regular expression is (a | b)*a (a | b)+
The Î-NFA for the given regular expression is as below:

q14
Î Î
q13
Î Î
q15 q17
q10
q12
a b Î
a
b
q16 q18
q9
q11
Î Î Î
Î
Î q19
q8
Î
a
q7 q20

q6
Î Î
q3 q5
Î
Î a b
q2 q4

Î Î
q1

Î
q0
e) The regular expression is (a | b)+abb.
The Î-NFA for the given regular expression is as below:

b q14
b q15
q13

a
q12

q11
Î Î

q8
q10

a
Î b
Î
q7
q9
Î Î

q6

q5
Î Î
q2 q4

a b
q1 q3

Î Î
q0
5) The given regular expression is (a | b)*abb. The Î-NFA for the given regular
expression is given below:

q2 a q3 Î
Î
Î a q8
Î q6 q7
q0 q1
Î
Î
q4 b q5 b

q9

Î
b

q10

Here the Î-closure of {q 0}={q0, q1, q2, q4, q7}= A.


Here all the states in A are equivalent. Now applying ‘a’ as the input to the states in
A, we get {q3, q8}.
Now to get all the equivalent states of {q3, q8}, we compute the Î-closure of {q 3, q8}.
Î-closure {q3, q8} = {q1, q2, q3, q4, q6, q7, q8} = B.
Applying ‘b’ as the input to the states in A, we get {q5}.
Now to get all the equivalent states of {q5}, we compute the Î-closure of {q 5}.
Î-closure {q 5} = {q1, q2, q4, q5, q6, q7} = C.
Similarly, applying ‘a’ to B, we get, {q3, q8}.
Î-closure {q3, q8} = B.
Applying ‘b’ to B, we get, {q5, q9}.
Î-closure {q 5, q9} = {q1, q2, q4, q5, q6, q7, q9} = D.
Now applying ‘a’ to C, we get, {q3, q8}.
Î-closure {q 3, q8} = B.
Applying ‘b’ to C, we get, {q5}.
Î-closure {q 5} = C.
Now applying ‘a’ to D, we get, {q3, q8}.
Î-closure {q 3, q8} = B.
Applying ‘b’ to D, we get, {q5, q10}.
Î-closure {q 5, q10} = {q1, q2, q4, q5, q6, q7, q10 } = E.
Now applying ‘a’ to E, we get, {q3, q8}.
Î-closure {q 3, q8} = B.
Applying ‘b’ to E, we get, {q5}.
Î-closure {q 5} = C.
Now the transition table for the DFA is shown below
State Input
a b
® A B C
B B D
C B C
D B E
E* B C

Now let us minimize the above DFA to get an equivalent minimized DFA.
Here the 0-equivalent classes will be given by Q10={E}, Q20={A, B, C, D}, which is a
set of all final states and a set of all non-final states.
Hence the set of 0-equivalent classes is given by
P 0={ Q10, Q20}.
Now for 1-equivalent classes we got,
Q11={E}, Q21={A, B, C}, Q31={D}.
Hence the set of 1-equivalent classes is given by
P 1={ Q11, Q21, Q31}, where Q11, Q21, Q31 are given above.
Now for 2-equivalent classes we got,
Q12={E}, Q22={A, C}, Q32={B}, Q42={D}.
Hence the set of 2-equivalent classes is given by
P 2={ Q12, Q22, Q32, Q42}, where Q12, Q22, Q32, Q42 are given above.
Now for 3-equivalent classes we got,
Q13={E}, Q23={A, C}, Q33={B}, Q43={D}.
Hence the set of 3-equivalent classes is given by
P 3={ Q13, Q23, Q33, Q43}, where Q13, Q23, Q33, Q43 are given above.
We can see that P2=P3. Hence P2 is the set of equivalence classes.

So, now the transition table for minimized DFA is given below.

State Input
a b
® A B A
B B D
D B E
E* B A

The transition diagram for the DFA is shown below.


b a

a b
A B D

b a
a
b
E

S-ar putea să vă placă și