Documente Academic
Documente Profesional
Documente Cultură
Problem
No
01.
Write a C program for developing a lexical analyzer (LA) that will eliminate white
spaces form a source program in c and collect numbers.
02.
Write a C program for developing a lexical analyzer (LA) that will eliminate white
spaces form a source program in c and collect numbers as token and then also
display the token value as attribute.
03.
Write a C program for developing a lexical analyzer (LA) that will recognize all
basic data type of C.
04.
Write a C program for developing a lexical analyzer (LA) that will recognize all
Keywords of C.
05.
Write a C program for developing a lexical analyzer (LA) that will eliminate white
spaces and comments form a C program.
06.
Write a C program for developing a lexical analyzer (LA) that will recognize
Variables of C a source program.
07.
Write a C program for developing a lexical analyzer (LA) that will generate token
for a given statement of C source program.
08.
Problem No.01
Problem Name:
Write a C program for developing a lexical analyzer(LA) that will eliminate
white spaces from a source program in C and collect numbers.
Problem analysis:
Linear analysis is called lexical analysis or scanning. For example ,in lexical analysis the
character in the assignment statement
Position :=initial + rate *60
Would be grouped into the following tokens:
1.
2.
3.
4.
5.
6.
7.
The blanks separating the characters of these tokens would normally be eliminated during
lexical analysis.
The lexical analyzer is the first phase of compiler .Its main task is to read the input character
and produce as output a sequence of tokens that the parser uses for syntax analysis. This
interaction, summarized schematically in fig(a), is commonly implemented by making the
lexical analyzer be a subroutine or a coroutine of the parser. Upon receiving a get next
token command from the parser, the lexical analyzer reads input characters until it can
identify the next token.
source program
Lexical
anlyzer
parser
Symbol table
analyzer
Since the lexical analyzer is the part of the compiler that reads the source text, it may also
perform certain secondary tasks at the user interface. One such task is stripping out from the
source program comments and white space in the form of blank ,tab, and newline characters. The
lexical analyzer may keep track of the number of newline characters seen ,so that line number
can be associated with an error message.
The purpose of the lexical analyzer is to allow white space and numbers to appear within
expressions.
uses getchar()
to read character
Lexan()
Lexical
analyzer
Tokenval
Code
#include<stdio.h>
#include<ctype.h>
#include<conio.h>
void main()
{
char t,f;
int n;
FILE *f1, *f2;
f1=fopen("c:\\compile\\input.txt","r");
f2=fopen("c:\\compile\\output.txt","w");
{
n=0;
while(isdigit(t))
{
putc(t,f2);
n=n*10+(t-48);
t=getc(f1);
}
printf("%d\n",n);
}
else putc(t,f2);
}
fclose(f1);
fclose(f2);
return(0);
}
INPUT:
void main()
{
FILE *f1,*f2;
long int a;
char c[100];
f1=fopen ("testinput.cpp","r");
f2=fopen("testoutput.cpp","w");
while(fscanf(f1,"%s",c)!=EOF)
{
int line=1;
if(c[0]=='\n')
{fprintf(f2,"\n",);line++;}
else if(!isdigit(c[0]))
*/
fprintf(f2,"%s",c);
else if(isdigit(c[0])
{
a=c[0]-'10';
int i=1;
j=120;
while(isdigit(c[i]))
{
a=a*10+c[i]-'0';
i++;
}
printf("Number %ld in line no. %d\n",a,line);
}}}
OUTPUT:
voidmain()
{
FILE*f1,*f2;
longinta;
charc[100];
f1=fopen("testinput.cpp","r");
f2=fopen("testoutput.cpp","w");
while(fscanf(f1,"%s",c)!=EOF)/*readingvaluefromfile*/
{
intline=Num(1);
if(c[0]=='\n')
{fprintf(f2,"\n",);line++;}
elseif(!isdigit(c[0]))
fprintf(f2,"%s",c);
elseif(isdigit(c[0])
{
a=c[0]-'Num(10)';
inti=Num(1);
j=Num(120);
while(isdigit(c[i]))
{
a=a*Num(10)+c[i]-'Num(0)';
i++;
}
printf("Number%ldinlineno.%d\n",a,line);
}}}
Problem No.02
Problem Name:
Write a C program for developing a lexical analyzer(LA) that will eliminate white
spaces from a source program in C and collect numbers as token and then also
display the token and token value attribute.
Problem analysis:
Linear analysis is called lexical analysis or scanning. For example ,in lexical analysis the
character in the assignment statement
Position :=initial + rate *60
Would be grouped into the following tokens:
1.
2.
3.
4.
5.
6.
7.
The blanks separating the characters of these tokens would normally be eliminated during
lexical analysis.
The lexical analyzer is the first phase of compiler .Its main task is to read the input character
and produce as output a sequence of tokens that the parser uses for syntax analysis. This
interaction, summarized schematically in fig(a), is commonly implemented by making the
lexical analyzer be a subroutine or a coroutine of the parser. Upon receiving a get next
token command from the parser, the lexical analyzer reads input characters until it can
identify the next token.
source program
Lexical
anlyzer
parser
Symbol table
analyzer
Fig.(a): Interaction of lexical analyzer with parser.
Since the lexical analyzer is the part of the compiler that reads the source text, it may also
perform certain secondary tasks at the user interface. One such task is stripping out from the
source program comments and white space in the form of blank ,tab, and newline characters. The
lexical analyzer may keep track of the number of newline characters seen ,so that line number
can be associated with an error message.
Tokens: The smallest individual unit in a source program are known as token.
CODE
#include<stdio.h>
#include<ctype.h>
#include<conio.h>
void main()
{
clrscr();
char t,f;
int n;
FILE *f1, *f2;
f1=fopen("c:\\compile\\input.txt","r");
f2=fopen("c:\\compile\\output.txt","w");
printf(" Token
putc(t,f2);
}
else
{
n=0;
while(isdigit(t))
{
putc(t,f2);
n=n*10+(t-48);
t=getc(f1);
}
printf("\n
num
%d",n);
INPUT:
void main()
{
FILE *f1,*f2;
long int a;
char c[100];
f1=fopen ("testinput.cpp","r");
f2=fopen("testoutput.cpp","w");
while(fscanf(f1,"%s",c)!=EOF)
{
int line=1;
if(c[0]=='\n')
{fprintf(f2,"\n",);line++;}
else if(!isdigit(c[0]))
fprintf(f2,"%s",c);
else if(isdigit(c[0])
{
a=c[0]-'10';
int i=1;
j=120;
}
}
}
OUTPUT:
voidmain()
{
FILE*f1,*f2;
longinta;
charc[100];
f1=fopen("testinput.cpp","r");
f2=fopen("testoutput.cpp","w");
while(fscanf(f1,"%s",c)!=EOF)
{
intline=1;
if(c[0]=='\n')
{fprintf(f2,"\n",);line++;}
elseif(!isdigit(c[0]))
fprintf(f2,"%s",c);
elseif(isdigit(c[0])
{
a=c[0]-'10';
inti=1;
j=120;
}
}
}
NUM
NUM
NUM
NUM
NUM
NUM
1
10
1
120
10
0
Problem Name.03
Write a C program for developing a lexical analyzer(LA) that will recognize all
basic data types of C.
Problem analysis:
The lexical analyzer is the first phase of compiler .Its main task is to read the input character
and produce as output a sequence of tokens that the parser uses for syntax analysis. This
interaction, summarized schematically in fig(a), is commonly implemented by making the
lexical analyzer be a subroutine or a coroutine of the parser. Upon receiving a get next
token command from the parser, the lexical analyzer reads input characters until it can
identify the next token.
source program
Lexical
anlyzer
parser
Symbol table
analyzer
Fig.(a): Interaction of lexical analyzer with parser.
Since the lexical analyzer is the part of the compiler that reads the source text, it may also
perform certain secondary tasks at the user interface. One such task is stripping out from the
source program comments and white space in the form of blank ,tab, and newline characters. The
lexical analyzer may keep track of the number of newline characters seen ,so that line number
can be associated with an error message. The basic data types in a c program are int, float, char
,double, longint .
CODE
#include<stdio.h>
#include<conio.h>
#include<string.h>
void main()
{
clrscr();
char *ch;
FILE *f1;
f1=fopen("c:\\compile\\input.txt","r");
while((fscanf(f1,"%s",ch)) !=EOF)
{
if(strcmp("int",ch)==0
||
strcmp("double",ch)==0)
strcmp("char",ch)==0
||
strcmp("float",ch)==0
||
printf("%s\n",ch);
}
fclose(f1);
getch();
}
INPUT:
int main()
{
int a,b,c;
float s;
chart s;
}
OUTPUT:
int
float
char
Result and Discussion:
This program has been written in C/C++ language and that will successfully recognize all basic
data types of C.
Problem No.04
Problem Name:
Write a C program for developing a lexical analyzer(LA) that will recognize all
Keywords of C.
Problem analysis:
The lexical analyzer is the first phase of compiler .Its main task is to read the input character
and produce as output a sequence of tokens that the parser uses for syntax analysis. This
interaction, summarized schematically in fig(a), is commonly implemented by making the
lexical analyzer be a subroutine or a coroutine of the parser. Upon receiving a get next
token command from the parser, the lexical analyzer reads input characters until it can
identify the next token.
source program
Lexical
anlyzer
parser
Symbol table
analyzer
Fig.(a): Interaction of lexical analyzer with parser.
Since the lexical analyzer is the part of the compiler that reads the source text, it may also
perform certain secondary tasks at the user interface. One such task is stripping out from the
source program comments and white space in the form of blank ,tab, and newline characters. The
lexical analyzer may keep track of the number of newline characters seen ,so that line number
can be associated with an error message.The keyword of C language are
For ,auto ,if,else,break,case,char ,const,continue,default,do,double,enum,float,
goto,int,long,register,return,short,signed,sizeof,static,stuct,switch,typedef,union,unsigned,void,
volatile,while.
CODE
#include<stdio.h>
#include<conio.h>
#include<string.h>
void main()
{
clrscr();
char *t;
char *k[]={"auto","break","case","void","char","int","const","continue","default",
"do","double","else","enum","extren","float","if","while","for"};
int n,i;
FILE *f1, *f2;
f1=fopen("c:\\compile\\input.txt","r");
while( (fscanf(f1,"%s",t)) !=EOF)
{
for(i=0;i<18;i++)
{
if(strcmp(t,k[i])==0);
printf("%s\n",t);
}
}
fclose(f1);
getch();}
INPUT:
#include <stdlib.h>
#include <stdio.h>
#include <values.h>
#include <time.h>
int main(void)
{
int i,j;
for(j=0;j<150;j++)
{
for(i=0;i<200;i++)
printf("%d\n", rand() % MAXINT);
}
return 0;
}
OUTPUT:
int
void
int
for
for
return
Problem No.05
Problem Name:
Write a C program for developing a lexical analyzer(LA) that will eliminate white
spaces and comments from a source program in C .
Problem analysis:
Linear analysis is called lexical analysis or scanning. For example ,in lexical analysis the
character in the assignment statement
Position :=initial + rate *60
Would be grouped into the following tokens:
The identifier position
The assignment symbol:=
The identifier initial
The plus sign.
The identifier the rate
The multiplication sign
The number 60.
The blanks separating the characters of these tokens would normally be eliminated during lexical
analysis.
The lexical analyzer is the first phase of compiler .Its main task is to read the input character
and produce as output a sequence of tokens that the parser uses for syntax analysis. This
interaction, summarized schematically in fig(a), is commonly implemented by making the lexical
analyzer be a subroutine or a coroutine of the parser. Upon receiving a get next token
command from the parser, the lexical analyzer reads input characters until it can identify the next
token.
source program
Lexical
anlyzer
parser
Symbol table
analyzer
Fig.(a): Interaction of lexical analyzer with parser.
Since the lexical analyzer is the part of the compiler that reads the source text, it may also
perform certain secondary tasks at the user interface. One such task is stripping out from the
source program comments and white space in the form of blank ,tab, and newline characters. The
lexical analyzer may keep track of the number of newline characters seen ,so that line number
can be associated with an error message.
CODE
#include<stdio.h>
#include<ctype.h>
#include<conio.h>
void main()
{
clrscr();
char t,t1;
int n,s;
FILE *f1, *f2;
f1=fopen("c:\\compile\\input.txt","r");
f2=fopen("c:\\compile\\output.txt","w");
while(s)
{
t=getc(f1);
if(t=='*')
{t=getc(f1); if(t=='/') s=0;}
}
}
}
else putc(t,f2);
}
fclose(f1);
fclose(f2);
getch();
}
INPUT:
#include<stdio.h>
#include<conio.h>
void main()
{
clrscr();
int p,q,m,n;
printf("How many line ");
scanf("%d",&n);/* n is the number of input*/
printf("\n\n");
for(p=1;p<=n;p++)
{
for(q=1;q<=(n-p);q++)
printf(" ");
m=p;
for(q=1;q<=p;q++)
printf("%2d",(m++%10));
m-=2;
for(q=1;q<p;q++)
printf("%2d",(m--%10));
printf("\n");
}
getch();
}
OUTPUT:
#include<stdio.h>
#include<conio.h>
voidmain()
{
clrscr();
intp,q,m,n;
printf("Howmanyline");
scanf("%d",&n);
printf("\n\n");
for(p=1;p<=n;p++)
{
for(q=1;q<=(n-p);q++)
printf("");
m=p;
for(q=1;q<=p;q++)
printf("%2d",(m++%10));
m-=2;
for(q=1;q<p;q++)
printf("%2d",(m--%10));
printf("\n");
}
getch();
}
Problem No.06
Problem Name:
Write a C program for developing a lexical analyzer(LA) that will generate token
for a given statement of C source program.
Problem analysis:
The lexical analyzer is the first phase of compiler .Its main task is to read the input character
and produce as output a sequence of tokens that the parser uses for syntax analysis. This
interaction, summarized schematically in fig(a), is commonly implemented by making the
lexical analyzer be a subroutine or a coroutine of the parser. Upon receiving a get next
token command from the parser, the lexical analyzer reads input characters until it can
identify the next token.
source program
Lexical
anlyzer
parser
Symbol table
analyzer
Fig.(a): Interaction of lexical analyzer with parser.
Since the lexical analyzer is the part of the compiler that reads the source text, it may also
perform certain secondary tasks at the user interface. One such task is stripping out from the
source program comments and white space in the form of blank ,tab, and newline characters. The
lexical analyzer may keep track of the number of newline characters seen ,so that line number
can be associated with an error message. The smallest individual unit in a source program are
known as token.
CODING:
#include<stdio.h>
#include<string.h>
#include<ctype.h>
int keyword(char buf[]);
char
*key[]={"auto","break","case","char","const","continue","default","do","double","else","enum",
"extern","float","for","goto","if","int","long","register","return","short","signed","sizeof","static"
,"struct","switch","typedef","union","unsigned","void","volatile","while","\0"};
void main()
{
char c,buf[100];
FILE *f;
f=fopen("c6input.cpp","r");
c=getc(f);
printf("Token
Attribute value:\n");
while(c!=EOF)
{
int i=0;
if(isalpha(c))
{
buf[i]=c;i++;
c=getc(f);
while(isalpha(c)||isdigit(c)||c=='_')
{
buf[i]=c;
c=getc(f);
i++;
}
buf[i]='\0';
if(keyword(buf)==0)
printf("ID
%s\n",buf);
else
printf("%s
%s\n",buf,buf);
}
else if(isdigit(c))
{
int a=c-'0';
c=getc(f);
while(isdigit(c))
{
a=a* 10 +c-'0';
c=getc(f);
}
if(c=='.')
{
c=getc(f);
char b[10];int i=0;
while(isdigit(c))
{
b[i]=c;i++;
c=getc(f);
}
b[i]='\0';
printf("Num
%d.%s\n",a,b);
}
else
printf("Num
%d\n",a);
}
else if(c=='<'||c=='>'||c=='=')
{
char k=c;
c=getc(f);
if(c=='=')
{
printf("RE
%c%c\n",k,c);
c=getc(f);
}
else
printf("RE
%c\n",k);
}
else
{
if(c!='\n'&&c!=' ')
printf("Punchuation %c\n",c);
c=getc(f);
}
//c=getc(f);
}
fclose(f);
}
int keyword(char buf[])
{
int i=0;
while(*(key+i)!='\0')
{
if(strcmp(*(key+i),buf)==0)
return 1;
i++;
}
return 0;
}
INPUT:
(i<j)
do{
s=s+20.04;
}
int a_1,a4;
for(i=0;i<n;i++)
a_first+s;
OUTPUT:
Token
Attribute value:
Punchuation (
ID
i
RE
<
ID
j
Punchuation )
do
do
Punchuation {
ID
s
RE
=
ID
s
Punchuation +
Num
20.04
Punchuation ;
Punchuation }
int
int
ID
a_1
Punchuation ,
ID
a4
Punchuation ;
for
for
Punchuation (
ID
i
RE
=
Num
0
Punchuation ;
ID
i
RE
<
ID
n
Punchuation ;
ID
i
Punchuation +
Punchuation +
Punchuation )
ID
a_first
Punchuation +
ID
s
Punchuation ;
Problem No.07
Problem Name:
Write a C program for developing a lexical analyzer(LA) that will recognize
variables of C a source program.
Problem analysis:
The lexical analyzer is the first phase of compiler .Its main task is to read the input character
and produce as output a sequence of tokens that the parser uses for syntax analysis. This
interaction, summarized schematically in fig(a), is commonly implemented by making the
lexical analyzer be a subroutine or a coroutine of the parser. Upon receiving a get next
token command from the parser, the lexical analyzer reads input characters until it can
identify the next token.
source program
Lexical
anlyzer
parser
Symbol table
analyzer
Fig.(a): Interaction of lexical analyzer with parser.
Since the lexical analyzer is the part of the compiler that reads the source text, it may also
perform certain secondary tasks at the user interface. One such task is stripping out from the
source program comments and white space in the form of blank ,tab, and newline characters. The
lexical analyzer may keep track of the number of newline characters seen ,so that line number
can be associated with an error message. The smallest individual unit in a source program are
known as token.
CODING:
#include<stdio.h>
#include<string.h>
#include<ctype.h>
int keyword(char buf[]);
char
*key[]={"auto","break","case","char","const","continue","default","do","double","else","enum","extern",
"float","for","goto","if","int","long","register","return","short","signed","sizeof","static","struct","switch",
"typedef","union","unsigned","void","volatile","while","\0"};
void main()
{
char c,buf[100];
FILE *f;
f=fopen("c7input.cpp","r");
c=getc(f);
while(c!=EOF){
int i=0;
if(isalpha(c))
{
buf[i]=c;i++;
c=getc(f);
while(isalpha(c)||isdigit(c)||c=='_')
{
buf[i]=c;
c=getc(f);
i++;
}
buf[i]='\0';
if(keyword(buf)==0)
printf("%s\n",buf);
}
c=getc(f);
}
fclose(f);
}
int keyword(char buf[])
{
int i=0;
while(*(key+i)!='\0')
{
if(strcmp(*(key+i),buf)==0)
{return 1;}
i++;
}
return 0;
}
INPUT:
int a_1,a4;
for(i=0;i<n;i++)
a_first+s;
OUTPUT:
a_1
a4
i
i
n
i
a_first
s
Problem No.08
Problem Name:
Design a compiler front-end based on syntax-directed translation technique that
will function as an infix to postfix translator for a language consist of sequence of
expressions terminated by semicolon.
Problem analysis: In a compiler, linear analysis is called lexical analysis or scanning. The
character in the assignment
Pos := init +rate *60
Would be grouped into the following tokens:
The identifier
pos
The assignment symbol
:=
The identifier
init
The plus sign
The identifier
rate
The multiplication sign
The number
60
The blanks separating the characters of this tokens would normally be eliminated during
lexical analysis.
Description of the Translator:
The translator is designed using the syntax-directed translation scheme in Fig.6. the token
id represents a nonempty sequence of letters and digits beginning with a letter, num a
sequence of digits, and eof end-of-file
character. Tokens are separated by sequence of blanks, tabs, and newlines ( white space). The
attribute lexeme of the token id gives the character string forming the token; the attribute the
value of the token num gives he integer represented by the num.
start
list eof
list
expr; list
|E
expr
expr+term {print(+)}
| Expr-term {print(-)}
| term
term
factor
(expr)
| id
| num
The code for the translator is arranged into seven modules, each stored in a separate file.
Execution begins in the module main .c that consists of a call to init () for initialization followed
by a call to parse () for the translation. The remaining six modules are shown in fig.7 . There is
also a global header file global.h.
infix expression
init.c
symbol.c
lexer.c
parser.c
error.c
emitter
Postfix expression
Fig.7: Modules of infix to postfix translator.
The Lexical Analysis Module lexer.c
The lexical analyzer is a routine called lexan() that is called by the parser to find tokens. The
value of the attribute associated with the tokens is assigned to a global variable tokenval.
The following tokens are expected by the parser:
+ - * / DIV MOD () ID NUM DONE
Here ID represent an identifier, NUM a number, and DONE the end- of- file character. White
space is silently stripped out by the lexical analyzer. The following table shows the tokens and
attribute value for the corresponding lexeme
LEXEME
TOKEN
ATTRIBUTE VALUE
White space---Numeric value of sequence
Sequence of digits --NUM
Div --DIV
Mod --MOD
Other sequence of a letter then letters and
digits
End of- file character
Any other character ---
ID
DONE
That character
list eof
expr; list
|E
Expr
term moreterms
Moreterms
+ term {print (+)} more terms
| - term {print (-)} more terms
|E
Term
factor morefactors
Morefactors * factor{print (*)} morefactors
| / factor{print (/)} morefactors
| div factor{print (DIV)} morefactors
| mod factor{print (MOD)} morefactors
|E
Factor
(expr)
| id {print (id. lexeme)}
| num{print(num.value)}
The emitter module consists of a single function emit (t,val) that generates the output for token t
with attribute value tval.
ARRAY symtable
lexptr
token
attributes
div
mod
id
id
EOS
EOS
EOS
EOS
ARRAY LEXEME
The entries in the array symtable are pairs consisting of a pointer to the lexemes array and
an integer denoting the token stored there.
The operation insert(s,t) returns the symtable index for the lexeme s forming the
token t. the function lookup (s) return the index of the entry in symtable for the lexeme s or 0 if s
is not there.
The module init.c used to preload symtable with keywords. The lexeme and token
representations for all the keywords are stored in the array keywords, which has the same type as
the sytable array. The function init() goes sequentially througt the keyword array, using the
function insert to pnt the keywords in the symbol table. This arrangement allows us to change the
representation of the tokensa for keywords in a convenient way.
Postfix Notation:
The postfix notation for an expression E can be defined inductively as follows:
1. If E is a variable or constant, the postfix
notation for E is E itself.
2. If E is an expression of the form E1 op E2,
where op is any binary operator, then the
postfix notation for E is E1 E2 op, where
E1 and E2 are the postfix for E1 and E2,
respectively.
3. If E is an expression of the form (E1), then the
postfix notation for E1 is also the postfix
notation for E.
CODING:
#include<stdio.h>
#include<conio.h>
#include<string.h>
void main()
{
clrscr();
char *infix,*stack;
int len,top=0,i;
printf("Enter infix = ");
scanf("%s",infix);
printf("Postfix is = ");
len=strlen(infix);
for(i=0;i<len;i++)
{
if( 65<=infix[i]&& infix[i]<=90 || 97<=infix[i]&&infix[i]<=122)
printf("%c",infix[i]);
if(infix[i]=='(' )
{
top++;
stack[top]=infix[i];
}
if(infix[i]=='*' )
{
while(stack[top]=='*')
{
printf("%c",stack[top]);
top--;
}
top++;
stack[top]=infix[i];
}
if(infix[i]=='/')
{
while(stack[top]=='*' || stack[top]=='/')
{
printf("%c",stack[top]);
top--;
}
top++;
stack[top]=infix[i];
}
if(infix[i]=='+' || infix[i]=='-')
{
while(stack[top]=='*' || stack[top]=='/' || stack[top]=='+' || stack[top]=='-')
{
printf("%c",stack[top]);
top--;
}
top++;
stack[top]=infix[i];
}
if(infix[i]==')')
{
while(stack[top]!='(')
{
printf("%c",stack[top]);
top--;
}
top--;
}
}
while(top!=0)
{
printf("%c",stack[top]);
top--;
}
getch();
}
INPUT:
a+(b*c)
OUTPUT:
abc*+
Result and Discussion:
This program has been written in C/C++ language and successfully that will function as an infix
to postfix translator of C a source program.