Sunteți pe pagina 1din 55


(Estd. u/s 3 of UGC Act, 1956)


11150L66-Compiler Design Lab Manual III Year / VI Semester




1. 2. 3. 4.

Implement a lexical analyzer in C Implement a lexical analyzer in C Use LEX tool to implement a lexical analyzer Implement a recursive descent parser for an expression grammar that generates arithmetic expressions with digits, + and *. Use YACC and LEX to implement a parser for the same grammar as given in previous problem Write semantic rules to the YACC program and implement a calculator. Implement the front end of a compiler that generates the three address code for a simple language. Implement the front end of a compiler that generates the three address code for a simple language. Implement the back end of the compiler and produces the assembly language instructions

Page No
11 15 18 22











10. Implement the back end of the compiler which takes the three address code generated and produces the assembly language instructions. 11. Implementing a Shift reduce parser.



12. Sample LEX program


11150L66/Compiler Design Lab

III year/VI Sem


Lexical Analyzer reads the source program character by character to produce tokens. Normally a lexical analyzer doesnt return a list of tokens at one attempt; it returns a token when the parser asks a token from it.

The Role of the Lexical Analyzer

Read input characters Group them into lexemes Produce as output as a sequence of tokens Which in-turn used as input for the syntactical analyzer Interact with the symbol table Insert identifiers It stripes out comments White spaces: blank, new line, tab other separators Correlates error messages generated by the compiler with the source program By keep tracking the number of new lines seen And associates a line number with each error message

Tokens, Patterns, Lexemes

Token - pair of: Token name abstract symbol representing a kind of lexical unit keyword, identifier, Optional attribute value Pattern Description of the form that the lexeme of a token may take e.g. For a keyword the pattern is the character sequence forming that keyword For identifiers the pattern is a complex structure that is matched by many strings Lexeme

A sequence of characters in the source program matching a pattern for a


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

Shift-Reduce Parser:
There are four possible actions of a shift-parser action: Shift : The next input symbol is shifted onto the top of the stack. Reduce: Replace the handle on the top of the stack by the non-terminal. Accept: Successful completion of parsing. Error: Parser discovers a syntax error, and calls an error recovery routine. Initial stack just contains only the end-marker $. The end of the input string is marked by the end-marker $.

A Stack Implementation of A Shift-Reduce Parser

Stack $ $id $F $T $E $E+ $E+id $E+F $E+T $E+T* $E+T*id $E+T*F $E+T $E Input id+id*id$ +id*id$ +id*id$ +id*id$ +id*id$ id*id$ *id$ *id$ *id$ id$ $ $ $ $ Action shift reduce by F id reduce by T F reduce by E T shift shift reduce by F id reduce by T F shift shift reduce by F id reduce by T T*F reduce by E E+T accept

Recursive-Descent Parsing (uses Backtracking)

Backtracking is needed. It tries to find the left-most derivation Recursive-Descent Parsing Backtracking is needed (If a choice of a production rule does not work, we backtrack to try other alternatives.) It is a general parsing technique, but not widely used. Not efficient

Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

Operator Precedence Parsing: Form of Shift/Reduce parsing

Two important properties for these shift reduce Parsers is that (epsilon) does not appear on the right side of any production and no production has two adjacent non terminals (NT). E -> E + E T -> + T T// wrong production, because it has 2 adjacent NT. This allows us to find handles

Precedence We need to define three different precedence relations between pairs of terminals .They look like >, <, and ==. These symbols are positioned by using precedence rules. Relation Meaning: a <. b, a yields precedence to b a =. b a has the same precedence as b a >. b a takes precedence


In this section, we introduce a tool called Lex, or in a more recent implementation Flex, that allows one to specify a lexical analyzer by specifying regular expressions to describe patterns for tokens. The input notation for the Lex tool is referred to as the Lex language and the tool itself is the Lex compiler. Behind the scenes, the Lex compiler transforms the input patterns into a transition diagram and generates code, in a file called lex . yy . c, that simulates this transition diagram. The mechanics of how this translation from regular expressions to transition diagrams occurs is the subject of the next sections; here we only learn the Lex language. Use of Lex: An input file lex1 is written in the lex language and describes the lexical analyzer to be generated . The Lex compiler transforms lex1 to a c program in a file that is always named lex.yy.c

Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

STRUCTURE OF A LEX PROGRAM: A Lex program has the following form: declarations %% translation rules %% auxiliary functions The declarations section includes declarations of variables, manifest constants (identifiers declared to stand for a constant, e.g., the name of a token), and regular definitions The translation rules each have the form Pattern { Action ) Each pattern is a regular expression, which may use the regular definitions of the declaration section. The actions are fragments of code, typically written in C. The third section holds whatever additional functions are used in the actions. Alternatively, these functions can be compiled separately and loaded with the lexical analyzer. The lexical analyzer created by Lex behaves as follows : 1. When called by the parser, the lexical analyzer begins reading its remaining input, one character at a time, until it finds the longest prefix of the input that matches one of the patterns Pi. 2.It then executes the associated action Ai. Typically, Ai will return to the parser, but if it does not (e.g., because Pi describes whitespace or comments), then the lexical analyzer proceeds to find additional lexemes, until one of the corresponding actions causes a return to the parser. 3.The lexical analyzer returns a single value, the token name, to the parser, but uses the shared, integer variable yylval to pass additional information about the lexeme found, if needed. LEX PROGRAM FOR TOKEN :
%{ /* definitions of manifest constants
4 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem


%} /* regular definitions */
delim [ \t\nl ws (delim)+ letter [A-Za-z] digit [o-9] id {letter} {(letter) | {digit})* number {digit)+ (\ . {digit}+)? (E [+-] ?{digit}+)? %% {ws} (/* no action and no return */) if {return(IF) ; } then {return(THEN) ; } else {return(ELSE) ; } {id} {yylval = (int) installID(); return(ID);} {number} {yylval = (int) installNum() ; return(NUMBER) ; } < {yylval = LT; return(REL0P); } <= {yylval = LE; return(REL0P); } = {yylval = EQ ; return(REL0P); } <> {yylval = NE; return(REL0P);} > {yylval = GT; return(REL0P);} >= {yylval = GE; return(REL0P);} %% int installID0 {/* function to install the lexeme, whose first character is pointed to by yytext, and whose length is yyleng, into the symbol table and return a pointer thereto */

int installNum() {/* similar to installID, but puts numerical constants into a separate table */

In the declarations section we see a pair of special brackets, %( and %). Anything within these brackets is copied directly to the file lex . yy . c, and is not treated as a regular definition.The manifest constants are placed inside it Also the languages occur as a sequence of regular definitions . Regular definitions that are used in later definitions or in the patterns of the translation rules are surrounded by curly braces. Thus, for instance, delim is defined to be a shorthand for the character class consisting of the blank, the tab, and the newline; the latter two are represented, as in all UNIX commands, by backslash followed by t or n, respectively In the auxiliary-function section, we see two such functions, installID( )and installNum(). Like the portion of the declaration section that appears between everything in the auxiliary section is copied directly to file lex. yy . c, but may be used in the actions. First, an identifier declared in the first section, has an associated empty action. If we find whitespace, we do not return to the parser, but look for another lexeme. The second token
5 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

has the simple regular expression pattern if. Should we see the two letters if on the input, and they are not followed by another letter or digit (which would cause the lexical analyzer to find a longer prefix of the input matching the pattern for id), then the lexical analyzer consumes these two letters from the input and returns the token name IF, that is, the integer for which the manifest constant IF stands. Keywords then and else are treated similarly. The fifth token has the pattern defined by id. Note that, although keywords like i f match this pattern as well as an earlier pattern, Lex chooses whichever pattern is listed first in situations where the longest matching prefix matches two or more patterns. The action taken when id is matched is given as follows: I. Function installID( ) is called to place the lexeme found in the symbol table. 2. This function returns a pointer to the symbol table, which is placed in global variable yylval, where it can be used by the parser or a later component of the compiler. Note that installID () has available to it two variables that are set automatically by the lexical analyzer that Lex generates: (a) yytext is a pointer to the beginning of the lexeme (b) yyleng is the length of the lexeme found. 3. The token name ID is returned to the parser

Lex and yacc:

This section contains example programs for the lex and yacc commands. Together, these example programs create a simple, desk-calculator program that performs addition, subtraction, multiplication, and division operations. This calculator program also allows you to assign values to variables (each designated by a single, lowercase letter) and then use the variables in calculations. The files that contain the example lex and yacc programs are as follows: Specifies the lex command specification file, which defines the lexical analysis rules. Specifies the yacc command grammar file, which defines the parsing rules, and calc.yacc calls the yylex subroutine created by the lex command to provide input. calc.lex The following descriptions assume that the calc.lex and calc.yacc example programs are located in your current directory. Compiling the Example Program To create the desk calculator example program, do the following: 1. Process the yacc grammar file using the -d optional flag (which informs the yacc command to create a file that defines the tokens used in addition to the C language source code):
Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

yacc -d calc.yacc 2. Use the ls command to verify that the following files were created: The C language source file that the yacc command created for the parser A header file containing define statements for the tokens used by the parser 3. Process the lex specification file: lex calc.lex 4. Use the ls command to verify that the following file was created: lex.yy.c The C language source file that the lex command created for the lexical analyzer 5. Compile and link the two C language source files: cc lex.yy.c 6. Use the ls command to verify that the following files were created: The object file for the source file lex.yy.o The object file for the lex.yy.c source file a.out The executable program file To run the program directly from the a.out file, type: $ a.out To move the program to a file with a more descriptive name, as in the following example, and run it, type: $ mv a.out calculate

Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

$ calculate The file contains the following sections:

Declarations Section. This section contains entries that: Include standard I/O header file Define global variables Define the list rule as the place to start processing Define the tokens used by the parser Define the operators and their precedence Rules Section. The rules section defines the rules that parse the input stream. %start - Specifies that the whole input should match stat. %union - By default, the values returned by actions and the lexical analyzer are integers. yacc can also support values of other types, including structures. In addition, yacc keeps track of the types, and inserts appropriate union member names so that the resulting parser will be strictly type checked. The yacc value stack is declared to be a union of the various types of values desired. The user declares the union, and associates union member names to each token and nonterminal symbol having a value. When the value is referenced through a $$ or $n construction, yacc will automatically insert the appropriate union name, so that no unwanted conversions will take place. %type - Makes use of the members of the %union declaration and gives an individual type for the values associated with each part of the grammar. %token - Lists the tokens which come from lex tool with their type. Programs Section. The programs section contains the following subroutines. Because these subroutines are included in this file, you do not need to use the yacc library when processing this file.

The required main program that calls the yyparse subroutine to start the program. yyerror(s) This error-handling subroutine only prints a syntax error message. The wrap-up subroutine that returns a value of 1 when the end of yywrap input occurs. main
8 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

Front end of compiler:

The front end analyzes the source code to build an internal representation of the program, called the intermediate representation or IR. It also manages the symbol table, a data structure mapping each symbol in the source code to associated information such as location, type and scope. This is done over several phases, which includes some of the following: 1. Lexical analysis breaks the source code text into small pieces called tokens. Each token is a single atomic unit of the language, for instance a keyword, identifier or symbol name. The token syntax is typically a regular language, so a finite state automaton constructed from a regular expression can be used to recognize it. This phase is also called lexing or scanning, and the software doing lexical analysis is called a lexical analyzer or scanner. 2. Preprocessing some languages like C, require a preprocessing phase which supports macro substitution and conditional compilation. Typically the preprocessing phase occurs before syntactic or semantic analysis; e.g. in the case of C, the preprocessor manipulates lexical tokens rather than syntactic forms. However, some languages such as Scheme support macro substitutions based on syntactic forms. 3. Syntax analysis involves parsing the token sequence to identify the syntactic structure of the program. This phase typically builds a parse tree, which replaces the linear sequence of tokens with a tree structure built according to the rules of a formal grammar which define the language's syntax. The parse tree is often analyzed, augmented, and transformed by later phases in the compiler. 4. Semantic analysis is the phase in which the compiler adds semantic information to the parse tree and builds the symbol table. This phase performs semantic checks such as type checking (checking for type errors), or object binding (associating variable and function references with their definitions), or definite assignment Back end of compiler The term back end is sometimes confused with code generator because of the overlapped functionality of generating assembly code. Some literature uses middle end to distinguish the generic analysis and optimization phases in the back end from the machine-dependent code generators. The main phases of the back end include the following: 1. Analysis: This is the gathering of program information from the intermediate representation derived from the input. Typical analyses are data flow analysis to
9 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

build use-define chains, dependence analysis, alias analysis, pointer analysis, escape analysis etc. Accurate analysis is the basis for any compiler optimization. The call graph and control flow graph are usually also built during the analysis phase. 2. Optimization: the intermediate language representation is transformed into functionally equivalent but faster (or smaller) forms. Popular optimizations are in line expansion, dead code elimination, constant propagation, loop transformation, register allocation and even automatic parallelization. 3. Code generation: the transformed intermediate language is translated into the output language, usually the native machine language of the system. This involves resource and storage decisions, such as deciding which variables to fit into registers and memory and the selection and scheduling of appropriate machine instructions along with their associated addressing modes.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

EX: NO: 1



To write a c program to implement lexical analysis for separating tokens.


Start the program.

Read the input statement from the keyboard. Use the tokenseperation() function to analyze the input program and store the identifiers, keywords, operators, punctuation in the dynamic arrays respectively.

Stores the tokens in its corresponding pointer array. Increment the line number of each token and its occurrences. Use the printtoken() function to print the stored tokens from the arrays. Stop the program.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

PROGRAM: LEXICAL ANALYSER USING C //File name: lex.c #include<stdio.h> #include<conio.h> #include<ctype.h> #include<string.h> char str[100]; char symboltable[25][25]; char attributetable[25][25]; int firstindex=0; void main() { void tokenseperation(); void printtokens(); int i; clrscr(); printf("\n enter the source program \n"); gets(str); tokenseperation(); printtokens(); getch(); } void tokenseperation() { int i,j,k,l,len; int keyword; char *keywords[]={"if","else","while","void","switch","int","main","case"}; char *operators[]={"<",">","=","+","-","*","/"}; char punctuation[]="{}[];:( )"; char token[20]; len=strlen(str); i=j=0; while(i<len) { if(isalpha(str[i])) { while(isalpha(str[i])||isdigit(str[i])) { token[j++]=str[i]; i++; } token[i]='\0'; strcpy(symboltable[firstindex],token); keyword=0; for(k=0;k<=0;k++)
12 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

if(strcmp(keywords[k],token)==0) { strcpy(attributetable[firstindex++],"key");keyword=1; break; } if(keyword==0) strcpy(attributetable[firstindex++],"var"); } j=0; if(str[i]==NULL) { while(str[++i]!=NULL) token[j++]=str[i]; token[j]='\0'; strcpy(symboltable[firstindex],token); strcpy(attributetable[firstindex++],"l"); } j=0; if(isdigit(str[i])) { while(isdigit(str[i])||(str[i]=='.')) token[j++]=str[i++]; token[j]='\0'; strcpy(symboltable[firstindex],token); strcpy(attributetable[firstindex++],"c"); } j=0; token[j++]=str[i]; token[j++]='\0'; for(k=0;k<11;k++) { if(strcmp(operators[k],token)==0) { strcpy(symboltable[firstindex],token); strcpy(attributetable[firstindex++],"operator"); break; } } for(k=0;k<12;k++) { if(punctuation[k]==str[i]) { strcpy(symboltable[firstindex],token); strcpy(attributetable[firstindex++],"p"); break; }
13 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

} j=0; i++; } } void printtokens() { int i; for(i=0;i<firstindex;i++) printf("\n%s\t%s\n",symboltable[i],attributetable[i]); getch(); } OUTPUT: Enter the source program a+b=c; a + b = c ; var operator var operator var p

Thus the program has been executed and the separated tokens are printed.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem



To write, a program for dividing the given input program into lexemes.

Start the program. Declare the file pointer and necessary variables. Open the input file in the read mode. Use the string comparison function to check whether the current input string is punctuation or keyword or operator or identifier respectively. Print the tokens which are found. Stop the program.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem


//File name: lexical.c #include<stdio.h> #include<conio.h> #include<string.h> main() { int i,j,k,p,c; char s[120],r[100]; char par[6]={'(',')','{','}','[',']'}; char sym[9]={'.',';',':',',','<','>','?','$','#'}; char key[9][10]={"main","if","else","switch","void","do","while","for","return", include,stdio}; char dat[4][10]={"int","float","char","double"}; char opr[5]={'*','+','-','/','^'}; FILE *fp; clrscr(); printf("\n\n\t enter the file name"); scanf("%s",s); fp=fopen(s,"r"); c=0; do { fscanf(fp,"%s",r); getch(); for(i=0;i<6;i++) if(strchr(r,par[i])!=NULL) printf("\n paranthesis :%c",par[i]); for(i=0;i<9;i++) if(strchr(r,sym[i])!=NULL) printf("\n symbol :%c",sym[i]); for(i=0;i<9;i++) if(strstr(r,key[i])!=NULL) printf("\n keyword :%s",key[i]); for(i=0;i<4;i++) if((strstr(r,dat[i])&&(!strstr(r,"printf")))!=NULL) { printf("\n data type :%s",dat[i]); fscanf(fp,"%s",r); printf("\n identifiers :%s",r); } for(i=0;i<5;i++) if(strchr(r,opr[i])!=NULL) printf("\n operator :%c",opr[i]); p=c; c=ftell(fp); }
16 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

while(p!=c); return 0; }

INPUT FILE : sample.c #include <stdio.h> main() { }

OUTPUT: enter the file name: sample.c keyword : include punctuation: < keyword: stdio punctuation: . punctuation:> keyword: main punctuation: ( punctuation: ) punctuation: { punctuation: }

Thus the program has been executed and the tokens are separated.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

EX: NO: 3



To implement the lexical analyzer using LEX tool, for a subset of C language.


Start the program.

Declare necessary variables and creates token representation using Regular Expression. Print the pre processor or directives, keywords by analysis of the input program. Check whether there are argument counter and argument vectors. Open the input file in read mode. Read the file and if any token in source program matches with the regular expression that are all returned as integer values. Print the token identified using yylex() function.

Stop the program.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem


// File name is lexp.l %{ int COMMENT=0; %} identifier [a-zA-Z][a-zA-Z0-9]* %% #.* { printf("\n%s is a PREPROCESSOR DIRECTIVE",yytext);} int | float | char | double | while | for | do | if | break | continue | void | switch | case | long | struct | const | typedef | return | else | goto {printf("\n\t%s is a KEYWORD",yytext);} "/*" {COMMENT = 1;} "*/" {COMMENT = 0;} {identifier}\( {if(!COMMENT)printf("\n\nFUNCTION\n\t%s",yytext);} \{ {if(!COMMENT) printf("\n BLOCK BEGINS");} \} {if(!COMMENT) printf("\n BLOCK ENDS");} {identifier}(\[[0-9]*\])? {if(!COMMENT) printf("\n %s is an IDENTIFIER",yytext);} \".*\" {if(!COMMENT) printf("\n\t%s is a STRING",yytext);} [0-9]+ {if(!COMMENT) printf("\n\t%s is a NUMBER",yytext);} \)(\;)? {if(!COMMENT) printf("\n\t");ECHO;printf("\n");} \( ECHO; = {if(!COMMENT)printf("\n\t%s is an ASSIGNMENT OPERATOR",yytext);} \<= | \>= | \< | == | \> {if(!COMMENT) printf("\n\t%s is a RELATIONAL OPERATOR",yytext);}
19 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

%% int main(int argc,char **argv) { if (argc > 1) { FILE *file; file = fopen(argv[1],"r"); if(!file) { printf("could not open %s \n",argv[1]); exit(0); } yyin = file; } yylex(); printf("\n\n"); return 0; } int yywrap() { return 0; }

INPUT: $vi var.c #include<stdio.h> main() { int a,b; }


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem


$lex lexp.l $cc lex.yy.c $./a.out var.c #include<stdio.h> is a PREPROCESSOR DIRECTIVE FUNCTION main ( ) BLOCK BEGINS int is a KEYWORD a IDENTIFIER b IDENTIFIER BLOCK ENDS

Thus the Lexical Analyzer was implemented using LEX TOOL for a subset of C language.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem


To perform the implementation of recursive descent parsing.


Start the program. Get the expression from the user and call the e() function. In gets(ipsym) function to get the input symbol and match with the look ahead pointer and then return the token accordingly.

In e() and eprime(), it check whether the look ahead pointer is + or - else return syntax error.

In t() and tprime(), it check whether the look ahead pointer is * or / else return syntax error.

In f(), it check whether the look ahead pointer is a member of any identifier.

In advance(), it advances the input pointer to the next position of input string.

In main(), check if the current look ahead points to the token in a given CFG it doesnt match the return syntax error.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

PROGRAM: RECURSIVE DESDENT PARSING // File name: recur.c #include<stdio.h> #include<conio.h> #include<stdlib.h> #include<string.h> #include<ctype.h> char ipsym[15],ipptr=0; void eprime(); void e(); void tprime(); void t(); void advance(); void e(); void f(); void e() { printf("\n \t \t E-->TE'"); t(); eprime(); } void eprime() { if(ipsym[ipptr]=='+') { printf("\n \t \t T-->+TE'"); advance(); t(); eprime(); } else printf("\n \t \t E'-->e"); } void t() { printf("\n \t \t E'-->FT'"); f(); tprime(); } void tprime()
23 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

{ if(ipsym[ipptr]=='*') { printf("\n \t \t E'-->*FT'"); advance(); f(); tprime(); } else printf("\n \t \t T'-->e"); } void f() { if((ipsym[ipptr]=='i')||(ipsym[ipptr]=='I')) { printf("\n\t\tF-->i"); advance(); } else if(ipsym[ipptr]=='c') { advance(); e(); if(ipsym[ipptr]==')') { advance(); printf("\n\t\tF-->(E)"); } } else { printf("\n\t syntax error"); getch(); exit(1); } } void advance() { ipptr++; } void main()
24 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

{ int i; clrscr(); printf("\n\t\tINPUT"); printf("\n\t\tGrammar without error recursion"); printf("\n\t\tE-->TE'\n\t\tE'-->+TE'|e\n\t\tT-->FT'"); printf("\n\t\tT'-->*FT'|e\n\t\tF-->(E)|i"); printf("\nENTER THE IP EXPRESSION"); gets(ipsym); printf("\n\t\toutput"); printf("\n sequence of production rules"); e(); for(i=0;i<strlen(ipsym);i++) { if(ipsym[i]!='+'&&ipsym[i]!='*'&&ipsym[i]!='('&&ipsym[i]!=')'&&ipsym[i]!='i' &&ipsym[i]!='c') { printf("\n syntax error"); break; } } getch(); }


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

OUTPUT: INPUT Grammar without error recursion E-->TE' E'-->+TE'|e T-->FT' T'-->*FT'|e F-->(E)|i ENTER THE IP EXPRESSION i+i Output sequence of production rules E-->TE' E'-->FT' F-->i T'-->e T-->+TE' E'-->FT' F-->i T'-->e E'-->e

Thus the program to implement the recursive descent parser was implemented and input strings were parsed.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem


Using LEX and YACC to implement lexical analyzer

To write a c program to implement the lexical analyzer using LEX and YACC tool.

Start the program Open a file seven.c in read and include the yylex() tool for input scanning. Define the alphabets, numbers and identifiers. Print the preprocessor, function, keyword using yytext.lex tool. Print the relational, assignment and all the operator using yytext() tool. Also scan and print where the loop ends and begins. Stop the program.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

PROGRAM: IMPLEMENTATION OF LEXICAL ANALYSER USING LEX & YACC // File name is lexical.l %% #.* {printf("\n %s is a PREPROCESSOR DIRECTIVE",yytext);} int | float | char | double | while | for | do | if | break | continue | void | switch | case | long | struct | scanf | printf | const | typedef | return | else | goto {printf("\n\t %s is a KEYWORD",yytext);} \< | \> | \<= | \>= | \== | \!= {printf("\n\t %s is RELATIONAL OPERATOR",yytext); } \= {printf("\n\t %s is ASSIGNMENT OPERATOR",yytext);} \+ | \- | \* | \/ | \% {printf("\n\t %s is ARITHMETIC OPERATOR",yytext);} \".*\" {printf("\n\t %s is a string",yytext);} [0-9]+ {printf("\n\t %s is a NUMBER",yytext);} [a-zA-Z][a-zA-Z0-9]* {printf("\n\t %s is a IDENTIFIER",yytext);} \{ {printf("\n BLOCK BEGINS");} \} {printf("\n BLOCK ENDS");} \/\*.*\*\/ printf("\n %s is COMMENT",yytext); [\t];
28 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

%% int main(int argc,char **argv) { if(argc>1) { FILE *f; f=fopen(argv[1],"r"); if(!f) { printf("could not open %s\n",argv[1]); exit(0); } yyin=f; } yylex(); } INPUT: #include<stdio.h> void main() { int a,b,c; printf("enter the a "); scanf("%d%d",&a,&b); c=a+b; getch(); }


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

OUTPUT: $ lex lexical.l $ cc lex.yy.c ll $ ./a.out seven.c #include<stdio.h> is a PREPROCESSOR DIRECTIVE void is a KEYWORD main is a IDENTIFIER BLOCKS BEGINS int is a KEYWORD a,b,c are IDENTIFIER printf is a KEYWORD scanf is a KEYWORD BLOCK ENDS

Thus the program for implementation of a lexical analyzer using LEX and YACC tools were successfully done.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem



To write LEX and YACC program for implementing calculator using LEX and YACC tools.

Start the program. In LEX program declare the identifier for log, cos, sin, tan and memory. Identify the identifier and return id to parser. In YACC program declare the possible symbol type, which are the tokens which are returned by LEX. Define precedence and associativity. Define rule in CFG for non terminal. In main() get the expression from user and print the output. Stop the program.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

PROGRAM: //File name: calc.yacc

%{ #include <stdio.h> int regs[26]; int base; %} %start list %token DIGIT LETTER %left '|' %left '&' %left '+' '-' %left '*' '/' '%' %left UMINUS /*supplies precedence for unary minus */ %% /* beginning of rules section */ list: /*empty */ | list stat '\n' | list error '\n' { yyerrok; } ; stat: expr { printf("%d\n",$1); } | LETTER '=' expr { regs[$1] = $3; } ; expr: '(' expr ')' { $$ = $2; } | expr '*' expr { $$ = $1 * $3; } | expr '/' expr { $$ = $1 / $3; } | expr '%' expr { $$ = $1 % $3; } | 32 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab expr '+' expr { $$ = $1 + $3; } | expr '-' expr { $$ = $1 - $3; } | expr '&' expr { $$ = $1 & $3; } | expr '|' expr { $$ = $1 | $3; } | '-' expr %prec UMINUS { $$ = -$2; } | LETTER { $$ = regs[$1]; } | number ; number: DIGIT { $$ = $1; base = ($1==0) ? 8 : 10; } | number DIGIT { $$ = base * $1 + $2; } ; %% main() { return(yyparse()); } yyerror(s) char *s; { fprintf(stderr, "%s\n",s); } yywrap() { return(1); } 33 Prepared by R.Bhavani, AP/CSE, PRIST University.

III year/VI Sem

11150L66/Compiler Design Lab

III year/VI Sem

PROGRAM: // File name: calc.lex

%{ #include <stdio.h> #include "" int c; extern int yylval; %} %% " " ; [a-z] { c = yytext[0]; yylval = c - 'a'; return(LETTER); } [0-9] { c = yytext[0]; yylval = c - '0'; return(DIGIT); } [^a-z0-9\b] { c = yytext[0]; return(c); }

OUTPUT: $ lex calc.lex $ yacc -d calc.yacc $ cc lex.yy.c $ mv a.out calculate $ calculate m=45 <press enter> m+10 <press enter> 55

RESULT: Thus the program for implementing calculator using LEX and YACC tools was successfully done.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

EX NO:7 & 8:



To Write a C program for implementation of the front end of compiler.

STEP 1: The input is a normal c program. STEP 2: It is given as a text file. STEP 3: The front end of the compiler has task of converting the source program into intermediate code. STEP 4: Intermediate code syntax tree, three address code (or) Post fix Notation. STEP 5: The backtracking process is involved here to produce the Intermediate code. STEP 6: Thus the intermediate code is generated for the given input.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

PROGRAM: IMPLEMENTATION OF FRONT END OF COMPILER //file name is front.c #include<stdio.h> #include<conio.h> #include<string.h> void main() {char pg[100][100],str1[24]; int tem=-1,ct=0,i=-1,j=0,j1,pos=-1,t=-1,flag,flag1,tt=0,fg=0; clrscr(); printf("Enter the codings \n"); while(i>-2) {i++; lab1: t++; scanf("%s",&pg[i]); if((strcmp(pg[i],"getch();"))==0) {i=-2; goto lab1;}} printf("\n pos \t oper \t arg1 \t arg2 \tresult \n"); while(j<t) {lop:ct=0; if(pg[j][1]=='=') { pos++; tem++; printf("%d\t%c\t%c\t%c\tt%d\n",pos,pg[j][3],pg[j][2],pg[j][4],tem); pos++; printf("%d\t:=\tt%d\t\t%c\n",pos,tem,pg[j][0]); } else if(((strcmp(pg[j],"if"))==0)||((strcmp(pg[j],"while"))==0)) {if((strcmp(pg[j],"if"))==0) strcpy(str1,"if"); if((strcmp(pg[j],"while"))==0) strcpy(str1,"ehile"); j++; j1=j;tem++; pos++; if(pg[j][3]=='=') printf("%d\t%c\t%c\t%c\tt%%d\n",pos,pg[j][2],pg[j][1],pg[j][4],tem); else printf("%d\t%c\t%c\t%c\tt%d\n",pos,pg[j][2],pg[j][1],pg[j][3],tem); j1+=2; pos++; while((strcmp(pg[j],"}"))!=0) { j++; if(pg[j][1]=='=') {tt=j;
36 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

fg=1; } ct++; } ct=ct+pos+1; printf("%d\t==\tt%d\tFALSE\t%d\n",pos,tem,ct); if(fg==1) { j=tt; Goto lop; } while((strcmp(pg[j],"}"))!=0) { pos++; tem++; printf("%d\t%c\t%c\t%c\tt%d\n",pos,pg[j][3],pg[j][2],pg[j][4],tem); pos++; printf("%d\t:=\tt%d\t\t%c\n",pos,tem,pg[j][0]); j++; } if((strcmp(pg[j+1],"else"))==0) {ct=0; j++; j1=j; j1+=2; pos++; while((strcmp(pg[j],"}"))!=0) { j1++; ct++; }ct=ct*2; ct++; ct+=(pos+1); j+=2; printf("%d\tGOTO\t\t\\t%d\n",pos,ct); while((strcmp(pg[j],"}"))!=0) { pos++; tem++; printf("%d\t%c\t%c\t%c\tt%d\n",pos,pg[j][3],pg[j][2],pg[j][4],tem); pos++; printf("%t:=\tt%d\t\t%c\n",pos,tem,pg[j][0]); j++; } pos++; printf("%d\tGOTO\t\t\t\%d\n",pos,ct); }} if((strcmp(pg[j],"}"))==0) { pos++; printf("%d\tGOTO\t\t\t%d\n",pos,pos+1);
37 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

} j++; } getch(); }


Thus the program has been executed and implemented the front end of the compiler. .


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

EX NO:9&10 DATE:


To Write a C program for implementation of back end of compiler.


The input for the back end of the compiler is the intermediate code generated by front end of the compiler. The input file (IN.TXT) is provided in read mode. The output file (TARGET.TXT) is created by the program in write mode. Each and every intermediate code in the input file is converted to its equivalent target code by the backend of the compiler The output is stored in the TARGET.Txt file in the form of assembly language.

Stop the program.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

PROGRAM: IMPLEMENTATION OF BACK END OF COMPILER //file name is back.c #include<stdio.h> #include<conio.h> #include<stdlib.h> #include<string.h> int label[20]; int no=0; int main() { FILE *fp1,*fp2; int check_label(int n); char fname[10],op[10],ch; char operand1[8],operand2[8],result[8]; int i=0; clrscr(); printf("\n\nEnter filename of the intermediate code:"); scanf("%s",&fname); fp1=fopen(fname,"r"); fp2=fopen("target.txt","w"); if(fp1==NULL||fp2==NULL) { printf("\nError Opening the File."); getch(); exit(0); } while(!feof(fp1)) { fprintf(fp2,"\n"); fscanf(fp1,"%s",op); i++; if(check_label(i)) { fprintf(fp2,"\nlabel#%d:",i); } if(strcmp(op,"print")==0) { fscanf(fp1,"%s",result); fprintf(fp2,"\n\tOUT%s",result); }
40 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

if(strcmp(op,"goto")==0) { fscanf(fp1,"%s",operand2); fprintf(fp2,"\n\t JMP labe#%s",operand2); label[no++]=atoi(operand2); } if(strcmp(op,"[]=")==0) { fscanf(fp1,"%s%s%s",operand1,operand2,result); fprintf(fp2,"\n\tSTORE%s[%s],%s",operand1,operand2,result); } if(strcmp(op,"uminus")==0) { fscanf(fp1,"%s%s",operand1,result); fprintf(fp2,"\n\tMOV R1,-%s",operand1); fprintf(fp2,"\n\tMOV %s,R1",result); } switch(op[0]) { case'*': fscanf(fp1,"%s%s%s",operand1,operand2,result); fprintf(fp2,"\n\t MOV R0,%s",operand1); fprintf(fp2,"\n\t MOV R1,%s",operand2); fprintf(fp2,"\n\t MUL R0 R1"); fprintf(fp2,"\n\t MOV %s,R0",result); break; case'+': fscanf(fp1,"%s%s%s",operand1,operand2,result); fprintf(fp2,"\n\t MOV R0,%s",operand1); fprintf(fp2,"\n\t MOV R1,%s",operand2); fprintf(fp2,"\n\t ADD R0 R1"); fprintf(fp2,"\n\t MOV %s,R0",result); break; case'-': fscanf(fp1,"%s%s%s",operand1,operand2,result); fprintf(fp2,"\n\t MOV R0,%s",operand1); fprintf(fp2,"\n\t MOV R1,%s",operand2); fprintf(fp2,"\n\t SUB R0 R1"); fprintf(fp2,"\n\t MOV %s,R0",result); break;
41 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

case'/': fscanf(fp1,"%s%s%s",operand1,operand2,result); fprintf(fp2,"\n\t MOV R0,%s",operand1); fprintf(fp2,"\n\t MOV R1,%s",operand2); fprintf(fp2,"\n\t DIV R0 R1"); fprintf(fp2,"\n\t MOV %s,R0",result); break; case'%': fscanf(fp1,"%s%s%s",operand1,operand2,result); fprintf(fp2,"\n\t MOV R0,%s",operand1); fprintf(fp2,"\n\t MOV R1,%s",operand2); fprintf(fp2,"\n\t DIV R0 R1"); fprintf(fp2,"\n\t MOV %s,R0",result); break; case'=': fscanf(fp1,"%s%s",operand1,result); fprintf(fp2,"\n\t MOV %s,%s",result,operand1); break; case'>': fscanf(fp1,"%s%s%s",operand1,operand2,result); fprintf(fp2,"\n\t JGT %s,%s label#%s",operand1,operand2,result); label[no++]=atoi(result); break; case'<': fscanf(fp1,"%s%s%s",operand1,operand2,result); fprintf(fp2,"\n\t JLT%s,%s label#%s",operand1,operand2,result); label[no++]=atoi(result); break; } } fclose(fp2); fclose(fp1); fp2=fopen("target.txt","r"); if(fp2==NULL) { printf("\nError Opening the File"); getch(); exit(0); } do
42 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

{ ch=fgetc(fp2); printf("%c",ch); }while(ch!=EOF); fclose(fp2); getch(); return 0; } int check_label(int k) { int i; for(i=0;i<no;i++) { if(k==label[i]) return 1; } return 0; }

[]=a i 1 * x y t1 + t1 z t2 > t2 num 6 goto 8 +xxx +yyy print x =yz print z

MOV R0,x MOV R1,y MUL R0 R1 MOV t1,R0 MOV R0,t1
43 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

MOV R1,z ADD R0 R1 MOV t2,R0 JGT t2,num label#6 JMP labe#8 label#8: MOV R0,x MOV R1,x ADD R0 R1 MOV x,R0 MOV R0,y MOV R1,y ADD R0 R1 MOV y,R0 OUTx MOV z,y OUTz

Thus the program has been executed and implemented the back end of the compiler.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem



To write a C program to perform the shift reduce parsing.

1. Start the program. 2. Define the main function. 3. Declare array for string and stack and other necessary variables. 4. Get the expression from the user and store it as string. 5. Append $ to the end of the string. 6. Store $ into the stack. 7. Print three columns as Stack, String and Action for the respective actions. 8. Use for loop from i as 0 till string length and check the string. 9. If string has some operator or id, push it to the stack. 10. Mark this action as Shift. 11. Print the stack, string and action values. 12. If stack contains some production on shifting, reduce it. 13. Mark this action as Reduce. 14. Print the stack, string and action values. 15. Repeat steps 9 to 14 again and again till the for loop is valid. 16. Now check the string and the stack. 17. If the string contains only $ and the stack has only $E within it, then print that the given string is valid. 18. Else print that the given string is invalid. 19. End the program


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

PROGRAM: SHIFT REDUCE PARSER // file name is shift.c #include<stdio.h> #include<string.h> #include<conio.h> void main() { char str[25],stk[25];int i,j,t=0,l,r; clrscr(); printf("Enter the String : "); scanf("%s",&str); l=strlen(str); str[l]='$'; stk[t]='$'; printf("Stack\t\tString\t\tAction\n-----------------------------------\n "); for(i=0;i<l;i++) {if(str[i]=='i') {t++; stk[t]=str[i]; stk[t+1]=str[i+1]; for(j=0;j<=t+1;j++) printf("%c",stk[j]); printf("\t\t "); for(j=i+2;j<=l;j++) printf("%c",str[j]); printf("\t\tShift"); printf("\n "); stk[t]='E';i++; } else {t++; stk[t]=str[i]; } for(j=0;j<=t;j++) printf("%c",stk[j]); printf("\t\t "); for(j=i+1;j<=l;j++) printf("%c",str[j]); if(stk[t]=='+' || stk[t]=='*') printf("\t\tShift"); else printf("\t\tReduce"); printf("\n "); } while(t>1) {if(stk[t]=='E' && (stk[t-1]=='+' || stk[t-1]=='*') && stk[t-2]=='E')
46 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

{t-=2; for(j=0;j<=t;j++) printf("%c",stk[j]); printf("\t\t"); printf(" %c",str[l]); printf("\t\tReduce\n "); } else t-=2; } if(t==1 && stk[t]!='+' && stk[t]!='*') printf("\nThe Given String is Valid\n\n"); else printf("\nThe Given String is Invalid\n\n"); getch(); } OUTPUT: Enter the String : id+id*id Stack $id $E $E+ id $E+E $E+E* $E+E*id $E+E*E $E+E $E String +id*id $ +id*id$ *id$ * id$ id$ $ $ $ $ Action Shift Reduce Shift Reduce Shift Shift Reduce Reduce Reduce

The Given String is Valid ALTERNATE PROGRAM: // file name is shift1.c #include<stdio.h> #include<conio.h> #include<string.h> #include<ctype.h> #include<process.h> typedef struct { char num[10]; int top;
47 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

}stack; stack s; void st_push(char a) { s.num[]=a; } char st_pop() { int a; if( return-1;; a=s.num[]; s.num[]='\0'; return(a); } char *substring(char *str,int start) { char *sub=""; int i=0; while (str[start]!='\0') sub[i++]=str[start++]; sub[i]='\0'; return sub; } void main() { char lhs[10],rhs[10][10]; char sub[10],ipstring[10]; char doller[3],*substr; int i,j,k,l,m,n,flag; int length=0,index=0; clrscr(); printf("\n Enter the no. of productions:"); scanf("%d",&n); printf("\n Enter the production in following form"); printf("\nlhs\trhs\n"); for(i=0;i<n;i++) { fflush(stdin); scanf("%c",&lhs[i]); scanf("%s",&rhs[i]); } doller[0]='$'; doller[1]=lhs[0]; doller[2]='\0'; printf("\nEnter the input string to be checked"); scanf("%s",&ipstring); strcat(ipstring,"$"); st_push('$'); length=strlen(ipstring); i=0; printf("\n\t stack\tinput\taction"); printf("\n\t%s\t%s\t",s.num,ipstring);
48 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

st_push(ipstring[i++]); substr=substring(ipstring,1); printf("\n\t%s\t%s\tSHIFT",s.num,substr); while(i<length) { for(j=1;j<;j++) { flag=0; index=0; if( sub[0]='\0'; else for(k=j;k<;k++) sub[index++]=s.num[k]; sub[index]='\0'; for(k=0;k<n;k++) { if(strcmp(sub,rhs[k])==0) { m=0; while(m<strlen(rhs[k])) { st_pop(); m++; } st_push(lhs[k]); flag=1; substr=substring(ipstring,i); printf("\n\t%s\t%s\tREDUCE",s.num,substr); } } if(flag==1) break; } if(flag==0) { if(ipstring[i]!='$') { st_push(ipstring[i++]); substr=substring(ipstring,i); printf("\n\t%s\t%s\tSHIFT",s.num,substr); } else { if(strcmp(s.num,doller)==0) printf("\n\t%s\t%s\tACCEPT",s.num,substr); else printf("\n\t%s\t%s\tERROR",s.num,substr); getch();
49 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

exit(0); } } } }

OUTPUT: Enter the no. of productions:3 Enter the production in following form lhs rhs S CC C cC C d Enter the input string to be checkedcdcd stack input action $ $c $cd $cC $C $Cc $Ccd $CcC $CC $S $S cdcd$ dcd$ cd$ cd$ cd$ d$ $ $ $ $ $ SHIFT SHIFT REDUCE REDUCE SHIFT SHIFT REDUCE REDUCE REDUCE ACCEPT

Thus the Shift reduce parser has been successfully implemented.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem



To write a LEX program to count the no of characters, words in an input program.

Start the program Open a file seven.c in readmode and include the yylex() tool for input scanning. Define and initialize the counters cc, lc and wc. Count number of characters, words and lines until end of input file. Print the number of characters, words and lines (cc, wc and lc). Stop the program.


Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

PROGRAM: SAMPLE LEX PROGRAM // file name is word.l %{ int cc=0,wc=0,lc=0; %} %% [^\t\n]+ {wc++;cc+=yyleng;} \n {cc++;lc++;} [ ] cc++; %% int main(int argc,char **argv) { if(argc>1) { FILE *f; f=fopen(argv[1],"r"); if(!f) { printf("could not open%s",argv[1]); exit(0); } yyin=f; } + yylex(); printf("%d\n%d\n",cc,wc,lc); return 0; } int yywrap() {return 1; } INPUT: #include<stdio.h> void main() { int a,b,c; printf("enter the a "); scanf("%d%d",&a,&b);
52 Prepared by R.Bhavani, AP/CSE, PRIST University.

11150L66/Compiler Design Lab

III year/VI Sem

c=a+b; getch(); }

OUTPUT: $ lex word.l $ cc lex.yy.c -ll $ ./a.out seven.c 101 9

Thus the LEX program has been successfully calculated the number of characters, words and lines in an input file.


Prepared by R.Bhavani, AP/CSE, PRIST University.

S-ar putea să vă placă și