Documente Academic
Documente Profesional
Documente Cultură
Code Generation
MACHINE - DEPENDENT CODE OPTIMIZATION
N.K. Srinath
Source Program
Lexical analysis Syntax analysis Table Management
Error Handling
N.K. Srinath
srinath_nk@yahoo.com
RVCE
N.K. Srinath
srinath_nk@yahoo.com
RVCE
Most of the times, the phases of a compiler are collected into a front-end and a back-end. The front-end comprises of those phases or at times also parts of the phases which depend on the source language and are independent of the target machine. These include lexical analysis, syntactic analysis, creation of symbol table, semantic analysis and generation of intermediate code.
N.K. Srinath srinath_nk@yahoo.com 4 RVCE
It also includes some amount of error handling and code optimization that goes along with these phases. The back-end generally includes those phases of the compiler which depend on the target machine. They do not depend on the source language, just the intermediate language. Backend includes code optimization, code generation along with error handling and symbol-table operations.
N.K. Srinath srinath_nk@yahoo.com 5 RVCE
The code generation routines that is discussed are designed for the use with the grammar in fig
The list of simplified Pascal grammar is shown in fig. 1. < prog > ::= PROGRAM < program > VAR <dec - list >
N.K. Srinath
< type > ::= integer < id - list > ::= id | < id - list > , id <stmt - list > ::= < stmt > <stmt - list > ; < stmt > < stmt > ::= <assign> | <read > | <write> | <for> < assign > ::= id : = < exp > <exp>::= <term>|<exp>+<term>| <exp> - <term>
<term>::=<factor>|<term><factor>|<term>DIV <factor>
< factor> ::= id ; int | (< exp >) < READ> ::= READ ( < id - list >) < write > ::= WRITE ( < id - list >) < for > ::= FOR < idex - exp > Do < body >
N.K. Srinath srinath_nk@yahoo.com 8 RVCE
Note: This grammar is used for code generations to emphasize the point that code generation techniques need not be associated with any particular parsing method. This is because: (1)Operator precedence method ignores certain nonterminals. (2) Recursive-descent menod must use a slightly modified grammar
N.K. Srinath srinath_nk@yahoo.com 9 RVCE
The code generation is for the SIC/XE machine. The code generation routines make use of two data structure for working storage: (1) A List (2) A Stack Listcount: A variable Listcount is used to keep a count of the number of items currently in the list.
N.K. Srinath
srinath_nk@yahoo.com
10
RVCE
The code generation routine make use of token specifiers and are denoted by S(token) . Example: id int S (id) ; name of the identifier S (int) ; value of the integer, # 100
The code generation routines create segments of object code for the compiled program. A symbolic representation is given to these codes using SIC assembler language.
N.K. Srinath
srinath_nk@yahoo.com
11
RVCE
LOCCTR: It is a Location counter which is updated to reflect the next variable address in the compiled program (exactly as it is in an assembler). Application Process to READ Statement:
READ
(
N.K. Srinath
The parser tree for Read statement can be generated with many different parsing methods.
12 RVCE
srinath_nk@yahoo.com
In an operator precedence parse, the recognition occurs when a sub-string of the input is reduced to some non-terminal <Ni>. In a recursive-descent parse, the recognition occurs when a procedure returns to its caller, indicating success. Thus the parser first recognizes the id VALUE as an <id-list>, and then recognizes the complete statement as a < read >.
N.K. Srinath srinath_nk@yahoo.com 13 RVCE
The symbolic representation of the object code to be generated for the READ statement is as shown. + JSUB XREAD WORD 1 WORD VALUE This code consists of a call to a statement XREAD, which world be a part of a standard library associated with the compiler. The subroutine of any program that wants to perform a READ operation can call XREAD.
N.K. Srinath srinath_nk@yahoo.com 14 RVCE
The parameter list for XREAD is defined immediately after the JSUB that calls it. The first word is the number of variables that will be assigned values by the READ. The following word gives the addresses of these variables. Routines that might be used to accomplish the above code generation.
< id - list > :: = < id - list >, id add S (id) to list add 1 to LC ListCount These two statements correspond to alternative structure for < id - list >, that is
generate [ + JSUB XREAD ] record external reference to XREAD generate [WORD Listcount] for each item on list do begin remove S (ITEM) from list generate [WORD S (ITEM)] end List _count : = 0
N.K. Srinath srinath_nk@yahoo.com 17 RVCE
Example:
VARIANCE:=SUMSQ DIV 100 - MEAN * MEAN
Solution
The parser tree for this statement is shown in fig. Most of the work of parsing involves the analysis of the <exp> on the right had side of the " : = " statement.:
N.K. Srinath srinath_nk@yahoo.com 18 RVCE
Parser Tree
<TERM>
<Term> <Factor><Factor> Id Id DIV Int {VARIANCE} {SUMQ} {100}
N.K. Srinath srinath_nk@yahoo.com
<TERM>
<Term>
:=
A code-generation routine is called for each portion of the statement is recognized. Example: For a rule <term>1:: = <term> 2 * <factor> a code is to be generated. The subscripts are used to distinguish between the two occurrences of <term> . The code-generation routines perform all arithmetic operations using register A.
The results after multiplication, <term>2 * <factor> will be left in register A. So we need to keep track of the result left in register A by each segment of code that is generated. This is accomplished by extending the token-specifier idea to non-terminal nodes of the parse tree.
The node specifier S(<term>1) would be set to rA. This indicates that the result of this computation is in register A.
N.K. Srinath srinath_nk@yahoo.com 21 RVCE
The variable REGA is used to indicate the highest level node of the parse tree whose value is left in register A by the code generated so far. 1.
The code generation routine for <assign> consists of bringing the value to be assigned into register A (using GETA). The STA instruction is generated to store the value in A register.
N.K. Srinath srinath_nk@yahoo.com 22 RVCE
Note that REGA is then set to null because the code for the statement has been completely generated, and any intermediate results are no longer needed.
The following rules do not require the generation of any machine instructions since no computation or data movement is involved.
The code generation routines for these rules simply set the node specifier of the higher-level node to reflect the location of the corresponding value.
N.K. Srinath srinath_nk@yahoo.com 23 RVCE
2.
3.
begin GETA (< EXP >2) generate [ADD S(< term >)] end S (< exp >1) : = rA REGA : = < exp >1 4.
begin GETA (< EXP >2) generate [ SUB S(< term >)] end S (< exp >1) : = rA REGA : = < exp >1
5.
6.
<term>1 :: = <term>2*<factor>
if S (< term >2) = rA then generate [ MUL S (<factor>)] else if S (< factor >) = rA then generate [ MUL S (< term >2)] else begin GETA (< term >2) generate [ MUL S(< factor >)] end S (< term >1) : = rA REGA : = < term >1
N.K. Srinath srinath_nk@yahoo.com 27 RVCE
7. <term> :: = <term>2 DIV <factor> if S (< term >2) = rA then generate [DIV S (< factor >)] else begin GETA (< term >2) generate [ DIV S (< factor >)] end S (< term >1) : = rA REGA : = < term >1
N.K. Srinath
srinath_nk@yahoo.com
28
RVCE
8.
9.
10.
The GETA procedure is shown Procedure - GETA (NODE) begin if REGA = null then generate [LDA S (NODE) ] else if S (NODE) rA then begin creates a new looking variable Tempi generate [STA Tempi] record forward reference to Tempi S (REGA) : = Tempi Generate [LDA S (NODE)]
N.K. Srinath srinath_nk@yahoo.com 30 RVCE
N.K. Srinath
srinath_nk@yahoo.com
31
RVCE
srinath_nk@yahoo.com
Let us consider the situation given in the slide above. Suppose, we have to write a complier for m languages targeted for n machines. The obvious approach would be to write m*n compilers.
N.K. Srinath
srinath_nk@yahoo.com
34
RVCE
HLL
This diagram shows two compilers converting higher level language to two different object codes for two machines.
It means that for a language it is necessary to have as many compilers as the number of machines.
Compilers
N.K. Srinath
srinath_nk@yahoo.com
36
RVCE
It allows a logical separation between machine independent and dependent phases and facilitates optimization. All we have to do is to choose a rich intermediate language that would bridge both the source programs and the target programs.
The first three phases are called as the front end of the compiler because they are machine independent.
N.K. Srinath srinath_nk@yahoo.com 37 RVCE
The code generation and related phase is called as the back end. The intermediate code generation is neither consider to be the back end nor front end. Next slide shows three languages producing a common intermediate code. From the intermediate code the object code for the two M/C are obtained.
Hence if we have Mnumber of languages and N object code is to be obtained, the number of front and back end that needs to be written is N+M.
N.K. Srinath srinath_nk@yahoo.com 38 RVCE
The intermediate form that is discussed here represents the executable instruction of the program with a sequence of quadruples. Each quadruples of the form Operation, OP1, OP2, result. Where Operation - is some function to be performed by the object code OP1 & OP2 - are the operands for the operation and Result - designation when the resulting value is to be placed.
N.K. Srinath srinath_nk@yahoo.com 40 RVCE
Example 1: SUM : = SUM + VALUE could be represented as + , SUM, Value, i1 :=, i1, , SUM The entry i1, designates an intermediate result (SUM + VALUE); the second quadruple assigns the value of this intermediate result to SUM. Assignment is treated as a separate operation ( :=).
N.K. Srinath srinath_nk@yahoo.com 41 RVCE
Example 2 : VARIANCE : = SUMSQ DIV 100 - MEAN * MEAN DIV, SUMSQ, #100, i1 *, MEAN, MEAN, i2 - , i1, i2, i3 ::=, i3, VARIABLE Note: Quadruples appears in the order in which the corresponding object code instructions are to be executed. This greatly simplifies the task of analyzing the code for purposes of optimization. It is also easy to translate into machine instructions.
N.K. Srinath srinath_nk@yahoo.com 42 RVCE
Example 3 : For the program shown below write the quadruples. PROGRAM STATS VAR SUM, SUMSQ, I, VALUE, MEAN, VARIANCE : INTEGER BEGIN SUM :=0; SUMSQ : = 0 ;
N.K. Srinath
srinath_nk@yahoo.com
43
RVCE
FOR I : = 1 to 100 DO BEGIN READ (VALUE) ; SUM : = SUM + VALUE ; SUMSQ : = SUMSQ + VALUE * VALUE END; MEAN : = SUM DIV 100; VARIANCE : = SUMSQ DIV 100 - MEAN * MEAN ; WRITE (MEAN, VARIANCE) END.
N.K. Srinath
srinath_nk@yahoo.com
44
RVCE
Solution
Line Operation OP 1 OP 2 Result Pascal Statement
1.
2.
:=
:=
#0
#0
SUM
SUMSQ
SUM : = 0
SUMSQ : = 0
3.
:=
#1
FOR I : = 1 to 100
4. JGT
5. CALL 6. PARA
I
XREAD VALUE
#100
(15)
READ (VALUE)
N.K. Srinath
srinath_nk@yahoo.com
45
RVCE
9. * VALUE VALUE i2
10. + SUMSQ i2 i3
{SUMSQ:= SUMSQ +
VALUE * VALUE}
11. := i3
12. + I #1
SUMSQ
i4 {End of FOR loop}
13. := i4 14. J
N.K. Srinath
I (4)
srinath_nk@yahoo.com 46 RVCE
i5 MEAN
i6 {VARIANCE :=
i7 SUMSQ DIV 100 - MEAN * MEAN}
47 RVCE
i7
i8 VARIANCE
srinath_nk@yahoo.com
N.K. Srinath
srinath_nk@yahoo.com
48
RVCE
MACHINE - DEPENDENT CODE OPTIMIZATION There are several different possibilities for performing machine-dependent code optimization . Assignment and use of registers: Registers is used as instruction operand.
N.K. Srinath
srinath_nk@yahoo.com
49
RVCE
Required to find the least used register to replace with new values when needed. Usually the existence of jump instructions creates difficulty in keeping track of registers contents. Divide the problem into basic blocks to tackle such problems. A basic block is a sequence of quadruples with one entry point, which is at the beginning of the block, one exit point, which is at the end of the block, and no jumps within the blocks.
N.K. Srinath srinath_nk@yahoo.com 50 RVCE
When control passes from one block to another, all values currently held in registers are saved in temporary variables. For example 3, the quadruples can be divided into five blocks. They are:
Block -- A Quadruples 1 - 3 Block -- B Quadruples 4
N.K. Srinath srinath_nk@yahoo.com 51 RVCE
Block -- C Quadruples 5 - 14 Block -- D Quadruples 15 - 20 Block -- E Quadruples 21 - 23 Fig. shows the basic blocks of the flow group for the quadruples.
A : 1-3 B:4 C : 5 - 14
An arrow from one block to D : 15 - 20 another indicates that control can pass directly from one E : 21 - 23 quadruple to another. This kind of representation is called a flow group.
N.K. Srinath srinath_nk@yahoo.com 52 RVCE
-Rearranging quadruples before machine code generation: Example : 1) DIV 2) * 3) SUMSQ MEAN i1 i2 # 100 i1 MEAN i2 i3
4) : =
i3
VARIANCE
N.K. Srinath
srinath_nk@yahoo.com
53
RVCE
LDA SUMSQ DIV # 100 STA i1 LDA MEAN MUL MEAN STA i2 LDA i1 SUB i2 STA i3 STA Variance shows a typical generation of machine code from the quadruples using only a single register ie Accumulator
N.K. Srinath srinath_nk@yahoo.com 54 RVCE
The optimizing compiler could rearrange the quadruples so that the second operand of the subtraction is computed first. This results in reducing two memory accesses. * MEAN MEAN # 100 i2 i1
DIV SUMSQ
:=
N.K. Srinath
i1
i3
i2
i3
VARIANCE
srinath_nk@yahoo.com 55 RVCE
N.K. Srinath
srinath_nk@yahoo.com
56
RVCE
Characteristics and Instructions of Target Machine: Special loop - control instructions or addressing modes can be used to create more efficient object code. High-level machine instructions can perform complicated functions such as calling procedure and manipulating data structures in a single operation. If multiple functional blocks can be used, the source code can be rearranged to use all the blocks or most of the blocks concurrently. This is possible if the result of one block does not depend on the result of the other.
N.K. Srinath srinath_nk@yahoo.com 57 RVCE
N.K. Srinath
srinath_nk@yahoo.com
58
RVCE
Register Allocation
Assign specific CPU registers for specific values. Code Generation must maintain information on which registers: Are used for which purposes Are available for reuse
Main objective: Maximize the utilization of the CPU registers Minimize references to memory locations
N.K. Srinath srinath_nk@yahoo.com 59 RVCE
Possible uses for CPU registers Values used many times in a program Values that are computationally expensive Importance Efficiency Speed
N.K. Srinath
srinath_nk@yahoo.com
60
RVCE
N.K. Srinath
srinath_nk@yahoo.com
61
RVCE