Sunteți pe pagina 1din 25

Systems Software

Compilers
And
Interpreters

12/4/2019 Complied By: Pallavi 1


Outline

1. Compiler and interpreters


2. Compilation process
3. Interpreters
4. PL/0 Symbols (tokens)

12/4/2019 Complied By: Pallavi 2


Compilers / Interpreters
• Programming languages are notations for
describing computations to people and to
machines.
• Programming languages can be implemented by
any of three general methods:
1. Compilation
2. Interpretation
3. Hybrid Implementation

12/4/2019 Complied By: Pallavi 3


Compilers
A compiler is a program that takes high level
languages (i.e. Pascal, C, ML)as input , and
translates it to a low-level representation which
the computer can understand and execute.

Source Compiler ELF


Program (binary)
(i.e. C++)

ELF: Executable Linkable File


12/4/2019 Complied By: Pallavi 4
Compilers
The process of compilation and program execution take
place in several phases:
Front end: Scanner  Parser  Semantic Analyzer
Back end: Code generator

Source Intermediate Target


Front End Back End
Code Code Code

12/4/2019 Complied By: Pallavi 5


Compilers
Source Code
Optimizer
program (optional)

Intermediate
Lexical Syntax code Code
analyzer analyzer generator generator
(semantic
analyzer)

Lexical units Intermediate Machine


Parse trees code
(Tokens) language

Computer
Symbol table

12/4/2019 Complied By: Pallavi 6


EXAMPLE: Fahrenheit := 32 + celsious * 1.8

|f|a|h|r|e|n|h|e|i|t|:|=|3|2|+|c|e|l|s| I|o|u|s|*|1|.|8|;|

Getchar() Lexical analyzer (scanner)


(converts from character stream into
a stream of tokens.)

[ id, 1 ] [ : = ][ int, 32 ][ + ][id, 2 ][ * ][int, 1.8 ][; ]


Symbol Table
1 index in symbol table
fahrenheit real Syntax analyzer (parser)
(Construct syntactic structure of the program)
2 celsious real

:=
name attribute
id1 +

int32 *

id2 real 1.8

12/4/2019 Complied By: Pallavi 7


:=

id1 +

int32 *

id2 real 1.8


Symbol Table
1
fahrenheit real Context analyzer Determines de type of
the identifier
2 celsious real

:=

id1 +r

inttoreal *r

int32 id2 real 1.8

12/4/2019 Complied By: Pallavi 8


:=

id1 +r

inttoreal *r

int32 id2 real 1.8


Symbol Table
1
fahrenheit real Intermediate code generator
2 celsious real

Temp1 := inttoreal(32)
Temp2 := id2 Intermediate code
Temp2 := Temp2 * 1.8
Temp1 := Temp1 + Temp2
id1 := Temp1

12/4/2019 Complied By: Pallavi 9


Temp1 := inttoreal(32)
Temp2 := id2
Temp2 := Temp2 * 1.8 Intermediate code
Temp1 := Temp1 + Temp2
id1 := Temp1

Symbol Table
Code optimizer
1
fahrenheit real

2 celsious real
Temp1 := id2
Temp1 := Temp1 * 1.8 optimized code
Temp1 := Temp1 + 32.0
id1 := Temp1

12/4/2019 Complied By: Pallavi 10


Temp1 := id2
Temp1 := Temp1 * 1.8 optimized code
Temp1 := Temp1 + 32.0
id1 := Temp1

Symbol Table
Code generator
1
fahrenheit real

2 celsious real
movf id2, r1
mulf #1.8, r1 assembly instructions
addf #32.0, r1
movf r1, id1

12/4/2019 Complied By: Pallavi 11


Compilers
Lexical analyzer:
Gathers the characters of the source program into lexical units.

Lexical units of a program are:


identifiers
special words (reserved words)
operators
special symbols
Comments are ignored!

Syntax analyzer:
Takes lexical units from the lexical analyzer and use them to construct
a hierarchical structure called parse tree

Parse trees represent the syntactic structure of the program.

12/4/2019 Complied By: Pallavi 12


Compilers
Intermediate code:
Produces a program in a different lenguage representation:
Assembly language
Similar to assembly language
Something higher than assembly language

Note: semantic analysis is an integral part of the intermediate


code generator

Optimization:
Makes programs smaller or faster or both.

Most optimization is done in the intermediate code.


(i.e. tree reduction, vectorization)

12/4/2019 Complied By: Pallavi 13


Compilers
Code generator:
Translate the optimized intermediate code into machine language.

The symbol table:


Serve as a database for the compilation process.

Contents type and attribute information of each user-defined


name in the program.

Symbol Table

1 fahrenheit real

celsious real
2

Index name type attributes

12/4/2019 Complied By: Pallavi 14


Compilers
Machine language
To run a program in its machine language form, it needs in general
-- some other code
-- programs from the O.S. (i.e. input/output)

Libraries

Machine language Linker Executable file Loader

O.S. routines
(I/O routines)
Computer

12/4/2019 Complied By: Pallavi 15


Interpreters
Programs are interpreted (executed) by another program called
the interpreter.

Advantages: Easy implementation of many source-level


debugging operations, because all run-time errors operations
refer to source-level units.

Disadvantages: 10 to 100 times slower because statements are


interpreted each time the statement is executed.

Background:
Early sixties  APL, SNOBOL, Lisp.
By the 80s  rarely used.
Recent years  Significant comeback ( some Web scripting
languages: JavaScritp, php)

12/4/2019 Complied By: Pallavi 16


Interpreters

Source
program

Interpreter Input data

Result

12/4/2019 Complied By: Pallavi 17


Hybrid implementation systems
They translate high-level language programs to an
intermediate language designed to allow easy
Java interpretation
program

Byte code Machine A


interpreter

Translator Byte code

Byte code Machine B


Intermediate interpreter
code

Example: PERL and initial implementations of Java

12/4/2019 Complied By: Pallavi 18


Interpreters
Just-In-Time (JIT) implementation

Programs are translated to an intermediate language.

During execution, it compiles intermediate language methods


into machine code when they are called.

The machine code version is kept for subsequent calls.

.NET and Java programs are implemented with JIT system.

12/4/2019 Complied By: Pallavi 19


Machine-independent Optimization

In this optimization, the compiler takes in the intermediate code and


transforms a part of the code that does not involve any CPU registers and/or
absolute memory locations. For example:

do
{ item = 10; value = value + item; }
while(value<100);

This code involves repeated assignment of the identifier item, which


if we put this way:
Item = 10;
do { value = value + item; }
while(value<100);

should not only save the CPU cycles, but can be used on any processor.

12/4/2019 Complied By: Pallavi 20


Machine-dependent Optimization

Machine-dependent optimization is done after the target code has been


generated and when the code is transformed according to the target machine
architecture. It involves CPU registers and may have absolute memory
references rather than relative references. Machine-dependent optimizers put
efforts to take maximum advantage of memory hierarchy
Basic Blocks

Source codes generally have a number of instructions, which are always


executed in sequence and are considered as the basic blocks of the code.
These basic blocks do not have any jump statements among them, i.e., when
the first instruction is executed, all the instructions in the same basic block will
be executed in their sequence of appearance without losing the flow control of
the program.
A program can have various constructs as basic blocks, like IF-THEN-ELSE,
SWITCH-CASE conditional statements and loops such as DO-WHILE, FOR,
and REPEAT-UNTIL, etc.

12/4/2019 Complied By: Pallavi 21


Basic block identification

We may use the following algorithm to find the basic blocks in a program:
Search header statements of all the basic blocks from where a basic block
starts:
First statement of a program.
Statements that are target of any branch (conditional/unconditional).
Statements that follow any branch statement.
Header statements and the statements following them form a basic block.
A basic block does not include any header statement of any other basic block.
Basic blocks are important concepts from both code generation and
optimization point of view.

12/4/2019 Complied By: Pallavi 22


Basic blocks play an important role in identifying variables, which are
being used more than once in a single basic block. If any variable is
being used more than once, the register memory allocated to that
variable need not be emptied unless the block finishes execution.

12/4/2019 Complied By: Pallavi 23


PL/0 Symbols
Given the following program written in PL/0:

const m = 7, n = 85;
var i,x,y,z,q,r;
As in any language, in PL/0 we need
procedure mult;
var a, b; to identify what is the vocabulary and
begin what are the valid names and special
a := x; b := y; z := 0; symbols that we accept as valid:
while b > 0 do
begin
if odd x then z := z+a;
a := 2*a;
b := b/2;
end
end;
begin
x := m;
y := n;
call mult;
end.

12/4/2019 Complied By: Pallavi 24


The End

12/4/2019 Complied By: Pallavi 25

S-ar putea să vă placă și