Documente Academic
Documente Profesional
Documente Cultură
Construction
1
Virtual Machines for Compilers
We will discuss how virtual machine together with compilers help in executing a
program. Before we discuss anything we should know what the compiler is and how
it works? We should also know what is virtual machine and how it works? Then
finally we see how a program is converted from source code to machine code and
then executes on the machine.
Compiler
A compiler is a piece of software that translates a source program from
source language to equivalent program in target language. An important feature of
the compiler is to identify error in source program during compilation/translation
process.
Structure
A compiler consists of many phases. This makes us easy to understand a
compiler and implements it. We will give the overview of phases one by one.
Syntax Analyzer The second phase of the compiler is syntax analysis or parsing.
The parser uses the first components of the tokens produced by the lexical analyzer
to create a tree-like intermediate representation that depicts the grammatical
structure of the token stream. A typical representation is a syntax tree in which
each interior node represents an operation and the children of the node represent
the arguments of the operation. The parser analyzes the source code (token stream)
against the production rules to detect any errors in the code. The output of this
phase is a parse tree. This way, the parser accomplishes two tasks, i.e., parsing
the code, looking for errors and generating a parse tree as the output of the phase
Semantic Analyzer The semantic analyzer uses the syntax tree and the
information in the symbol table to check the source program for semantic
consistency. It performs two functions.
Scope checking: verify that all applied occurrences of identifiers are declared
Type checking: verify that all operations in the program are used according to
their type rules.
3
The output of the semantic analyzer is parse tree or (AST) which is then fed to
intermediate code generator.
Quadruples
Triples
Op (operator)
Arg1(argument1)
Arg2(argument2)
result
for example if the expression is i := i + j + k then the three address code will be
t1 := i + j
i := t1 + k
Quadruple representation is
(0)
(1)
Op
+
+
ARG1
i
t1
ARG2
j
k
Result
t1
i
4
2. Arg1(argument1)
3. Arg2(argument2)
the triples representation of the above expression will be
Op
+
+
(0)
(1)
ARG1
i
(0)
ARG2
j
k
Memory
Algorithm
Execution Time
Programming Language and
Others
The output code must not, in any way, change the meaning of the program.
Optimization should increase the speed of the program and if possible, the
program should demand less number of resources.
Optimization should itself be fast and should not delay the overall compiling
process.
Virtual Machine
Now that we have seen how the compiler works we are going to see what is virtual
machine?
5
A virtual machine (VM) is a software implementation of a machine that executes
programs like a physical machine. It shares physical hardware resources with the
other users but isolates the OS or application to avoid changing the end-user
experience.
The Java Virtual Machine is an abstract computing machine. Like a real computing
machine, it has an instruction set and manipulates various memory areas at run
time. It is reasonably common to implement a programming language using a
virtual machine. The Java Virtual Machine knows nothing of the Java programming
language, only of a particular binary format, the class file format. A class file
contains Java Virtual Machine instructions (or bytecodes) and a symbol table, as well
as other ancillary information.
For the sake of security, the Java Virtual Machine imposes strong syntactic and
structural constraints on the code in a class file. However, any language with
functionality that can be expressed in terms of a valid class file can be hosted by
the Java Virtual Machine. Attracted by a generally available, machine-independent
6
platform, implementers of other languages can turn to the Java Virtual Machine as a
delivery vehicle for their languages.
Architecture of JVM
Load is the part responsible for loading bytecode into the memory. Class loader
loads files from different sources using different loader such as
Bootstrap Class Loader responsible for loading java internal classes from
rt.jar which is distributed with JVM.
Extension class loader responsible for loading additional application jars
that reside in jre/lib/ext
Application class loader loads classes from valued specified in your
CLASSPATH environment variables and from cp parameterized folder.
7
Link is the phase where much of the work is done. It consists of three parts
Verify This is the part where the bytecode is verified according to the JVM
class specifications.
Prepare This is the part where the memory is allocated for the static
variables inside the class file. The memory locations are than initialized with
the default values.
Resolve In this part all the symbolic references to the current classes are
resolved with actual reference. For example one class has reference to other
class.
Initialization This is the phase where the actual values of the static variable define
in source code are set unlike prepare where the default value are set.
Runtime Data Areas In this section the memory is reserved for all the parts of the
program. It consists of five parts.
Execution Engine
Once the instruction to be executed is ready then the java interpreter interprets the
instruction and execute its
Interpreter take the byte code instruction interprets the instruction and finds out
which native operation is to be done and then execute that operation with the help
of native method interface which uses native method libraries.
JIT Compiler JIT stands for Just in Time. Every time JIT compiler interprets byte
codes, it will keep the most frequent executed binary code in log and optimize it.
Next time, when the same method is running, the optimized code will run. So this
eliminate the overhead of interpreting the instructions again and optimizing it.
Profiler keeps track of which portions are frequently being executing. Experiments
show Java programs using JIT could be as fast as a compiled C program
Garbage Collection A garbage collector's primary function is to automatically
reclaim the memory used by objects that are no longer referenced by the running
application. It may also move objects as the application runs to reduce heap
fragmentation. A garbage collector is not strictly required by the Java virtual
machine specification. The specification only requires that an implementation
manage its own heap in some manner.
Assembler
It is a program which converts assembly language into machine code. Assembler
performs the translation in similar way as compiler. But assembler is used to
translate low-level programming language whereas compiler is used to translate
high-level programming language.
An assembler performs the following functions
Output the object program and provide other information (e.g., for linker and
loader)
Pass 1
Assign addresses to all statements in the program
Save the values (addresses) assigned to all labels (including label and
variable names) for use in Pass 2 (deal with forward references)
Perform some processing of assembler directives (e.g., BYTE, RESW, these
can affect address assignment)
Pass 2
Assemble instructions (generate opcode and look up addresses)
Generate data values defined by BYTE, WORD
Perform processing of assembler directives not done in Pass 1
Write the object program and the assembly listing
10
Location Counter (LOCCTR) is used to help the assignment of addresses. LOCCTR
is initialized to the beginning address specified in the START statement .The length
of the assembled instruction or data area to be generated is added to LOCCT.
Linker
A programming tool which combines one or more partial Object Files and libraries
into a (more) complete executable object file.
Three tasks
o
o
o
Loader
Part of the OS that brings an executable file residing on disk into memory and
starts it running
Steps
o
o
o
o
o
o
Read executable files header to determine the size of text and data
segments
Create a new address space for the program
Copies instructions and data into address space
Copies arguments passed to the program on the stack
Initializes the machine registers including the stack ptr
Jumps to a startup routine that copies the programs arguments from
the stack to registers and calls the programs main routine
11
Editors
Source code editors have features specifically designed to simplify and speed up
input of source code, such as syntax highlighting, indentation, autocomplete and
bracket matching functionality. These editors also provide a convenient way to run a
compiler, interpreter, debugger, or other program relevant for the software
development process. So, while many text editors can be used to edit source code,
if they don't enhance, automate or ease the editing of code, they are not source
code editors, but simply text editors that can also be used to edit source code.
Some well-known editors are Gedit, Vim, Atom etc.
12
Type
int
char
Assembly Code
PUSH 2
LOADL 38
STORE 1[SB]
LOAD 0[SB]
LOADL 1
CALL add
STORE 0[SB]
POP 2
HALT
Address
0[SB]
1[SB]
13
Example Pascal:
Every identifier must be declared before it is used. How to handle mutual
recursion then?
14