Documente Academic
Documente Profesional
Documente Cultură
Assemblers
Prepared By
Prof. Mihir N Shah
Assembler
Assembler translate source code written in assembly
language to object code.
Source Program
Mnemonic
Code
Symbol
Assembler
Object Code
Assembly Language
Structure
<Label><Mnemomic><Operand>Comments
Label
symbolic labeling of an assembler address (command
address at Machine level)
Mnemomic
Symbolic description of an operation
Operands
Contains of variables or address if necessary
Comments
Statement format
An Assembly language statement has following format:
[Label] <opcode> <operand spec>[,<operand spec>..]
If a label is specified in a statement, it is associated as a
symbolic name with the memory word generated for the
statement.
<operand spec> has the following syntax:
<symbolic name> [+<displacement>] [(<index
register>)]
Eg. AREA, AREA+5, AREA(4), AREA+5(4)
Mnemonic Operation
Codes
Each statement has two operands, first operand is always
a register and second operand refers to a memory word
using a symbolic name and optional displacement.
Operation Codes
Machine Instruction
Format
Sign is not a part of the instruction
Opcode: 2 digits, Register Operand: 1 digit, Memory
Operand: 3 digits
Condition code specified in a BC statement is encoded into the
first operand using the codes 1- 6 for specifications LT, LE,
EQ, GT, GE and ANY respectively
In a Machine Language Program, all addresses and constants
are shown in decimal as shown in the next slide
Assembly Language
Statements
An assembly program contains three kinds of statements:
1. Imperative Statements
2. Declaration Statements
3. Assembler Directives
.Imperative Statements: They indicate an action to be
performed during the execution of an assembled program.
Each imperative statement is translated into one machine
instruction.
Use of Constants
The DC statement does not really implement constants it just
initializes memory words to given values.
An Assembly Program can use constants just like HLL, in two ways
as immediate operands and as literals.
1. Immediate operands can be used in an assembly statement only if
the architecture of the target machine includes the necessary
features.
Ex: ADD
AREG,5
Assembler Directive
Assembler directives instruct the assembler to perform certain
actions during the assembly of a program.
Some assembler directives are described in the following:
1) START
<constant>
This directive indicates that the first word of the target program
generated by the assembler should be placed in the memory word
having address <constant>.
2) END [<operand spec>]
This directive indicates the end of the of the source program. The
optional <operand spec> indicates the address of the instruction
where the execution of the program should begin.
Advantages of Assembly
Language
The primary advantages of assembly language programming
over machine language programming are due to the use of
symbolic operand specifications.
(in comparison to machine language program)
Assembly language programming holds an edge over HLL
programming in situations where it is desirable to use
architectural features of a computer.
(in comparison to high level language program)
Fundamentals of LP
Language processing =
analysis of source program + synthesis of target
program
Analysis of source program is specification of the source
program
Lexical rules: formation of valid lexical units(tokens) in the
source language
Syntax rules : formation of valid statements in the source
language
Semantic rules: associate meaning with valid statements of
the language
information
Synthesis Phase
Consider the following statement:
MOVER BREG, ONE
The following info is needed to synthesize machine instruction for
this stmt:
Address of the memory word with which name ONE is associated
[depends on the source program, hence made available by the
Analysis phase].
Machine operation code corresponding to MOVER [does not depend
on the source program but depends on the assembly language, hence
synthesis phase can determine this information for itself]
Note: Based on above discussion, the two data structures required during the
synthesis phase are described next
Analysis Phase
Primary function of the Analysis phase is to build the symbol
table.
It must determine the addresses with which the symbolic
names used in a program are associated
It is possible to determine some addresses directly like the
address of first instruction in the program (ie.,start)
Other addresses must be inferred
To determine the addresses of the symbolic names we need
to fix the addresses of all program elements preceding it
through Memory Allocation.
To implement memory allocation a data structure called
location counter is introduced.
For Example
Symbol Table
Symbol
Address
103
Opcode
Length
MOVER
04
MULT
03
Mnemo
nic
Opco
de
leng
th
ADD
01
SUB
Source
Program
Analysis
Phase
02
Mnemonic
Table
Synthesis
-----------------------------Phase
--->
Symb Addre
ol
ss
N
104
AGAIN 113
Symbol Table
Target
Program
Data Access
-- > Control
Access
Advanced Assembler
Directives
1.ORIGIN
The syntax of this directive is
ORIGIN <address specification>
where <address specification> is an <operand specification> or
<constant>.
This directive instructs the assembler to put the address given by
<address specification> in the location counter.
The ORIGIN statement is useful when the target program does not
consist of a single contiguous area of memory.
The ability to use an <operand specification> in the ORIGIN statement
provides the ability to change the address in the location counter in a
relative rather than absolute manner.
2.EQU
The EQU directive has the syntax
<symbol> EQU <address specification>
where <address specification> is either a <constant> or
<symbolic name> <displacement>.
The EQU statement simply associates the name <symbol> with
the address specified by <address specification>. However, the
address in the location counter is not affected.
3.LTORG
The LTORG directive, which stands for 'origin for literals', allows a
programmer to specify where literals should be placed.
The assembler uses the following scheme for placement of literals:
When the use of a literal is seen in a statement, the assembler enters
it into a literal pool unless a matching literal already exists in the
pool.
At every LTORG statement, as also at the END statement, the
assembler allocates memory to the literals of the literal pool and
clears the literal pool.
This way, a literal pool would contain all literals used in the program
since the start of the program or since the previous LTORG
statement.
Thus, all references to literals are forward references by definition.
If a program does not use an LTORG statement, the assembler would
enter all literals used in the program into a single pool and allocate
memory to them when it encounters the END statement.
SYMTAB
A SYMTAB entry contains the symbol name, field address and
length.
Some address can be determining directly, e.g. the address of the first
instruction in the program, however other must be inferred.
To find address of other we must fix the addresses of all program
elements preceding it. This function is called memory allocation.
LITTAB
A table of literals used in the program.
A LITTAB entry contains the field literal and address.
The first pass uses LITTAB to collect all literals used in a program.
POOLTAB
Awareness of different literal pools is maintained using the auxiliary
table POOLTAB.
This table contains the literal number of the starting literal of each
literal pool.
At any stage, the current literal pool is the last pool in the LITTAB.
On encountering an LTORG statement (or the END statement), literals
in the current pool are allocated addresses starting with the current
value in LC and LC is appropriately incremented.
Address
Opcode
Operands
Variant II
This variant differs from variant I of the intermediate code
because in variant II symbols,condition codes and CPU
register are not processed.
So, IC unit will not generate for that during pass I.
Example 2
SYMTAB
Pass I Algorithm
Pass II Algorithm
Thank You