Documente Academic
Documente Profesional
Documente Cultură
Assembly Languages
Assembler
Pu-Jen Cheng
Source Object
j
Assembler Linker
Program Code
Loader
Overview
SIC Machine Architecture
SIC/XE Machine Architecture
Design and Implementation of Assembler
Source Object
j
Assembler Linker
Program Code
Loader
SIC Machine Architecture
The Simplified Instructional Computer (SIC)
Memory
¾ Memory consists of 8-bit bytes
¾ Any 3 consecutive bytes form a word (24 bits)
¾ Total 768 ((215) bytes
o oof 332768 by es in thee computer
co pu e memory
e oy
Registers
¾ Five 24-bits registers
SIC Machine Architecture
Data Formats
¾ Integers are stored as 24-bit binary number
¾ 2’s complement representation for negative values
¾ Characters are stored using 8-bit ASCII codes
¾ No floating-point hardware on the standard version of SIC
Instruction Formats
8 1 15
¾ Standard version of SIC
opcode x address
Mode Indication Target address calculation
¾ Subroutine linkage
JSUB, RSUB
LDA FIVE
STA ALPHA
LDCH CHARZ
STCH C1
.
.
.
ALPHA RESW 1 one-word variable
FIVE WORD 5 one-word constant
CHARZ BYTE C’Z’ one-byte constant
C1 RESB 1 one-byte variable
SIC Programming Example
LDX ZERO initialize index register to 0
MOVECH LDCH STR1,X load char from STR1 to reg A
STCH STR2,X
TIX ELEVEN add 1 to index, compare to 11
JLT MOVECH loop if “less than”
.
.
.
STR1 BYTE C’TEST STRING’
STR2 RESB 11
ZERO WORD 0
ELEVEN WORD 11
SIC Programming Example
LDA ZERO initialize index value to 0
STA INDEX
ADDLP LDX INDEX load index value to reg X
LDA ALPHA,X load word from ALPHA into reg A
ADD BETA,X
STA GAMMA,X store the result in a word in GAMMA
LDA INDEX
ADD THREE add 3 to index value
STA INDEX
COMP K300 compare new index value to 300
JLT ADDLP loop if less than 300
...
...
INDEX RESW 1
ALPHA RESW 100 array variables—100 words each
BETA RESW 100
GAMMA RESW 100
ZERO WORD 0 one-word constants
THREE WORD 3
K300 WORD 300
SIC Programming Example
INLOOP TD INDEV test input device
JEQ INLOOP loop until device is ready (<)
RD INDEV read one byte into register A
STCH DATA
.
.
OUTLP TD OUTDEV test output device
JEQ OUTLP loop until device is ready (<)
LDCH DATA
WD OUTDEV write one byte to output device
.
.
INDEV BYTE X’F1’ input device number
OUTDEV BYTE X’05’ output device number
DATA RESB 1
Overview
SIC Machine Architecture
SIC/XE Machine Architecture
Design and Implementation of Assembler
Source Object
j
Assembler Linker
Program Code
Loader
SIC/XE Machine Architecture
An XE version (upward compatible)
Memory
¾ Maximum 1 megabyte (220 bytes)
Registers
¾ Additional registers are provided by SIC/XE
Support 48-bit floating-point data type
SIC/XE Machine Architecture
Instruction Formats
8
Format 1 (1 byte) op
8 4 4
Format 2 (2 bytes) op r1 r2
6 1 1 1 1 1 1 12
Format 3 (3 bytes) op n i x b p e disp
6 1 1 1 1 1 1 20
Format 4 (4 bytes) op n i x b p e address
n i x b p e
opcode 1 1 0 1 disp
nn=11, i=1
i 1, bb=0
0, pp=00, TA
TA=disp
disp (0≤disp ≤4095)
n i x b p e
opcode 1 1 1 0 0 disp
n i x b p e
opcode 1 0 0 disp
ii=00, n=0
n 0, TA
TA=bpe+disp
bpe+disp (SIC standard)
n i x b p e
opcode 1 1 disp
Instruction Format
SIC/XE Machine Architecture
Instruction Set
¾ Instructions to load and store the new registers
LDB, STB, etc.
Source Object
j
Assembler Linker
Program Code
Loader
Design & Implementation of Assembler
U variable
Use i bl names instead
i t d off memory addresses
dd
¾ Labels (for jump instructions)
¾ Subroutines COPY START 1000
¾ Constants …
LDA LEN
…
forward references …
LEN RESW 1
Two Pass Assembler
Pass 1
¾ Assign addresses (LOC) to all statements in the program
¾ Save the values assigned to all labels for use in Pass 2
¾ Perform some processing of assembler directives
Pass 2
¾ Assemble instructions
¾ Generate data values defined by BYTE, WORD
¾ Perform processing of assembler directives not done in Pass 1
¾ Write the object program and the assembly listing
Two Pass Assembler
Source
program
Intermediate Object
Pass 1 Pass 2
file codes
Data Structures:
Operation Code Table (OPTAB)
Symbol Table (SYMTAB)
Location Counter(LOCCTR)
Two Pass Assembler – Pass 1
Two Pass Assembler – Pass 2
OPTAB (operation code table)
Content
¾ Mnemonic, machine code (instruction format, length) etc.
Characteristic
¾ Static table
Implementation
¾ Array or hash table, easy for search
SYMTAB (symbol table)
Content COPY 1000
FIRST 1000
¾ Label name, value, flag, (type, length) etc.
CLOOP 1003
Characteristic ENDFIL 1015
EOF 1024
¾ Dynamic table (insert, delete, search)
THREE 102D
Implementation ZERO 1030
RETADR 1033
¾ Hash table, non-random keys, hashing LENGTH 1036
function BUFFER 1039
RDREC 2039
Design & Implementation of Assembler
Header
Col. 1 H
Col. 2~7 Program name
Col. 8~13 Starting address (hex)
Col. 14-19 Length of object program in bytes (hex)
Text
Col.1 T
Col.2~7 Starting address in this record (hex)
Col. 8~9 Length of object code in this record in bytes (hex)
Col. 10~69 Object code (69-10+1)/6=10 instructions
End
Col.1 E
Col.2~7 Address of first executable instruction (hex)
(END program_name)
Line Loc Source statement Object code
17 20 2D
Instruction Format and Addressing Mode
SIC/XE
¾ PC-relative or Base-relative addressing: op m
¾ Indirect addressing: op @m
¾ Immediate addressing: op #c
¾ Extended format: +op m
¾ Index addressing: op m,x
¾ register-to-register instructions
¾ larger memory -> multi-programming (program allocation)
Translation
Register translation
¾ Register name (A, X, L, B, S, T, F, PC, SW) and their values
(0,1, 2, 3, 4, 5, 6, 8, 9)
¾ Preloaded in SYMTAB
Address translation
¾ Most register-memory instructions use program counter
relative or base relative addressing
¾ Format 3: 12-bit address field
Base-relative: 0~4095
PC-relative: -2048~2047
OPCODE n i x b p e Address
0001 01 1 1 0 0 1 0 (02D)16
OPCODE n i x b p e Address
0011 11 1 1 0 0 1 0 (FEC)16
Displacement= CLOOP-PC= 6 - 1A= -14= FEC
Base-Relative Addressing Modes
Base-relative
¾ Base register is under the control of the programmer
¾ 12 LDB #LENGTH
¾ 13 BASE LENGTH
¾ 160 104E STCH BUFFER, X 57C003
OPCODE n i x b p e Address
0101 01 1 1 1 1 0 0 (003)16
Immediate addressing
¾ 55 0020 LDA #3 010003
OPCODE n i x b p e Address
0000 00 0 1 0 0 0 0 (003)16
OPCODE n i x b p e Address
0111 01 0 1 0 0 0 1 (01000)16
Immediate Address Translation
Immediate addressing
¾ 12 0003 LDB #LENGTH 69202D
OPCODE n i x b p e Address
0110 10 0 1 0 0 1 0 (02D)16
¾ 12 0003 LDB #LENGTH 690033
OPCODE n i x b p e Address
0110 10 0 1 0 0 0 0 (033)16
0011 11 1 0 0 0 1 0 (003)16
TA=RETADR=0030
TA=(PC)+disp=002D+0003
Design & Implementation of Assembler
Modification record
¾ Col 1 M
¾ Col 2-7 Starting location of the address field to be
modified, relative to the beginning of the program
¾ Col 8-9
8 9 length
l th off the
th address
dd field
fi ld to
t be
b modified,
difi d in
i half-
h lf
bytes
Object Code with Modification Record
Design & Implementation of Assembler
Solution
¾ Require
equ e all such
suc areas
e s be defined
de ed before
be o e they
ey aree referenced
e e e ced
¾ Labels on instructions: no good solution
Two types of one-pass assemblers
¾ Load-and-go
Produces object code directly in memory for immediate
execution
¾ The other
Produces usual kind of object code for later execution
Load-and-go Assembler
Characteristics
¾ Useful for program development and testing
¾ Avoids the overhead of writing the object program out and
reading it back
¾ Both one-pass
p and two-pass
p assemblers can be designed
g as
load-and-go.
¾ However one-pass also avoids the over head of an additional
pass over the source program
¾ For a load-and-go assembler, the actual address must be
known at assembly time, we can use an absolute program
Load-and-go Assembler
Forward references handling
1. Omit the address translation
2. Insert the symbol into SYMTAB, and mark this symbol
undefined
3. The address that refers to the undefined symbol is added to a
list of forward references associated with the symbol table
entry
4. When the definition for a symbol is encountered, the proper
address for the symbol is then inserted into any instructions
previous generated according to the forward reference list
Load-and-go Assembler
At the end of the program
¾ Any SYMTAB entries that are still marked with * indicate
undefined symbols
¾ Search SYMTAB for the symbol named in the END statement
and jjumpp to this location to begin
g execution
The actual starting address must be specified at
assembly time
After Scanning Line 40
After Scanning Line 160
Object Code
Multi-Pass Assemblers
Restriction on EQU
¾ No forward reference, since symbols’ value can’t be defined
during the first pass
Example
¾ Use link list to keep track of whose
hose value
al e depend on an
undefined symbol
Example of Multi-pass Assembler
Example of Multi-pass Assembler
Example of Multi-pass Assembler
Example of Multi-pass Assembler
Example of Multi-pass Assembler