Sunteți pe pagina 1din 15

Intel 8086 microprocessor architecture

Memory

Program, data and stack memories occupy the same memory space. The total addressable memory size is 1MB KB. As the most of
the processor instructions use 16-bit pointers the processor can effectively address only 64 KB of memory. To access memory
outside of 64 KB the CPU uses special segment registers to specify where the code, stack and data 64 KB segments are positioned
within 1 MB of memory (see the "Registers" section below).

16-bit pointers and data are stored as:


address: low-order byte
address+1: high-order byte

32-bit addresses are stored in "segment:offset" format as:


address: low-order byte of segment
address+1: high-order byte of segment
address+2: low-order byte of offset
address+3: high-order byte of offset

Physical memory address pointed by segment:offset pair is calculated as:

address = (<segment> * 16) + <offset>

Program memory - program can be located anywhere in memory. Jump and call instructions can be used for short jumps within
currently selected 64 KB code segment, as well as for far jumps anywhere within 1 MB of memory. All conditional jump
instructions can be used to jump within approximately +127 - -127 bytes from current instruction.

Data memory - the processor can access data in any one out of 4 available segments, which limits the size of accessible memory
to 256 KB (if all four segments point to different 64 KB blocks). Accessing data from the Data, Code, Stack or Extra segments can
be usually done by prefixing instructions with the DS:, CS:, SS: or ES: (some registers and instructions by default may use the ES
or SS segments instead of DS segment).

Word data can be located at odd or even byte boundaries. The processor uses two memory accesses to read 16-bit word located at
odd byte boundaries. Reading word data from even byte boundaries requires only one memory access.

Stack memory can be placed anywhere in memory. The stack can be located at odd memory addresses, but it is not recommended
for performance reasons (see "Data Memory" above).

Reserved locations:

• 0000h - 03FFh are reserved for interrupt vectors. Each interrupt vector is a 32-bit pointer in format segment:offset.
• FFFF0h - FFFFFh - after RESET the processor always starts program execution at the FFFF0h address.

Interrupts

The processor has the following interrupts:

INTR is a maskable hardware interrupt. The interrupt can be enabled/disabled using STI/CLI instructions or using more
complicated method of updating the FLAGS register with the help of the POPF instruction. When an interrupt occurs, the
processor stores FLAGS register into stack, disables further interrupts, fetches from the bus one byte representing interrupt type,
and jumps to interrupt processing routine address of which is stored in location 4 * <interrupt type>. Interrupt processing routine
should return with the IRET instruction.

NMI is a non-maskable interrupt. Interrupt is processed in the same way as the INTR interrupt. Interrupt type of the NMI is 2, i.e.
the address of the NMI processing routine is stored in location 0008h. This interrupt has higher priority then the maskable
interrupt.

Software interrupts can be caused by:

• INT instruction - breakpoint interrupt. This is a type 3 interrupt.


• INT <interrupt number> instruction - any one interrupt from available 256 interrupts.
1
• INTO instruction - interrupt on overflow
• Single-step interrupt - generated if the TF flag is set. This is a type 1 interrupt. When the CPU processes this interrupt it
clears TF flag before calling the interrupt processing routine.
• Processor exceptions: divide error (type 0), unused opcode (type 6) and escape opcode (type 7).

Software interrupt processing is the same as for the hardware interrupts.

I/O ports

65536 8-bit I/O ports. These ports can be also addressed as 32768 16-bit I/O ports.

Registers

Most of the registers contain data/instruction offsets within 64 KB memory segment. There are four different 64 KB segments for
instructions, stack, data and extra data. To specify where in 1 MB of processor memory these 4 segments are located the processor
uses four segment registers:

Code segment (CS) is a 16-bit register containing address of 64 KB segment with processor instructions. The processor uses CS
segment for all accesses to instructions referenced by instruction pointer (IP) register. CS register cannot be changed directly. The
CS register is automatically updated during far jump, far call and far return instructions.

Stack segment (SS) is a 16-bit register containing address of 64KB segment with program stack. By default, the processor
assumes that all data referenced by the stack pointer (SP) and base pointer (BP) registers is located in the stack segment. SS
register can be changed directly using POP instruction.

Data segment (DS) is a 16-bit register containing address of 64KB segment with program data. By default, the processor assumes
that all data referenced by general registers (AX, BX, CX, DX) and index register (SI, DI) is located in the data segment. DS
register can be changed directly using POP and LDS instructions.

Extra segment (ES) is a 16-bit register containing address of 64KB segment, usually with program data. By default, the processor
assumes that the DI register references the ES segment in string manipulation instructions. ES register can be changed directly
using POP and LES instructions.

It is possible to change default segments used by general and index registers by prefixing instructions with a CS, SS, DS or ES
prefix.

All general registers of the 8086 microprocessor can be used for arithmetic and logic operations. The general registers are:

Accumulator register consists of 2 8-bit registers AL and AH, which can be combined together and used as a 16-bit register AX.
AL in this case contains the low-order byte of the word, and AH contains the high-order byte. Accumulator can be used for I/O
operations and string manipulation.

Base register consists of 2 8-bit registers BL and BH, which can be combined together and used as a 16-bit register BX. BL in this
case contains the low-order byte of the word, and BH contains the high-order byte. BX register usually contains a data pointer
used for based, based indexed or register indirect addressing.

Count register consists of 2 8-bit registers CL and CH, which can be combined together and used as a 16-bit register CX. When
combined, CL register contains the low-order byte of the word, and CH contains the high-order byte. Count register can be used as
a counter in string manipulation and shift/rotate instructions.

Data register consists of 2 8-bit registers DL and DH, which can be combined together and used as a 16-bit register DX. When
combined, DL register contains the low-order byte of the word, and DH contains the high-order byte. Data register can be used as
a port number in I/O operations. In integer 32-bit multiply and divide instruction the DX register contains high-order word of the
initial or resulting number.

The following registers are both general and index registers:

Stack Pointer (SP) is a 16-bit register pointing to program stack.

Base Pointer (BP) is a 16-bit register pointing to data in stack segment. BP register is usually used for based, based indexed or
register indirect addressing.
2
Source Index (SI) is a 16-bit register. SI is used for indexed, based indexed and register indirect addressing, as well as a source
data address in string manipulation instructions.

Destination Index (DI) is a 16-bit register. DI is used for indexed, based indexed and register indirect addressing, as well as a
destination data address in string manipulation instructions.

Other registers:

Instruction Pointer (IP) is a 16-bit register.

Flags is a 16-bit register containing 9 1-bit flags:

• Overflow Flag (OF) - set if the result is too large positive number, or is too small negative number to fit into destination
operand.
• Direction Flag (DF) - if set then string manipulation instructions will auto-decrement index registers. If cleared then the
index registers will be auto-incremented.
• Interrupt-enable Flag (IF) - setting this bit enables maskable interrupts.
• Single-step Flag (TF) - if set then single-step interrupt will occur after the next instruction.
• Sign Flag (SF) - set if the most significant bit of the result is set.
• Zero Flag (ZF) - set if the result is zero.
• Auxiliary carry Flag (AF) - set if there was a carry from or borrow to bits 0-3 in the AL register.
• Parity Flag (PF) - set if parity (the number of "1" bits) in the low-order byte of the result is even.
• Carry Flag (CF) - set if there was a carry from or borrow to the most significant bit during last result calculation.

Instruction Set

8086 instruction set consists of the following instructions:

• Data moving instructions.


• Arithmetic - add, subtract, increment, decrement, convert byte/word and compare.
• Logic - AND, OR, exclusive OR, shift/rotate and test.
• String manipulation - load, store, move, compare and scan for byte/word.
• Control transfer - conditional, unconditional, call subroutine and return from subroutine.
• Input/Output instructions.
• Other - setting/clearing flag bits, stack operations, software interrupts, etc.

Addressing modes

Implied - the data value/data address is implicitly associated with the instruction.

Register - references the data in a register or in a register pair.

Immediate - the data is provided in the instruction.

Direct - the instruction operand specifies the memory address where data is located.

Register indirect - instruction specifies a register containing an address, where data is located. This addressing mode works with
SI, DI, BX and BP registers.

Based - 8-bit or 16-bit instruction operand is added to the contents of a base register (BX or BP), the resulting value is a pointer to
location where data resides.

Indexed - 8-bit or 16-bit instruction operand is added to the contents of an index register (SI or DI), the resulting value is a pointer
to location where data resides.

Based Indexed - the contents of a base register (BX or BP) is added to the contents of an index register (SI or DI), the resulting
value is a pointer to location where data resides.

Based Indexed with displacement - 8-bit or 16-bit instruction operand is added to the contents of a base register (BX or BP) and
index register (SI or DI), the resulting value is a pointer to location where data resides.
3
8086 CPU ARCHITECTURE

The microprocessors functions as the CPU in the stored program model of the digital computer. Its job is to generate all system
timing signals and synchronize the transfer of data between memory, I/O, and itself. It accomplishes this task via the three-bus
system architecture previously discussed.

The microprocessor also has a S/W function. It must recognize, decode, and execute program instructions fetched from the
memory unit. This requires an Arithmetic-Logic Unit (ALU) within the CPU to perform arithmetic and logical (AND, OR, NOT,
compare, etc) functions.

The 8086 CPU is organized as two separate processors, called the Bus Interface Unit (BIU) and the Execution Unit (EU). The BIU
provides H/W functions, including generation of the memory and I/O addresses for the transfer of data between the outside world
-outside the CPU, that is- and the EU.

The EU receives program instruction codes and data from the BIU, executes these instructions, and store the results in the general
registers. By passing the data back to the BIU, data can also be stored in a memory location or written to an output device. Note
that the EU has no connection to the system buses. It receives and outputs all its data thru the BIU.

The only difference between an 8088 microprocessor and an 8086 microprocessor is the BIU. In the 8088, the BIU data bus path
is 8 bits wide versus the 8086's 16-bit data bus. Another difference is that the 8088 instruction queue is four bytes long instead of
six.

The important point to note, however, is that because the EU is the same for each processor, the programming instructions are
exactly the same for each. Programs written for the 8086 can be run on the 8088 without any changes.

FETCH AND EXECUTE

Although the 8086/88 still functions as a stored program computer, organization of the CPU into a separate BIU and EU allows
the fetch and execute cycles to overlap. To see this, consider what happens when the 8086 or 8088 is first started.

1. The BIU outputs the contents of the instruction pointer register (IP) onto the address bus, causing the selected byte or word to
be read into the BIU.

2. Register IP is incremented by 1 to prepare for the next instruction fetch.


4
3. Once inside the BIU, the instruction is passed to the queue. This is a first-in, first-out storage register sometimes likened to a
"pipeline".

4. Assuming that the queue is initially empty, the EU immediately draws this instruction from the queue and begins execution.

5. While the EU is executing this instruction, the BIU proceeds to fetch a new instruction. Depending on the execution time of the
first instruction, the BIU may fill the queue with several new instructions before the EU is ready to draw its next instruction.

The BIU is programmed to fetch a new instruction whenever the queue has room for one (with the 8088) or two (with the 8086)
additional bytes. The advantage of this pipelined architecture is that the EU can execute instructions almost continually instead of
having to wait for the BIU to fetch a new instruction.

There are three conditions that will cause the EU to enter a "wait" mode. The first occurs when an instruction requires access to a
memory location not in the queue. The BIU must suspend fetching instructions and output the address of this memory location.
After waiting for the memory access, the EU can resume executing instruction codes from the queue (and the BIU can resume
filling the queue).

The second condition occurs when the instruction to be executed is a "jump" instruction. In this case control is to be transferred to
a new (nonsequential) address. The queue, however, assumes that instructions will always be executed in sequence and thus will
be holding the "wrong" instruction codes. The EU must wait while the instruction at the jump address is fetched. Note that any
bytes presently in the queue must be discarded (they are overwritten).

One other condition can cause the BIU to suspend fetching instructions. This occurs during execution of instructions that are slow
to execute. For example, the instruction AAM (ASCII Adjust for Multiplication) requires 83 clock cycles to complete. At four
cycles per instruction fetch, the queue will be completely filled during the execution of this single instruction. The BIU will thus
have to wait for the EU to pull over one or two bytes from the queue before resuming the fetch cycle.

A subtle advantage to the pipelined architecture should be mentioned. Because the next several instructions are usually in the
queue, the BIU can access memory at a somewhat "leisurely" pace. This means that slow-mem parts can be used without affecting
overall system performance.

PROGRAMING MODEL

As a programmer of the 8086 or 8088 you must become familiar with the various registers in the EU and BIU.
5
The data group consists of the accumulator and the BX, CX, and DX registers. Note that each can be accessed as a byte or a word.
Thus BX refers to the 16-bit base register but BH refers only to the higher 8 bits of this register. The data registers are normally
used for storing temporary results that will be acted on by subsequent instructions.

The pointer and index group are all 16-bit registers (you cannot access the low or high bytes alone). These registers are used as
memory pointers. Sometimes a pointer reg will be interpreted as pointing to a memory byte and at other times a memory word. As
you will see, the 8086/88 always stores words with the high-order byte in the high-order word address.

Register IP could be considered in the previous group, but this register has only one function -to point to the next instruction to be
fetched to the BIU. Register IP is physically part of the BIU and not under direct control of the programmer as are the other
pointer registers.

Six of the flags are status indicators, reflecting properties of the result of the last arithmetic or logical instructions. The 8086/88
has several instructions that can be used to transfer program control to a new memory location based on the state of the flags.

Three of the flags can be set or reset directly by the programmer and are used to control the operation of the processor. These are
TF, IF, and DF.

The final group of registers is called the segment group. These registers are used by the BIU to determine the memory address
output by the CPU when it is reading or writing from the memory unit. To fully understand these registers, we must first study the
way the 8086/88 divides its memory into segments.

SEGMENTED MEMORY

Even though the 8086 is considered a 16-bit processor, (it has a 16-bit data bus width) its memory is still thought of in bytes. At
first this might seem a disadvantage:

Why saddle a 16-bit microprocessor with an 8-bit memory?

Actually, there are a couple of good reasons. First, it allows the processor to work on bytes as well as words. This is especially
important with I/O devices such as printers, terminals, and modems, all of which are designed to transfer ASCII-encoded (7- or 8-
bit) data.

Second, many of the 8086's (and 8088's) operation codes are single bytes. Other instructions may require anywhere from two to
seven bytes. By being able to access individual bytes, these odd-length instructions can be handled.

We have already seen that the 8086/88 has a 20-bit address bus, allowing it to output 210, or 1'048.576, different memory
addresses. As you can see, 524.288 words can also be visualized.

6
As mentioned, the 8086 reads 16 bits from memory by simultaneously reading an odd-addressed byte and an even-addressed byte.
For this reason the 8086 organizes its memory into an even-addressed bank and an odd-addressed bank.

With regard to this, you might wonder if all words must begin at an even address. Well, the answer is yes. However, there is a
penalty to be paid. The CPU must perform two memory read cycles: one to fetch the low-order byte and a second to fetch the
high-order byte. This slows down the processor but is transparent to the programmer.

The last few paragraphs apply only to the 8086. The 8088 with its 8-bit data bus interfaces to the 1 MB of memory as a single
bank. When it is necessary to access a word (whether on an even- or an odd-addressed boundary) two memory read (or write)
cycles are performed. In effect, the 8088 pays a performance penalty with every word access. Fortunately for the programmer,
except for the slightly slower performance of the 8088, there is no difference between the two processors.

MEMORY MAP

Still another view of the 8086/88 memory space could be as 16 64K-byte blocks beginning at hex address 000000h and ending at
address 0FFFFFh. This division into 64K-byte blocks is an arbitrary but convenient choice. This is because the most significant
hex digit increments by 1 with each additional block. That is, address 20000h is 65.536 bytes higher in memory than address
10000h. Be sure to note that five hex digits are required to represent a memory address.

The diagram is called a memory map. This is because, like a road map, it is a guide showing how the system memory is allocated.
This type of information is vital to the programmer, who must know exactly where his or her programs can be safely loaded.

Note that some memory locations are marked reserved and others dedicated. The dedicated locations are used for processing
specific system interrupts and the reset function. Intel has also reserved several locations for future H/W and S/W products. If you
make use of these memory locations, you risk incompatibility with these future products.

SEGMENT REGISTERS

Within the 1 MB of memory space the 8086/88 defines four 64K-byte memory blocks called the code segment, stack segment,
data segment, and extra segment. Each of these blocks of memory is used differently by the processor.

The code segment holds the program instruction codes. The data segment stores data for the program. The extra segment is an
extra data segment (often used for shared data). The stack segment is used to store interrupt and subroutine return addresses.

You should realize that the concept of the segmented memory is a unique one. Older-generation microprocessors such as the 8-bit
8086 or Z-80 could access only one 64K-byte segment. This mean that the programs instruction, data and subroutine stack all had
to share the same memory. This limited the amount of memory available for the program itself and led to disaster if the stack
should happen to overwrite the data or program areas.
7
The four segment registers (CS, DS, ES, and SS) are used to "point" at location 0 (the base address) of each segment. This is a
little "tricky" because the segment registers are only 16 bits wide, but the memory address is 20 bits wide. The BIU takes care of
this problem by appending four 0's to the low-order bits of the segment register. In effect, this multiplies the segment register
contents by 16.

The point to note is that the beginning segment address is not arbitrary -it must begin at an address divisible by 16. Another way if
saying this is that the low-order hex digit must be 0.

Also note that the four segments need not be defined separately. Indeed, it is allowable for all four segments to completely overlap
(CS = DS = ES = SS).

Memory locations not defined to be within one of the current segments cannot be accessed by the 8086/88 without first redefining
one of the segment registers to include that location. Thus at any given instant a maximum of 256 K (64K * 4) bytes of memory
can be utilized. As we will see, the contents of the segment registers can only be specified via S/W. As you might imagine,
instructions to load these registers should be among the first given in any 8086/88 program.

LOGICAL AND PHYSICAL ADDRESS

Addresses within a segment can range from address 00000h to address 0FFFFh. This corresponds to the 64K-byte length of the
segment. An address within a segment is called an offset or logical address. A logical address gives the displacement from the
address base of the segment to the desired location within it, as opposed to its "real" address, which maps directly anywhere into
the 1 MB memory space. This "real" address is called the physical address.

What is the difference between the physical and the logical address?

The physical address is 20 bits long and corresponds to the actual binary code output by the BIU on the address bus lines. The
logical address is an offset from location 0 of a given segment.

8
When two segments overlap it is certainly possible for two different logical addresses to map to the same physical address. This
can have disastrous results when the data begins to overwrite the subroutine stack area, or vice versa. For this reason you must be
very careful when segments are allowed to overlap.

You should also be careful when writing addresses on paper to do so clearly. To specify the logical address XXXX in the stack
segment, use the convention SS:XXXX, which is equal to [SS] * 16 + XXXX.

ADVANTAGES OF SEGMENTED MEMORY

Segmented memory can seem confusing at first. What you must remember is that the program op-codes will be fetched from the
code segment, while program data variables will be stored in the data and extra segments. Stack operations use registers BP or SP
and the stack segment. As we begin writing programs the consequences of these definitions will become clearer.

An immediate advantage of having separate data and code segments is that one program can work on several different sets of data.
This is done by reloading register DS to point to the new data. Perhaps the greatest advantage of segmented memory is that
programs that reference logical addresses only can be loaded and run anywhere in memory. This is because the logical addresses
always range from 00000h to 0FFFFh, independent of the code segment base. Such programs are said to be relocatable, meaning
that they will run at any location in memory. The requirements for writing relocatable programs are that no references be made to
physical addresses, and no changes to the segment registers are allowed.

REFERENCE

Books

The 80x86 IBM PC and Compatible Computers (Vol 1 and Vol 2)

Microcomputer Systems: The 8086/8088 Family

9
Introduction to 8086 Assembly Language • Initialized variables take up space in the program's
CS 272 Sam Houston State University Dr. Tim code file
McGuire • Declare uninitialized variables after initialized ones so
they do not take up space in the program's code file
Structure of an assembly language program
Reserving space for variables
• Assembly language programs divide roughly
into five sections • Sample DATA SEGMENT
o header
o equates .data
o data numRows DB 25
o body numColumns DB ?
o closing videoBase DW 0800h

The Header • DB and DW are common directives (define byte) and


(define word)
• The header contains various directives which do • The symbols associated with variables are called labels
not produce machine code • Strings may be declared using the DB directive:
• Sample header:
aTOm DB "ABCDEFGHIJKLM"
%TITLE "Sample Header" Program Data and Storage
.8086
.model small • Pseudo-ops to define data or reserve storage
.stack 256 o DB - byte(s)
o DW - word(s)
Named Constants o DD - doubleword(s)
o DQ - quadword(s)
• Symbolic names associated with storage o DT - tenbyte(s)
locations represent addresses • These directives require one or more operands
• Named constants are symbols created to o define memory contents
represent specific values determined by an o specify amount of storage to reserve for run-
expression time data
• Named constants can be numeric or string
• Some named constants can be redefined Defining Data
• No storage is allocated for these values
• Numeric data values
Equates o 100 - decimal
o 100b - binary
• Constant values are known as equates o 100h - hexadecimal
• Sample equate section: o '100' - ASCII
o "100" - ASCII
Count EQU 10 • Use the appropriate DEFINE directive (byte, word,
Element EQU 5 etc.)
Size = Count * Element • A list of values may be used - the following creates 4
MyString EQU "Maze of twisty consecutive words
passages"
Size = 0 DW 40Ch,10b,-13,0

• = is used for numeric values only • A ? represents an uninitialized storage location


• Cannot change value of EQU symbol
• EQUated symbols are not variables DB 255,?,-128,'X'
• EQU expressions are evaluated where used; = Naming Storage Locations
expressions are evaluated where defined
• Names can be associated with storage locations
The Data Segment
ANum DB -4
• Begins with the .data directive DW 17
• Two kinds of variables, initialized and ONE
uninitialized.

10
UNO DW 1 Label Mnemonic Operand Comment
X DD ? ---------------------------------------------------------
.data
• These names are called variables exCode DB 0 ;A byte variable
• ANum refers to a byte storage location, myWord DW ? ;Uninitialized word var.
initialized to FCh .code
MAIN:
• The next word has no associated name
mov ax,@data ;Initialize DS to address
• ONE and UNO refer to the same word
mov ds,ax ; of data segment
• X is an uninitialized doubleword jmp Exit ;Jump to Exit label
mov cx,10 ;This line skipped!
Arrays Exit: mov ah,04Ch ;DOS function: Exit
prog
• Any consecutive storage locations of the same mov al, exCode ;Return exit code value
size can be called an array int 21h ;Call DOS. Terminate prog
END MAIN ;End Program and specify
X DW 040Ch,10b,-13,0 entry point
Y DB 'This is an array' The Label Field
Z DD -109236, FFFFFFFFh, -1,
100b • Labels mark places in a program which other
instructions and directives reference
• Components of X are at X, X+2, X+4, X+6 • Labels in the code segment always end with a colon
• Components of Y are at Y, Y+1, …, Y+15 • Labels in the data segment never end with a colon
• Components of Z are at Z, Z+4, Z+8, Z+12 • Labels can be from 1 to 31 characters long and may
consist of letters, digits, and the special characters ? .
@_$%
DUP • If a period is used, it must be the first character
• Labels must not begin with a digit
• The assembler is case insensitive
• Allows a sequence of storage locations to be
defined or reserved
• Only used as an operand of a define directive Legal and Illegal Labels

DB 40 DUP(?) • Examples of legal names


DW 10h DUP(0) o COUNTER1
DB 3 DUP("ABC") o @character
DB 4 DUP(3 DUP (0,1), 2 DUP('$')) o SUM_OF_DIGITS
Word Storage o $1000
o DONE?
• Word, doubleword, and quadword data are o .TEST
stored in reverse byte order (in memory) • Examples of illegal names
o TWO WORDS contains a blank
Directive Bytes in Storage o 2abc begins with a digit
DW 256 00 01 o A45.28 . not first character
DD 1234567h 67 45 23 01 o YOU&ME contains an illegal character
DQ 10 0A 00 00 00 00 00 00 00
X DW 35DAh DA 35 The Mnemonic Field
Low byte of X is at X, high byte of X
is at X+1
• For an instruction, the operation field contains a
The Program Body
symbolic operation code (opcode)
• The assembler translates a symbolic opcode into a
• Also known as the code segment machine language opcode
• Divided into four columns: labels, mnemonics, • Examples are: ADD, MOV, SUB
operands, and comments • In an assembler directive, the operation field contains a
• Labels refer to the positions of variables and directive (pseudo-op)
instructions, represented by the mnemonics • Pseudo-ops are not translated into machine code; they
• Operands are required by most assembly tell the assembler to do something
language instructions
• Comments aid in remembering the purpose of The Operand Field
various instructions
• For an instruction, the operand field specifies the data
An example
that are to be acted on by the instruction. May have
zero, one, or two operands
11
NOP ;no operands -- does • END is a pseudo-op; the single "operand" is the label
nothing specifying the beginning of execution, usually the first
INC AX ;one operand -- adds 1 instruction after the .code pseudo-op
to the contents of AX
ADD WORD1,2 ;two operands -- Assembling a Program
adds 2 to the contents
; of memory word WORD1
• The source file of an assembly language program is
usually named with an extension of .asm
• In a two-operand instruction, the first operand is
the destination operand. The second operand is
edit myprog.asm
the source operand.
• For an assembler directive, the operand field
usually contains more information about the • The source file is processed (assembled) by the
directive. assembler (TASM) to produce an object file (.obj)

The Comment Field tasm myprog produces myprog.obj

• A semicolon marks the beginning of a comment • The object file must be linked by the linker (TLINK)
field to produce an executable file (.exe)
• The assembler ignores anything typed after the
semicolon on that line tlink myprog produces myprog.exe
• It is almost impossible to understand an Dealing with Errors
assembly language program without good
comments • TASM will report the line number and give an error
• Good programming practice dictates a comment message for each error it finds
on almost every line • Sometimes it is helpful to have a listing file (.lst),
created by using TASM with the -l option
Good and Bad Comments • The .lst file contains a complete listing of the program,
along with line numbers, object code bytes, and the
• Don't say something obvious, like symbol table

MOV CX,0 ;move 0 to CX Using the Debugger

• Instead, put the instruction into the context of • Useful for logic errors that the assembler misses
the program • See the text for a complete tutorial
• You do not need to use the TDH386.SYS driver or the
MOV CX,0 ;CX counts terms, TD386.EXE debugger with the latest version of the
initially 0 assembler
• To use the debugger on myprog.asm
• An entire line can be a comment, or be used to
create visual space in a program tasm /zi myprog
tlink /v myprog
td myprog
; .COM and .EXE files
; Initialize registers
;
MOV AX,0 • The .COM code file format is a relic of the first version
MOV BX,0 of MS-DOS
The Closing • Not recommended for general purposes
• All code, data, and the stack occupy one 64K segment
(Borland's "tiny" model)
• The last lines of an assembly language program
are the closing • .EXE code files are more efficient in use of RAM
• Indicates to assembler that it has reached the • Data and code occupy separate segments
end of the program and where the entry point is • The programmer is responsible for setting up the data
and code segments properly
MAIN ENDP ;End of program
END MAIN ; entry point for Ending a Program
linker use
• All programs, upon termination, must return control
back to another program -- the operating system
• Under MS-DOS, this is COMMAND.COM
• This is done by doing a DOS system call

12
Data Transfer Instructions • XCHG destination,source
o reg, reg
• MOV destination,source o reg, mem
o reg, reg o mem, reg
o mem, reg • MOV and XCHG cannot perform memory to memory
o reg, mem moves
o mem, immed • This provides an efficient means to swap the operands
o reg, immed o No temporary storage is needed
• Sizes of both operands must be the same o Sorting often requires this type of operation
• reg can be any non-segment register except IP o This works only with the general registers
cannot be the target register
• MOV's between a segment register and memory Arithmetic Instructions
or a 16-bit register are possible ADD dest, source
SUB dest, source
Examples INC dest
DEC dest
NEG dest
• mov ax, word1
o "Move word1 to ax"
o Contents of register ax are replaced by • Operands must be of the same size
the contents of the memory location • source can be a general register, memory location, or
word1 constant
• xchg ah, bl • dest can be a register or memory location
o Swaps the contents of ah and bl o except operands cannot both be memory
• Illegal: mov word1, word2
o can't have both operands be memory ADD and INC
locations
• ADD is used to add the contents of
Sample MOV Instructions o two registers
b db 4Fh o a register and a memory location
w dw 2048 o a register and a constant
mov bl,dh • INC is used to add 1 to the contents of a register or
mov ax,w memory location
mov ch,b
mov al,255 Examples
mov w,-100
mov b,0
• add ax, word1
o "Add word1 to ax"
• When a variable is created with a define o Contents of register ax and memory location
directive, it is assigned a default size attribute
word1 are added, and the sum is stored in ax
(byte, word, etc)
• inc ah
• You can assign a size attribute using LABEL
o Adds one to the contents of ah
• Illegal: add word1, word2
LoByte LABEL BYTE
o can't have both operands be memory locations
aWord DW 97F2h
Addresses with Displacements
b db 4Fh, 20h, 3Ch SUB, DEC, and NEG
w dw 2048, -100, 0
mov bx, w+2 • SUB is used to subtract the contents of
mov b+1, ah o one register from another register
mov ah, b+5 o a register from a memory location, or vice
mov dx, w-3 versa
o a constant from a register
• Type checking is still in effect • DEC is used to subtract 1 from the contents of a
• The assembler computes an address based on register or memory location
the expression • NEG is used to negate the contents of a register or
• NOTE: These are address computations done at memory location
assembly time
Examples

MOV ax,b-1 • sub ax, word1


will not subtract 1 from the value stored at b o "Subtract word1 from ax"
eXCHanGe
13
o Contents of memory location word1 is o compact: code<=64K, one code segment
subtracted from the contents of register o large: multiple code and data segments
ax, and the sum is stored in ax o huge: allows individual arrays to exceed 64K
• dec bx o flat: no segments, 32-bit addresses, protected
o Subtracts one from the contents of bx mode only (80386 and higher)
• Illegal: sub byte1, byte2
o can't have both operands be memory Program Skeleton
locations .MODEL small
.STACK 100h
Type Agreement of Operands .DATA
;declarations
• The operands of two-operand instructions must .CODE
be of the same type (byte or word) MAIN:
o mov ax, bh ;illegal ;main proc code
;return to DOS
o mov ax, byte1 ;illegal
o mov ah,'A' ;legal -- moves 41h into
;other procs (if any) go here
ah end MAIN
o mov ax,'A' ;legal -- moves 0041h
into ax  Select a memory model
 Define the stack size
Translation of HLL Instructions  Declare variables
 Write code
• B=A
• organize into procedures

mov ax,A  Mark the end of the source file


mov B,ax
• define the entry point
o memory-memory moves are illegal
• A = B - 2*A Input and Output Using 8086 Assembly Language

• Most input and output is not done directly via the I/O
mov ax,B ports, because
sub ax,A o port addresses vary among computer models
sub ax,A o it's much easier to program I/O with the
mov A,ax
service routines provided by the manufacturer
Program Segment Structure
• There are BIOS routines (which we'll look at later) and
DOS routines for handling I/O (using interrupt number
• Data Segments 21h)
o Storage for variables
o Variable addresses are computed as Interrupts
offsets from start of this segment
• Code Segment
• The interrupt instruction is used to cause a software
o contains executable instructions
interrupt (system call)
• Stack Segment o An interrupt interrupts the current program
o used to set aside storage for the stack
and executes a subroutine, eventually
o Stack addresses are computed as returning control to the original program
offsets into this segment o Interrupts may be caused by hardware or
• Segment directives software
• int interrupt_number ;software interrupt
.DATA
.CODE Output to Monitor
.STACK size
Memory Models
• DOS Interrupts : interrupt 21h
o This interrupt invokes one of many support
• .Model memory_model
routines provided by DOS
o tiny: code+data <= 64K (.com
o The DOS function is selected via AH
program)
o Other registers may serve as arguments
o small: code<=64K, data<=64K, one of
• AH = 2, DL = ASCII of character to output
each
o medium: data<=64K, one data segment
14
o Character is displayed at the current • Interrupt 21h, function 01h
cursor position, the cursor is advanced, • Filtered input with echo
AL = DL o This function returns the next character in the
keyboard buffer (waiting if necessary)
Output a String o The character is echoed to the screen
o AL will contain the ASCII code of the non-
• Interrupt 21h, function 09h control character
o DX = offset to the string (in data  AL=0 if a control character was
segment) entered
o The string is terminated with the '$'
character An Example Program
• To place the address of a variable in DX, use %TITLE "Case Conversion"
one of the following .8086
o lea DX,theString ;load effective .MODEL small
address .STACK 256
o mov DX, offset theString .DATA
;immediate data MSG1 DB 'Enter a lower case letter: $'
MSG2 DB 0Dh,0Ah,'In upper case it is: '
CHAR DB ?,'$'
Print String Example
exCode DB 0
%TITLE "First Program -- HELLO.ASM"
.CODE
.8086
MAIN:
.MODEL small
;initialize ds
.STACK 256
mov ax,@data ; Initialize DS to address
.DATA
mov ds,ax ; of data segment
msg DB "Hello, World!$"
;print user prompt
.CODE
mov ah,9 ; display string fcn
MAIN:
lea dx,MSG1 ; get first message
mov ax,@data ;Initialize DS to
int 21h ; display it
address
;input a character and convert to upper case
mov ds,ax ; of data segment
mov ah,1 ; read char fcn
lea dx,msg ;get message
int 21h ; input char into AL
mov ah,09h ;display string
sub al,20h ; convert to upper case
function
mov CHAR,al ; and store it
int 21h ;display message
;display on the next line
Exit: mov ah,4Ch ;DOS function: Exit
mov dx,offset MSG2 ; get second message
program
mov ah,9 ; display string function
mov al,0 ;Return exit code value
int 21h ; display message and upper case
int 21h ;Call DOS. Terminate
;return to DOS
program
Exit:
END MAIN ; End program/entry
mov ah,4Ch ; DOS function: Exit program
point
mov al,exCode ; Return exit code value
Input a Character
int 21h ; Call DOS. Terminate program
END MAIN ; End of program / entry point

15

S-ar putea să vă placă și