Sunteți pe pagina 1din 12

1 8086 ALP TOOLS (CH 2)

CHAPTER 2
In this chapter, we shall discuss the Assembly Language Program development tools, PC memory structure and Assembler directives.

Books to be Referred:
1. 2. 3. 4. 5. Microprocessors and Interfacing 2nd Edition, Douglas V Hall Intel Microprocessors 6th Edition, Barry Brey Peter Nortons DOS Guide IBM PC Assembly language programming Peter Abel Hardware and Software of Computers S. K. Bose

How the System Works ?


Turn on the computer power (cold boot). Now, all memory locations are cleared, IP is cleared, CS is set to FFFF. Thus, the first instruction to be executed is at CS:IP or FFFF0h. From here, the control goes to BIOS routines present in Flash or EPROM.

The BIOS (Basic I/O System routines)


Its a firmware present in a non-volatile memory like ROM Checks and identifies various devices connected to the system Establishes ISR table that contains the vector addresses of the 256 interrupts. (IVT is located from 00000h to 003FFh) Establishes its own data area starting at 00400h Determines whether a disk containing DOS is present and if yes, it accesses the bootstrap loader from the disk. The Bootstrap Loader is a BIOS program in a ROM which is responsible for booting the computer. i.e., it loads the Operating System (OS) (along with necessary DOS files like IO.SYS, MSDOS.SYS, COMMAND.COM etc) from hard disk to the primary memory.

Programming Models:
Depending on the size of the memory the user program occupies, different types of assembly language models are defined.

TINY All data and code in one segment SMALL one data segment and one code segment MEDIUM one data segment and two or more code segments COMPACT one code segment and two or more data segments LARGE any number of data and code segments To designate a model, we use .MODEL directive.

General Memory Structure in a Computer:


The memory is divided into three parts: Lower 640 KB Transient Program Area (TPA), next 384 KB System area and Extended memory (XMS). But, 8086/8088 based PCs have only 1 MB real memory. So, they have only TPA and System area.

Compiled by: L. Krishnananda, Asst. Professor, REVA Institute of Technology, Bangalore

2 8086 ALP TOOLS (CH 2)

The TPA holds the OS and other programs to control the system operation. The System area holds Video RAM, Video ROM, BIOS ROM etc,. The DOS controls the way the disk memory is organized and controlled. The BIOS is a collection of programs stored in ROM or Flash memory, to access I/O devices and internal features of the system. IO.SYS is a program that loads into the TPA from the disk whenever MS DOS is started. Device drivers are programs that control installable I/O devices. Windows uses a file called System.ini to load the drivers. The Command.Com program controls the operation of the computer from the keyboard. That is, it processes the DOS commands as they are keyed in. The free TPA area holds application programs as they are executed. The TPA also holds TSR programs that remain in memory in an inactive state until activated thro a hot key or thro a command.

Differences between .COM programs and .EXE programs:


COM programs are smaller than EXE programs An EXE program may be of any size, whereas a COM file size is limited to 64KB (one segment), including Program segment prefix (PSP). For EXE file, stack should be explicitly specified; for COM, DOS creates the stack on its own. An entire COM program consists of one code segment (max 64K) including PSP, data and stack area. An EXE contains a 512 byte header preceding the program, whereas COM does not require a header For COM, no need to initialize DS register A COM program uses the directive ORG 100h to indicate that user program addressing begins at an offset 100h from the beginning of the PSP.

A. Assembly Language Development Tools:


1. EDITOR:
Its a system software (program) which allows users to create a file containing assembly instructions and statements. Ex: Wordstar, DOS Editor, Norton Editor Using the editor, you can also edit/delete/modify already existing files. While saving, you must give the file extension as .asm. Follow the AL syntax while typing the programs Editor stores the ASCII codes for the letters and numbers keyed in. Any statement beginning with semicolon is treated as comment. When you typed all your program, you have to save the file on the disk. This file is called source file, having a .asm extension. The next step is to convert this source file into a machine executable .obj file.

Compiled by: L. Krishnananda, Asst. Professor, REVA Institute of Technology, Bangalore

3 8086 ALP TOOLS (CH 2)

2. ASSEMBLER:
An assembler is a system software (program) used to translate the assembly language mnemonics for instructions to the corresponding binary codes. An assembler makes two passes thro your source code. On the first pass, it determines the displacement of named data items, the offset of labels etc., and puts this information in a symbol table. On the second pass, the assembler produces the binary code for each instruction and inserts the offsets, etc., that is calculated during the first pass. The assembler checks for the correct syntax in the assembly instructions and provides appropriate warning and error messages. You have to open your file again using the editor to correct the errors and reassemble it using assembler. Unless all the errors are corrected, the program cannot be executed in the next step. The assembler generates two files from the source file; the first file, called the object file having an extension .obj which contains the binary codes for instructions and information about the addresses of the instructions. The second file is called list file with an extension .lst. This file contains the assembly language statements, the binary codes for each instruction, and the offset for each inst. It also indicates any syntax errors or typing errors in the source program. Note: The assembler generates only offsets (i.e., effective addresses), not absolute physical addresses.

3. LINKER:
Its a program used to join several object files into one large object file. For large programs, usually several modules are written and each module is tested and debugged. When all the modules work, their object modules can be linked together to form a complete functioning program. The LINK program must be run on .obj file. The linker produces a link file which contains the binary codes for all the combined modules. The linker also produces a link map file which contains the address information about the linked files. The linker assigns only relative addresses starting from zero, so that this can be put anywhere in physical primary memory later (by another program called locator or loader). Therefore, this file is called relocatable. The linker produces link files with .exe extension. Object modules of useful programs (like square root, factorial etc) can be kept in a library, and linked to other programs when needed.

4. LOADER (LOCATOR):
Its a program used to assign absolute physical addresses to the segments in the .exe file, in the memory. IBM PC DOS environment comes with EXE2BIN loader program. The .exe file is converted into .bin file. The physical addresses are assigned at run time by the loader. So, assembler does not know about the segment starting addresses at the time program being assembled.

Compiled by: L. Krishnananda, Asst. Professor, REVA Institute of Technology, Bangalore

4 8086 ALP TOOLS (CH 2)

5. DEBUGGER:
If your program requires no external hardware, you can use a program called debugger to load and run the .exe file. A debugger is a program which allows you to load your object code program into system memory, execute the program and troubleshoot or debug it. The debugger also allows you to look at the contents of registers and memory locations after you run your program. The debugger allows you to change the contents of registers & memory locations and rerun the program. Also, if facilitates to set up breakpoints in your program, single step feature, and other easy-to-use features. If you are using a prototype SDK 86 board, the debugger is usually called monitor program. We would be using the development tool MASM 6.0 or higher version from Microsoft Inc. MASM stands for Microsoft Macro Assembler. Another assembler TASM (Turbo Assembler) from Borland International is also available.

How to Write and Execute your ALP using MASM?


1. Type EDIT at the command prompt (C:\>\MASM\). A window will be opened with all the options like File, Edit etc., In the workspace, type your program according to the assembly language syntax and save the file with a .asm extension. (say addn.asm) 2. Exit the Editor using File menu or pressing ALT +F +X. 3. At the prompt, type MASM addn.asm if your file is named as addn.asm. Press Enter key 2 or 3 times. The assembler checks the syntax of your program and creates .obj file, if there are no errors. Otherwise, it indicates the error with line numbers. You have to correct the errors by opening your file with EDIT command and changing your instructions. Come back and again assemble your program using MASM command. This has to continue until MASM displays 0 Severe Errors. There may still be Warning Errors. Try to correct them also. 4. Once you get the .obj file from step 3, you have to create the .exe file. At the prompt, type LINK addn.obj and press Enter key. (Note that you have to give the extension now as .obj and not as .asm). If there are no linker errors, linker will create a .exe file of your program. Now, your program is ready to run. 5. There are two ways to run your program. a) If your program accepts user inputs thro keyboard and displays the result on the screen, then you can type the name of the file at the prompt and press Enter key. Appropriate messages will be displayed. 6. If your program works with memory data and if you really want to know the contents of registers, flags, memory locations assigned, opcodes etc., then type CV addn (file name) at the prompt. Another window will be opened with your program, machine codes, register contents etc., Now, you also get a prompt > sign within CV window. Here you can use d command to display memory contents, E command to enter data into memory and g command to execute your program. Also, you can single step thro your program using the menu options. In many ways, CV (Code View) is like Turbo C environment.

Compiled by: L. Krishnananda, Asst. Professor, REVA Institute of Technology, Bangalore

5 8086 ALP TOOLS (CH 2)

B. Assembler Directives
Assembler Directives, also called as pseudo operations are the commands issued to the assembler for many system related tasks such as: variable labeling, memory assignment, reserving memory storage, identifying beginning & end of the program etc,. These directives can be used with Intel macro assembler (ASM80), Borland Turbo Assembler (TASM) and IBM macro assembler (MASM). Depending on the type of functions performed, assembler directives are classified.

1. Segment Definition and Initialization Directives:


a) SEGMENT and These directives are used to define the beginning and end of a logical segment. That is, they are used to identify a group of data items or a group of instructions that you want to put together logically in a segment. When you set up a logical segment, you have to name it. A logical segment is not given a physical starting address when it is declared. (For the sake of clarity, directives are written in bold type) Syntax: Segment name SEGMENT Body of a logical segment (data or code) Segment name ENDS Directive Ex: DATA_SEG SEGMENT Variable NUM DB 2 DUP (0) name NUM1 DW 12A2H DATA_SEG ENDS Ex: Data SEGMENT Data1 DB ? Data2 DW ? Data ENDS ENDS directives:

Code SEGMENT START: MOV AX,BX Code ENDS b) ASSUME directive: It is used to inform the assembler the name of the logical segment it should use for a specified physical segment. That is, it tells about how to link the logical segments to the actual segment definition. An 8086 program may have several logical segments. ASSUME tells the assembler what names have been chosen for Code, Data, Extra and Stack segments. Informs the assembler that the register CS is to be initialized with the address allotted by the loader to the label, say CODE. DS is similarly initialized with the address of label, sat DATA, etc,. Ex: DATA_HERE SEGMENT

Compiled by: L. Krishnananda, Asst. Professor, REVA Institute of Technology, Bangalore

6 8086 ALP TOOLS (CH 2)

N1 DB 10, 0AH, 20 DATA_HERE ENDS CODE_HERE SEGMENT ASSUME CS: CODE_HERE, DS:DATA_HERE ... MOV AX, 2222 Prog. instructions CODE_HERE ENDS Here, for example, the statement ASSUME CS: CODE_HERE, tells the assembler that the instructions for the program are available in a logical segment named CODE_HERE. Note that, the ASSUME directive does not load the segment starting address into the corresponding segment registers of the CPU. c) ORG directive: As the assembler assembles the program statements or data declarations, it uses a location counter to keep track of how many bytes it is from the start of a segment at any time. The location counter is automatically set to 0000h when the assembler starts reading a segment. The ORG directive allows the user to set the location counter to any value desired. For example, within the data segment, if you write ORG 100h, then the first data item declared there will be available at an offset of 100h from the starting of the data segment. Changes the starting offset address of the data in the data segment Ex: ORG 100H d) EXIT Used to Exit to DOS from MASM environment (can be used before END directive) e) END directive: It is put after the last statement of a program to tell the assembler that this is the end of program module. There should be only one END directive in your program.

2. Storage Allocation and Reservation directives:

a) DB, DW, DD: (Define Byte, Define Word, Define Double Word) These directives are used to assign names to variables in your programs. DB directive is used to declare a byte-type variable and to set aside one or more storage locations of type byte in memory. DW directive is used to declare a variable of type word and to reserve one or more storage locations of type word in memory. DD directive is used to declare a variable of type Double word (32-bits or 4 bytes) and to reserve one or more storage locations of type double word in memory.

NUM DB 23H ; name a memory locn NUM and initialize with data 23h LOC DB ? ; Reserve a memory locn LOC, but value is not initialized. N1 DB 11, 22, 33 ; reserve 3 locations starting from N1 and put the values msg DB HELLO ; store ASCII values for the string in mem starting from msg SUM DW ? ; reserve two memory locations from SUM and uninitialize Note: To reserve or assign large number of locations, use DUP operator. Ex:

Compiled by: L. Krishnananda, Asst. Professor, REVA Institute of Technology, Bangalore

7 8086 ALP TOOLS (CH 2)

Ex: DATA1 db 50d Dup (?) ; reserve 50 locations starting from DATA1 in memory LOCN dw 20d Dup (0) ; reserve 40 locations and all are initialized to value 0 Note: If you want to assign a specific data type to a variable, then PTR attribute operator is used. You can use either BYTE PTR (for byte operations) or WORD PTR (for word operations) operators. Ex: NUM1 dw 0A345h ; any value (number) must start with a digit (0 to 9) MOV AL, NUM1 ; illegal since NUM1 is of type DW. MOV AL, BYTE PTR NUM1 ; AL will be loaded with value 45h. b) DQ, DT (Define Quad word, Define Ten bytes) The operation is similar to DB, DD or DW. Used to declare and reserve memory locations and to initialize the values. c) EQU (Equate) This directive used to give a name to some value or symbol. Each time the assembler finds the given name in the program, it replaces the name with the value or symbol. Equates a numeric, ASCII or label to another label. Ex: Data SEGMENT Num1 EQU 50H Num2 EQU 66H Data ENDS Numeric value 50H and 66H are assigned to Num1 and Num2

3. Attribute Operators:
a) OFFSET: Its an operator which tells the assembler to determine the offset or displacement of a named data item or variable from the start of the segment which contains it. This is used to load the offset of a variable into a register, so that the variable can be accessed using one of the indexed modes. Ex: if you declared a variable as Num db 20h in the data segment, then, in the code segment you can write mov bx, OFFSET Num ; bx = address of NUM. mov al, [bx] ; now al =20h Note: If you write, mov bx, Num ; bx = 0020h ,i.e., the contents of Num will be loaded. But if you use OFFSET operator, you get the address on the variable. b) LENGTH: Its an operator which tells the assembler to determine the number of elements in some named data item such as string or array, That is, it returns the number of units assigned to a variable. Ex: Suppose you declare SUM db 50 Dup (?) in the data segment. Then, in the code segment, if you write the instruction, mov cx, LENGTH Sum ; then cx = 50d. Similarly, if you declare, Array dw 50 dup (?), then mov cx, LENGTH Array, will again return cx = 50d. That is, LENGTH simply tells the number of elements available irrespective of the data type.

Compiled by: L. Krishnananda, Asst. Professor, REVA Institute of Technology, Bangalore

8 8086 ALP TOOLS (CH 2)

c) SIZE: Its an operator which tells the assembler to determine the number of bytes available in the given string or array. Considering the above examples, SUM db 50 Dup (?) mov cx, SIZE Sum ; then cx = 50d. (This is same as LENGTH operator) But if , Array dw 50 dup (?) is declared, Then, mov cx, SIZE Array ; cx = 100d (since Array is of type DW, there are 100 bytes available) d) PTR (Pointer): Its used to assign a specific type to a variable or to a label. For example, if you write an instructions like MOV bx, 2000h ; point bx reg to address 2000 INC [bx] ; the assembler will not know whether to increment the byte at 2000 or word at 2000 & 2001h. In such cases, PTR operator is used, and you re write the instruction as: INC BYTE PTR [bx] or INC WORD PTR [bx]. The PTR operator can also be used to override the declared type of a variable. For example, if you declare Num dw 1223h, and if you write in your program the instruction: mov al, BYTE PTR Num ; then, only one 8-bit data (23h) will be accessed and loaded into al register. PTR operator is also useful for indirect Jump instructions. If you want a near jump, you can specify as Jmp WORD PTR [bx], provided bx has the target address. e) TYPE: The TYPE operator tells the assembler to determine the type of a specified variable. The assembler determines the number of bytes associated with that variable. For a byte-type variable, the assembler will give a value 1; for word-type variable, the value is 2. It can be used for auto increment mode of operation. Ex: if you write an instruction like, Add si, TYPE Array SI will be added with value 1 if Array is defined with DB directive, with a value 2 if if Array is defined with DW directive and so on.

4. Miscellaneous Directives:
a) PROC and ENDP directives: The PROC directive is used to identify the start of a procedure. PROC is preceded by a procedure name given by the user. After the PROC directive, the label NEAR or FAR is used to identify the type of the procedure. If the procedure is written within the same code segment where the main program resides, its called a NEAR procedure. If the procedure is written in a different code segment, then its called FAR procedure. The ENDP directive is used to indicate the end of the procedure. The procedure is called from the main program using CALL instruction. PROC & ENDP: indicate the start and end of the procedure. They require a label to indicate the name of the procedure. NEAR: the procedure resides in the same code segment. (Local)

Compiled by: L. Krishnananda, Asst. Professor, REVA Institute of Technology, Bangalore

9 8086 ALP TOOLS (CH 2)

FAR: resides at any location in the memory. Ex: Fact PROC Near ; definition of a procedure ... ; body of the procedure Fact ENDP Ex: Add PROC NEAR ADD AX,BX MOV CX,AX RET Add ENDP PROC directive stores the contents of the register in the stack. b) EVEN (Align on Even Memory address) As discussed, the assembler uses a location counter to keep track of the statements in the program. The EVEN directive tells the assembler to increment the location counter to the next even address, if it is not already at an even address. That is, the addresses of all variables declared and all instructions in the program are aligned at even addresses only. This is done to increase the speed of accessing the memory. NOP instruction is inserted at the appropriate places to facilitate this. c) GROUP (Group the related segments) It is used to inform the assembler to form logical group of all segments mentioned after the word GROUP. That is, all such segments and labels can be addressed using one name and same group segment base address. Ex: Main GROUP Code, Data, Stack Now, it directs the linker to prepare an EXE file such that Code, Data and Stack segment must lie within 64 KB memory segment named Main. So, you can use the statements like: ASSUME cs:Main, ds: Main, ss: Main d) LABEL During the assembly process, whenever the assembler comes across the LABEL directive, it assigns the declared label with the current contents of location counter. The LABEL directive must be followed by a term which specifies the type you want to associate with that name. If the label is used as the destination for a jump or call, then the label must be specified as type NEAR or FAR. If the label is referencing a data item, then the label must be specified as type byte, word or double word. e) SEG (Segment of a label) The SEG operator is used to determine the segment address of the label, variable or procedure. Ex: mov bx, SEG Array ; copy the segment address of label Array and mov ds, bx ; store it in DS register. f) EXTRN (External) The EXTRN directive is used to tell the assembler that the names or labels following this directive are in some other assembly module. For example, if you want to call a procedure assembled at a different time in another program module, you must in your

Compiled by: L. Krishnananda, Asst. Professor, REVA Institute of Technology, Bangalore

10 8086 ALP TOOLS (CH 2)

program, tell the assembler that the procedure you are using is external. The assembler will then put information in the object code file so that the linker can connect the two modules together. For a reference to a label, you must specify whether the label is Near or Far. Ex: In your program, if you write EXTRN Division Far, it tells the assembler Division is a label of type far in another assembly module. Ex: If you want to call a Factorial procedure of Module1 from Module2 it must be declared as PUBLIC in Module1. Note: Names or labels referred to as external in one module must be declared public with the PUBLIC directive, in the module in which they are actually defined. g) PUBLIC Large programs are generally written as several separate modules. Each module is individually assembled, tested and debugged. When all the modules are working correctly, their object code files are linked together to form the complete program. In order for the modules to link together correctly, any variable name or label referred to in other modules must be declared public in the module in which it is defined. The PUBLIC directive is used to tell the assembler that a specified name or label will be accessed from some other modules. For example, if you write PUBLIC Divisor, Dividend - then these two variables are available to other assembly modules.

Note: The PUBLIC and EXTRN directives are used within SEGMENT ENDS brackets. Ex: Mod1 SEGMENT Mod2 SEGMENT PUBLIC Fact Far EXTRN Fact Far Mod1 ENDS Mod2 ENDS Note: The linker will verify that every identifier appearing in an EXTRN statement is matched by a PUBLIC statement.

Assembly Language Program Example:


1. Example program using .MODEL directive ; Example to illustrate the usage of Model directive .MODEL SMALL ; Module has one data segment and one code segment .DATA ; Default name for the data segment ; Define all the variables here NUM DB 20, 20H NUM DB 8 DUP (?) ; End of data segment .CODE Start: ; This is optional MOV AX, @ DATA ; load DS reg with MOV DS, AX ; Data seg address MOV AX, NUM ; copy dat from memory into AX MOV BX, NUM ; copy data into reg BX MUL BX ; Multiply (AX)&(BX). Store result in DX & AX reg (32-bit)

Compiled by: L. Krishnananda, Asst. Professor, REVA Institute of Technology, Bangalore

11 8086 ALP TOOLS (CH 2)

MOV NUM1, AX ; Store low word of result into memory MOV AH, 4CH INT 21H END Start ; END of program /// The program may be written in a different form as below. .CODE .STARTUP ; indicates start of code segment. Also, ; loads DS reg with base address of data segment MOV AX, NUM MOV BX, AX MUL BX MOV NUM1, AX MOV AX, 4C00h INT 21H .EXIT END 2. Program without using .MODEL directive ; Program to illustrate the use of SEGMENT, ENDS and ASSUME directives ; Define all the variables here

DATA_HERE SEGMENT NUM DB 20, 20H Body of the logical segment Variable NUM1 DB DUP (?) to define all the variables name DATA_ HERE ENDS CODE_HERE SEGMENT Directive ; Write the Program code here Main : MOV AX, DATA_SEG ; Segment register cant be MOV DS, AX ; loaded with an immediate operand ; use any General purpose register MOV AX, NUM MOV BX, AX MUL BX ; (DX) (AX) (AX) * (BX) MOV NUM1, DX ; Move upper word of product into memory MOV AH, 4CH ; DOS INT 21H INT 21H function to return to DOS CODE_HERE ENDS END Main ; Main label is optional Different ways of initializing Segment base address to DS register: (1) MOV AX, @ DATA MOV DS, AX (3) MOV BX, _DATA MOV DS, BX (2) MOV AX, SEG_DATA MOV DS,AX (4) MOV CX, SEG NUM MOV DS, CX

Compiled by: L. Krishnananda, Asst. Professor, REVA Institute of Technology, Bangalore

12 8086 ALP TOOLS (CH 2)

Summary:
Assembler is a program that accepts an assembly language program as input and converts it into an object module and prepares for loading the program into memory for execution. Loader (linker) further converts the object module prepared by the assembler into executable form, by linking it with other object modules and library modules. The final executable map of the assembly language program is prepared by the loader at the time of loading into the primary memory for actual execution. The assembler prepares the relocation and linkages information (subroutine, ISR) for loader. The operating system that actually has the control of the memory, passes the memory address at which the program is to be loaded for execution and the map of the available memory to the loader. Based on this information and the information generated by the assembler, the loader generates an executable map of the program and further physically loads it into the memory and transfers control to for execution. Thus the basic task of an assembler is to generate the object module and prepare the loading and linking information.

Operation of an Assembler:
Assembling a program proceeds statement by statement sequentially. The first phase of assembling is to analyze the program to be converted. This phase is called Pass1 defines and records the symbols, pseudo operands and directives. It also analyses the segments used by the program types and labels and their memory requirements. The second phase l (Pass2) looks for the addresses and data assigned to the labels. It also finds out codes of the instructions from the instruction machine, code database and the program data. It processes the pseudo operands and directives. It is the task of the assembler designer to select the suitable strings for using them as directives, pseudo operands or reserved words and decides syntax.

Compiled by: L. Krishnananda, Asst. Professor, REVA Institute of Technology, Bangalore

S-ar putea să vă placă și