Sunteți pe pagina 1din 25

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

CIS-77 Home http://www.c-jump.com/CIS77/CIS77syllabus.htm

Introduction to x86 Assembly Language


1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. Advantages of High-Level Languages Why program in Assembly ? Here is why... Speed, Efficiency, Debugging, Optimization... Why MASM ? Introduction to 80x86 Assembly Language Materials on the Web Useful books, in no particular order Fundamental Concepts Software Environment Runtime Environment M1.ASM Assembly and C Code Compared More Assembly and C Code Assembly vs. Machine Language Controlling Program Flow Conditional Jumps General-Purpose Registers Typical Uses of General-Purpose Registers x86 Registers x86 Registers, Cont x86 Control Registers MOV, Data Transfer Instructions Ambiguous MOVes: PTR and OFFSET INC and DEC Arithmetic Instructions ADD Arithmetic Instruction ADD vs. INC SUB Arithmetic Instruction SUB vs. DEC CMP instruction Unconditional Jumps Conditional Jumps Conditional Jumps, Cont Conditional Jumps, Cont LOOP Instruction Logical Instructions Logical Instructions, Cont. Shift Instructions SHL and SHR Shift Instructions Shift Instructions Examples Rotate Instructions ROL and ROR, Rotate Without Carry

1 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

43. RCL and RCR, Rotate With Carry 44. EQU directive 45. EQU Directive Syntax

1. Advantages of High-Level Languages


High-level language programs are portable. (Although some programs could still have a few machine-dependent details, they can be used with little or no modifications on other types of machines.) High-level instructions: Program development is faster Fewer lines of code Program maintenance is easier Compiler translates to the target machine language.

2. Why program in Assembly ?


There are some disadvantages... Assembly language programs are not portable! Learning the assembly is more difficult than learning Java! Programming in the assembly language is a tedious and error-prone process. High-level languages should be natural preference for common applications.

3. Here is why...
I just don't consider a utility program that's 4 megabytes big, and contains all sorts of files that the author didn't create, to be really great software. Do you? Steve Gibson, Gibson Research Corporation. Assembly language programs contain only the code that is necessary to perform the
2 sur 25 31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

given task. Assembly gives direct and complete control over system hardware: Writing device drivers. Operating system design. Embedded systems programming, e.g. aviation industry. Writing in-line assembly (mixed-mode) in high-level languages such as C/C++, or hybrid programming in assembly and C/C++.

4. Speed, Efficiency, Debugging, Optimization...


There are areas where speed is everything, for example, internet data encryption, aircraft navigational systems, medical hardware control... There are also areas where space-efficiency is everything: spacecraft control software... Understanding disassembly view of an executable program is also useful: for investigating the cause of a serious bugs or crashes that require understanding of memory dumps and disassembled code. for optimizing your code. for practical and educational purposes.

5. Why MASM ?
The "granddaddy" of all assemblers for the Intel platform, product of Microsoft. Available since the beginning of the IBM-compatible PCs. Works in MS-DOS and Windows environments. It's free: Microsoft no longer sells MASM as a standalone product. Bundled with the Microsoft Visual Studio product. Numerous tutorials, books, and samples floating around, many are free or low-cost. Steve Hutchessen's www.masm32.com
3 sur 25 31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

MASM32 development environment incorporates MASM assembler and Win32 API tools.

6. Introduction to 80x86 Assembly Language


Logic gates are used at the hardware level. What is machine language? How high-level language concepts, such as if-else statements, are realized at the machine level? What about interactions with the operating system functions? How is assembly language translated into machine language? These fundamental questions apply to most computer architectures. By using assembly, we gain understanding of how the particular model of computer works.

7. Materials on the Web


Such secrets have been revealed to me that all I have written now appears of little value. St. Thomas Aquinas, December 6, 1273. Useful links: Microsoft MASM Programmer's Guide Assembly-Language Development System v6.1, also at another location MASM Reference Guide can be downloaded there, too. More here: Assembly Technical Documentation in PDF and MS Word format Intel and Microsoft MASM 6.1 Documentation A web page with a variety of assembler source code Intel 80x86 Conditional and Unconditional Branching Examples Intel 80x86 Boolean and Arithmetic Instruction Examples You can get Microsoft's Macro Assembler free: download Microsoft Windows Driver

4 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

Development Kit (DDK), which contains both assembler and linker. Also, download Microsoft's Debugging Tools for Windows 32-bit Version. Take a look at Sivarama P. Dandamudi textbook info, Introduction to Assembly Language Programming , From 8086 to Pentium. Homepage includes free downloadable Microsoft assembler, MASM , and student slides. Last, but not least, Microsoft Macro Assembler Reference MSDN resource.

8. Useful books, in no particular order

Intel Architecture Software Developer's Manual 1. 2. 3. Volume 1 , Intel Basic Architecture: Order Number 243190 , PDF, 2.6 MB. Volume 2 , Instruction Set Reference: Order Number 243191 , PDF, 6.6 MB. Volume 3 , System Programing Guide: Order Number 243192 , PDF, 5.1 MB.

It is highly recommended that you download the above manuals and use them as a reference.

Introduction to 80x86 Assembly Language and Computer Architecture by Richard C. Detmer, Professor of Computer Science at Middle Tennessee State University, Tennessee. Jones and Bartlett Publishers 2001 (499 pages) ISBN-13: 9780763717735 ISBN-10: 0763717738 Hardcover, 512 Pages 2001 Excellent book for beginners

5 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

The Intel Family Of Microprocessors: Hardware and Software Principles and Applications (Hardcover) by James L. Antonakos ISBN: 1418038458 Date: 2006 Pages: 640 Solid book, covers Pentium CPUs

Professional Assembly Language by Richard Blum Publisher: Wrox Date: 2005 Pages: 567 ISBN: 0764579010 Covers Linux Programming

PC Assembly Language Free book online by Paul A. Carter November 11, 2003

Free online tutorial Win32 Assembler Coding For Crackers Author: Goppit. "First go away and learn assembler, then come back and read this." An introduction to Win32 Assembler programming aimed at filling the gap between the complete beginner and the advanced.

6 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

Size: 11.31 MB

Introduction to Assembly Language Programming: For Pentium and RISC Processors by Sivarama P. Dandamudi Publisher: Springer; 2nd ed. edition Date: 2004 Pages: 696 ISBN: 0387206361 Highly recommended, in depth coverage of concepts.

Use google to search for "MASM programmer's guide chm". by Microsoft, 1992, covers Assembly Version 6.1

Assembly Language for Intel-Based Computers by Kip R. Irvine Publisher: Prentice Hall; 4th Edition, 2002 Pages: 700 ISBN: 0130910139 Excellent book, lots of sample code, in-depth coverage of BIOS, Win32, MS-DOS.

7 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

32/64-bit 80x86 Assembly Language Architecture by James Leiterman Publisher: Wordware Publishing, Inc. Date: 2005 Pages: 450 ISBN: 1598220020 Online resources: James Leiterman Advanced book for game and graphics programmers.

9. Fundamental Concepts
CPU registers Memory addressing Representation of data: numeric formats character strings Instructions to operate on 2's complement integers Instructions to operate on individual bits Instructions to handle strings of characters Instructions for branching and looping Coding of procedures: transfer of control parameter passing local variables

8 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

10. Software Environment


The tools we will use include: Visual Studio development environment... ...edit, assemble, link, manage projects, debug and disassemble programs. Command-line MASM, Microsoft Macro Assembler... ...produces code for 32-bit flat memory model appropriate to modern Windows. Test-drive fullscreen 32-bit debuggers: OllyDbg, Visual Studio, WinDbg. DUMPBIN: command-line utility that examines binary files and disassembles programs.

11. Runtime Environment


Program runs on the processor. Program uses operating system functions and services. Program uses one of the memory models: Real mode flat model, 65,536 bytes of addressable memory (ancient MS-DOS .COM files) Real mode segmented model, 1 megabyte (prime-time MS-DOS) Protected mode flat model, modern Windows and Linux: Addressable Memory: 80486 and Pentium - 4 Gigabytes As far as 32-bit Vista is concerned, the world ends at 4,096 megabytes. A 32-bit program can address up to 4 gigabytes of memory.

; CIS-77 ; your_program_name.asm ; Brief description of what the program does

9 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

.386 ; Tells MASM to use Intel 80386 instruction set. .MODEL FLAT ; Flat memory model option casemap:none ; Treat labels as case-sensitive .CONST .STACK 100h .DATA .CODE _main PROC ret _main ENDP END _main ; Constant data segment ; (default is 1-kilobyte stack) ; Begin initialized data segment ; Begin code segment ; Beginning of code

; Marks the end of the module and sets the program entry point label

12. Assembly and C Code Compared


Some simple high-level language instructions can be expressed by a single assembly instruction:
Assembly Code ---------------inc result mov size, 1024 and var, 128 add value, 10 C Language Code --------------------------------++result; // Increment value size = 1024; // Assign value var &= 128; // Apply AND bitmask value += 10; // Addition

13. More Assembly and C Code


Most high-level language instructions need more than one assembly instruction:
Assembly Code ---------------mov AX, value mov size, AX mov add add add mov AX, AX, AX, AX, sum, sum x y z AX C Language Code --------------------------------size = value; // Assign variable

sum += x + y + z; // Arithmetic computation

10 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

14. Assembly vs. Machine Language


Assembly Language uses mnemonics, digital numbers, comments, etc. Machine Language instructions are just a sequences of 1s and 0s. Readability of assembly language instructions is much better than the machine language instructions:
Assembly Language ----------------inc result mov size, 45 and var, 128 add value, 10 Machine Language (in Hex) -------------------------FF060A00 C7060C002D00 80260E0080 83060F000A

15. Controlling Program Flow


Just as in high-level language, you want to control program flow. The JMP instruction transfers control unconditionally to another instruction. JMP corresponds to goto statements in high-level languages:
; Handle one case label1: . . . jmp done ; Handle second case label2: . . . jmp done . . done:

16. Conditional Jumps


Conditional jump is taken only if the condition is met. Condition testing is separated from branching.
11 sur 25 31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

Flag register is used to convey the condition test result. For example:
cmp je . . done: ax, bx done

17. General-Purpose Registers

The EAX, EDX, ECX, EBX, EBP, EDI, and ESI registers are 32-bit generalpurpose registers, used for temporary data storage and memory access. The AX, DX, CX, BX, BP, DI, and SI registers are 16-bit equivalents of the above, they represent the low-order 16 bits of 32-bit registers. The AH, DH, CH, and BH registers represent the high-order 8 bits of the corresponding registers.

Since the processor accesses registers more quickly than it accesses memory, you can make your programs run faster by keeping the most-frequently used data in registers.

12 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

Similarly, AL, DL, CL, and BL represent the low-order 8 bits of the registers.

18. Typical Uses of General-Purpose Registers

Register EAX EBX ECX EDX EBP ESP ESI EDI EIP EFLAGS

Size

Typical Uses

32-bit Accumulator for operands and results 32-bit Base pointer to data in the data segment 32-bit Counter for loop operations 32-bit Data pointer and I/O pointer 32-bit Frame Pointer - useful for stack frames 32-bit Stack Pointer - hardcoded into PUSH and POP operations 32-bit Source Index - required for some array operations 32-bit Destination Index - required for some array operations 32-bit Instruction Pointer 32-bit Result Flags - hardcoded into conditional operations

19. x86 Registers


Four 32-bit registers can be used as Four 32-bit registers EAX, EBX, ECX, EDX. Four 16-bit registers AX, BX, CX, DX. Eight 8-bit register AH, AL, BH, BL, CH, CL, DH, DL. Some registers have special use... ...ECX for count in LOOP and REPeatable instructions

13 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

20. x86 Registers, Cont

Two index registers ESI (source index) and EDI (destination index) can be used as 16-bit or 32-bit registers Also in string processing instructions In addition, ESI and EDI can be used as generalpurpose data registers

Two pointer registers ESP (stack pointer) and EBP (base pointer) 16-bit or 32-bit registers Used exclusively to maintain the stack.

14 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

21. x86 Control Registers


EIP Program counter (Instruction Pointer) EFLAGS is set of bit flags: Status flags record status information about the result of the last arithmetic/logical instruction. Direction flag stores forward/backward direction for data copying. System flags store IF interrupt-enable mode TF Trap flag used in single-step debugging.

22. MOV, Data Transfer Instructions


The MOV instruction copies the source operand to the destination operand without affecting the source. Five types of operand combinations are allowed with MOV:
Instruction type -------------------------mov register, register mov register, immediate mov register, memory mov memory, register mov memory, immediate Example -----------------mov DX, CX mov BL, 100 mov EBX, [count] mov [count], ESI mov [count], 23

Note: the above operand combinations are valid for all instructions that require two operands.

23. Ambiguous MOVes: PTR and OFFSET


For the following data definitions
.DATA

15 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language


table1 status .CODE mov mov mov mov DW DB 20 DUP (?) 7 DUP (0) table1 status 100 100 ; ; ; ;

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

EBX, ESI, [EBX], [ESI],

"instruction operands must be the same size" "instruction operands must be the same size" "invalid instruction operands" "invalid instruction operands"

The above MOV instructions are ambiguous. Not clear whether the assembler should use byte or word equivalent of 100. Better:
mov mov mov mov EBX, ESI, WORD BYTE OFFSET table1 OFFSET status PTR [EBX], 100 PTR [ESI], 100

24. INC and DEC Arithmetic Instructions


Format:
inc destination dec destination

Semantics:
destination = destination +/- 1

The destination can be 8-bit, 16-bit, or 32-bit operand, in memory or in register. No immediate operand is allowed. Examples:
inc dec BX [value] ; BX = BX + 1 ; value = value - 1

25. ADD Arithmetic Instruction


Format:
add destination, source

Semantics:

16 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language


destination = (destination) + (source)

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

Examples:
add add ebx,eax [value], 10h

26. ADD vs. INC


Note that
inc eax

is better than
add eax, 1

INC takes less space. Both INC and ADD execute at about the same speed.

27. SUB Arithmetic Instruction


Format:
sub destination, source

Semantics:
destination = (destination) - (source)

Examples:
sub sub ebx, eax [value], 10h

28. SUB vs. DEC


Note that
dec eax

17 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

is better than
sub eax, 1

DEC takes less space. Both execute at about the same speed.

29. CMP instruction


Format:
cmp destination, source

Semantics:
(destination) - (source)

The destination and source are not altered. Useful to test relationship such as < > or = between the two operands. Used in conjunction with conditional jump instructions for decision making purposes. Examples:
cmp ebx, eax je done .. done: .. ; jump if equal

30. Unconditional Jumps


Format:
jmp label

Semantics: Execution is transferred to the instruction identified by the label. Infinite loop example:
mov eax, 1 inc_again:

18 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language


inc jmp mov eax inc_again ebx, eax

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

; this will never execute...

31. Conditional Jumps


Format:
jcondition label

Semantics: Execution is transferred to the instruction identified by label only if condition is met. Testing for carriage return example:
; Assume that AL contains input character. cmp al, 0dh ; 0dh = ASCII carriage return je CR_received inc cl .. CR_received:

32. Conditional Jumps, Cont


Some conditional jump instructions treat operands of the CMP instruction as signed numbers:
je jg jl jge jle jne jump jump jump jump jump jump if if if if if if equal greater less greater or equal less or equal not equal

33. Conditional Jumps, Cont


Some conditional jump instructions can also test values of the individual CPU flags:
jz jnz jump if zero jump if not zero (ZF = 1) (ZF = 0)

19 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language


jc jnc jz jnz jump if carry (CF = 1) jump if not carry (CF = 0) is synonymous for je is synonymous for jne

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

34. LOOP Instruction


Format:
loop target

Semantics: Decrements ECX and jumps to target, if ECX > 0 ECX should be loaded with a loop count value before loop begins. Loop 50 times example:
mov ecx, 50 repeat: ; loop body: .. loop repeat ..

Equivalent to:
mov ecx, 50 repeat: ; loop body: .. dec ecx jnz repeat ..

Surprisingly,
dec jnz ecx repeat

executes faster than


loop repeat

35. Logical Instructions


Format:
and or xor not destination, source destination, source destination, source destination

Semantics: Perform the standard bitwise logical operations. Result goes to the destination. TEST is a non-destructive AND instruction:

20 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language


test destination, source

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

TEST performs logical AND but the result is not stored in destination (similar to CMP instruction.)

36. Logical Instructions, Cont.


Example of testing the value in AL for odd/even number:
test al, 01h ; test the least significant bit je even_number odd_number: ; process odd number .. jmp next even_number: ; process even number .. next:

37. Shift Instructions


Shift left format:
shl shl destination, count destination, cl

Shift right format:


shr shr destination, count destination, cl

where count is an immediate value. Semantics: Performs left/right bit-shift of destination by the value in count or CL register. CL register contents is not altered.

38. SHL and SHR Shift Instructions

21 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

Bit shifted out goes into the carry flag CF. Zero bit is shifted in at the other end:

39. Shift Instructions Examples


Count is an immediate value:
shl eax, 5

Specification of count greater than 31 is not allowed. If greater, only the least significant 5 bits are actually used. CL version of shift is useful if shift count is known at run time, e.g. when the shift count is a parameter in a procedure call.

Only CL register can be used. Shift count value should be loaded into CL:
mov shl cl, 5 ax, cl

40. Rotate Instructions


Two types of rotate instructions:

22 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

1. Rotate without carry: ROL (ROtate Left) ROR (ROtate Right) 2. Rotate with carry: RCL (Rotate through Carry Left) RCR (Rotate through Carry Right) Rotate instruction operand is similar to shift instructions and supports two versions: Immediate count value Count value is in CL register

41. ROL and ROR, Rotate Without Carry

42. RCL and RCR, Rotate With Carry

23 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

43. EQU directive


EQU directive eliminates hardcoding:
NUM_OF_STUDENTS EQU 90 .. mov ecx, NUM_OF_STUDENTS

No reassignment is allowed. Only numeric constants are allowed. Defining constants has two main advantages: 1. Improves program readability 2. Helps in software maintenance.
mov ecx, 90 ; HARDCODING is less readable and harder to maintain

Multiple occurrences can be changed from a single place The convention is to use all UPPER-CASE LETTERS for names of constants.

44. EQU Directive Syntax


name EQU expression

Assigns the result of expression to name. The expression is evaluated at assembly time. More examples:
NUM_OF_ROWS NUM_OF_COLS EQU EQU 50 10

24 sur 25

31/07/2012 10:00

Introduction to x86 Assembly Language

http://www.c-jump.com/CIS77/ASM/Assembly/lecture.html

ARRAY_SIZE

EQU

NUM_OF_ROWS * NUM_OF_COLS

25 sur 25

31/07/2012 10:00

S-ar putea să vă placă și