Tutorial Module 4

Computer Organization and Architecture SCR 1043 Semester 2, 09/10 Tutorial: Central Processing Unit and Pipelining 1.
What is the difference between a microprocessor and a microprogram. Is it possible to design a microprocessor without a microprogram? Are all microprogrammed computers also microprocessors? 2. 3. 4. What general roles are performed by CPU registers? Explain the difference between hardwired control and microprogrammed control. What is the function of the condition codes?
5. The control logic for many CPUs is organized using microcode. What is the alternative to microcode? What are the advantages and disadvantages of using or not using microcode? 6. If the last operation on a computer with an 8-bit word was an addition in which the two operands were 2 and 3, what would be the value of the following flags? Carry Zero Overflow Sign Even parity Half-carry (auxiliary carry). What if the operands of Question 4 were -1 (2s complement) and 1? Trace the execution of the instructions below by showing all the changes in CPU registers (control and general purpose registers) as well as the micro-operations related to the instructions. Memory address 39D 39E 39F 9. Instruction MOV CX, NUM ADD CX, 1 MOV NUM2, CX
7. 8.
Write the sequence of micro-operations required to add a number to the ACC when the number is a. an immediate operand b. a direct-address operand c. an indirect-address operand 1
10.
The ALU can add its two input registers, and it can logically complement the bits of either input register, but it cannot subtract. Numbers are to be stored in twos complement representation. List the micro-operations the control unit must perform to execution a subtraction instruction. Assume that the control memory word is 24 bit each. The control portion of the microinstruction format is divided into two fields. A micro-operation field of 13 bits specifies the micro-operations to be preformed. An address selection field specifies a condition, based on the flags that will cause a microinstruction branch. There are eight flags. a. How many bits are in the address selection field? b. How many bits are in the address field? c. What is the size of the control memory? We wish to provide 8 control words for each machine instruction routine. Machine instruction opcodes have 5 bits and control memory has 1024 words. Suggest a mapping from the instruction register to the control address register. A control memory has 4096 words of 24 bits each. a. How many bits are there in the control address register? b. How many bits are there are for the select field and address field of the microinstruction? Using the mapping procedure as shown in Method 1 of the lecture slide, give the first microinstruction address for the following operation code: a. 0010 b. 1011 c. 1111 Formulate a mapping procedure that provides 8 consecutive microinstructions for each routine. The operation has six bits and the control memory has 2048 words. Show how a 9-bit microoperation field in a microinstruction can be divided into subfields to specify 46 microoperations. How many microoperations can be specified in one microinstruction? Why is a two-stage instruction pipeline unlikely to cut the instruction cycle time in half, compare with the use of no pipeline? Assume an 8088 is executing a program in which the probability of a program jump is 0.1. For simplicity, assume that all instructions are two bytes long. a. What fraction of the instruction fetch bus cycles is wasred? b. Repeat if the instruction queue is 8 bytes long.
11.
12.
13.
14.
15. 16.
17. 18.
19.
Assume a pipeline processor with 4 stages: fetch instruction (FI), decode instruction and calculate addresses (DA), fetch operand (FO), and execute (EX). Draw the space time diagram for a sequence of 7 instructions, in which the third instruction is a branch that is taken and in which there are no data dependencies. Assume that RISC architecture ALU instructions have three register operands and that the result is stored into the first (left-most) register. The architecture is implemented in ordinary (not superscalar) pipelined fashion so that, in best case, one instruction can be completed in each cycle. The pipeline has 5 phases (instruction fetch, instruction decode, register read, ALU, write back). Observe the following set of instructions generated by the compiler (instruction number on left):
1 2 3 4 5 6 7 Load Add Move Add Add Jnzer Move R2, VarX R5, R5, R2 R2, R6 R3, R3, R2 R2, R3, R5 R3, Loop R1, R2 ; Regs(2) <- Mem(VarX) ; Regs(5) <- Regs(5) + Regs(2) ; Regs(2) <- Regs(6) ; Jnzer is jmp not zero
20.
Loop
Many aspects in the preceding code segment may reduce the execution speed from the maximum possible. Describe precisely the problem types defined below and mark in a clear way one occurrence of each problem type in the preceding code segment (if it appears there): data dependency control dependency structural dependency How can one avoid or reduce the performance problems caused by each problem type? 21. Pipelining can be applied within the ALU to speed up floating-point operations. Consider the case of floating-point addition and subtraction. In simplified terms, the pipeline could have 4 stages: 1. Compare the exponents; 2. Choose the exponent and align the mantissa/significant; 3. Add or subtract the mantissa/significant; 4. Normalized the results. The pipeline can be considered to have two parallel threads, one handling exponents and one handling significant.
Answers 4.
Answer: Condition codes are bits set by the CPU hardware as the result of operations. For example, an arithmetic operation may produce a positive, negative, zero, or overflow result. In addition to the result itself being stored in a register or memory a condition code is also set. The code may subsequently be tested as part of a conditional branch operation. 6&7 Answer:
8. Answer:
Y = MOV CX, NUM Clock PC MAR t0 39D t1 39D 39D t2 39E 39D t3 t4 t5 t6 39E 39E 39E 39E 39D
Address of NUM Address of NUM Address of NUM
MBR Y Y Y NUM NUM MBR -
IR Y Y Y Y IR -
CX ? NUM CX NUM
Micro-operation [PC] = 39D FETCH: MAR PC MBR MEM[MAR] PC PC + 1 IR MBR INDIRECT: MAR IR[address] MBR MEM[MAR]
EXECUTE: CX
MBR
X = ADD CX, 1 Clock PC t7 39E
MAR -
Micro-operation [PC] = 39E
t8 t9 t10 t11
39E 39E 39F 39F 39F
39E 39E 39E 39E
X X X
X X
NUM+1
FETCH: MAR
PC
MBR MEM[MAR] PC PC + 1 IR MBR EXECUTE: CX CX + 1 Micro-operation [PC] = 39F FETCH: MAR PC MBR MEM[MAR] PC PC + 1 IR MBR EXECUTE: MBR CX MAR IR[address] MEM[MAR] MBR
Z= MOV NUM2, CX Clock PC MAR t12 39F t13 39F 39F t14 39F 39F 3A0 t15 3A0 39F t16 3A0 Address of t17 t18
10.
MBR Z Z NUM+1 NUM+1
IR Z Z Z
CX NUM+1 NUM+1 NUM+1
3A0
NUM2 Address of NUM2
Answer: Consider the instruction SUB R1, X, which subtracts the contents of location X from the contents of register R1, and places the result in R1. t1: MAR IR(address)) t2: MBR Memory t3: MBR Complement (MBR) t4: MBR Increment (MBR) t5: R1 R1) + (MBR)
11. Answer: a. b. c.
Three bits are needed to specify one of 8 flags. 24 13 3 = 8 28 = 256 words X 24 bits = 6144 bits.
12. Answer: An address for control memory requires 10 bits (210 = 1024). A very simple mapping would be this:
opcode XXXXX control address 00XXXXX000 This allows 8 words between successive addresses. 13. Answer:
a. b.
14. Answer:
12 bits in CAR 12 bits for Select, 12 bits for Address.
a. b. c.
15. Answer:
0 0010 00 0 1011 00 0 1111 00
Total 11 bit in CAR, 6 bit for opcodes, 3 bit for micro-instructions 00 000000 000 (starting address)
17. Answer: There could be two possibilities: a. The execution time will generally be longer than the fetch time. Execution will involve reading and storing operands and the performance of some operation. Thus, the fetch stage may have to wait for some time before it can empty its buffer. b. A conditional branch instruction makes the address of the next instruction to be fetched unknown. Thus, the fetch stage must wait until it receives the next instruction address from the execute stage. The execute stage may then have to wait while the next instruction is fetched. 18. Answer: The occurrence of a program jump wastes up to 4 bus cycles (corresponding to the 4 bytes in the instruction queue when the jump is encountered). For 100 instructions, the number of non-wasted bus cycles is, on average, 90 x2 = 180.
a. The number wasted is as high as 10x4 = 40. Therefore the

fraction of wasted cycles is 40/(180 + 40) = 0.18. b. If the capacity of the instruction queue is 8, then the fraction of wasted cycles is 80/(180 + 80) = 0.3 19. Answer:
21. Answer:
Extra Questions 1. Trace the execution of the instructions below by showing all the changes in CPU register (control and general purpose registers) as well as the micro-operations related to the instructions. Memory address 39D 39E 39F 3A0 450 451 2. There are 6 segments as follows: Segment Name Fetch Instruction (FI) Decode Instruction (DI) Calculate Operand (CO) Fetch Operand (FO) Execute Instruction (EI) Write Operand (WO) The interface delay is 4 ns. a. b. c. d. What is the execution time for 10 instructions without pipeline? What is the execution time for 10 instructions with pipeline? What is the maximum/theoretical speedup? What is the real speedup? Segment Execution Time (ns) 52 20 40 30 40 35 Memory Content A450 B451 8C00 C39D 100 500 Instruction/Data L1: ADD CX, NUM SUB VAL1,CX INC CX JMP L1: NUM VAL1
3. Assume a pipeline processor with 4 stages: instruction fetch (IF), operand fetch (OF), instruction execute (IE) and operand store (OS). Draw the space time diagram and solve the problem of data dependency using NOP for the following sequence of instructions: ADD AH, BH SUB AH, 2 8
CMP AH, 0 4. With the same pipeline processor as question 3, draw the space time diagram for the following sequence of instructions: ADD AH, BH SUB CH, 2 JMP A1 AND AH, 011F SHL BH, 3 A1: MUL AH, BH a. Solve the branching problem using NOP. b. Solve the same problem by rearranging the instructions.

Tutorial Module 4

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Tutorial Module 4

Încărcat de

Drepturi de autor:

Formate disponibile

Computer Organization and Architecture SCR 1043 Semester 2, 09/10 Tutorial: Central Processing Unit and Pipelining 1.

MBR Y Y Y NUM NUM MBR -

X = ADD CX, 1 Clock PC t7 39E

Micro-operation [PC] = 39E

39E 39E 39F 39F 39F

39E 39E 39E 39E

MBR Z Z NUM+1 NUM+1

CX NUM+1 NUM+1 NUM+1

NUM2 Address of NUM2

12 bits in CAR 12 bits for Select, 12 bits for Address.

0 0010 00 0 1011 00 0 1111 00

a. The number wasted is as high as 10x4 = 40. Therefore the

S-ar putea să vă placă și