Sunteți pe pagina 1din 8

ECE 366: Computer Architecture Lecture Notes # 14 Pipeline Basics

S HANTANU D UTT

Department of Electrical and Computer Engineering University of Illinois, Chicago Phone: (312) 355-1314; e-mail: dutt@eecs.uic.edu URL: http://www.eecs.uic.edu/dutt

Shantanu Dutt

UIC

Pipelining Introduction

1. Pipelining is the processing concept in which the entire processing ow is broken up into multiple stages, and a new data/instruction is processed by a stage potentially as soon as it is done with the current data/instruction, which then goes onto the next stage for further processing.

2. In a non-pipelined processing, by contrast, the next data/instruction is processed after the entire processing of the previous data/instruction is complete.
Data/ Instr. Delay = D Processing Path Data/ Instr. k stages each of delay D/k

(a) Non-pipelined processing; new data/instr. inputted every D secs. A data/instr. is done evry D secs.

(b) Pipelined processing: new data/instr. is inputted every D/k secs. After initial "fill", a data/instr. is done every D/k secs

3. A data/instructions processing in a pipeline is complete when it exits the last stage. If there are k stages each of equal delay, then the processing throughput (# of data/instruction processed per second) increases roughly by a factor of k.

4. The real-world has pipelining! E.g., water/oil pipelineprocessing is sending all the uid from source to destination, TV/radio signal communicationprocessing is sending all the signals from source to destination.

Instruction Pipeline 1. Typically broken up into 3 to 5 stages:


F D R Ex W

(a)

(E = R+Ex+W)

(b)

2. Each stage now requires its own C.U., as control signals need to be generated simultaneously for the processing in each stage: E.g., while the read and other control signals are asseted in the Fetch stage for instruction i, the Read stage is asserting the control signals for reading the operands for instruction i 2, the Decode stage is guring out which next stage to forward instruction i 1 to, the Execute stage is sserting the control signals for the right ALU operation for instruction i 3, while the Write stage is asserting the register write signals for instruction i 4.

3. Input registers are required for each stage to store intermediate values and instructions that are passed between the stages. E.g., the input register

of the Decode stage will store instruction i 1, the input register of the Read stage will store instruction i 2, the input register of the Execute stage will store the opcode and write register elds of instruction i 3 as well as the twi input operands generated by the Read stage in its previous processing of nstruction i 3, while the Write stages input register would store the ALU output and the write register eld of instruction i 4.

4. Further, the C.U. of stage j can only store new data.instruction in the input register of stage j 1 after it receives a signal (done j 1) from stage j 1 informing it that it is done with its current processing (much like wait is used to communicate between the memory unit and the processor).

5. Also, the C.U. of stage j can only start processing new data/instruction in its input register when it receives a signal new j 1 from stage j 1 informing it that new data/instruction has been written into its input register (which, will be done, of course, only after stage j asserts signal done j to stage j 1.

6. Some extra delay has been added to the total delay/latency for processing a single instruction due to the writing and reading of the stage registers that is needed in the pipelined version. We will, however, ignore this extra delay in our discussions.

Pipeline ThroughputNon-branching model 1. Let ti be the delay of stage i. Then, non-pipelined time T no processing n instruction is T no
pipe

pipe

n for

i 1

ti

k

2. Pipelined time T pipe n for processing n instruction is T pipe n


where the Fill Time is the time taken to ll the pipeline, i.e., the time taken for the rst instruction to emerge from the pipeline, by which time, all pipeline stages ate lled or busy processing some instruction; before the Fill Time, some stages may be idle. The Fill Time is given by Fill Time sumik 1ti

Fill Time

1 max ti

i 1

0 3 ccs F1 D1 E1

Time (ending cc) 3 6 9 1ccs 2ccs 15

F2 D2 E2
Instructions

F3 D3

E3

F4

D4

E4

(a) Timing view with "moving" pipeline


Instructions F 1
Time (current cc)

D 1 4 5 7 8 10 11 13 14 16 2 2 3 3 4 4 1

Done

1 2 2 3 3 4 4 4 3 2 1

(b) A stationary-pipeline timing view (reversed Gantt chart)

3. Pipelined throughput is given by n T pipe n for a large n and is in units of instructions / sec. Non-pipeline throughput is given by n T no pipe n 1 ik 1 ti and is in units of instructions / sec.
  

4. Example 1: If all tis are equal and that value is t, then non-pipeline

throughput is 1 kt instr/sec, while the pipelined throughput is 1 t instr/sec, which is k times more than the non-pipelined version.


5. Example 2: Consider a pipeline with 3 stages F, D, E with delays of 3 ccs, 1 cc and 2 ccs, respectively. T no n

pipe

n 6 cc s

T pipe n 6


Pipelined throughput = n 6 n 1 3 Non-pipeline throughput is 1/6 instr/cc.


1 3 cc s
  

1 3 instr/cc.

Pipelined Myth8 Processor


MEMORY SYSTEM Memory Data Bus
write 8 dr_enb O1 O2 ir0_sel
result_sel[0..1]

read

Memory Address Bus

MAR
mar_sel Fetch CU r0write r0write
ri_sel rj_sel rk_sel

O1 O2

IR1
ir1_sel

IR0

MDR

dr_sel[0..1] O1 O2

CPU

Fetch Stage

opcode

Mux

Register File
O1 O2

Read CU

a_sel[0..2] b_sel[0..2]
Read Bus A

Decode & Read Stage


8 8 Read Bus B

Opcode register

Reg. file 8 o/p regs

ALU CU

cout, m7, v
cin

Decode & ALU Stage


alu_sel[0..2] ALU
Write Bus

Write CU

ALU o/p register

Decode & Write Stage

The Myth8 Pipelined Processor

S-ar putea să vă placă și