Documente Academic
Documente Profesional
Documente Cultură
3
3.3 RISC Processors
• Over the past 30 years the explosion in device technology (SSI, LSI,
VLSI, .... ) has enabled processors to become increasingly complex.
• Functions that once had to be coded - perhaps as a subroutine - may now
be supported directly in hardware and accessed through a single machine
instruction.
• For example - VAX - 11 series machines had:-
• 304 instructions
• 18 addressing modes
• 20 data types
i IF ID OF OE OS
i+1 IF ID OF OE OS
i+2 IF ID OF OE OS
Sequential Execution
Advanced Computer Architecture 7
Section 3.3.2 Pipelined Execution
i IF ID OF OE OS
i+1 IF ID OF OE OS
i+2 IF ID OF OE OS
Pipelined Execution
• A conventional pipeline can breakdown for a number of reasons:-
– A branch instruction is encountered, if taken the pipeline must be
cleared and new instructions fetched.
– Data used by an instruction is dependent on the result of the previous
instruction; the instruction can’t proceed until the data is valid.
– Instructions in the pipeline require access to the same resource - the
memory bus, the register, or the ALU - simultaneously.
Advanced Computer Architecture 8
Section 3.3.2
• In such situations a 'pipeline hazard' or 'bubble' is introduced into the
pipeline reducing the average instruction execution rate
Pipelined with Data Interlock
Data dependency
ADD ID B,C + A
INC ID Bubble A +1 A
Bubble IF ID OF OE OS
Bubbles in a Pipeline
Advanced Computer Architecture 9
Section 3.3.2
• Conventional designs solve these pipeline problems with special hardware
- RISC designs attempt to avoid the introduction of the bubble in the first
place.
• RISC systems use an optimising compiler to rearrange programs and
schedule instructions so that they don't interfere with one another
• We consider how an optimising compiler might reduce the size of the
bubble in the branch instruction
Delayed Branch Instruction
• Introduce a delayed branch instruction to the architecture:-
– the instruction following the branch is always executed
– control is then transferred to the branch destination
– this gives the CPU time to fetch the proper instruction from the
destination and start it through the pipeline.
Advanced Computer Architecture 10
Section 3.3.2
Address Normal Branch Delayed Branch Optimised
Delayed Branch
100 LOAD X,A LOAD X,A LOAD X,A
101 ADD 1,A ADD 1,A JUMP 105
102 JUMP 105 JUMP 106 ADD 1,A
103 ADD A,B NO-OP ADD A,B
104 SUB C,B ADD A,B SUB C,B
105 STOR A,Z SUB C,B STOR A,Z
106 STOR A,Z
• The separate instruction and data caches may be internal or external to the
processor - normally use on-chip caches for speed