Documente Academic
Documente Profesional
Documente Cultură
Control Design
* Two approaches for control unit design
5.5
A microprogrammed control unit
: by organizing control signals into microinstructions. The signals are
implemented by a kind of software(or firmware) rather than hardware.
design change : change the contents of control memory.
emulation : a microprogrammed CPU can execute programs written in
the machine language of other computers.
Disadvantage:
Slower due to fetch.
more costly due to the presence of the control memory and its
access circuits.
5.1.2. Hardwired Control
design method 1 : The classical method of sequential circuit design. For a P-state
circuit, log2P flip-flops are required.
design method 2 : One-hot method, one flip-flop per state. Expensive in terms of
F/F but simplify CU design and debugging.
GCD processor
Classical method
S0 = 00, S1 = 01, S2 = 10 and S3 = 11
S 0 0001
S1 0010 (5.9)
S 2 0100
S 3 1000
Di D0 ( XR 0) ( XR XR) D2 ( XR 0) ( XR XR)
( D0 D 2 ) ( XR 0) ( XR XR)
zk D k ,1 D k , 2 ... D k ,mk
LoadYR D0 D1 D0 D1
D0 0
D1 D0 ( XR 0) ( XR XR) D2 ( XR 0) ( XR XR)
D2 D0 ( XR 0) ( XR XR) D1 D2 ( XR 0) ( XR XR)
D3 D0 ( XR 0) D2 ( XR 0) D3
Subtract D2
Swap D1
SelectXY D0
LoadXR D0 D1 D2
LoadYR D0 D1
One-hot method
i 1
where I j ,1, I j,2 , ..., I j ,n denote all input combinations that cause a transition from Sj
j
Instruction
: implemented by a sequence of one or more sets of concurrent micro-operations.
Microprogramming
: control-signal selection and sequencing information is stored in a ROM or RAM
called a control memory(CM), and microinstruction is fetched from CM.
K0 K1 K2 Micro-operation
0 0 1 R X0
0 1 0 R X1
0 1 1 R X2 5 operations
1 0 0 R X3
0 0 0 No op
I : horizontal VS vertical
horizontal form : long format
able to express a high degree of parallelism
little encoding for the control information.
length CM
Control field
control field
Algorithm
1. Find the set of Maximal compatibility class (MCC), defined as the compatibility
classes to which no control signal can be added without introducing a pair of
incompatible control signals. An encoded control field can activate only one control
signal at a time. Two control signals can be included in the same control field iff
they are never simultaneously activated by a I. (i.e. they are compatible).
Two control signals Ci1 and Ci2 are compatible if Ci1Ij implies Ci2Ij, and vice
versa. The compatibility class is a set of control signals that are pairwise compatible.
2. Determine all minimal MCC covers. A minimal MCC cover is the minimal set of
MCC that includes each control signal. ( Note that a minimal MCC cover does not
always yield a minimum value of the cost function W ).
3. For each minimal MCC covers, include each control signal in exactly one subset of
some {Ci} and execute the cost W of the resulting solutions and select one with the
minimal cost.
Deriving MCC
: Denote Si as the set of compatibility classes {Ci} such that {Ci}
contains i Cij control signals.
S1={simply the n original control signals}
Si forms all possible(i)- member compatibility classes.
Using Si, construct Si+1 as follow;
For each {Ci}Si, add a control signal Cik to {Ci} to form {C}.
If {C} is a compatibility class, then add {C} to Si+1 and delete {Ci} and
all subset of {C} from Si .
Stop when Sk= for some kn+1.
k 1
Formats
Control fields
0 0 Action -Instruction
Branch -Instruction
0 1 if Q(7) = 0
1 0 if COUNT6 = 1
1 1 jump
-program sequencer
: to place all the circuitry required to generate I addresses in a single IC
with the advance of VLSI.
a general purpose building block for -programmed CU.
simplify CPU design.
Nanoprogrammed Computer
-programmed Computer.
Instruction
PC
CM IR Control
signals
nanoprogrammed Computer
Instruction
PC nPC
CM:
HmWm
Hm
Wm
nCM: Hn HnWn
Total size : HmWm+HnWn = S2
Size of comparable single-level CM
Wm
HmWm = S1
Hm
Hm log2Hm log2Hn Hn
N
S2 = Hm (log2Hm + log2Hn) + Hn N
Let, r = Hn/Hm = ratio of unique nano-control states to total # of -
control states for all instructions. Hn = rHm
S2 = Hm (log2Hm + log2rHm) + rHm N
= Hm ( 2 log2Hm +log2r + rN )
Example) For 68,000 Processor(N = 70, Hm = 650, r = 0.4), which approach is better?
1-level CM design : 70
650 log 650
2
S1 = 650 (log2650 + 70) = 52,000
Nanoprogramming
S1 = 650 (log2650 + log2260 )+ 260 70 = 30,550
70
650 log2650 log2260
260
T (1)
Speedup S(m) =
T ( m)
S(m) = m E(m)
f
Performance/cost ratio : PCR =
K
where f : pipelines clock frequency
K : hardware cost
Suppose the pipeline has m stages for SI.
a : the delay of a non-pipelined processor for SI
each stage of P : delay a/m and extra delay b due to the buffer resister
1 a
Tc b
f m
hardware cost K = cm + d
c : buffer-register cost per stage
d : cost of the pipelines data processing logic
f 1 m
PCR
K TcK bcm 2 (ac bd )m ad
To maximize PCR with respect to m,
d 1 m(2bcm ac bd )
( PCR)
dm bcm 2 (ac bd )m ad (bcm 2 (ac bd )m ad ) 2
d
( PCR) 0 bcm 2 (ac bd ) m ad m(2bcm ac bd )
dm
ad
mopt
bc
5.3.3 Superscalar Processing
Superscalar operation performs more than one instruction per cycle by
fetching, decoding, and executing several instructions concurrently.
A superscalar computer has a single CPU that attempts to exploit the
parallelism that is implicit in computer programs, with multiple execution
units.
In Fig. 5.66, the superscalar design has a potential speedup of 10.
With K independent m-stage pipeline E-units speedup factors of a
superscalar CPU: k m
heavy demand on the instruction-fetch logic
a large, fast instruction and data cache