Sunteți pe pagina 1din 67

Implementation of a Verilog

Multicycle CPU

Joey Nirschl, Benjamin Holland

Iowa State University


Department of Computer Engineering
Ames, Iowa 50011
(515) 294-4111

jnirsch@iastate.edu, bholland@iastate.edu

Keywords: Verilog, Simulation, Multicycle Processor, CPU,


Datapath, Instruction set

Abstract: This project was a semester term project to solidify our

gained knowledge in CPU datapath design. The project


requirements include having at least 15 different instructions,
including branch and jump instructions. Each module should be
separately testable. The entire design should be implemented and
have the ability to run a small sample program which is easily
changeable to demonstrate functionality.
Group Contributions:
Joey Nirschl: High level design, testing, implementation.
Benjamin Holland: Individual component design, testing,
implementation.

Time Contribution:
Joey Nirschl’s Hours: 25 (50% of work)
Ben Holland’s Hours: 25 (50% of work)

Project Work Breakdown

Design (10%)
Programming (20%)
Testing (50%)
Documentation (20%)
Table of Contents

Purpose of Machine
Instruction Set Definition
Instruction Format
Design Methodology
Design
Testing Methodology
Conclusion
Lessons Learned
Appendix A – Verilog Code & Testbench
Appendix B – Simulation Results
Appendix C – Commonly Made Verilog Mistakes
Appendix D – Figures and Diagrams
Appendix E – Sources
1. Purpose of Machine:

The multicycle CPU design is an improvement on the single cycle design. In this
implementation the multicycle design allows for instructions to be executed in multiple
stages. This is a great improvement to the signal cycle design because it allows instructions
to be executed completely in three to five stages per instruction.

These stages include:

Stage 1: Instruction fetch


Stage 2: Instruction decode/register fetch
Stage 3: Memory address computation, execution, branch or jump completion
Stage 4: Memory access load, memory access store, R-type instruction completion
Stage 5: Memory read completion

In our implementation every instruction shares the first two stages which are the
instruction fetch and instruction decode/register fetch stages. In the first stage data is fetched
from memory and stored in the memory data register and the instruction register. The second
stage decodes the instruction to either a R-type instruction, a branch instruction, a jump
instruction or a memory address.
At stage three the instruction may take separate logical paths depending on the instruction
type which was decoded in stage 2. A finite state machine of these logical paths is described
in Figure 1 of Appendix D. Stage three is the last stage for instructions of branch or jump
types. After either of these two instructions have completed the next instruction is fetched in
stage one, and the logic cycle restarts at the beginning of stage one.
Stage four occurs for R-type and I-type instructions and for instructions which require
memory access (load word, store word). Both store word and R/I-type instructions end in
stage four. R/I-type instructions must now store the ALU result in the register file. Store
word must store values to memory in this stage. The logic cycle then begins again at stage
one with the instruction fetch.
Stage five is only responsible for the load word instruction which after reading the word
from memory still needs to store the word to the register file. Load word instructions must
load values from memory and store values into a register. After the writing of data to the
register file is complete the cycle then will again continue with the fetching of the next
instruction in stage one.
The control module of the datapath is responsible for organizing and updating the stages
of instructions. The advantage of breaking instructions up into stages is that fast instructions
can be completed in fewer stages than slower instructions whereas in a single cycle design,
all instructions are implemented in one stage requiring the system to wait during every
instruction for the time it would take the longest instruction to finish. Since some
instructions can now finish in one to two cycles sooner than in the single cycle
implementation, the overall average number of clock cycles required to execute instruction
code is drastically reduced.

2. Instruction Set Definition

This implementation of a multicycle CPU has support for R-type instructions, I-type
instructions, as well as branch and jump instructions. Special logic has been added to the
control unit to support I-type instructions because I-type instructions were not previously
implemented in the design by Patterson and Hennessy. The instruction set is modeled off of
the MIPS (millions of instructions per second) instruction set. Aside from a few minor
differences in operation codes the implemented instruction follows the MIPS instruction set
convention. The instruction format is discussed in more detail in the next section.

*Stages were added to the finite state machine to support additional functionality. The
FSM can be viewed in Appendix D. (The additional logic of each figure is indicated in red.)
The instructions included in this set are as listed below:

add – Add, stores the addition of register source (rs) and register target (rt) into register
destination (rd).
R[rd]=R[rs] + R[rt]

sub – Subtract, stores the difference of register source (rs) and register target (rt) into register
destination (rd).
R[rd]=R[rs] - R[rt]

and – And, stores the bitwise and operation of register source (rs) and register target (rt) into
register destination (rd).
R[rd]=R[rs] & R[rt]

or – Or, stores the bitwise or operation of register source (rs) and register target (rt) into
register destination (rd).
R[rd]=R[rs] | R[rt]

xor - Xor, stores the bitwise xor operation of register source (rs) and register target (rt) into
register destination (rd).
R[rd]=R[rs] ^ R[rt]

slt – Set Less Than, conditionally stores a value 1 or 0 in register destination (rd) if register
source (rs) is less than register target (rt).
if(R[rs}<R[rt]){R[rd]=1}
else{R[rd]=0}

beq – Branch Equal, Conditionally upon equality of register source (rs) and register target
(rt) branch to current pc value + 4 + immediate value.
if(R[rs]==R[rt]){PC=PC+4+BranchAddress}
lw – Load Word, loads a 32-bit quantity at memory address in register source (rs) + sign
extended immediate into the register target (rt). R[rt]=M[R[rs]+SignExtendedImmediate]

sw – Store Word, stores a 32-bit quantity in register target (rt) to register source (rs) +sign
extended immediate .
M[R[rs]+ SignExtendedImmediate]=R[rt]

addi – Add Immediate, stores the addition of register source (rs) and the sign extended
immediate value into register target (rt).
R[rt]=R[rs] + SignExtendedImmediate

andi - And Immediate, stores the bitwise and operation of register source (rs) and the zero
extended immediate value into register target (rt).
R[rt]=R[rs] & ZeroExtendedImmediate

xori - Xor Immediate, stores the bitwise xor operation of register source (rs) and the zero
extended immediate value into register target (rt).
R[rt]=R[rs] ^ ZeroExtendedImmediate

ori - Or Immediate, stores the bitwise or operation of register source (rs) and the zero
extended immediate value into register target (rt).
R[rt]=R[rs] | ZeroExtendedImmediate

slti – Set Less Than Immediate, conditionally stores a value 1 or 0 in register destination (rt)
if register source (rs) is less than the sign extended immediate value.
if(R[rs}<SignExtendedImmediate){R[rt]=1}
else{R[rt]=0}

j – Jump, unconditionally jumps to the instruction at the specified address.


PC = PC[31:28]+address<<2
3. Instruction Format

The instruction format is different for each of the three instruction types. An R-
type instruction has six fields which include the opcode, rs, rt, rd, shamt, and function
fields. The opcode for an R-type instruction is always zero. The function field defines
the type of the R-type instruction (ex: add, sub, and, or, ect.). The shamt field is used in
shifting operations (not implemented in this design). RD is the register destination which
is where the operation result is stored after execution of the instruction. RS (register
source) and RT (register target) are the fields referencing the register values to be used in
the computational operation of the instruction.
The I-type instruction has 4 fields. The opcode for an I-type instruction defines
the operation of the instruction. RS (register source) and RT (register target) are the
fields referencing the register values to be used in the computational operation of the
instruction. The immediate field of the instruction can either be used as a constant value
or as way to compute a memory address by sign extending the value.
The J-type instruction has an opcode just like the other two instructions in order to
define the instruction operation. The J-type instruction also has an address field which
can be used to jump to the specified memory address.
The figures below show the individual fields of each instruction type.
Instruction Instruction Type Opcode Function
add R 0x00 0x20
sub R 0x00 0x22
and R 0x00 0x24
or R 0x00 0x25
xor R 0x00 0x01
slt R 0x00 0x2A
xori I 0x0F N/A
beq I 0x04 N/A
lw I 0x23 N/A
sw I 0x2B N/A
addi I 0x08 N/A
andi I 0x0C N/A
ori I 0x0D N/A
slti I 0x0A N/A
j J 0x02 N/A

4. Design Methodology

The general approach to this design was to map out a high level design of the
system. Our design was based off of the ideas presented in the Computer Organization
and Design textbook written by David A. Patterson and John L. Hennessy. Figures 2 and
3 show a general outline of how our implementation was planned out on paper before
implementation. The red markings on the figures in Appendix D are the modifications
that were made to the design.
After the high level design we broke the datapath down in separate modules so
that we could divide work among team members and test functionality of each module
individually. Testing each module individually was extremely important because it
allowed us to catch many errors in a controlled environment before it became cluttered in
the traffic of the entire system. Modularizing code also allows team members to assign
responsibility and let team members specialize in specific areas of the code creating more
efficient code than if the system were not modularized. Having the system be modular
allows for a greater resistance to change for the overall system because if new
functionality is needed either a new modular is added or logic is modified within a
module to accommodate the added requirement. Although having a system be modular is
an important aspect, it is also important to note that the original design must be good
enough to allow for code functionality to be modularized in the first place.
Once the system has been designed and implemented in pieces it is only a matter
of combining the pieces of the system to make the entire CPU. This is easy to do in
theory, but with every project there are unforeseen mistakes and poor logic errors.
Thankfully, because the system was designed well to begin with, there was room for
change and modifications to correct the mistakes of the early implementation.
After an intensive debug period, the system was complete. At this point we were
able to fully document the entirety of the project and consider features to add or subtract
as well as other design changes.

5. Design

As mentioned earlier the original design was roughly based off of Figures 2 and 3
of Appendix D. Also as mention above, each of the core datapath functionalities was
implemented in a separate module. Actual project code can be seen in Appendix A.
Below is a top level view of the final datapath implementation. (The additional logic of
each figure is indicated in red.)
Top Level Design - Final Implementation
m em Address~[5..0] InstructionDecode:IDStage zero_extend:extender
pc[0]~reg0
opcode[5..0]
PRE SEL DataM em ory:data clock
pc~_OUT 0 D Q rs[4..0] zerovalue[31..0] zero_extend:extender_zerovalue
value[15..0]
memWrite rt[4..0] pc[0]~reg0_OUT 0
clock
ENA memRead rd[4..0]
DATAA readData[31..0] instruction[31..0]
CLR ALUM ulticycle:ALU OUT0 Address[5..0] funct[5..0]
writeData[31..0] immediate[15..0] M ulticycleControlFSM :m ainControl
aluctrl[3..0] ALUOut[31..0]
zero address[25..0]
valueA[31..0] PRE ALUSrcA
result[31..0] D Q DATAB
FiveT oOneM ux32:registerBm ux_out valueB[31..0] IorD iord
RegDst regdst
ENA
MUX21 MemtoReg m em toreg
CLR
MemRead m em read
PCWriteCondition pcwritecond
clock clk
PCWrite pcwrite
pc[1]~reg0 regFileRT L:RT L twom ux32:registerAm ux opcode[5..0]
MemWrite m em write
PRE twom ux32:writedata register_A[31..0]
D Q clock
PRE
a IRWrite irwrite
a regWrite D Q x1[31..0] RegWrite regwrite
ENA x1[31..0] x[31..0] inData[31..0] regA[31..0] forceAdd aluscra
CLR x0[31..0] wrReg[4..0] regB[31..0] ENA ALUSrcB[2..0] alu_out[31..0]
readA[4..0] CLR PCSource[1..0] pcsource[1..0]
readB[4..0] alusrcb[2..0]
pc[2]~reg0
PRE register_B[31..0]
D Q sign_extend:extendIm m ediate InstructionDecode:IDStage_address
PRE
D Q
ENA clock
signvalue[31..0] sign_extend:extendIm m ediate_signvalue
CLR ENA value[15..0]
M ulticycleControlFSM :m ainControl_PCWriteCondition
CLR
M ulticycleControlFSM :m ainControl_PCWrite
pc[3]~reg0 M ulticycleControlFSM :m ainControl_ALUSrcB
PRE M ulticycleControlFSM :m ainControl_PCSource
D Q
pc[3]~reg0_OUT 0
pc[2]~reg0_OUT 0
ENA
x[31..0] pc[1]~reg0_OUT 0
CLR
x0[31..0] ALUM ulticycle:ALU_zero
ALUM ulticycle:ALU_result
pc[4]~reg0
PRE
D Q pc[4]~reg0_OUT 0

ENA
CLR

pc[5]~reg0
PRE
D Q pc[5]~reg0_OUT 0

ENA
CLR

ALUControlM ulti:alucontrol
forceadd
funct[5..0] ALUIn[3..0]
opcode[5..0] m em _out[31..0]

m em DataReg[31..0]
PRE
D Q

ENA
CLR

twom ux5:writereg

a
x1[4..0] x[4..0]
x0[4..0]

pc[31]~reg0
PRE
D Q

ENA
CLR

pc[30]~reg0
PRE
D Q

ENA
CLR

pc[29]~reg0
PRE
D Q

pc[31..0]
ENA
CLR

pc[28]~reg0
PRE
D Q

ENA
CLR

pc[27]~reg0
PRE
D Q

ENA
CLR
pc[27]~reg0_OUT 0
pc[26]~reg0 pc[28]~reg0_OUT 0
pc[29]~reg0_OUT 0
PRE
D Q pc[26]~reg0_OUT 0
pc[30]~reg0_OUT 0
ENA pc[31]~reg0_OUT 0
CLR

pc[25]~reg0
PRE
D Q pc[25]~reg0_OUT 0

ENA
CLR

pc[24]~reg0
PRE
D Q pc[24]~reg0_OUT 0

ENA
CLR

pc[23]~reg0
PRE
D Q pc[23]~reg0_OUT 0

ENA
CLR

pc[22]~reg0
PRE
D Q pc[22]~reg0_OUT 0

ENA
CLR

pc[21]~reg0
PRE
D Q pc[21]~reg0_OUT 0

ENA
CLR

pc[20]~reg0
PRE
D Q pc[20]~reg0_OUT 0

ENA
CLR

pc[19]~reg0
PRE
D Q pc[19]~reg0_OUT 0

ENA
CLR

pc[18]~reg0
PRE
D Q pc[18]~reg0_OUT 0

ENA
CLR

pc[17]~reg0
PRE
D Q pc[17]~reg0_OUT 0

ENA
CLR

pc[16]~reg0
PRE
D Q pc[16]~reg0_OUT 0

ENA
CLR

pc[15]~reg0
PRE
D Q pc[15]~reg0_OUT 0

ENA
CLR

pc[14]~reg0
PRE
D Q pc[14]~reg0_OUT 0

ENA
CLR

pc[13]~reg0
PRE
D Q pc[13]~reg0_OUT 0

ENA
CLR

pc[12]~reg0
PRE
D Q pc[12]~reg0_OUT 0

ENA
CLR

pc[11]~reg0
PRE
D Q pc[11]~reg0_OUT 0

ENA
CLR

pc[10]~reg0
PRE
D Q pc[10]~reg0_OUT 0

ENA
CLR

pc[9]~reg0
PRE
D Q pc[9]~reg0_OUT 0

ENA
CLR

pc[8]~reg0
PRE
D Q pc[8]~reg0_OUT 0

ENA
CLR

pc[7]~reg0
PRE
D Q pc[7]~reg0_OUT 0

ENA
CLR

pc[6]~reg0
PRE
D Q pc[6]~reg0_OUT 0

ENA
CLR

Add0

cycle[31..0]~reg0
A[31..0] PRE
32' h00000001 --
OUT[31..0] D Q cycle[31..0]
B[31..0]

ENA
ADDER CLR

ALUOut_OUT 0
zero
register_B_OUT 0

*To examine details of design please use the zoom feature of your PDF viewer
Datapath Control - Final Implementation
current_state WideOr9 RegWrite~reg0
Equal0
PRE
next_state:E D Q RegWrite
next_state 0000
next_state:0000
opcode[5..0] A[5..0] B
OUT Equal0:OUT next_state:B ENA
6' h23 -- B[5..0] C
Equal1:OUT next_state:C CLR
D
Equal5:OUT next_state:D
E
EQUAL Equal6:OUT next_state:F
F PCWrite~0 PCWrite~reg0
Equal7:OUT next_state:G
Equal1 G PRE
Equal8:OUT next_state:H D Q PCWrite
H
Equal9:OUT I next_state:I
I
A[5..0] Equal4:OUT G next_state:J ENA
OUT J
6' h2B -- B[5..0] Equal3:OUT C next_state:K CLR
K
Equal2:OUT B next_state:L
L
current_state.K 0000 next_state:M
EQUAL M MemRead~0 MemRead~reg0
current_state.M J clk
PRE
Equal5 current_state.C K D Q MemRead
current_state.B M
WideOr2 D ENA
A[5..0]
OUT current_state.G F CLR
6' h08 -- B[5..0]
WideOr3 E
current_state.D H
IorD~0 IorD~reg0
EQUAL WideOr4 L
PRE
Equal6
WideOr5 D Q IorD
WideOr6
current_state.0000 ENA
A[5..0] WideOr8 CLR
OUT
6' h0A -- B[5..0] WideOr7
clk
ALUSrcB~0 forceAdd~reg0
EQUAL PRE
D Q forceAdd
Equal7
ENA
CLR
A[5..0]
OUT
6' h0D -- B[5..0]

WideOr1 ALUSrcB[1]~reg0
PRE
EQUAL D Q

Equal8 ALUSrcB[2..0]
ENA
CLR
A[5..0]
OUT
6' h0C -- B[5..0]
WideOr0 ALUSrcA~reg0
PRE
D Q ALUSrcA
EQUAL

Equal9 ENA
CLR

A[5..0]
OUT
6' h0F -- B[5..0]

EQUAL

Equal4

A[5..0]
OUT
6' h02 -- B[5..0]

EQUAL

Equal3

A[5..0]
OUT
6' h04 -- B[5..0]

EQUAL

Equal2

A[5..0]
OUT
6' h00 -- B[5..0]

EQUAL

WideOr2

WideOr3

WideOr4

WideOr5

WideOr6

WideOr8

WideOr7

ALUSrcB[2]~reg0 RegDst~reg0
PRE PRE
D Q D Q RegDst

ENA ENA
CLR CLR

PCWriteCondition~reg0
PRE
D Q PCWriteCondition

ENA
CLR

PCSource[1..0]~reg0
PRE
D Q PCSource[1..0]

ENA
CLR

MemWrite~reg0
PRE
D Q MemWrite

ENA
CLR

MemtoReg~reg0
PRE
D Q MemtoReg

ENA
CLR

IRWrite~reg0
PRE
D Q IRWrite

ENA
CLR

ALUSrcB[0]~reg0
PRE
D Q
clk
ENA
CLR

*To examine details of design please use the zoom feature of your PDF viewer
ALU - Final Implementation
Mux32

SEL[3..0]
16' h00E7 --
OUT Mux32_OUT
DATA[15..0]

MUX

Mux31

SEL[3..0]
node[319..1]

319' h00000000000000000000000000000000000000000000000000000000000000000000000000000000 --
1' h0 --
BUF (DC)

Add1 OUT Mux31_OUT


DATA[15..0]

valueA[31..0]
A[32..0]
1' h1 --
OUT[32..0]
valueB[31..0]
B[32..0]
result~32_OUT0 1' h1 --

MUX
ADDER
Mux30
result~38_OUT0
result~69_OUT0
SEL[3..0]
result~37_OUT0
result~36_OUT0
1' h0 --
result~35_OUT0

result~95_OUT0 OUT Mux30_OUT


DATA[15..0]
result~65_OUT0
result~33_OUT0
result~63_OUT0
result~31_OUT0
result~39_OUT0
result~40_OUT0
MUX
result~41_OUT0
result~42_OUT0 Mux29
result~43_OUT0
result~44_OUT0
SEL[3..0]
result~45_OUT0
result[31]~0_OUT0
1' h0 --
result~64_OUT0

result~94_OUT0 OUT Mux29_OUT


DATA[15..0]
result~96_OUT0

result~62_OUT0
result~30_OUT0

MUX

Mux28

SEL[3..0]

1' h0 --

result~93_OUT0 OUT Mux28_OUT


DATA[15..0]

result~61_OUT0
result~29_OUT0

MUX

Mux27

SEL[3..0]

1' h0 --

result~92_OUT0 OUT Mux27_OUT


DATA[15..0]

result~60_OUT0
result~28_OUT0

MUX

Mux26

SEL[3..0]

1' h0 --

result~91_OUT0 OUT Mux26_OUT


DATA[15..0]

result~59_OUT0
result~27_OUT0

MUX

Mux25

SEL[3..0]

1' h0 --

result~90_OUT0 OUT Mux25_OUT


DATA[15..0]

result~58_OUT0
result~26_OUT0

MUX

Mux24

SEL[3..0]

1' h0 --

result~89_OUT0 OUT Mux24_OUT


DATA[15..0]

result~57_OUT0
result~25_OUT0

MUX

Mux23

SEL[3..0]

1' h0 --

result~88_OUT0 OUT Mux23_OUT


DATA[15..0]

result~56_OUT0
result~24_OUT0

MUX

Mux22

SEL[3..0]

1' h0 --

result~87_OUT0 OUT Mux22_OUT


DATA[15..0]

result~55_OUT0
result~23_OUT0

MUX

Mux21

SEL[3..0]

1' h0 --

result~86_OUT0 OUT Mux21_OUT


DATA[15..0]

result~54_OUT0
result~22_OUT0

MUX

Mux20

SEL[3..0]

1' h0 --

result~85_OUT0 OUT Mux20_OUT


DATA[15..0]

result~53_OUT0
result~21_OUT0

MUX

Mux19

SEL[3..0]

1' h0 --

result~84_OUT0 OUT Mux19_OUT


DATA[15..0]

result~52_OUT0
result~20_OUT0

MUX

Mux18

SEL[3..0]

1' h0 --

result~83_OUT0 OUT Mux18_OUT


DATA[15..0]

result~51_OUT0
result~19_OUT0

MUX

Mux17

SEL[3..0]

1' h0 --

result~82_OUT0 OUT Mux17_OUT


DATA[15..0]

result~50_OUT0
result~18_OUT0

MUX

Mux16

SEL[3..0]

1' h0 --

result~81_OUT0 OUT Mux16_OUT


DATA[15..0]

result~49_OUT0
result~17_OUT0

MUX

Mux15

SEL[3..0]

1' h0 --

result~80_OUT0 OUT Mux15_OUT


DATA[15..0]

result~48_OUT0
result~16_OUT0

MUX

Mux14

SEL[3..0]

1' h0 --

result~79_OUT0 OUT Mux14_OUT


DATA[15..0]

result~47_OUT0
result~15_OUT0

MUX

Mux13

SEL[3..0]

1' h0 --

result~78_OUT0 OUT Mux13_OUT


DATA[15..0]

result~46_OUT0
result~14_OUT0

MUX

Mux12

SEL[3..0]

1' h0 --

result~77_OUT0 OUT Mux12_OUT


DATA[15..0]

result~13

MUX

Mux11

SEL[3..0]

1' h0 --

result~76_OUT0 OUT Mux11_OUT


DATA[15..0]

result~12

MUX

Mux10

SEL[3..0]

1' h0 --

result~75_OUT0 OUT Mux10_OUT


DATA[15..0]

result~11

MUX

Mux9

SEL[3..0]

1' h0 --

result~74_OUT0 OUT Mux9_OUT


DATA[15..0]

result~10

MUX

Mux8

SEL[3..0]

1' h0 --

result~73_OUT0 OUT Mux8_OUT


DATA[15..0]

result~9

MUX

Mux7

SEL[3..0]

1' h0 --

result~72_OUT0 OUT Mux7_OUT


DATA[15..0]

result~8

MUX

Mux6

SEL[3..0]

1' h0 --

result~71_OUT0 OUT Mux6_OUT


DATA[15..0]

result~7

MUX

Mux5

LessThan0
SEL[3..0]

A[31..0]
OUT
B[31..0]

OUT Mux5_OUT
DATA[15..0]
LESS_THAN

result~1

MUX

Mux4

SEL[3..0]

1' h0 --

result~66_OUT0 OUT Mux4_OUT


DATA[15..0]

result~2

MUX

Mux3

SEL[3..0]

1' h0 --

result~67_OUT0 OUT Mux3_OUT


DATA[15..0]

result~3

MUX

Mux2

SEL[3..0]

1' h0 --

result~68_OUT0 OUT Mux2_OUT


DATA[15..0]

result~4

MUX

Mux1

aluctrl[3..0] SEL[3..0]

1' h0 --

OUT Mux1_OUT
DATA[15..0]

Add0_OUT
result~5

MUX
result~34_OUT0
Mux0

SEL[3..0]

1' h0 --

result~70_OUT0 OUT Mux0_OUT


DATA[15..0]

result~6

MUX

Equal0

A[31..0]
OUT zero
B[31..0]

EQUAL

*To examine details of design please use the zoom feature of your PDF viewer
ALU Control - Final Implementation
Decoder0

opcode[5..0] IN[5..0] OUT[63..0]

DECODER

Selector1
WideOr0

SEL[3..0] ALUIn[0]$latch
node[3..1] PRE
OUT D Q
3' h0 -- ENA
CLR
2' h1 -- DATA[3..0]
BUF (DC)
0
0
1 1
0 1
ALUIn[0]~1 SELECTOR
ALUIn[0]~0
Selector4

SEL[3..0] ALUIn[1]$latch
PRE
OUT D Q
ENA
CLR
0 2' h2 -- DATA[3..0]
0
0 1
0 1
ALUIn[1]~9
ALUIn[1]~12
Selector5 SELECTOR
Selector3

SEL[3..0]
SEL[3..0] ALUIn[2]$latch
OUT
PRE
OUT D Q 1' h0 --
3' h3 --
DATA[3..0] ENA
2' h1 --
CLR ALUIn[3..0]
DATA[3..0]

WideOr5
SELECTOR

SELECTOR

WideOr6

Equal4

funct[5..0] A[5..0]
OUT 0
6' h22 -- B[5..0] 0
1 1
0 1
ALUIn[0]~3
EQUAL ALUIn[0]~2
WideOr4 Selector0
Equal3

A[5..0]
OUT
6' h01 -- B[5..0] SEL[3..0]
OUT

EQUAL
ALUIn[3]~13 3' h3 --
DATA[3..0]
Equal2

SELECTOR
A[5..0]
OUT
6' h20 -- B[5..0] Selector6

EQUAL
WideOr3

Equal1 SEL[3..0]
OUT

A[5..0] 3' h3 --
OUT DATA[3..0]
6' h25 -- B[5..0]

0
EQUAL 1 1 SELECTOR

Equal0 ALUIn[1]~6

A[5..0]
OUT WideOr2
6' h24 -- B[5..0]

EQUAL

Equal5

ALUIn~14
A[5..0]
OUT
6' h2A -- B[5..0]

EQUAL
0
0
0 1
0 1
ALUIn[2]~8
ALUIn[2]~11
forceadd

*To examine details of design please use the zoom feature of your PDF viewer
6. Testing Methodology

The general methodology to test the system directly stems from our design methodology.
In the design methodology we broke important system functionalities in separate modules
so that we could individually debug and assign responsibility. This way each module can
be tested individually eliminating possible interference from other modules. Once each
module has been individually tested and is working, the system can be implemented
using each of the smaller modules. At this point it is just a matter of working out any
system integration issues or finding any bugs that were missed in the first stage. Once
the system was completely integrated, we decided that the best way to test the system as a
whole was to write a program which would demonstrate the working functionality of the
entire system. Finally, after writing our test program, we found that we were able to
implement a working datapath that calculates the nth digit of the Fibonacci sequence
correctly.

7. Conclusion

Our Computer Engineering 305 project came from an accumulation of material from
Cpre305 and previous courses. The knowledge we needed to complete this project
included an understanding of multicycle CPUs, datapaths, control units, finite state
machines, digital logic, and Verilog. With our knowledge, we were able to build
individual logic modules and integrate those modules to create our multicycle processor.
The processor was capable of supporting fifteen MIPS instructions. In the process of
building the CPU, we added logic to the design presented in the textbook by Patterson
and Hennessy to fully support our multicycle design.
8. Lessons Learned

• Save often, ModelSim has a bad habit of crashing in the lab. The more you save, the less
amount of work will be lost after a program or computer crashes.
• Make backups, if all else fails, you have a backup.
• Use comments, when working with others, comments allow others to understand your
code. The less comments provided, the harder it may be for someone to understand your
code in the future.
• Create block schematics, block schematics help to understand the big picture. If the
block diagram created from the Verilog code, does not look correct, then the block
diagram can bring understanding to the high level design as well as help overcome
mistakes in code.
Appendix A – Verilog Code & Testbench

//MultiCycle is our multicycle cpu


module MultiCycle(cycle, pc, clock, alu_out, mem_out, regdst,
memread, memwrite, regwrite, memtoreg, zero, pcwritecond,
pcwrite,iord,irwrite, pcsource,aluscra,alusrcb);
// input/output
input clock;
output[31:0] cycle,alu_out, mem_out, pc;
output regdst, memread, memwrite, regwrite, memtoreg;
output zero;
output pcwritecond, pcwrite,iord,irwrite;
output aluscra;
output [1:0] pcsource;
output [2:0] alusrcb;

// for debug
reg[31:0] cycle=0;

always @ (posedge clock)


begin
cycle = cycle + 1;
end

// control variables
wire regdst, memread, memwrite, regwrite, memtoreg;
wire pcwritecond, pcwrite,iord,irwrite, aluscra, zero;
wire [1:0] pcsource;
wire [2:0] alusrcb;
wire [31:0] jumpaddress,alu_out, mem_out;
wire [31:0] branchCondition;
wire[3:0] aluCtrl;

// other variables
reg [31:0] pc = 32'b0;
reg [31:0] ALUOut;
reg [31:0] register_A, register_B;

wire [31:0] memAddress;


// Decode control signals
wire [5:0]opCode;
wire [4:0] regToWrite;

//Instruction decode variables


wire[4:0] rs,rt,rd;
wire [15:0] immediatevalue;
wire [4:0] shamt;
wire [5:0] funct;
wire [25:0] address;

reg [31:0] memDataReg;


wire [31:0] regA,regB;
wire [31:0] regWriteData;
wire [31:0] imm_value;
wire [31:0] valueA, valueB;
wire forceadd;

assign memAddress = iord? ALUOut:pc;

//Data Memory module holds both data and instructions


DataMemory data(memwrite,memread,memAddress[5:0], register_B,mem_out);
//Instruction decode decodes instructions and puts values into appropiate wires
InstructionDecode IDStage(clock,
mem_out,opCode,rs,rt,rd,shamt,funct,immediatevalue,address);

//Microcode Control FSM control control of multicycle cpu


MulticycleControlFSM
mainControl(opCode,clock,aluscra,iord,alusrcb,pcsource,regdst,memtoreg,
memread,pcwritecond, pcwrite, memwrite, irwrite, regwrite,forceadd);

//MemoryDataRegister holds data from memory that may be written into register
always@(posedge clock)
memDataReg = mem_out;

//Chooses appropiate write register depending on the control


twomux5 writereg(regdst, rd, rt,regToWrite);
//Chooses appropiate data to write depending on the control
twomux32 writedata(memtoreg,memDataReg,ALUOut,regWriteData);
regFileRTL RTL(clock,regwrite,regWriteData,regToWrite,rs,rt,regA,regB);

//Registers hold value until positive edge of clock, when they are updated
always@(posedge clock)
begin
register_A = regA;
register_B=regB;
end

//sign extend the immediate value


sign_extend extendImmediate(clock,immediatevalue,imm_value);
//xero extend the immediat value
wire [31:0] zeroextendvalue;
zero_extend extender(clock,immediatevalue, zeroextendvalue);

twomux32 registerAmux(aluscra,register_A,pc,valueA);
FiveToOneMux32 registerBmux(alusrcb,zeroextendvalue,imm_value<<2,
imm_value,4,register_B,valueB);

// ALU control control operation of alu


ALUControlMulti alucontrol(funct,opCode,forceadd,aluCtrl);

//Main ALU
ALUMulticycle ALU(aluCtrl,valueA,valueB,alu_out,zero);

//temp ALU out register holds value from alu until updated on posedge clock
always@(posedge clock)
begin
ALUOut= alu_out;
end

JumpAddress jumpTo(pc,address,jumpaddress);
//Mux chooses next data to pc depending on control
ThreeToOneMux32
branchesAndJumps(pcsource,jumpaddress,ALUOut,alu_out,branchCondition);
wire brachwritecond, gotoNextPc;
assign brachwritecond = pcwritecond & zero;
assign gotoNextPc =pcwrite | brachwritecond;
//PC update
always @ (posedge clock)
begin
if(gotoNextPc)
pc=branchCondition;
end
endmodule// END: MultiCycle

//The Testbench for our multicycle cpu


module AMultiCycleTest;
reg clock;
wire[31:0] cycle,alu_out, mem_out, pc;
wire regdst, memread, memwrite, regwrite, memtoreg;
wire zero;
wire pcwritecond, pcwrite,iord,aluscra,irwrite;
wire [1:0] pcsource;
wire [2:0] alusrcb;
initial
begin
clock =1'b0;
end
always
begin
#15 clock = ~clock;
end

MultiCycle testcpu(cycle, pc, clock, alu_out, mem_out, regdst,


memread, memwrite, regwrite, memtoreg, zero, pcwritecond,
pcwrite,iord,irwrite, pcsource,aluscra,alusrcb);
endmodule // END: AMultiCycleTest

//Control for multicycle cpu


module
MulticycleControlFSM(opcode,clk,ALUSrcA,IorD,ALUSrcB,PCSource,RegDst,MemtoReg,
MemRead,PCWriteCondition, PCWrite, MemWrite, IRWrite, RegWrite,forceAdd);
input [5:0]opcode;
input clk;
output ALUSrcA,IorD,RegDst,MemtoReg;
output [1:0]ALUSrcB;
output [1:0]PCSource;
output MemRead,PCWriteCondition, PCWrite, MemWrite, IRWrite, RegWrite,forceAdd;
reg MemRead,PCWriteCondition, PCWrite, MemWrite, IRWrite, RegWrite;
reg ALUSrcA,IorD,RegDst,MemtoReg, forceAdd;
reg [2:0]ALUSrcB;
reg [1:0]PCSource;
reg [3:0] current_state, next_state;
reg [3:0] debug;
parameter A=4'b0000, B=4'b0001, C=4'b0010, D=4'b0011, E=4'b0100, F=4'b0101, G=4'b0110,
H=4'b0111, I=4'b1000, J=4'b1001, K=4'b1010, L=4'b1011, M=4'b1100;
//parameter A=0, B=1, C=2, D=3, E=4, F=5, G=6, H=7, I=8, J=9; K=10;L=11,M=12;

//forceAdd 1=add (only in states A, B)


initial begin
current_state=4'b0000;
next_state=4'b0000;
end
always@(posedge clk)
begin
current_state=next_state;
end
always@(posedge clk or opcode)
begin
case(current_state)

A:begin

debug = 4'b0000; //added recently


MemRead = 1;
ALUSrcA=0;
IorD=1'b0;
IRWrite = 1;
ALUSrcB=3'b001;
PCWrite = 1;
PCSource=2'b00;
next_state=B;

RegDst=0;
MemtoReg=0;
PCWriteCondition=0;
MemWrite=0;
RegWrite=0;

forceAdd=1;
end

B:begin

debug = 4'b0001;

ALUSrcA=0;
ALUSrcB=3'b011;

IorD=0;
PCSource=0;
RegDst=0;
MemtoReg=0;
MemRead=0;
PCWriteCondition=0;
PCWrite=0;
MemWrite=0;
IRWrite=0;
RegWrite=0;
forceAdd=1;

//if lw or sw nextstate = C
//if(opcode==35 || opcode==43)
if(opcode==6'b100011 || opcode==6'b101011)
begin
next_state=C;
end

//if r type nextstate = G


//if(opcode==0)
else if(opcode==6'b000000)
begin
next_state=G;
end

//if beq nextstate = I


//if(opcode==4)
else if(opcode==6'b000100)
begin
next_state=I;
end

//if j nextstate = j
//if(opcode==2)
else if(opcode==6'b000010)
begin
next_state=J;
end
//IType instrcution, treate as R-Type
//because ALU control will take care of proper execution

else if(opcode== 6'b001000 ||//addI


opcode==6'b001010//slt
)
begin
next_state=K;//sign extended immediate state
end
else if(opcode== 6'b001101||//orI
opcode== 6'b001100||//andI
opcode== 6'b001111//xorI
)
begin
next_state=M;//zero extended immediate state
end
else
debug = 4'b1111;
end
C:begin

debug = 4'b0010;

ALUSrcA=1;
ALUSrcB=3'b010;

IorD=0;
PCSource=0;
RegDst=0;
MemtoReg=0;
MemRead=0;
PCWriteCondition=0;
PCWrite=0;
MemWrite=0;
IRWrite=0;
RegWrite=0;

forceAdd=0;

//if lw nextstate = D or sw nextstate = F


//if(opcode==35 || opcode==43)
if(opcode==6'b100011)
begin
next_state=D;
end
else if(opcode==6'b101011)
next_state=F;
else
debug = 4'b1111;
end

D:begin

debug = 4'b0011;

MemRead = 1;
IorD=1;

ALUSrcA=0;
ALUSrcB=0;
PCSource=0;
RegDst=0;
MemtoReg=0;
PCWriteCondition=0;
PCWrite=0;
MemWrite=0;
IRWrite=0;
RegWrite=0;

next_state=E;

forceAdd=0;
end

E:begin

debug = 4'b0100;

RegDst=1'b0;
RegWrite = 1;
MemtoReg=1'b1;

next_state=A;

ALUSrcA = 0;
IorD = 0;
ALUSrcB = 0;
PCSource = 0;
MemRead = 0;
PCWriteCondition = 0;
PCWrite = 0;
MemWrite = 0;
IRWrite = 0;

forceAdd=0;

end

F:begin

debug = 4'b0101;

MemWrite = 1'b1;
IorD=1'b1;

next_state=A;

ALUSrcA = 0;
ALUSrcB = 0;
PCSource = 0;
RegDst = 0;
MemtoReg = 0;
MemRead = 0;
PCWriteCondition = 0;
PCWrite = 0;
IRWrite = 0;
RegWrite = 0;

forceAdd=0;
end

G:begin

debug = 4'b0110;

ALUSrcA=1;
ALUSrcB=3'b000;

next_state=H;

IorD = 0;
PCSource = 0;
RegDst = 0;
MemtoReg = 0;
MemRead = 0;
PCWriteCondition = 0;
PCWrite = 0;
MemWrite = 0;
IRWrite = 0;
RegWrite = 0;

forceAdd=0;

end

H:begin
//For RType or IType, if not RType, it is IType
//if IType regDst = 0
debug = 4'b0111;
RegDst=1'b1;

RegWrite = 1;
MemtoReg=1'b0;

next_state=A;

ALUSrcA = 0;
IorD = 0;
ALUSrcB = 0;
PCSource = 0;
MemRead = 0;
PCWriteCondition = 0;
PCWrite = 0;
MemWrite = 0;
IRWrite = 0;

forceAdd=0;

end

I:begin

debug = 4'b1000;

ALUSrcA=1;
ALUSrcB=3'b000;
PCWriteCondition = 1;
PCSource=2'b01;

next_state=A;
IorD = 0;
RegDst = 0;
MemtoReg = 0;
MemRead = 0;
PCWrite = 0;
MemWrite = 0;
IRWrite = 0;
RegWrite = 0;

forceAdd=0;

end

J:begin

debug = 4'b1001;
PCWrite = 1;
PCSource=2'b10;
next_state=A;

ALUSrcA = 0;
IorD = 0;
ALUSrcB = 0;
RegDst = 0;
MemtoReg = 0;
MemRead = 0;
PCWriteCondition = 0;
MemWrite = 0;
IRWrite = 0;
RegWrite = 0;
forceAdd=0;
end
K:begin

debug = 4'b1010;

ALUSrcA = 1;
ALUSrcB = 3'b010;

MemtoReg = 0;
IorD = 0;
RegDst = 0;
MemRead = 0;
PCWriteCondition = 0;
PCWrite=0;
MemWrite = 0;
IRWrite = 0;
RegWrite = 0;
PCSource=2'b00;

next_state=L;

forceAdd=0;
end
L:
begin

debug = 4'b1011;
RegDst = 0;
RegWrite = 1;
MemtoReg = 0;

IorD = 0;
MemRead = 0;
PCWriteCondition = 0;
PCWrite=0;
MemWrite = 0;
IRWrite = 0;
ALUSrcA = 0;
ALUSrcB = 3'b000;
PCSource=2'b00;

next_state=A;

forceAdd=0;
end
M:
begin
debug = 4'b1100;
ALUSrcA = 1;
ALUSrcB = 3'b100;

MemtoReg = 0;
IorD = 0;
RegDst = 0;
MemRead = 0;
PCWriteCondition = 0;
PCWrite=0;
MemWrite = 0;
IRWrite = 0;
RegWrite = 0;
PCSource=2'b00;

next_state=L;

forceAdd=0;
end
endcase
end

endmodule//END: MulticycleControlFSM

//Testbench for control


module testbenchMulticycleControlFSM;
reg [5:0]op;
reg clock=0;
wire ALUSrcA,IorD,RegDst,MemtoReg;
wire [1:0]ALUSrcB;
wire [1:0]PCSource;
always
begin
#2 clock=~clock;
end
initial begin
op=6'b000000;//add 1
#10 op=6'b001000;//addi 9
#10 op=6'b000000;//Sub 2
#10 op=6'b000100;//branch 10
#10 op=6'b000000;//And 3
#10 op=6'b000010;//j 15
#10 op=6'b000000;//Or 4
#10op=6'b100011;//LW 16
#10 op=6'b000000;//Xor 5
#30 op=6'b101011;//SW 17
#10 op=6'b000000;//Slt 6
#10 op=6'b001101;//OrI 11
#10 op=6'b000000;//Mult 7
#10 op=6'b001100;//AndI 12
#10 op=6'b000000;//Div 8
#10 op=6'b001111;//XorI 13
#10 op=6'b001010;//SltI 14

end
MulticycleControlFSM test(op,clock,ALUSrcA,IorD,ALUSrcB,PCSource,RegDst,MemtoReg,
MemRead,PCWriteCondition, PCWrite, MemWrite, IRWrite, RegWrite);

endmodule// END: testbenchMulticycleControlFSM

//Total Lines: 186


module ALUMulticycle(aluctrl, valueA, valueB,result,zero);
input [3:0] aluctrl;
input [31:0] valueA;
input [31:0] valueB;
output [31:0] result;
reg [31:0] result;
output zero;
reg zero;

always@(aluctrl or valueA or valueB)


begin
case(aluctrl)

4'b0000://Bitwise And
begin
result = valueA & valueB;
end
4'b0001://Bitwise Or
begin
result = valueA | valueB;
end
4'b0010://Add
begin
result = valueA + valueB;
end

4'b0101://Xor
begin
result = valueA ^ valueB;
end
4'b0110://Sub
begin
result = valueA - valueB;
end
4'b0111://Slt
begin
result = valueA < valueB ? 1:0;
end

endcase

if(valueA==valueB)
begin
zero=1'b1;
end
else
begin
zero=1'b0;
end

end
endmodule

module testALUMultiCycle;
reg [3:0] aluctrl;
reg [31:0] valueA;
reg [31:0] valueB;
wire [31:0] result;
wire zero;

initial
begin

//AND
aluctrl = 4'b0000;
valueA = 0;
valueB = 4294967295;
$monitor("AND -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
#5
aluctrl = 4'b0000;
valueA = 4294967295;
valueB = 4294967295;
$monitor("AND -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
//OR
#5
aluctrl = 4'b0001;
valueA = 4294967295;
valueB = 4294967295;
$monitor("OR -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
#5
aluctrl = 4'b0001;
valueA = 0;
valueB = 0;
$monitor("OR -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);

//Add
#5
aluctrl = 4'b0010;
valueA = 5;
valueB = 5;
$monitor("ADD -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
#5
aluctrl = 4'b0010;
valueA = 0;
valueB = 4294967295;
$monitor("ADD -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
//XOR
#5
aluctrl = 4'b0101;
valueA = 0;
valueB = 1;
$monitor("XOR -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
#5
aluctrl = 4'b0101;
valueA = 4294967295;
valueB = 0;
$monitor("XOR -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);

//Subtract
#5
aluctrl = 4'b0110;
valueA = 5;
valueB = 4;
$monitor("SUBTRACT -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b
(%d) | Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
#5
aluctrl = 4'b0110;
valueA = 4294967295;
valueB = 0;
$monitor("SUBTRACT -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b
(%d) | Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);

//SLT
#5
aluctrl = 4'b0111;
valueA = 5;
valueB = 4;
$monitor("SLT -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
#5
aluctrl = 4'b0111;
valueA = 0;
valueB = 4294967295;
$monitor("SLT -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);

end

ALUMulticycle test(aluctrl, valueA, valueB,result,zero);

endmodule

//ALU control
module ALUControlMulti(funct, opcode,forceadd, ALUIn);
input [5:0]funct;
input [5:0] opcode;
input forceadd;
output [3:0]ALUIn;
reg [3:0]ALUIn;

always@(funct or opcode or forceadd)


begin
if(forceadd==1)
begin
ALUIn = 4'b0010;
end
else
begin
//begin case
case(opcode)
//R-Type
6'b000000:
begin
//And
if(funct==6'b100100)
begin
ALUIn = 4'b0000;
end
//Or
else if(funct==6'b100101)
begin
ALUIn = 4'b0001;
end
//Add
else if(funct==6'b100000)
begin
ALUIn = 4'b0010;
end

//Xor
else if(funct == 6'b000001)
begin
ALUIn = 4'b0101;
end
//Sub
else if(funct==6'b100010)
begin
ALUIn = 4'b0110;
end
//Slt
else if(funct==6'b101010)
begin
ALUIn = 4'b0111;
end

end//end R-type
//Begin I-Type
//AndI
6'b001100://C
begin
ALUIn = 4'b0000;
end
//OrI
6'b001101://D
begin
ALUIn = 4'b0001;
end
//XorI
6'b001111://F
begin
ALUIn = 4'b0101;
end
//SltI
6'b001010://A
begin
ALUIn = 4'b0111;
end
//AddI
6'b001000://8
begin
ALUIn = 4'b0010;
end

//Branch
6'b000100://4
begin
ALUIn = 4'b0010;
end
//LW
6'b100011:
begin
ALUIn = 4'b0010;
end
//SW
6'b101011:
begin
ALUIn = 4'b0010;
end
//End I-Type

//jump
6'b000010:
begin
ALUIn=4'b0010;
end
endcase //endcase
end//end else
end//end always
endmodule

//ALU control testbench


module testALUControlMulti;
reg clock;
reg [5:0]funct;
reg [5:0]op;
wire [3:0]ALUIn;
initial
begin
$monitor(" Time=%d,\top=%d,\t funct=%d,\t ALUIn=%d", $time, op,funct, ALUIn);
end
initial
begin
op=6'b000000;funct=6'b100000;//add 1
#20 op=6'b000000;funct=6'b100010;//Sub 2
#20 op=6'b000000;funct=6'b100100;//And 3
#20 op=6'b000000;funct=6'b100101;//Or 4
#20 op=6'b000000;funct=6'b000001;//Xor 5
#20 op=6'b000000;funct=6'b101010;//Slt 6
#20 op=6'b000000;funct=6'b011000;//Mult 7
#20 op=6'b000000;funct=6'b011010;//Div 8
#20 op=6'b001000;funct=6'b010100;//addi 9
#20 op=6'b000100;funct=6'b000110;//branch (I) 10 ///???
#20 op=6'b001101;funct=6'bx;//OrI 11
#20 op=6'b001100;funct=6'bx;//AndI 12
#20 op=6'b001111;funct=6'bx;//XorI 13
#20 op=6'b001010;funct=6'bx;//SltI 14
#20 op=6'b000010;funct=6'bx; //j 15
#20 op=6'b100011; funct=6'bx;//LW 16
#20 op=6'b101011; funct=6'bx;//SW 17
#20 $stop;
end
ALUControlMulti aluctrltest(funct, op, ALUIn);
endmodule

//Data Memory module


module DataMemory( memWrite,memRead,Address, writeData,readData);
input memWrite, memRead;
input [5:0] Address;
input [31:0] writeData;
output [31:0] readData;
reg [31:0] readData;
reg [31:0]dataMemory[1024:0];

initial
begin

dataMemory[0] = 32'b00100000000101010000000000010100;//N=20
dataMemory[4] = 32'b00000000000000001011100000100000;
dataMemory[8] = 32'b00010010101000000000000000000110;
dataMemory[12] = 32'b00100000000101100000000000000001;
dataMemory[16] = 32'b00000010111101101011100000100000;
dataMemory[20] = 32'b00000010111101101011000000100010;
dataMemory[24] = 32'b00100010101101011111111111111111;
dataMemory[28] = 32'b00010010101000000000000000000001;
dataMemory[32] = 32'b00010000000000001111111111111011;
dataMemory[36] = 32'b10101100000101110000000000000001;
end
always@(memWrite or memRead or Address)
begin

if(memWrite == 1'b1)
begin
dataMemory[Address]=writeData;
end
if(memRead ==1'b1)
begin
readData=dataMemory[Address];
end
end
endmodule//END: DataMemory

//Data Memory testbench


module ADataMemTest;

reg memWrite,memRead;
reg [5:0] Address;
reg [31:0] writeData;
wire [31:0] readData;

initial
begin
$monitor(" memWrite=%d, memRead=%d, Address=%d, writeData=%d,readData=%d ",
$time,memWrite,memRead,Address, writeData,readData);
end
initial
begin
memRead=1;
#20 Address=4;
#20 memRead=0;
#20 Address=1;
#20 memWrite=1;
#20 writeData=32'b1;
#20 $stop;
end

DataMemory testMem(memWrite,memRead,Address, writeData,readData);


endmodule

//Register File
module regFileRTL(clock,regWrite,inData,wrReg,readA, readB,regA,regB);
input clock;
input regWrite;
input [31:0] inData;
input [4:0] wrReg;
input [4:0] readA;
input [4:0] readB;
output [31:0] regA;
output [31:0] regB;
reg [31:0] registerFiles[31:0];

initial
begin
registerFiles[5'b00000] = 32'b0;
end
always@(posedge clock)
begin

if(regWrite && ( wrReg != 5'b00000))


begin
registerFiles[wrReg] = inData;
end

end

assign regA = registerFiles[readA];


assign regB = registerFiles[readB];

endmodule//END: regFileRTL

//InstructionDecode decode the instruction


module InstructionDecode(clock,instruction, opcode, rs,rt,rd,shamt,funct, immediate, address);
input clock;
input [31:0] instruction;
output [5:0] opcode, funct;
output [4:0] rs, rt,rd,shamt;
output [15:0] immediate;
output [25:0] address;

reg [5:0] opcode, funct;


reg [4:0] rs, rt,rd,shamt;
reg [15:0] immediate;
reg [25:0] address;

always@(posedge clock)
begin
assign opcode = instruction[31:26];
assign rs = instruction[25:21];
assign rt = instruction[20:16];
assign rd = instruction[15:11];
assign shamt = instruction[10:6];
assign funct = instruction[5:0];
assign immediate = instruction[15:0];
assign address = instruction[25:0];
end
endmodule// END: InstructionDecode

//Instruction decode testbench


module AInstrTest;
reg clock;
reg [31:0] instr;
wire [5:0] opcode, funct;
wire [4:0] rs, rt,rd,shamt;
wire [15:0] immediate;
wire [25:0] address;

initial
begin
$monitor("Time=%d,
instOp=%d,%d,instRs=%d,%d,instRt=%d,%d,instRd=%d,%d,instShT=%d,%d,instFt=%d,%d,in
stImm=%d,%d,instAdd=%d;%d",
$time,instr[31:26], opcode,instr[25:21],
rs,instr[21:16],rt,instr[15:11],rd,instr[10:6],shamt,instr[5:0],funct,instr[15:0],
immediate,instr[25:0], address);
clock=0;
end
always
#2 clock= ~clock;
initial
begin
instr = 32'b00100000000101010000000000010001;
#20 instr = 32'b00000000000000001011100000100000;
#20 instr = 32'b00010010101000000000000000000110;
#20 instr = 32'b00100000000101100000000000000001;
#20 instr = 32'b00000010111101101011100000100000;
#20 instr = 32'b00000010111101101011000000100010;
#20 instr = 32'b00100010101101011111111111111111;
#20 instr = 32'b00010010101000000000000000000001;
#20 instr = 32'b00010000000000001111111111111011;
#20 instr = 32'b10101100000101110000000000000001;
#20 $stop;
end
InstructionDecode testdecode(clock,instr, opcode, rs,rt,rd,shamt,funct, immediate, address);
endmodule//END:AInstrTest

//Sign extension module


module sign_extend(clock,value,signvalue);
input clock;
input [15:0] value;
output [31:0] signvalue;
reg [31:0] signvalue;
always@(posedge clock)
begin
signvalue[31:16] = 16'b0000000000000000;
if(value[15] ==1'b1)
begin
signvalue[31:16] = 16'b1111111111111111;
end
signvalue[15:0] = value;
end
endmodule//END: sign_extend

//Sign Extend test bench


module testSignExtend;
reg clock;
reg [15:0] value;
wire [31:0] newvalue;
initial
begin
$monitor(" Time=%d, value=%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b,
signvalue=%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%
b%b%b%b%b%b%b",
$time,
value[15],value[14],value[13],value[12],value[11],value[10],value[9],
value[8],value[7],value[6],value[5],value[4],value[3],value[2],value[1], value[0],
newvalue[31],newvalue[30],newvalue[29],newvalue[28],newvalue[27],
newvalue[26],newvalue[25],newvalue[24],newvalue[23],newvalue[22],
newvalue[21],newvalue[20],newvalue[19],newvalue[18],newvalue[17],
newvalue[16],newvalue[15],newvalue[14],newvalue[13],newvalue[12],
newvalue[11],newvalue[10],newvalue[9],newvalue[8],newvalue[7],
newvalue[6],newvalue[5],newvalue[4],newvalue[3],newvalue[2],
newvalue[1],newvalue[0]);
clock=0;
end
always
#2 clock= ~clock;
initial
begin
value = 0;#20 value = 1;#20 value = 2;#20 value = 3;#20 value = 4;
#20 value = 5;#20 value = 20;#20 value = 40;#20 value = 500;#20 value = 10000;

#20 value = 16'b1000000000000000;#20 value = 16'b1000000000000001;


#20 value = 16'b0111111111111111;#20 value = 16'b1010101010101010;
#20 value = 16'b1111111111111111;#20 value = 16'b1111111111111110;
#20 $stop;
end
sign_extend testsign(clock,value,newvalue);
endmodule

//Zero extension module


module zero_extend(clock,value, zerovalue);
input clock;
input [15:0] value;
output [31:0] zerovalue;
reg [31:0] zerovalue;
always@(posedge clock)
begin
zerovalue[31:16] = 16'b0000000000000000;
zerovalue[15:0] = value;
end
endmodule

//Zero extension testbench


module testZeroExtend;
reg [15:0] value;
reg clock;
wire [31:0] newvalue;
initial
begin
$monitor(" Time=%d, value=%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b,
zerovalue=%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%b%
b%b%b%b%b%b%b",
$time,
value[15],value[14],value[13],value[12],value[11],value[10],value[9],
value[8],value[7],value[6],value[5],value[4],value[3],value[2],value[1], value[0],
newvalue[31],newvalue[30],newvalue[29],newvalue[28],newvalue[27],
newvalue[26],newvalue[25],newvalue[24],newvalue[23],newvalue[22],
newvalue[21],newvalue[20],newvalue[19],newvalue[18],newvalue[17],
newvalue[16],newvalue[15],newvalue[14],newvalue[13],newvalue[12],
newvalue[11],newvalue[10],newvalue[9],newvalue[8],newvalue[7],
newvalue[6],newvalue[5],newvalue[4],newvalue[3],newvalue[2],
newvalue[1],newvalue[0]);
clock=0;
end
always
#2 clock= ~clock;
initial
begin
value = 0;#20 value = 1;#20 value = 2;#20 value = 3;#20 value = 4;
#20 value = 5;#20 value = 20;#20 value = 40;#20 value = 500;#20 value = 10000;

#20 value = 16'b1000000000000000;#20 value = 16'b1000000000000001;


#20 value = 16'b0111111111111111;#20 value = 16'b1010101010101010;
#20 value = 16'b1111111111111111;#20 value = 16'b1111111111111110;
#20 $stop;
end
zero_extend testzero(clock,value,newvalue);
endmodule

//JumpAddress
module JumpAddress(pc,address,newAddress);
input [31:0] pc;
input [25:0] address;
output [31:0] newAddress;
reg [31:0] newAddress;

always@(pc or address)
begin
newAddress[31:28] = pc[31:28];
newAddress[27:0] = (address <<2);
end
endmodule//END: JumpAddress

//JumpAddress testbench
module AJumpAddressTest;
reg [31:0] pc;
reg [25:0] addr;
wire [31:0] newAddr;
integer x,y;

initial
begin
$monitor(" Time=%d, pc=%d, addr=%d, newAddr=%d", $time,pc,addr,newAddr);
end
initial
begin
x=0;
y=0;
addr = 32'b0;
pc = 32'b00010000000000000000000000000000;
for(x = 0; x < 32; x=x+1)
begin
#10 addr=x;
end
pc = 32'b00110000000000000000000000000000;
addr= 32'b11110000000000000000000000000000;
for(y = 0; y < 32; y=y+1)
begin
#10 addr=y;
end
#20 $stop;
end
JumpAddress jumptest(pc,addr,newAddr);
endmodule

//Twomux5 has a datapath of 5 bits wide and a choice of two elements


module twomux5(a,x1,x0,x);
input a;
input [4:0] x1,x0;
output [4:0]x;
reg [4:0]x;

always@(a or x1 or x0)
begin

if(a == 1'b1)
begin
x = x1;
end
else if(a==1'b0)
begin
x = x0;
end

end
endmodule//END: twomux5

//Twomux32 has a datapath of 32 bits wide and a choice of two elements


module twomux32(a,x1,x0,x);
input a;
input [31:0] x1,x0;
output [31:0]x;
reg [31:0]x;
always@(a or x1 or x0)
begin
if(a == 1'b1)
begin
x = x1;
end
else if(a == 1'b0)
begin
x=x0;
end
end
endmodule//END: twomux32

//ThreeToOneMux has a datapath of 32 bits wide and a choice of three elements


module ThreeToOneMux32(select,x2,x1,x0,out);
input [1:0] select;
input [31:0] x2,x1,x0;
output [31:0] out;
reg [31:0] out;
always@(select or x0 or x0 or x2)
begin
if(select == 2'b00)
begin
out = x0;
end
if(select == 2'b01)
begin
out = x1;
end
if(select == 2'b10)
begin
out = x2;
end

end
endmodule// END:ThreeToOneMux32
//FiveToOneMux 32 has a datapath of 32 bits and a choice of three elements
module FiveToOneMux32(select,x4,x3,x2,x1,x0,out);
input [2:0] select;
input [31:0] x4,x3,x2,x1,x0;
output [31:0] out;
reg [31:0] out;
always@(select or x0 or x1 or x2 or x3 or x4)
begin
if(select == 3'b000)
begin
out = x0;
end
if(select == 3'b001)
begin
out = x1;
end
if(select == 3'b010)
begin
out = x2;
end
if(select == 3'b011)
begin
out = x3;
end
if(select == 3'b100)
begin
out = x4;
end
end
endmodule
Appendix B – Simulation Results

The following simulation results are of a program we wrote which calculates the nth digit of the
Fibonacci sequence. In this simulation the nth digit to calculate was set as “20”. After running
the simulation we calculated that 20th digit of the Fibonacci sequence was “6765”, which is
indeed correct.

The assembly code to our program is listed below:

addi $21,$0,20
add $23,$0,$0
beq $21,$0,end
addi $22, $0,1
loop:
add $23,$23,$22
sub $22,$23,$22
addi $21,$21,-1
beq $21,$0,end
beq $0,$0, loop
end:
sw $23,1($0)

To double check out binary math, we compiled our assemble code in the MIPS simulator SPIM.

[0x00400000] 0x20150002 addi $21, $0, 20 ; 1: addi $21,$0,2


[0x00400004] 0x0000b820 add $23, $0, $0 ; 2: add $23,$0,$0
[0x00400008] 0x12a00007 beq $21, $0, 28 [end-0x00400008]; 3: beq $21,$0,end
[0x0040000c] 0x20160001 addi $22, $0, 1 ; 4: addi $22, $0,1
[0x00400010] 0x02f6b820 add $23, $23, $22 ; 6: add $23,$23,$22
[0x00400014] 0x02f6b022 sub $22, $23, $22 ; 7: sub $22,$23,$22
[0x00400018] 0x22b5ffff addi $21, $21, -1 ; 8: addi $21,$21,-1
[0x0040001c] 0x12a00002 beq $21, $0, 8 [end-0x0040001c] ; 9: beq $21,$0,end
[0x00400020] 0x1000fffc beq $0, $0, -16 [loop-0x00400020] ; 10: beq $0,$0, loop
[0x00400024] 0xac170001 sw $23, 1($0) ; 12: sw $23,1($0)

In binary representation our program code is as follows:

dataMemory[0] = 32'b00100000000101010000000000010100;
dataMemory[4] = 32'b00000000000000001011100000100000;
dataMemory[8] = 32'b00010010101000000000000000000110;
dataMemory[12] = 32'b00100000000101100000000000000001;
dataMemory[16] = 32'b00000010111101101011100000100000;
dataMemory[20] = 32'b00000010111101101011000000100010;
dataMemory[24] = 32'b00100010101101011111111111111111;
dataMemory[28] = 32'b00010010101000000000000000000001;
dataMemory[32] = 32'b00010000000000001111111111111011;
dataMemory[36] = 32'b10101100000101110000000000000001;

On the following pages are the results of the simulation running the program described above.
/AMultiCycleTest/testcpu/clock

/AMultiCycleTest/testcpu/cycle

/AMultiCycleTest/testcpu/alu_out

/AMultiCycleTest/testcpu/mem_out

/AMultiCycleTest/testcpu/pc 00000000000000000000000000101100

/AMultiCycleTest/testcpu/regdst

/AMultiCycleTest/testcpu/memread

/AMultiCycleTest/testcpu/memwrite

/AMultiCycleTest/testcpu/regwrite

/AMultiCycleTest/testcpu/memtoreg

/AMultiCycleTest/testcpu/zero

/AMultiCycleTest/testcpu/pcwritecond

/AMultiCycleTest/testcpu/pcwrite

/AMultiCycleTest/testcpu/iord

/AMultiCycleTest/testcpu/irwrite

/AMultiCycleTest/testcpu/aluscra

/AMultiCycleTest/testcpu/pcsource 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

/AMultiCycleTest/testcpu/alusrcb 011

/AMultiCycleTest/testcpu/jumpaddress 0000xxxxxxxxxxxxxxxxxxxxxxxxxx00

/AMultiCycleTest/testcpu/branchCondition

/AMultiCycleTest/testcpu/aluCtrl 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010

/AMultiCycleTest/testcpu/ALUOut

/AMultiCycleTest/testcpu/register_A

/AMultiCycleTest/testcpu/register_B

/AMultiCycleTest/testcpu/memAddress 00000000000000000000000000101100

/AMultiCycleTest/testcpu/opCode 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000

/AMultiCycleTest/testcpu/regToWrite 10110 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000

/AMultiCycleTest/testcpu/rs 00000 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101

/AMultiCycleTest/testcpu/rt 00000
10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110

/AMultiCycleTest/testcpu/rd 00000 00000

/AMultiCycleTest/testcpu/immediatevalue

/AMultiCycleTest/testcpu/shamt 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000

/AMultiCycleTest/testcpu/funct 000001

/AMultiCycleTest/testcpu/address

/AMultiCycleTest/testcpu/memDataReg

/AMultiCycleTest/testcpu/regA

/AMultiCycleTest/testcpu/regB

/AMultiCycleTest/testcpu/regWriteData

/AMultiCycleTest/testcpu/imm_value 0000000000000000xxxxxxxxxxxxxxxx

/AMultiCycleTest/testcpu/valueA 00000000000000000000000000101100

/AMultiCycleTest/testcpu/valueB 00000000000000xxxxxxxxxxxxxxxx00

/AMultiCycleTest/testcpu/forceadd

/AMultiCycleTest/testcpu/zeroextendvalue 0000000000000000xxxxxxxxxxxxxxxx

/AMultiCycleTest/testcpu/brachwritecond

0 2 us 4 us 6 us 8 us 10 us 12 us

Entity:AMultiCycleTest Architecture: Date: Sun Dec 02 8:30:30 PM Central Standard Time 2007 Row: 1 Page: 1
/AMultiCycleTest/testcpu/gotoNextPc

/AMultiCycleTest/testcpu/RTL/registerFiles
[31]
[30]
[29]
[28]
[27]
[26]
[25]
[24]
[23] 0 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765
[22] 1 0 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181
[21] 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
[20]
[19]
[18]
[17]
[16]
[15]
[14]
[13]
[12]
[11]
[10]
[9]
[8]
[7]
[6]
[5]
[4]
[3]
[2]
[1]
[0] 0
/AMultiCycleTest/testcpu/data/dataMemory[1] 6765

0 2 us 4 us 6 us 8 us 10 us 12 us

Entity:AMultiCycleTest Architecture: Date: Sun Dec 02 8:30:30 PM Central Standard Time 2007 Row: 1 Page: 2
Appendix C – Commonly Made Verilog Mistakes

• When a register or a wire that is spelled incorrectly is used in Verilog using ModelSim,
the compiler will not throw any error or warnings, but at the same time as expected, the
program will cease to function correctly.
• A warning is thrown but not enforced in ModelSim when a register is assigned more bits
than the register is wide. This forces the register to only act upon the bottom bits of the
assigned bits, usually to the inconvenience of the developer.
• Module names should be name exactly as the file name which holds the module.
Although this is not a strict rule of Verilog, it is a good practice because some programs
like Quartus II depend on this naming scheme for some applications.
• It is important to remember to pay close attention to the sensitivity list on an always
block. If a variable is not included in the always block that is used inside the block itself,
then the entire block may not run at all. This is a confusing issue to find when debugging
code.
• In Verilog an output must be accompanied by a register if the data is to be manipulated.
• Begin and end statements must be used properly. Not having an end statement to
accompany a begin statement will cause problems in code.
• To assign output from one module to another a wire must be used. Using a register will
cause a compilation error.
• Blocking vs. Non-Blocking assignment statements, misunderstanding the differences
between these assignment statements can cause problems in the inner workings of
Verilog code. This is also a very hard issue to debug.
Appendix D – Figures and Diagrams

Figure 1 – Control FSM Logic Diagram (Patternson,Hennessy, page 338)


Figure 2 – High Level Datapath Design (Patternson,Hennessy, page 320)
Figure 3 – High Level Datapath Design with Control Logic (Patternson,Hennessy, page 323)
Appendix E – Sources

David A. Patterson, John L. Hennessy. Computer Organization and Design, Revised


Printing 3rd Ed. New York: Elsevier, 2007.

S-ar putea să vă placă și