Documente Academic
Documente Profesional
Documente Cultură
Multicycle CPU
jnirsch@iastate.edu, bholland@iastate.edu
Time Contribution:
Joey Nirschl’s Hours: 25 (50% of work)
Ben Holland’s Hours: 25 (50% of work)
Project Work Breakdown
Design (10%)
Programming (20%)
Testing (50%)
Documentation (20%)
Table of Contents
Purpose of Machine
Instruction Set Definition
Instruction Format
Design Methodology
Design
Testing Methodology
Conclusion
Lessons Learned
Appendix A – Verilog Code & Testbench
Appendix B – Simulation Results
Appendix C – Commonly Made Verilog Mistakes
Appendix D – Figures and Diagrams
Appendix E – Sources
1. Purpose of Machine:
The multicycle CPU design is an improvement on the single cycle design. In this
implementation the multicycle design allows for instructions to be executed in multiple
stages. This is a great improvement to the signal cycle design because it allows instructions
to be executed completely in three to five stages per instruction.
In our implementation every instruction shares the first two stages which are the
instruction fetch and instruction decode/register fetch stages. In the first stage data is fetched
from memory and stored in the memory data register and the instruction register. The second
stage decodes the instruction to either a R-type instruction, a branch instruction, a jump
instruction or a memory address.
At stage three the instruction may take separate logical paths depending on the instruction
type which was decoded in stage 2. A finite state machine of these logical paths is described
in Figure 1 of Appendix D. Stage three is the last stage for instructions of branch or jump
types. After either of these two instructions have completed the next instruction is fetched in
stage one, and the logic cycle restarts at the beginning of stage one.
Stage four occurs for R-type and I-type instructions and for instructions which require
memory access (load word, store word). Both store word and R/I-type instructions end in
stage four. R/I-type instructions must now store the ALU result in the register file. Store
word must store values to memory in this stage. The logic cycle then begins again at stage
one with the instruction fetch.
Stage five is only responsible for the load word instruction which after reading the word
from memory still needs to store the word to the register file. Load word instructions must
load values from memory and store values into a register. After the writing of data to the
register file is complete the cycle then will again continue with the fetching of the next
instruction in stage one.
The control module of the datapath is responsible for organizing and updating the stages
of instructions. The advantage of breaking instructions up into stages is that fast instructions
can be completed in fewer stages than slower instructions whereas in a single cycle design,
all instructions are implemented in one stage requiring the system to wait during every
instruction for the time it would take the longest instruction to finish. Since some
instructions can now finish in one to two cycles sooner than in the single cycle
implementation, the overall average number of clock cycles required to execute instruction
code is drastically reduced.
This implementation of a multicycle CPU has support for R-type instructions, I-type
instructions, as well as branch and jump instructions. Special logic has been added to the
control unit to support I-type instructions because I-type instructions were not previously
implemented in the design by Patterson and Hennessy. The instruction set is modeled off of
the MIPS (millions of instructions per second) instruction set. Aside from a few minor
differences in operation codes the implemented instruction follows the MIPS instruction set
convention. The instruction format is discussed in more detail in the next section.
*Stages were added to the finite state machine to support additional functionality. The
FSM can be viewed in Appendix D. (The additional logic of each figure is indicated in red.)
The instructions included in this set are as listed below:
add – Add, stores the addition of register source (rs) and register target (rt) into register
destination (rd).
R[rd]=R[rs] + R[rt]
sub – Subtract, stores the difference of register source (rs) and register target (rt) into register
destination (rd).
R[rd]=R[rs] - R[rt]
and – And, stores the bitwise and operation of register source (rs) and register target (rt) into
register destination (rd).
R[rd]=R[rs] & R[rt]
or – Or, stores the bitwise or operation of register source (rs) and register target (rt) into
register destination (rd).
R[rd]=R[rs] | R[rt]
xor - Xor, stores the bitwise xor operation of register source (rs) and register target (rt) into
register destination (rd).
R[rd]=R[rs] ^ R[rt]
slt – Set Less Than, conditionally stores a value 1 or 0 in register destination (rd) if register
source (rs) is less than register target (rt).
if(R[rs}<R[rt]){R[rd]=1}
else{R[rd]=0}
beq – Branch Equal, Conditionally upon equality of register source (rs) and register target
(rt) branch to current pc value + 4 + immediate value.
if(R[rs]==R[rt]){PC=PC+4+BranchAddress}
lw – Load Word, loads a 32-bit quantity at memory address in register source (rs) + sign
extended immediate into the register target (rt). R[rt]=M[R[rs]+SignExtendedImmediate]
sw – Store Word, stores a 32-bit quantity in register target (rt) to register source (rs) +sign
extended immediate .
M[R[rs]+ SignExtendedImmediate]=R[rt]
addi – Add Immediate, stores the addition of register source (rs) and the sign extended
immediate value into register target (rt).
R[rt]=R[rs] + SignExtendedImmediate
andi - And Immediate, stores the bitwise and operation of register source (rs) and the zero
extended immediate value into register target (rt).
R[rt]=R[rs] & ZeroExtendedImmediate
xori - Xor Immediate, stores the bitwise xor operation of register source (rs) and the zero
extended immediate value into register target (rt).
R[rt]=R[rs] ^ ZeroExtendedImmediate
ori - Or Immediate, stores the bitwise or operation of register source (rs) and the zero
extended immediate value into register target (rt).
R[rt]=R[rs] | ZeroExtendedImmediate
slti – Set Less Than Immediate, conditionally stores a value 1 or 0 in register destination (rt)
if register source (rs) is less than the sign extended immediate value.
if(R[rs}<SignExtendedImmediate){R[rt]=1}
else{R[rt]=0}
The instruction format is different for each of the three instruction types. An R-
type instruction has six fields which include the opcode, rs, rt, rd, shamt, and function
fields. The opcode for an R-type instruction is always zero. The function field defines
the type of the R-type instruction (ex: add, sub, and, or, ect.). The shamt field is used in
shifting operations (not implemented in this design). RD is the register destination which
is where the operation result is stored after execution of the instruction. RS (register
source) and RT (register target) are the fields referencing the register values to be used in
the computational operation of the instruction.
The I-type instruction has 4 fields. The opcode for an I-type instruction defines
the operation of the instruction. RS (register source) and RT (register target) are the
fields referencing the register values to be used in the computational operation of the
instruction. The immediate field of the instruction can either be used as a constant value
or as way to compute a memory address by sign extending the value.
The J-type instruction has an opcode just like the other two instructions in order to
define the instruction operation. The J-type instruction also has an address field which
can be used to jump to the specified memory address.
The figures below show the individual fields of each instruction type.
Instruction Instruction Type Opcode Function
add R 0x00 0x20
sub R 0x00 0x22
and R 0x00 0x24
or R 0x00 0x25
xor R 0x00 0x01
slt R 0x00 0x2A
xori I 0x0F N/A
beq I 0x04 N/A
lw I 0x23 N/A
sw I 0x2B N/A
addi I 0x08 N/A
andi I 0x0C N/A
ori I 0x0D N/A
slti I 0x0A N/A
j J 0x02 N/A
4. Design Methodology
The general approach to this design was to map out a high level design of the
system. Our design was based off of the ideas presented in the Computer Organization
and Design textbook written by David A. Patterson and John L. Hennessy. Figures 2 and
3 show a general outline of how our implementation was planned out on paper before
implementation. The red markings on the figures in Appendix D are the modifications
that were made to the design.
After the high level design we broke the datapath down in separate modules so
that we could divide work among team members and test functionality of each module
individually. Testing each module individually was extremely important because it
allowed us to catch many errors in a controlled environment before it became cluttered in
the traffic of the entire system. Modularizing code also allows team members to assign
responsibility and let team members specialize in specific areas of the code creating more
efficient code than if the system were not modularized. Having the system be modular
allows for a greater resistance to change for the overall system because if new
functionality is needed either a new modular is added or logic is modified within a
module to accommodate the added requirement. Although having a system be modular is
an important aspect, it is also important to note that the original design must be good
enough to allow for code functionality to be modularized in the first place.
Once the system has been designed and implemented in pieces it is only a matter
of combining the pieces of the system to make the entire CPU. This is easy to do in
theory, but with every project there are unforeseen mistakes and poor logic errors.
Thankfully, because the system was designed well to begin with, there was room for
change and modifications to correct the mistakes of the early implementation.
After an intensive debug period, the system was complete. At this point we were
able to fully document the entirety of the project and consider features to add or subtract
as well as other design changes.
5. Design
As mentioned earlier the original design was roughly based off of Figures 2 and 3
of Appendix D. Also as mention above, each of the core datapath functionalities was
implemented in a separate module. Actual project code can be seen in Appendix A.
Below is a top level view of the final datapath implementation. (The additional logic of
each figure is indicated in red.)
Top Level Design - Final Implementation
m em Address~[5..0] InstructionDecode:IDStage zero_extend:extender
pc[0]~reg0
opcode[5..0]
PRE SEL DataM em ory:data clock
pc~_OUT 0 D Q rs[4..0] zerovalue[31..0] zero_extend:extender_zerovalue
value[15..0]
memWrite rt[4..0] pc[0]~reg0_OUT 0
clock
ENA memRead rd[4..0]
DATAA readData[31..0] instruction[31..0]
CLR ALUM ulticycle:ALU OUT0 Address[5..0] funct[5..0]
writeData[31..0] immediate[15..0] M ulticycleControlFSM :m ainControl
aluctrl[3..0] ALUOut[31..0]
zero address[25..0]
valueA[31..0] PRE ALUSrcA
result[31..0] D Q DATAB
FiveT oOneM ux32:registerBm ux_out valueB[31..0] IorD iord
RegDst regdst
ENA
MUX21 MemtoReg m em toreg
CLR
MemRead m em read
PCWriteCondition pcwritecond
clock clk
PCWrite pcwrite
pc[1]~reg0 regFileRT L:RT L twom ux32:registerAm ux opcode[5..0]
MemWrite m em write
PRE twom ux32:writedata register_A[31..0]
D Q clock
PRE
a IRWrite irwrite
a regWrite D Q x1[31..0] RegWrite regwrite
ENA x1[31..0] x[31..0] inData[31..0] regA[31..0] forceAdd aluscra
CLR x0[31..0] wrReg[4..0] regB[31..0] ENA ALUSrcB[2..0] alu_out[31..0]
readA[4..0] CLR PCSource[1..0] pcsource[1..0]
readB[4..0] alusrcb[2..0]
pc[2]~reg0
PRE register_B[31..0]
D Q sign_extend:extendIm m ediate InstructionDecode:IDStage_address
PRE
D Q
ENA clock
signvalue[31..0] sign_extend:extendIm m ediate_signvalue
CLR ENA value[15..0]
M ulticycleControlFSM :m ainControl_PCWriteCondition
CLR
M ulticycleControlFSM :m ainControl_PCWrite
pc[3]~reg0 M ulticycleControlFSM :m ainControl_ALUSrcB
PRE M ulticycleControlFSM :m ainControl_PCSource
D Q
pc[3]~reg0_OUT 0
pc[2]~reg0_OUT 0
ENA
x[31..0] pc[1]~reg0_OUT 0
CLR
x0[31..0] ALUM ulticycle:ALU_zero
ALUM ulticycle:ALU_result
pc[4]~reg0
PRE
D Q pc[4]~reg0_OUT 0
ENA
CLR
pc[5]~reg0
PRE
D Q pc[5]~reg0_OUT 0
ENA
CLR
ALUControlM ulti:alucontrol
forceadd
funct[5..0] ALUIn[3..0]
opcode[5..0] m em _out[31..0]
m em DataReg[31..0]
PRE
D Q
ENA
CLR
twom ux5:writereg
a
x1[4..0] x[4..0]
x0[4..0]
pc[31]~reg0
PRE
D Q
ENA
CLR
pc[30]~reg0
PRE
D Q
ENA
CLR
pc[29]~reg0
PRE
D Q
pc[31..0]
ENA
CLR
pc[28]~reg0
PRE
D Q
ENA
CLR
pc[27]~reg0
PRE
D Q
ENA
CLR
pc[27]~reg0_OUT 0
pc[26]~reg0 pc[28]~reg0_OUT 0
pc[29]~reg0_OUT 0
PRE
D Q pc[26]~reg0_OUT 0
pc[30]~reg0_OUT 0
ENA pc[31]~reg0_OUT 0
CLR
pc[25]~reg0
PRE
D Q pc[25]~reg0_OUT 0
ENA
CLR
pc[24]~reg0
PRE
D Q pc[24]~reg0_OUT 0
ENA
CLR
pc[23]~reg0
PRE
D Q pc[23]~reg0_OUT 0
ENA
CLR
pc[22]~reg0
PRE
D Q pc[22]~reg0_OUT 0
ENA
CLR
pc[21]~reg0
PRE
D Q pc[21]~reg0_OUT 0
ENA
CLR
pc[20]~reg0
PRE
D Q pc[20]~reg0_OUT 0
ENA
CLR
pc[19]~reg0
PRE
D Q pc[19]~reg0_OUT 0
ENA
CLR
pc[18]~reg0
PRE
D Q pc[18]~reg0_OUT 0
ENA
CLR
pc[17]~reg0
PRE
D Q pc[17]~reg0_OUT 0
ENA
CLR
pc[16]~reg0
PRE
D Q pc[16]~reg0_OUT 0
ENA
CLR
pc[15]~reg0
PRE
D Q pc[15]~reg0_OUT 0
ENA
CLR
pc[14]~reg0
PRE
D Q pc[14]~reg0_OUT 0
ENA
CLR
pc[13]~reg0
PRE
D Q pc[13]~reg0_OUT 0
ENA
CLR
pc[12]~reg0
PRE
D Q pc[12]~reg0_OUT 0
ENA
CLR
pc[11]~reg0
PRE
D Q pc[11]~reg0_OUT 0
ENA
CLR
pc[10]~reg0
PRE
D Q pc[10]~reg0_OUT 0
ENA
CLR
pc[9]~reg0
PRE
D Q pc[9]~reg0_OUT 0
ENA
CLR
pc[8]~reg0
PRE
D Q pc[8]~reg0_OUT 0
ENA
CLR
pc[7]~reg0
PRE
D Q pc[7]~reg0_OUT 0
ENA
CLR
pc[6]~reg0
PRE
D Q pc[6]~reg0_OUT 0
ENA
CLR
Add0
cycle[31..0]~reg0
A[31..0] PRE
32' h00000001 --
OUT[31..0] D Q cycle[31..0]
B[31..0]
ENA
ADDER CLR
ALUOut_OUT 0
zero
register_B_OUT 0
*To examine details of design please use the zoom feature of your PDF viewer
Datapath Control - Final Implementation
current_state WideOr9 RegWrite~reg0
Equal0
PRE
next_state:E D Q RegWrite
next_state 0000
next_state:0000
opcode[5..0] A[5..0] B
OUT Equal0:OUT next_state:B ENA
6' h23 -- B[5..0] C
Equal1:OUT next_state:C CLR
D
Equal5:OUT next_state:D
E
EQUAL Equal6:OUT next_state:F
F PCWrite~0 PCWrite~reg0
Equal7:OUT next_state:G
Equal1 G PRE
Equal8:OUT next_state:H D Q PCWrite
H
Equal9:OUT I next_state:I
I
A[5..0] Equal4:OUT G next_state:J ENA
OUT J
6' h2B -- B[5..0] Equal3:OUT C next_state:K CLR
K
Equal2:OUT B next_state:L
L
current_state.K 0000 next_state:M
EQUAL M MemRead~0 MemRead~reg0
current_state.M J clk
PRE
Equal5 current_state.C K D Q MemRead
current_state.B M
WideOr2 D ENA
A[5..0]
OUT current_state.G F CLR
6' h08 -- B[5..0]
WideOr3 E
current_state.D H
IorD~0 IorD~reg0
EQUAL WideOr4 L
PRE
Equal6
WideOr5 D Q IorD
WideOr6
current_state.0000 ENA
A[5..0] WideOr8 CLR
OUT
6' h0A -- B[5..0] WideOr7
clk
ALUSrcB~0 forceAdd~reg0
EQUAL PRE
D Q forceAdd
Equal7
ENA
CLR
A[5..0]
OUT
6' h0D -- B[5..0]
WideOr1 ALUSrcB[1]~reg0
PRE
EQUAL D Q
Equal8 ALUSrcB[2..0]
ENA
CLR
A[5..0]
OUT
6' h0C -- B[5..0]
WideOr0 ALUSrcA~reg0
PRE
D Q ALUSrcA
EQUAL
Equal9 ENA
CLR
A[5..0]
OUT
6' h0F -- B[5..0]
EQUAL
Equal4
A[5..0]
OUT
6' h02 -- B[5..0]
EQUAL
Equal3
A[5..0]
OUT
6' h04 -- B[5..0]
EQUAL
Equal2
A[5..0]
OUT
6' h00 -- B[5..0]
EQUAL
WideOr2
WideOr3
WideOr4
WideOr5
WideOr6
WideOr8
WideOr7
ALUSrcB[2]~reg0 RegDst~reg0
PRE PRE
D Q D Q RegDst
ENA ENA
CLR CLR
PCWriteCondition~reg0
PRE
D Q PCWriteCondition
ENA
CLR
PCSource[1..0]~reg0
PRE
D Q PCSource[1..0]
ENA
CLR
MemWrite~reg0
PRE
D Q MemWrite
ENA
CLR
MemtoReg~reg0
PRE
D Q MemtoReg
ENA
CLR
IRWrite~reg0
PRE
D Q IRWrite
ENA
CLR
ALUSrcB[0]~reg0
PRE
D Q
clk
ENA
CLR
*To examine details of design please use the zoom feature of your PDF viewer
ALU - Final Implementation
Mux32
SEL[3..0]
16' h00E7 --
OUT Mux32_OUT
DATA[15..0]
MUX
Mux31
SEL[3..0]
node[319..1]
319' h00000000000000000000000000000000000000000000000000000000000000000000000000000000 --
1' h0 --
BUF (DC)
valueA[31..0]
A[32..0]
1' h1 --
OUT[32..0]
valueB[31..0]
B[32..0]
result~32_OUT0 1' h1 --
MUX
ADDER
Mux30
result~38_OUT0
result~69_OUT0
SEL[3..0]
result~37_OUT0
result~36_OUT0
1' h0 --
result~35_OUT0
result~62_OUT0
result~30_OUT0
MUX
Mux28
SEL[3..0]
1' h0 --
result~61_OUT0
result~29_OUT0
MUX
Mux27
SEL[3..0]
1' h0 --
result~60_OUT0
result~28_OUT0
MUX
Mux26
SEL[3..0]
1' h0 --
result~59_OUT0
result~27_OUT0
MUX
Mux25
SEL[3..0]
1' h0 --
result~58_OUT0
result~26_OUT0
MUX
Mux24
SEL[3..0]
1' h0 --
result~57_OUT0
result~25_OUT0
MUX
Mux23
SEL[3..0]
1' h0 --
result~56_OUT0
result~24_OUT0
MUX
Mux22
SEL[3..0]
1' h0 --
result~55_OUT0
result~23_OUT0
MUX
Mux21
SEL[3..0]
1' h0 --
result~54_OUT0
result~22_OUT0
MUX
Mux20
SEL[3..0]
1' h0 --
result~53_OUT0
result~21_OUT0
MUX
Mux19
SEL[3..0]
1' h0 --
result~52_OUT0
result~20_OUT0
MUX
Mux18
SEL[3..0]
1' h0 --
result~51_OUT0
result~19_OUT0
MUX
Mux17
SEL[3..0]
1' h0 --
result~50_OUT0
result~18_OUT0
MUX
Mux16
SEL[3..0]
1' h0 --
result~49_OUT0
result~17_OUT0
MUX
Mux15
SEL[3..0]
1' h0 --
result~48_OUT0
result~16_OUT0
MUX
Mux14
SEL[3..0]
1' h0 --
result~47_OUT0
result~15_OUT0
MUX
Mux13
SEL[3..0]
1' h0 --
result~46_OUT0
result~14_OUT0
MUX
Mux12
SEL[3..0]
1' h0 --
result~13
MUX
Mux11
SEL[3..0]
1' h0 --
result~12
MUX
Mux10
SEL[3..0]
1' h0 --
result~11
MUX
Mux9
SEL[3..0]
1' h0 --
result~10
MUX
Mux8
SEL[3..0]
1' h0 --
result~9
MUX
Mux7
SEL[3..0]
1' h0 --
result~8
MUX
Mux6
SEL[3..0]
1' h0 --
result~7
MUX
Mux5
LessThan0
SEL[3..0]
A[31..0]
OUT
B[31..0]
OUT Mux5_OUT
DATA[15..0]
LESS_THAN
result~1
MUX
Mux4
SEL[3..0]
1' h0 --
result~2
MUX
Mux3
SEL[3..0]
1' h0 --
result~3
MUX
Mux2
SEL[3..0]
1' h0 --
result~4
MUX
Mux1
aluctrl[3..0] SEL[3..0]
1' h0 --
OUT Mux1_OUT
DATA[15..0]
Add0_OUT
result~5
MUX
result~34_OUT0
Mux0
SEL[3..0]
1' h0 --
result~6
MUX
Equal0
A[31..0]
OUT zero
B[31..0]
EQUAL
*To examine details of design please use the zoom feature of your PDF viewer
ALU Control - Final Implementation
Decoder0
DECODER
Selector1
WideOr0
SEL[3..0] ALUIn[0]$latch
node[3..1] PRE
OUT D Q
3' h0 -- ENA
CLR
2' h1 -- DATA[3..0]
BUF (DC)
0
0
1 1
0 1
ALUIn[0]~1 SELECTOR
ALUIn[0]~0
Selector4
SEL[3..0] ALUIn[1]$latch
PRE
OUT D Q
ENA
CLR
0 2' h2 -- DATA[3..0]
0
0 1
0 1
ALUIn[1]~9
ALUIn[1]~12
Selector5 SELECTOR
Selector3
SEL[3..0]
SEL[3..0] ALUIn[2]$latch
OUT
PRE
OUT D Q 1' h0 --
3' h3 --
DATA[3..0] ENA
2' h1 --
CLR ALUIn[3..0]
DATA[3..0]
WideOr5
SELECTOR
SELECTOR
WideOr6
Equal4
funct[5..0] A[5..0]
OUT 0
6' h22 -- B[5..0] 0
1 1
0 1
ALUIn[0]~3
EQUAL ALUIn[0]~2
WideOr4 Selector0
Equal3
A[5..0]
OUT
6' h01 -- B[5..0] SEL[3..0]
OUT
EQUAL
ALUIn[3]~13 3' h3 --
DATA[3..0]
Equal2
SELECTOR
A[5..0]
OUT
6' h20 -- B[5..0] Selector6
EQUAL
WideOr3
Equal1 SEL[3..0]
OUT
A[5..0] 3' h3 --
OUT DATA[3..0]
6' h25 -- B[5..0]
0
EQUAL 1 1 SELECTOR
Equal0 ALUIn[1]~6
A[5..0]
OUT WideOr2
6' h24 -- B[5..0]
EQUAL
Equal5
ALUIn~14
A[5..0]
OUT
6' h2A -- B[5..0]
EQUAL
0
0
0 1
0 1
ALUIn[2]~8
ALUIn[2]~11
forceadd
*To examine details of design please use the zoom feature of your PDF viewer
6. Testing Methodology
The general methodology to test the system directly stems from our design methodology.
In the design methodology we broke important system functionalities in separate modules
so that we could individually debug and assign responsibility. This way each module can
be tested individually eliminating possible interference from other modules. Once each
module has been individually tested and is working, the system can be implemented
using each of the smaller modules. At this point it is just a matter of working out any
system integration issues or finding any bugs that were missed in the first stage. Once
the system was completely integrated, we decided that the best way to test the system as a
whole was to write a program which would demonstrate the working functionality of the
entire system. Finally, after writing our test program, we found that we were able to
implement a working datapath that calculates the nth digit of the Fibonacci sequence
correctly.
7. Conclusion
Our Computer Engineering 305 project came from an accumulation of material from
Cpre305 and previous courses. The knowledge we needed to complete this project
included an understanding of multicycle CPUs, datapaths, control units, finite state
machines, digital logic, and Verilog. With our knowledge, we were able to build
individual logic modules and integrate those modules to create our multicycle processor.
The processor was capable of supporting fifteen MIPS instructions. In the process of
building the CPU, we added logic to the design presented in the textbook by Patterson
and Hennessy to fully support our multicycle design.
8. Lessons Learned
• Save often, ModelSim has a bad habit of crashing in the lab. The more you save, the less
amount of work will be lost after a program or computer crashes.
• Make backups, if all else fails, you have a backup.
• Use comments, when working with others, comments allow others to understand your
code. The less comments provided, the harder it may be for someone to understand your
code in the future.
• Create block schematics, block schematics help to understand the big picture. If the
block diagram created from the Verilog code, does not look correct, then the block
diagram can bring understanding to the high level design as well as help overcome
mistakes in code.
Appendix A – Verilog Code & Testbench
// for debug
reg[31:0] cycle=0;
// control variables
wire regdst, memread, memwrite, regwrite, memtoreg;
wire pcwritecond, pcwrite,iord,irwrite, aluscra, zero;
wire [1:0] pcsource;
wire [2:0] alusrcb;
wire [31:0] jumpaddress,alu_out, mem_out;
wire [31:0] branchCondition;
wire[3:0] aluCtrl;
// other variables
reg [31:0] pc = 32'b0;
reg [31:0] ALUOut;
reg [31:0] register_A, register_B;
//MemoryDataRegister holds data from memory that may be written into register
always@(posedge clock)
memDataReg = mem_out;
//Registers hold value until positive edge of clock, when they are updated
always@(posedge clock)
begin
register_A = regA;
register_B=regB;
end
twomux32 registerAmux(aluscra,register_A,pc,valueA);
FiveToOneMux32 registerBmux(alusrcb,zeroextendvalue,imm_value<<2,
imm_value,4,register_B,valueB);
//Main ALU
ALUMulticycle ALU(aluCtrl,valueA,valueB,alu_out,zero);
//temp ALU out register holds value from alu until updated on posedge clock
always@(posedge clock)
begin
ALUOut= alu_out;
end
JumpAddress jumpTo(pc,address,jumpaddress);
//Mux chooses next data to pc depending on control
ThreeToOneMux32
branchesAndJumps(pcsource,jumpaddress,ALUOut,alu_out,branchCondition);
wire brachwritecond, gotoNextPc;
assign brachwritecond = pcwritecond & zero;
assign gotoNextPc =pcwrite | brachwritecond;
//PC update
always @ (posedge clock)
begin
if(gotoNextPc)
pc=branchCondition;
end
endmodule// END: MultiCycle
A:begin
RegDst=0;
MemtoReg=0;
PCWriteCondition=0;
MemWrite=0;
RegWrite=0;
forceAdd=1;
end
B:begin
debug = 4'b0001;
ALUSrcA=0;
ALUSrcB=3'b011;
IorD=0;
PCSource=0;
RegDst=0;
MemtoReg=0;
MemRead=0;
PCWriteCondition=0;
PCWrite=0;
MemWrite=0;
IRWrite=0;
RegWrite=0;
forceAdd=1;
//if lw or sw nextstate = C
//if(opcode==35 || opcode==43)
if(opcode==6'b100011 || opcode==6'b101011)
begin
next_state=C;
end
//if j nextstate = j
//if(opcode==2)
else if(opcode==6'b000010)
begin
next_state=J;
end
//IType instrcution, treate as R-Type
//because ALU control will take care of proper execution
debug = 4'b0010;
ALUSrcA=1;
ALUSrcB=3'b010;
IorD=0;
PCSource=0;
RegDst=0;
MemtoReg=0;
MemRead=0;
PCWriteCondition=0;
PCWrite=0;
MemWrite=0;
IRWrite=0;
RegWrite=0;
forceAdd=0;
D:begin
debug = 4'b0011;
MemRead = 1;
IorD=1;
ALUSrcA=0;
ALUSrcB=0;
PCSource=0;
RegDst=0;
MemtoReg=0;
PCWriteCondition=0;
PCWrite=0;
MemWrite=0;
IRWrite=0;
RegWrite=0;
next_state=E;
forceAdd=0;
end
E:begin
debug = 4'b0100;
RegDst=1'b0;
RegWrite = 1;
MemtoReg=1'b1;
next_state=A;
ALUSrcA = 0;
IorD = 0;
ALUSrcB = 0;
PCSource = 0;
MemRead = 0;
PCWriteCondition = 0;
PCWrite = 0;
MemWrite = 0;
IRWrite = 0;
forceAdd=0;
end
F:begin
debug = 4'b0101;
MemWrite = 1'b1;
IorD=1'b1;
next_state=A;
ALUSrcA = 0;
ALUSrcB = 0;
PCSource = 0;
RegDst = 0;
MemtoReg = 0;
MemRead = 0;
PCWriteCondition = 0;
PCWrite = 0;
IRWrite = 0;
RegWrite = 0;
forceAdd=0;
end
G:begin
debug = 4'b0110;
ALUSrcA=1;
ALUSrcB=3'b000;
next_state=H;
IorD = 0;
PCSource = 0;
RegDst = 0;
MemtoReg = 0;
MemRead = 0;
PCWriteCondition = 0;
PCWrite = 0;
MemWrite = 0;
IRWrite = 0;
RegWrite = 0;
forceAdd=0;
end
H:begin
//For RType or IType, if not RType, it is IType
//if IType regDst = 0
debug = 4'b0111;
RegDst=1'b1;
RegWrite = 1;
MemtoReg=1'b0;
next_state=A;
ALUSrcA = 0;
IorD = 0;
ALUSrcB = 0;
PCSource = 0;
MemRead = 0;
PCWriteCondition = 0;
PCWrite = 0;
MemWrite = 0;
IRWrite = 0;
forceAdd=0;
end
I:begin
debug = 4'b1000;
ALUSrcA=1;
ALUSrcB=3'b000;
PCWriteCondition = 1;
PCSource=2'b01;
next_state=A;
IorD = 0;
RegDst = 0;
MemtoReg = 0;
MemRead = 0;
PCWrite = 0;
MemWrite = 0;
IRWrite = 0;
RegWrite = 0;
forceAdd=0;
end
J:begin
debug = 4'b1001;
PCWrite = 1;
PCSource=2'b10;
next_state=A;
ALUSrcA = 0;
IorD = 0;
ALUSrcB = 0;
RegDst = 0;
MemtoReg = 0;
MemRead = 0;
PCWriteCondition = 0;
MemWrite = 0;
IRWrite = 0;
RegWrite = 0;
forceAdd=0;
end
K:begin
debug = 4'b1010;
ALUSrcA = 1;
ALUSrcB = 3'b010;
MemtoReg = 0;
IorD = 0;
RegDst = 0;
MemRead = 0;
PCWriteCondition = 0;
PCWrite=0;
MemWrite = 0;
IRWrite = 0;
RegWrite = 0;
PCSource=2'b00;
next_state=L;
forceAdd=0;
end
L:
begin
debug = 4'b1011;
RegDst = 0;
RegWrite = 1;
MemtoReg = 0;
IorD = 0;
MemRead = 0;
PCWriteCondition = 0;
PCWrite=0;
MemWrite = 0;
IRWrite = 0;
ALUSrcA = 0;
ALUSrcB = 3'b000;
PCSource=2'b00;
next_state=A;
forceAdd=0;
end
M:
begin
debug = 4'b1100;
ALUSrcA = 1;
ALUSrcB = 3'b100;
MemtoReg = 0;
IorD = 0;
RegDst = 0;
MemRead = 0;
PCWriteCondition = 0;
PCWrite=0;
MemWrite = 0;
IRWrite = 0;
RegWrite = 0;
PCSource=2'b00;
next_state=L;
forceAdd=0;
end
endcase
end
endmodule//END: MulticycleControlFSM
end
MulticycleControlFSM test(op,clock,ALUSrcA,IorD,ALUSrcB,PCSource,RegDst,MemtoReg,
MemRead,PCWriteCondition, PCWrite, MemWrite, IRWrite, RegWrite);
4'b0000://Bitwise And
begin
result = valueA & valueB;
end
4'b0001://Bitwise Or
begin
result = valueA | valueB;
end
4'b0010://Add
begin
result = valueA + valueB;
end
4'b0101://Xor
begin
result = valueA ^ valueB;
end
4'b0110://Sub
begin
result = valueA - valueB;
end
4'b0111://Slt
begin
result = valueA < valueB ? 1:0;
end
endcase
if(valueA==valueB)
begin
zero=1'b1;
end
else
begin
zero=1'b0;
end
end
endmodule
module testALUMultiCycle;
reg [3:0] aluctrl;
reg [31:0] valueA;
reg [31:0] valueB;
wire [31:0] result;
wire zero;
initial
begin
//AND
aluctrl = 4'b0000;
valueA = 0;
valueB = 4294967295;
$monitor("AND -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
#5
aluctrl = 4'b0000;
valueA = 4294967295;
valueB = 4294967295;
$monitor("AND -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
//OR
#5
aluctrl = 4'b0001;
valueA = 4294967295;
valueB = 4294967295;
$monitor("OR -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
#5
aluctrl = 4'b0001;
valueA = 0;
valueB = 0;
$monitor("OR -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
//Add
#5
aluctrl = 4'b0010;
valueA = 5;
valueB = 5;
$monitor("ADD -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
#5
aluctrl = 4'b0010;
valueA = 0;
valueB = 4294967295;
$monitor("ADD -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
//XOR
#5
aluctrl = 4'b0101;
valueA = 0;
valueB = 1;
$monitor("XOR -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
#5
aluctrl = 4'b0101;
valueA = 4294967295;
valueB = 0;
$monitor("XOR -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
//Subtract
#5
aluctrl = 4'b0110;
valueA = 5;
valueB = 4;
$monitor("SUBTRACT -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b
(%d) | Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
#5
aluctrl = 4'b0110;
valueA = 4294967295;
valueB = 0;
$monitor("SUBTRACT -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b
(%d) | Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
//SLT
#5
aluctrl = 4'b0111;
valueA = 5;
valueB = 4;
$monitor("SLT -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
#5
aluctrl = 4'b0111;
valueA = 0;
valueB = 4294967295;
$monitor("SLT -> aluctrl: %b | valueA: %b (%d) | valueB: %b (%d) | Result= %b (%d) |
Zero = %b",aluctrl,valueA,valueA,valueB,valueB,result,result,zero);
end
endmodule
//ALU control
module ALUControlMulti(funct, opcode,forceadd, ALUIn);
input [5:0]funct;
input [5:0] opcode;
input forceadd;
output [3:0]ALUIn;
reg [3:0]ALUIn;
//Xor
else if(funct == 6'b000001)
begin
ALUIn = 4'b0101;
end
//Sub
else if(funct==6'b100010)
begin
ALUIn = 4'b0110;
end
//Slt
else if(funct==6'b101010)
begin
ALUIn = 4'b0111;
end
end//end R-type
//Begin I-Type
//AndI
6'b001100://C
begin
ALUIn = 4'b0000;
end
//OrI
6'b001101://D
begin
ALUIn = 4'b0001;
end
//XorI
6'b001111://F
begin
ALUIn = 4'b0101;
end
//SltI
6'b001010://A
begin
ALUIn = 4'b0111;
end
//AddI
6'b001000://8
begin
ALUIn = 4'b0010;
end
//Branch
6'b000100://4
begin
ALUIn = 4'b0010;
end
//LW
6'b100011:
begin
ALUIn = 4'b0010;
end
//SW
6'b101011:
begin
ALUIn = 4'b0010;
end
//End I-Type
//jump
6'b000010:
begin
ALUIn=4'b0010;
end
endcase //endcase
end//end else
end//end always
endmodule
initial
begin
dataMemory[0] = 32'b00100000000101010000000000010100;//N=20
dataMemory[4] = 32'b00000000000000001011100000100000;
dataMemory[8] = 32'b00010010101000000000000000000110;
dataMemory[12] = 32'b00100000000101100000000000000001;
dataMemory[16] = 32'b00000010111101101011100000100000;
dataMemory[20] = 32'b00000010111101101011000000100010;
dataMemory[24] = 32'b00100010101101011111111111111111;
dataMemory[28] = 32'b00010010101000000000000000000001;
dataMemory[32] = 32'b00010000000000001111111111111011;
dataMemory[36] = 32'b10101100000101110000000000000001;
end
always@(memWrite or memRead or Address)
begin
if(memWrite == 1'b1)
begin
dataMemory[Address]=writeData;
end
if(memRead ==1'b1)
begin
readData=dataMemory[Address];
end
end
endmodule//END: DataMemory
reg memWrite,memRead;
reg [5:0] Address;
reg [31:0] writeData;
wire [31:0] readData;
initial
begin
$monitor(" memWrite=%d, memRead=%d, Address=%d, writeData=%d,readData=%d ",
$time,memWrite,memRead,Address, writeData,readData);
end
initial
begin
memRead=1;
#20 Address=4;
#20 memRead=0;
#20 Address=1;
#20 memWrite=1;
#20 writeData=32'b1;
#20 $stop;
end
//Register File
module regFileRTL(clock,regWrite,inData,wrReg,readA, readB,regA,regB);
input clock;
input regWrite;
input [31:0] inData;
input [4:0] wrReg;
input [4:0] readA;
input [4:0] readB;
output [31:0] regA;
output [31:0] regB;
reg [31:0] registerFiles[31:0];
initial
begin
registerFiles[5'b00000] = 32'b0;
end
always@(posedge clock)
begin
end
endmodule//END: regFileRTL
always@(posedge clock)
begin
assign opcode = instruction[31:26];
assign rs = instruction[25:21];
assign rt = instruction[20:16];
assign rd = instruction[15:11];
assign shamt = instruction[10:6];
assign funct = instruction[5:0];
assign immediate = instruction[15:0];
assign address = instruction[25:0];
end
endmodule// END: InstructionDecode
initial
begin
$monitor("Time=%d,
instOp=%d,%d,instRs=%d,%d,instRt=%d,%d,instRd=%d,%d,instShT=%d,%d,instFt=%d,%d,in
stImm=%d,%d,instAdd=%d;%d",
$time,instr[31:26], opcode,instr[25:21],
rs,instr[21:16],rt,instr[15:11],rd,instr[10:6],shamt,instr[5:0],funct,instr[15:0],
immediate,instr[25:0], address);
clock=0;
end
always
#2 clock= ~clock;
initial
begin
instr = 32'b00100000000101010000000000010001;
#20 instr = 32'b00000000000000001011100000100000;
#20 instr = 32'b00010010101000000000000000000110;
#20 instr = 32'b00100000000101100000000000000001;
#20 instr = 32'b00000010111101101011100000100000;
#20 instr = 32'b00000010111101101011000000100010;
#20 instr = 32'b00100010101101011111111111111111;
#20 instr = 32'b00010010101000000000000000000001;
#20 instr = 32'b00010000000000001111111111111011;
#20 instr = 32'b10101100000101110000000000000001;
#20 $stop;
end
InstructionDecode testdecode(clock,instr, opcode, rs,rt,rd,shamt,funct, immediate, address);
endmodule//END:AInstrTest
//JumpAddress
module JumpAddress(pc,address,newAddress);
input [31:0] pc;
input [25:0] address;
output [31:0] newAddress;
reg [31:0] newAddress;
always@(pc or address)
begin
newAddress[31:28] = pc[31:28];
newAddress[27:0] = (address <<2);
end
endmodule//END: JumpAddress
//JumpAddress testbench
module AJumpAddressTest;
reg [31:0] pc;
reg [25:0] addr;
wire [31:0] newAddr;
integer x,y;
initial
begin
$monitor(" Time=%d, pc=%d, addr=%d, newAddr=%d", $time,pc,addr,newAddr);
end
initial
begin
x=0;
y=0;
addr = 32'b0;
pc = 32'b00010000000000000000000000000000;
for(x = 0; x < 32; x=x+1)
begin
#10 addr=x;
end
pc = 32'b00110000000000000000000000000000;
addr= 32'b11110000000000000000000000000000;
for(y = 0; y < 32; y=y+1)
begin
#10 addr=y;
end
#20 $stop;
end
JumpAddress jumptest(pc,addr,newAddr);
endmodule
always@(a or x1 or x0)
begin
if(a == 1'b1)
begin
x = x1;
end
else if(a==1'b0)
begin
x = x0;
end
end
endmodule//END: twomux5
end
endmodule// END:ThreeToOneMux32
//FiveToOneMux 32 has a datapath of 32 bits and a choice of three elements
module FiveToOneMux32(select,x4,x3,x2,x1,x0,out);
input [2:0] select;
input [31:0] x4,x3,x2,x1,x0;
output [31:0] out;
reg [31:0] out;
always@(select or x0 or x1 or x2 or x3 or x4)
begin
if(select == 3'b000)
begin
out = x0;
end
if(select == 3'b001)
begin
out = x1;
end
if(select == 3'b010)
begin
out = x2;
end
if(select == 3'b011)
begin
out = x3;
end
if(select == 3'b100)
begin
out = x4;
end
end
endmodule
Appendix B – Simulation Results
The following simulation results are of a program we wrote which calculates the nth digit of the
Fibonacci sequence. In this simulation the nth digit to calculate was set as “20”. After running
the simulation we calculated that 20th digit of the Fibonacci sequence was “6765”, which is
indeed correct.
addi $21,$0,20
add $23,$0,$0
beq $21,$0,end
addi $22, $0,1
loop:
add $23,$23,$22
sub $22,$23,$22
addi $21,$21,-1
beq $21,$0,end
beq $0,$0, loop
end:
sw $23,1($0)
To double check out binary math, we compiled our assemble code in the MIPS simulator SPIM.
dataMemory[0] = 32'b00100000000101010000000000010100;
dataMemory[4] = 32'b00000000000000001011100000100000;
dataMemory[8] = 32'b00010010101000000000000000000110;
dataMemory[12] = 32'b00100000000101100000000000000001;
dataMemory[16] = 32'b00000010111101101011100000100000;
dataMemory[20] = 32'b00000010111101101011000000100010;
dataMemory[24] = 32'b00100010101101011111111111111111;
dataMemory[28] = 32'b00010010101000000000000000000001;
dataMemory[32] = 32'b00010000000000001111111111111011;
dataMemory[36] = 32'b10101100000101110000000000000001;
On the following pages are the results of the simulation running the program described above.
/AMultiCycleTest/testcpu/clock
/AMultiCycleTest/testcpu/cycle
/AMultiCycleTest/testcpu/alu_out
/AMultiCycleTest/testcpu/mem_out
/AMultiCycleTest/testcpu/pc 00000000000000000000000000101100
/AMultiCycleTest/testcpu/regdst
/AMultiCycleTest/testcpu/memread
/AMultiCycleTest/testcpu/memwrite
/AMultiCycleTest/testcpu/regwrite
/AMultiCycleTest/testcpu/memtoreg
/AMultiCycleTest/testcpu/zero
/AMultiCycleTest/testcpu/pcwritecond
/AMultiCycleTest/testcpu/pcwrite
/AMultiCycleTest/testcpu/iord
/AMultiCycleTest/testcpu/irwrite
/AMultiCycleTest/testcpu/aluscra
/AMultiCycleTest/testcpu/pcsource 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
/AMultiCycleTest/testcpu/alusrcb 011
/AMultiCycleTest/testcpu/jumpaddress 0000xxxxxxxxxxxxxxxxxxxxxxxxxx00
/AMultiCycleTest/testcpu/branchCondition
/AMultiCycleTest/testcpu/aluCtrl 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010 0010
/AMultiCycleTest/testcpu/ALUOut
/AMultiCycleTest/testcpu/register_A
/AMultiCycleTest/testcpu/register_B
/AMultiCycleTest/testcpu/memAddress 00000000000000000000000000101100
/AMultiCycleTest/testcpu/opCode 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000
/AMultiCycleTest/testcpu/regToWrite 10110 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000
/AMultiCycleTest/testcpu/rs 00000 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101 10111 10101
/AMultiCycleTest/testcpu/rt 00000
10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110 00000 10110
/AMultiCycleTest/testcpu/immediatevalue
/AMultiCycleTest/testcpu/shamt 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000
/AMultiCycleTest/testcpu/funct 000001
/AMultiCycleTest/testcpu/address
/AMultiCycleTest/testcpu/memDataReg
/AMultiCycleTest/testcpu/regA
/AMultiCycleTest/testcpu/regB
/AMultiCycleTest/testcpu/regWriteData
/AMultiCycleTest/testcpu/imm_value 0000000000000000xxxxxxxxxxxxxxxx
/AMultiCycleTest/testcpu/valueA 00000000000000000000000000101100
/AMultiCycleTest/testcpu/valueB 00000000000000xxxxxxxxxxxxxxxx00
/AMultiCycleTest/testcpu/forceadd
/AMultiCycleTest/testcpu/zeroextendvalue 0000000000000000xxxxxxxxxxxxxxxx
/AMultiCycleTest/testcpu/brachwritecond
0 2 us 4 us 6 us 8 us 10 us 12 us
Entity:AMultiCycleTest Architecture: Date: Sun Dec 02 8:30:30 PM Central Standard Time 2007 Row: 1 Page: 1
/AMultiCycleTest/testcpu/gotoNextPc
/AMultiCycleTest/testcpu/RTL/registerFiles
[31]
[30]
[29]
[28]
[27]
[26]
[25]
[24]
[23] 0 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765
[22] 1 0 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181
[21] 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
[20]
[19]
[18]
[17]
[16]
[15]
[14]
[13]
[12]
[11]
[10]
[9]
[8]
[7]
[6]
[5]
[4]
[3]
[2]
[1]
[0] 0
/AMultiCycleTest/testcpu/data/dataMemory[1] 6765
0 2 us 4 us 6 us 8 us 10 us 12 us
Entity:AMultiCycleTest Architecture: Date: Sun Dec 02 8:30:30 PM Central Standard Time 2007 Row: 1 Page: 2
Appendix C – Commonly Made Verilog Mistakes
• When a register or a wire that is spelled incorrectly is used in Verilog using ModelSim,
the compiler will not throw any error or warnings, but at the same time as expected, the
program will cease to function correctly.
• A warning is thrown but not enforced in ModelSim when a register is assigned more bits
than the register is wide. This forces the register to only act upon the bottom bits of the
assigned bits, usually to the inconvenience of the developer.
• Module names should be name exactly as the file name which holds the module.
Although this is not a strict rule of Verilog, it is a good practice because some programs
like Quartus II depend on this naming scheme for some applications.
• It is important to remember to pay close attention to the sensitivity list on an always
block. If a variable is not included in the always block that is used inside the block itself,
then the entire block may not run at all. This is a confusing issue to find when debugging
code.
• In Verilog an output must be accompanied by a register if the data is to be manipulated.
• Begin and end statements must be used properly. Not having an end statement to
accompany a begin statement will cause problems in code.
• To assign output from one module to another a wire must be used. Using a register will
cause a compilation error.
• Blocking vs. Non-Blocking assignment statements, misunderstanding the differences
between these assignment statements can cause problems in the inner workings of
Verilog code. This is also a very hard issue to debug.
Appendix D – Figures and Diagrams