Documente Academic
Documente Profesional
Documente Cultură
Prepared By:
Babak Keshavarz (
babak@ece.uvic.ca
)
CENG450_Project_Report
RISCBasedProcessoronFPGA
CONTENTS
Contents
List of Figures
List of Tables
Introduction
Objective
Design Requirement
Project Outline
System Design
The DataPath
The Control Unit
Detailed Description of the Design
A-Format
B-Format:
L-Format:
Simulation Results
Test Program 1
Test Program 2
Analysis and Discussion
A-Format
B-Format
L-Format
Processor I/O and Hardware Implementation
Conclusion
Appendix (Tables)
FSM Control States (VHDL)
FSM Control Signals
Instruction Set
Program Test 1
Program Test 2
Reference
2 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
LIST OF FIGURES
Fig.1 Block Diagram of the Processor Datapath
Fig.2 Register File Block Diagram
Fig.3 ALU Block Diagram
Fig.4 FSM Control
Fig.5 The Actual Pipeline 5-Stages
Fig.7 A-Format Instructions
Fig.8 A-Format Datapath
Fig.9 B-Format Instructions
Fig.10 B-Format Datapath
Fig.11 L-Format Instructions
Fig.12 L-Format Datapath
Fig.13 Flow-Chart of Test Program1
Fig.14 PC3-10: 30,FF & 34,0C (LOADIMM)
Fig.15 PC15-17: 51 (SUB) & 34,0C (LOADIMM)
Fig.16 PC22-27: D6 (MOV) & B8 (IN)
Fig.17 PC32-34: 28,70 (STORE) & 49 (ADD)
Fig.18 PC39-44: 88 (NAND) & C8 (OUT 3C)
Fig.19 PC45-51: 10,70 (LOAD) & 70 (SHR)
Fig.20 PC56-61: 48 (ADD) & C8 (OUT 9C)
Fig.21 Overall Simulation of Test1
Fig.22 Flow-Chart of Test Program2
Fig.23 1st Loop: 1st ADD
Fig.24 2nd Loop: 2nd ADD
Fig.25 3rd Loop: 3rd ADD
Fig.26 4th Loop: 4th ADD
Fig.27 5th Loop: 1st NAND
Fig.28 6th Loop: 2nd NAND
Fig.29 7th Loop: 3rd NAND
Fig.30 8th Loop: 4th NAND
Fig.31 9th Loop: Start Again
Fig.32 A-Format Simulation Example ( 49: ADD r2, r1)
Fig.32 B-Format Simulation Example ( 96: BR r2)
Fig.33 L-Format Simulation Example ( 34,0C: LOADIMM r1, 0C)
Fig.34 Test Program 2 running on Spartan FPGA
3 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
LIST OF TABLES
4 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
INTRODUCTION
OBJECTIVE
To design and implement a pipelined processor on FPGA using Xilinx (Spartan3E). We
have used a RISC-like instruction set in the project. Instructions are 1-byte or 2-bytes
depending on the type of instructions. There are four 1-byte general purpose registers;
R0, R1, R2, and R3. For both RAM and ROM, the memory address space is 256 bytes
and is byte addressable. PC is the program counter that points to the next instruction to
be executed. For call subroutine instruction (BR.SUB) a special register, link register
(LR), holds the address of instruction after BR.SUB.
DESIGN REQUIREMENT
There are 3 different instruction formats:
1) A-Format: Arithmetic instructions, which affect zero flag (Z) and negative flag (N).
2) B-Format: Branch instructions. (op-code, brx -to determines type of branch -, and rb).
3) L-Format: Load/store instructions (Two bytes: op-code and ra, and memory address
or immediate value.
PROJECT OUTLINE
This rest of the report is organized as the following:
System Design, where the overall design is presented showing the block diagram of
the datapath and the finite state machine of the control unit.
Detailed Description of the Design, where every format is discussed in details with
specific illustrations and examples (data flow, state diagrams, and simulation
examples)
Simulation Results of two test programs, where the test code is stored in ROM.
Analysis and Discussion, where the simulation results are discussed and the test
codes are shown to be satisfied in simulation and hardware
Conclusion
5 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
SYSTEM DESIGN
The overall design of the processor is presented in this section showing the block
diagram of the datapath and demonstrating the finite state machine of the control unit.
The instruction set architecture (ISA) of our mini processor is RISC-like and instructions
are 1-byte or 2-bytes depending on the type of instructions. (
Fig.1
) illustrates the actual
datapath of our processor including the control unit and the following subsections will
give more details about these two main components, i.e., the datapath and control units.
THE DATAPATH
6 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
7 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
8 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
9 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
4
Op-Code
Ra
0
Rb
10 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
4
Op-Code
Ra
0
Rb
11 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
The branch instructions contain register labels which points to the target ROM address
for the jump. Upon receipt of a branch the controller enables a read of the addresses
from the register file and enables the br_sel line. The received address is loaded into the
PC. (Fig.10) shows the possible data flow when executing an B-format instruction:
12 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
L-FORMAT:
L-format instructions are two bytes and are used for load/store instructions. The first
byte holds op-code and ra and the second byte holds address of memory or an
immediate value. (
Fig.11
) shows L-format instructions, while (
Table 3
) in the Appendix
shows details of L-format instructions. M[ea] represents the content of memory with
address ea. Note that in L-format instructions, the least significant two bits of the first
byte are unused.
7
4
Op-Code
Ra
Effective Address / Immediate Value
13 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
SIMULATION RESULTS
TEST PROGRAM 1
Two test programs are examined, and the test code is stored in ROM. (
Fig.13
) gives a
clear step by step operation of the intended execution of test program 1.
14 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
At time 325 ns the second write back state has been completed and it can be seen that
the values have been successfully loaded into the register file (purple).
Fig.14 PC3-10:
30,FF & 34,0C (LOADIMM)
In (
Fig.15
) the shift-left operation has already been completed and the expected value of
0x18 is stored in R1. Instruction #15 in the ROM performs a subtraction of R1 from R0
and the result is stored in R0. At the completion of instruction #15 at time 615ns the
expected result of 0xE7 is stored in the register file at R0. A continued analysis of the
figures 14-21
and comparison to the expected results in (
Fig.13
) demonstrate the
proper execution and desired result of the program providing sufficient evidence of the
proper implementation of the type
A
and
L
instruction
formats
. In addition it is worth
reiterating the correct implementation of 2s complement arithmetic as can be seen by
the receipt of the result 0xE7 in the subtraction operation of (
Fig.15
).
Fig.15 PC15-17:
51 (SUB) & 34,0C (LOADIMM)
15 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
Fig.16 PC22-27:
D6 (MOV) & B8 (IN)
Fig.17 PC32-34:
28,70 (STORE) & 49 (ADD)
Fig.18 PC39-44:
88 (NAND) & C8 (
OUT 3C
)
16 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
Fig.19 PC45-51:
10,70 (LOAD) & 70 (SHR)
Fig.20 PC56-61:
48 (ADD) & C8 (
OUT 9C
)
17 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
TEST PROGRAM 2
Test Program 2 employs the use of all implemented instructions. The proper execution
of the program demonstrates effective design and implementation of the RISC-style
instruction processor. (
Fig.22
) provides a flowchart of the expected execution and
results.
18 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
Figures 23-31
show the simulated results of the execution of the program on the
system. The program performs four additions and four NAND operations executed by
conditional branches.
Figures 23-26
display the operation of each iteration of the loop
performing addition. In (
Fig.23
) at time 660 ns the value 0xFF has been loaded into both
R0 and R1. By time 900 ns the value in R0 has been shifted right and the value in R1 has
been shifted left resulting in 0x7F and 0xFE in R0 and R1 respectively. This value is then
copied to R3 at time 1,040 ns. The conditional branch is not taken so the addition of R0
and R1 occurs and the result 0x7D can be seen in the register file in R0 by time 1,800
ns. The second branch is taken to the
OUT
state where the value in R0, 0x7D is
displayed on the 7-segment display. This pattern is repeated three additional times, the
summation results 0x3B, 0x17 and 0xFF are displayed.
19 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
20 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
21 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
(
Fig.31
) shows the final stage of the program, the unconditional branch to the start of the
program. At time 16,800 ns the branch flag is enabled and the program counter returns
to instruction 2 as can be seen in the rom_addr_in bus (green).
22 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
23 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
B-FORMAT
B-format instructions execute both conditional and unconditional branch operations in;
figures 11 and 12 demonstrate the action taken. In (
Fig.32
) at time 9,955 ns the system
has received and approved a branch statement and the controller enables the
branch_flag. The rom_addr_in bus increments sequentially until the branch flag is
activated. At the completion of the currently executing instruction the branch target is
loaded into the program counter from the register file enacting the branch mechanism.
As can be seen in (
Fig.11
) the branch function completes successfully.
24 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
The immediate value in the IMR register is loaded onto the control_out_imm bus (orange)
and the rf_write_index bus (purple) is given the target register value. At time 325 ns it
can be seen that the value 0x0C has been successfully loaded into register R1 in (RF).
25 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
For demonstration purposes the external system clock was set to 200 Hz (>150)
allowing rapid program execution and verification, especially for the 7-segment to
display the two digits at the same time. We also added the Display controller VHDL file
to the project in order to display the output on the 7-segments in Digilent Nexys2 Board.
The following settings were used:
clk is external clock FPGA pin U9, reset and start are FPGA pin B18 and pin H13
hex3, hex2,hex1 and hex0 are four bit arrays used to store hex display character.
We used only hex1 and hex0 as we have only two digits
an (an0,an1, an2, an3) four bit array that connect to the display transistors
sseg 7bit array that stores the individual segments of the hex display
26 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
CONCLUSION
A pipelined processor employing RISC-like instructions was implemented on a Xilinx
Spartan 3E FPGA installed on a Nexys2 circuit board. The design was verified via the
execution of two test programs designed to use and stress every instruction in the
system. The design includes a 1-byte register file, a 256 byte ROM and 256 byte RAM,
dynamic controller and support modules. The system successfully employs multiple
arithmetic instructions, two conditional and one unconditional branch instruction as well
as memory management instructions. Verification programs were executed on the
Spartan FPGA and verified using two 7-segment displays and two push button inputs
and an system frequency of 150 Hz.
27 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
APPENDIX (TABLES)
FSM CONTROL STATES (VHDL)
AlloutputsareInitialized
WHEN
ALU=>
WHEN
S0=>
Flag_Reg<={Z,N}
IF
(
Start='1'
)
THEN
ALUMode<=Inst_reg(64)
PC_en<=1
ExceptforNAND
NextState<=Fetch
IN_Sel<=0
Else
IMR2_Sel<=0
ST:IMR2_Sel,RAM_WR
NextState<=S0
EndIf
LD,LDIm:IMR2_Sel
WHEN
Fetch=>
IN:IN_Sel,IMR2_Sel
Alloutputsare0
NextState<=Write_Back
Inst_reg<=ROM_out
WHEN
Branch=>
NextState<=Decode
ALUMode<=000
WHEN
Decode=>
PC_en<=1
Check(
Inst_reg(74)
)
Check(
Inst_reg(32)
)
AFormat(Arithmetic)
00/11:Br_Sel
IN_Sel<=1
01/10:Br_Sel(Flag)
NextState<=ALU
NextState<=Fetch
BFormat(Branch)
WHEN
Write_Back=>
PC_en<=1
NextState<=Branch
Return:PC_en,Rt_Sel
Check(
Inst_reg(74)
)
0/2/C:RF_en<=0
NextState<=Fetch
AFormat(Arithmetic)
Others:RF_en
PC_en<=1
0001:WB_Sel
NextState<=LD_ST
NextState<=Fetch
28 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
Next
State
Fetch
Decode
ALU
Branch
LD/ST
Fetch
(Return
)
WB
E4
LR_e
n
0
1
0
1
0
Br_Se
l
Fetch
WB
(LD)
Fetch
(ST)
Fetch
S1
Br_Se
l
x
x
S2
Rt_Se
l
x
x
0
0
0
E5
RF_e
n
0
0
Output
S3
S4
IN_Se IMR2_Se
l
l
x
x
x
x
1
0
0
x
E3
PC_e
n
1
0
0
0
0
ALU
ALU_Mod
e
x
x
E6
RAM_WR
S5
0
0
x
x
WB_Sel
NC
NC
NC
WB_Sel
INSTRUCTION SET
Format
Mnemonic
NOP
Op-code
0
ADD
SUB
SHL
SHR
6
7
NAND
IN
OUT
MOV
BR
11
12
13
9
BR.Z
BR.N
BR.SUB
RETURN
LOAD
STORE
LOADIMM
9
14
1
2
3
Function
Nothing
R[ra] R[ra] + R[rb]; ((R[ra] + R[rb]) = 0) Z 1; else Z 0;
((R[ra] + R[rb]) < 0) N 1; else N 0;
R[ra] R[ra] R[rb]; ((R[ra] R[rb]) = 0) Z 1; else Z 0;
((R[ra] R[rb]) < 0) N 1; else N 0;
Z R[ra]<7>; R[ra] (R[ra]<6:0>&0);
Z R[ra]<0>; R[ra] (0&R[ra]<7:1>);
R[ra] R[ra] NAND R[rb]; ((R[ra] NAND R[rb]) = 0) Z 1;
else Z 0; ((R[ra] NAND R[rb]) < 0) N 1; else N 0;
R[ra] IN.PORT;
OUT.PORT R[ra];
R[ra] R[rb];
(brx=0) PC R[rb];
(brx=1 Z=1) PC R[rb];
(brx=1 Z=0) PC PC + 1;
(brx=2 N=1) PC R[rb];
(brx=2 N=0) PC PC + 1;
(brx=3) (LR PC + 1; PC R[rb])
(brx=0) PC LR;
R[ra] M[ea];
M[ea] R[ra];
R[ra] imm;
29 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
PROGRAM TEST 1
Addr
2
3
4
5
10
15
16
17
22
Inst.
x"30"
x"FF"
x"34"
x"0c"
x"64"
x"51"
x"38"
x"03"
x"D6"
27
x"B8"
32
33
34
39
x"28"
x"70"
x"49"
x"88"
44
x"C8"
45
46
51
56
x"10"
x"70"
x"70"
x"48"
61
x"C8"
MIPSASM
#start#LOADIMMr0,0XFF
0xFF,theimmediatevalue
LOADIMMr1,0X0c
0x0C,theimmediatevalue
SHLr1
SUBr0,r1
r0=r0r1
LOADIMMr2,0x03
0x03,theimmediatevalue
MOVr1,r2
r1=r2
INr2SettheInputto"0xC0"
("11000000")
STOREr2,0x70
EffectiveAddress
ADDr2,r1
r2=r2+r1
NANDr2,r0
r2=r2NANDr0
OUTr2AtthispointR[2]
mustbe"00111100"
LOADr0,0x70
EffectiveAddress
SHRr0
ADDr2,r0(N_Flag)
OUTr2AtthispointR[2]
havetobe"10011100"
r0
FF
E7
r1
0
0C
18
r2 r3 RAM70
0 0
0
3
OUT
C0
C3
3C
C0
60
9C
C0
3C
9C
PROGRAM TEST 2
Addr Inst.
2 x"30"
3 x"0F"
9 x"20"
10 x"0F"
11 x"30"
12 x"07"
18 x"20"
19 x"0E"
20 x"30"
21 x"FF"
22 x"34"
23 x"FF"
29 x"70"
30 x"64"
35 x"DC"
36 x"10"
37 x"0F"
43 x"70"
49 x"20"
50 x"0F"
MIPSASM
loadimmr0,0F#start#
storer0,add_nand(AN)
loadimmr0,7
storer0,counter(CN)
loadimmr0,FF
loadimmr1,FF
shrr0#loop#
shlr1
movr3,r0
loadr0,add_nand
shrr0
storer0,add_nand
r0
0F
FF
ShR
AN
ShR
r1
FF
ShL
r2 r3 RAM714
CN=07
r0
RAM715 OUT
AN=0F
ShRAN
30 out of 31
CENG450_Project_Report
RISCBasedProcessoronFPGA
51 x"D3"
58 x"38"
59 x"4B"
65 x"96"
66 x"41"
67 x"38"
68 x"4C"
74 x"92"
75 x"81"
81 x"C0"
82 x"10"
83 x"0E"
84 x"D9"
85 x"34"
86 x"01"
92 x"51"
98 x"20"
99 x"0E"
100 x"D6"
101 x"38"
102 x"76"
106 x"D3"
109 x"9A"
110 x"38"
111 x"1D"
117 x"92"
118 x"38"
119 x"02"
125 x"92"
movr0,r3
loadimmr2,nand
brznand:
addr0,r1
loadimmr2,out_add_nand
brout_add_nand
nandr0,r1(N)#nand#
outr0
loadr0,counter
movr2,r1
loadimmr1,1
subr0,r1
storer0,counter
movr1,r2
loadimmr2,out:
movr0,r3
brnout
loadimmr2,loop
brloop
loadimmr2,start#out#
brstart
r3
r0+r1
r0Nr1
CN
r0r1
r3
r2
4B
4C
r1
76
1D
2
CN1
r0
REFERENCE
1. CENG 450 Lab Manual,
University of Victoria, 06/04/2015
2. Digilent Nexys2 Board reference Manual
, (Doc: 502-107) Digilent, Pullman, WA,
2008
31 out of 31