Documente Academic
Documente Profesional
Documente Cultură
net/publication/288835032
CITATIONS READS
0 2,659
1 author:
Bibek Bhattarai
George Washington University
6 PUBLICATIONS 5 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Bibek Bhattarai on 01 January 2016.
We are particularly grateful towards Mr. Bikash Poudel, Mr. Prasanna Kansakar, and Mr.
Sujit Rokka Chhetri from Nova Research and Consultancy Pvt Ltd for their supervision,
extensive help with the design including their helpful suggestions and encouragements. We
would like to convey our many thanks for them for providing Spartan-3E FPGA Board without
which our project wouldn‘t have been successful.
We also wish to acknowledge the help and cooperation offered by Mr. Sudarshan Sharma, Mr.
Purushottam Adhikari, Mr. Shiva Bhusal for their support and willingness in providing us
with the resources needed in our project.
We are also indebted towards all our friends for providing important suggestions, advices and
encouragements in our project.
ABSTRACT
CORDIC or CO-ordinate Rotation DIgital Computer is a fast, simple, efficient and powerful
algorithm used for diverse Digital Signal Processing applications. CORDIC is hardware efficient
algorithm which is suitable for solving the trigonometric relationships involved in plane co-
ordinate rotation and conversion from rectangular to polar form. It comprises a special serial
arithmetic unit having shift registers, adder/subtractor, Look-Up table and special
interconnections.
In this project:
Of late, rapid advancements have been made in the field of VLSI and IC design. As a result
special purpose processors with custom-architectures have come up. Higher speeds can be
achieved by these customized hardware solutions at competitive costs. To add to this, various
simple and hardware-efficient algorithms exist which map well onto these chips and can be used
to enhance speed and flexibility while performing the desired signal processing tasks. All these
tasks can be efficiently implemented using processing elements performing vector rotations.
The CORDIC, an acronym for COordinate Rotation DIgital Computer, proposed by Jack E
Volder is used to compute the trigonometric functions, multiplications, divisions, data type
conversions, and hyperbolic functions. Two basic CORDIC modes are known leading to the
computation of different functions, the rotation mode and the vectoring mode. For both modes
the algorithm can be realized as an iterative sequence of additions/subtractions and shift
operations, which are rotations by a fixed rotation angle but with variable rotation direction. Due
to the simplicity of the operations involved, the CORDIC algorithm is well suited for VLSI
implementation.
CORDIC algorithm is used to design a digital sine and cosine waveform generator. There are
plenty of applications which require digital wave generators. Wireless and mobile systems are
among the fastest growing application areas; in particular, Software Defined Radio (SDR) is
currently a focus of research and development. An SDR system allows performing many
functions based on a single hardware platform, thus highly reconfigurable resources for signal
processing are needed, mainly for modulation and demodulation of digital signals. Fourth
generation (4G) wireless and mobile systems are currently the focus of research and
development. They will allow new types of services to be universally available to consumers and
for industrial applications. Broadband wireless networks will enable packet based high data rate
communication suitable for video transmission and mobile Internet applications.
Chapter – 1 contains the motivation behind the project and project scenarios. Chapter – 2
contains the literature review of CORDIC algorithm and FPGA architecture. Chapter – 3
contains the algorithm, arithmetic and architecture details of CORDIC processor. Chapter – 4
contains the description of theory behind the interfacing of ps2 port, VGA port and rotary
encoder. Chapter – 5 contains the system block diagram of project of different modules. Chapter
– 6 describes about the algorithm and peripheral devices. Chapter – 7, 8, 9, 10 describes results,
limitation, problem faced and conclusion respectively in detail.
1. Literature Review
1.1 CORDIC Overview
Although CORDIC algorithm is not a very fast algorithm for use but this algorithm is followed
due to its very simple implementation and also the same architecture can be used for all the
applications which is based on simple shift- add operation.
1.1.2 Advantages
The major advantages of CORDIC processor are included in listing below:
Hardware requirement and cost of CORDIC processor is less as only shift registers,
adders and look-up table (ROM) are required
Number of gates required in hardware implementation, such as on an FPGA, is minimum
as hardware complexity is greatly reduced compared to other processors such as DSP
multipliers
It is relatively simple in design.
No multiplication and only addition, subtraction and bit-shifting operation ensures
simple VLSI implementation.
Delay involved during processing is comparable to that during the implementation of a
division or square-rooting operation.
Either if there is an absence of a hardware multiplier (e.g. uC, uP) or there is a necessity
to optimize the number of logic gates (e.g. FPGA) CORDIC is the preferred choice.
1.1.3 Disadvantages
The listing below includes some of drawbacks of CORDIC processor:
Large number of iterations required for accurate results and thus the speed is low and
time delay is high
Power consumption is high in some architecture types
Whenever a hardware multiplier is available, e.g. in a DSP microprocessor, table look-up
methods and good old-fashioned power series methods are generally quicker than this
CORDIC algorithm.
1.1.4 Applications
Following are some of the famous applications of CORDIC so far
The algorithm was basically developed to offer digital solutions to the problems of real-
time navigation in B-58 bomber.
John Walther extended the basic CORDIC theory to provide solution to and implement a
diverse range of functions.
This algorithm finds use in 8087 Math coprocessor, the HP-35 calculator, radar signal
processors, and robotics.
CORDIC algorithm has also been described for the calculation of DFT (Digital Fourier
Transform), DHT (Discrete Hartley Transform), Chirp Z-transforms, filtering, Singular
value decomposition, and solving linear systems.
Most calculators especially the ones built by Texas Instruments and Hewlett-Packard use
CORDIC algorithm for calculation of transcendental functions.
1.2.1 Introduction
FPGA or Field Programmable Gate Arrays can be programmed or configured by the user or
designer after manufacturing and during implementation. Hence they are otherwise known as
On-Site programmable. Unlike a Programmable Array Logic (PAL) or other programmable
device, their structure is similar to that of a gate-array or an ASIC. Thus, they are used to rapidly
prototype ASICs, or as a substitute for places where an ASIC will eventually be used. This is
done when it is important to get the design to the market first. Later on, when the ASIC is
produced in bulk to reduce the NRE cost, it can replace the FPGA. The programming of the
FPGA is done using a logic circuit diagram or a source code using a Hardware Description
Language (HDL) to specify how the chip should work. FPGAs have programmable logic
components called ‚logic blocks‛, and a hierarchy or reconfigurable interconnects which
facilitate the ‚wiring‛ of the blocks together. The programmable logic blocks are called
configurable logic blocks and reconfigurable interconnects are called switch boxes. Logic blocks
(CLBs) can be programmed to perform 7 complex combinational functions, or simple logic gates
like AND and XOR. In most FPGAs the logic blocks also include memory elements, which can
be as simple as a flip-flop or as complex as complete blocks of memory.
Configurable I/O block is used to route signal towards and away from the chip. It comprises
input buffer, output buffer with three states and open collector output controls. Pull-up and Pull-
down resistors may also be present at the output. The output polarity is programmable for active
high or active low output.
Figure 6 Arrangement of SRAM Cells Inside FPGA Onto Which Bit Stream is Added
1.2.3.3.1 Translation
The translate process is used to merge all of the input net-lists and the design constraints. It
outputs a Xilinx NGD (Native Information and Generic Database) file. The logical design
reduced to Xilinx device primitive cells is described by this .ngd file. Here, User Constraints are
defined by assigning the ports in the design to physical elements (e.g. pins, switches, buttons,
etc.) for the target device as well as specifying timing requirements. This information is stored in
a UCF file which can be created using PACE or Constraint Editor.
1.2.3.3.2 Mapping
After the translation process is complete the logical design described in the .ngd file to the
components or primitives (Slices/CLBs) present on the .ncd file is mapped onto the target FPGA
design. The whole circuit is divided into smaller blocks so that they can be appropriately fit into
the FPGA blocks. The mapping is done onto the CLBs and IOBs in accordance with the logic.
1.2.3.4 Testing
System testing is necessary to ensure that all parts of the system correctly work together after the
prototype is mapped onto the system. If the system doesn‘t work then the problem can be fixed
by making some changes in the system or the software. The problems are documented so that on
the next revision or production of the chip they are fixed. When the ICs are produced it is
necessary to have some sort of burnt-in self-test mechanism such that the system gets tested
regularly over a long period of time.
NRE cost is zero- Non-Recurring Engineering refers to the one-time cost of researching,
developing, designing and testing a new product. Since FPGAs are reprogrammable and
they can be used without any loss of quality every time, the NRE cost is not present. This
significantly reduces the initial cost of manufacturing the ICs since the program can be
implemented and tested on FPGAs free of cost.
Low cost- FPGA is quite affordable and hence is very designer-friendly. Also the power
requirement is much less as the architecture of FPGAs is based upon LUTs.
Due to the above mentioned advantages of FPGAs in IC technology and DCT in mapping of
images, implementation of DCT in FPGA can give us a clearer idea about the advantages and
limitations of using DCT as the mapping function. This can help in forming better image
compression and restoration techniques.
Family: Spartan 3E
Family: XC3S500E
Package: FG320
Speed grade: -4
Synthesis Tool: XST (VHDL/Verilog)
= ………………………………………… (1)
So this angular moment of vector can easily be achieved by the simple process of shifting and
adding. Now, if we consider the iterative equation as below.
xi+1 = xi cos αi – yi sin αi
yi+1 = xi sin αi + yi cosαi …………………………………………………….(2)
From equation (1), we can write as
xi+1 = cos αi (xi– yi tan αi)
yi+1 = cos αi (xi tan αi + yi ) …………………………………………………..(3)
Now here we define scale factor kn which is same as shown below:
Ki = cos αi or 1/√(1+2-2i)
So, for the above written two equations we can rewrite them as
xi+1 = (1/√(1+2-2i) ) Ri cos( αi + θ )
yi+1 = (1/√(1+2-2i) ) Ri cos( αi - θ )…………………………………………… (4)
OR
xi+1 = ki (xi - 2-i yi)
yi+1 = ki (yi + 2-i xi )
Now as shown in above equation the direction of rotation may be clock wise or anticlockwise
means unpredictable for different iterations so for that ease we define a binary notation di to
identify the direction. It can equal either +1 or -1. So putting di in above equation we get:
xi+1 = ki (xi - di 2-i yi)
yi+1 = ki (yi + di 2-i xi) ………………………………………………………(5)
As the value of di depends on the direction of rotation, if we move clockwise then the value of di
is +1 otherwise -1.Now, these iterations are basically combination of elementary functions like
addition, subtraction, shifting and table look up operations and no multiplication and division
functions are required in the CORDIC operation.
In CORDIC algorithm, a number of micro-rotations are combined in different ways to realize
some different functions. This is achieved by properly controlling the direction of the successive
micro-rotations. So on the basis of controlling these micro-rotations we can divide CORDIC in
two parts and this control on successive micro-rotations can be achieved in the following two
ways:
Vectoring mode: - In this type of mode the y-component of the input vector is forced to zero. So
this type of consideration yields computation of magnitude and phase of the input vector.
Rotation mode: - In the rotation mode θ-component is forced to zero and this mode yields
computation of a plane rotation of the input vector by a given input phase θ0.
1.3.1 Vectoring mode
As earlier written the in vectoring mode of CORDIC algorithm the magnitude and the phase of
the input vector are calculated. The y-component is forced to zero that means the input vector
(x0, y0) is rotated towards the x-axis. So the CORDIC iteration in vectoring mode is controlled by
the sign of y-component as well as x-component. Means in the vectoring mode the rotator rotates
the input vector through any angle to align the result in the x-axis direction.
So in the vectoring mode the CORDIC equations are:
xi+1 = ki [xi + di pi 2-i yi]
yi+1 = ki [yi - di pi 2-i xi ]
θi+1 = θi + di pi α i
where,
di = sign of x-component
and pi = sign of y-component.
The product of ki‘s can be applied elsewhere in the system or treated as a system processing
gain. The product approaches 0.6073 as the number of iterations tends to infinity. Therefore
algorithm has a gain An of approximately 1.647. The exact gain depends upon the number of
iterations and follows the relation:
A i = Π Ki
which provide the following results:
Xn = A (√(x02 + y02))
Yn = 0
θn = θ0 + tan-1(y0/x0)
Different hardware is used for computation of sine and cosine using CORDIC. Here iterative
rotations of a point around the origin on the x-y plane are considered. In each rotation, the
coordinates of the rotated point and the remaining angle to be rotated are calculated. Since each
rotation is a rotation extension the number of rotations for each angle should be a constant
independent of operands. So the gain factor K becomes a constant. Hardware implementation for
CORDIC arithmetic requires three registers for x, y and z, two shifter to supply the terms 2-i x
and 2-i y to the adder/subtractor units and a look up table to store the values of αi=tan-12-i. The di
factor (-1 and 1) selects the shift operand or its complement. The initial inputs to the
architectures are X0=1, Y0=0. The structure requires a pre-processing unit to converge the input
angles to the desired range and a post processing unit to fix the sign of outputs depending on the
initial angle quadrants. The pre-processing unit takes in angles of any range and converges it to
the interval [-π/2, π/2]. It keeps record of the quadrant of the input angle which may be used in
the post-processing unit to fix the sign of outputs. These two blocks are inevitable for any
application as the input range cannot be predicted always.
Figure 10 Basic Arithmetic Unit for CORDIC Algorithm
1.7 Keyboard
USB keyboard - Latest keyboard supported by all new computers (Macintosh and
IBM/compatible). These are relatively complicated to interface.
IBM/Compatible keyboards - Also known as "AT keyboards" or "PS/2 keyboards", all
modern PCs support this device. They're the easiest to interface, and are the subject of
this project.
ADB keyboards - Connect to the Apple Desktop Bus of older Macintosh systems.
IBM introduced a new keyboard with each of its major desktop computer models. The original
IBM PC, and later the IBM XT, used what we call the "XT keyboard." These are obsolete and
differ significantly from modern keyboards. Next came the IBM AT system and later the IBM
PS/2. They introduced the keyboards we use today. AT keyboards and PS/2 keyboards were
very similar devices, but the PS/2 device used a smaller connector and supported a few
additional features. Nonetheless, it remained backward compatible with AT systems and few of
the additional features ever caught on (since software also wanted to remain backward
compatible.).
Keyboards consist of a large matrix of keys, all of which are monitored by an on-board processor
(called the "keyboard encoder".) The specific processor varies from keyboard-to-keyboard but
they all basically do the same thing: Monitor which key(s) are being pressed/ released and send
the appropriate data to the host. This processor takes care of all the de-bouncing and buffers any
data in its 16-byte buffer, if needed. Your motherboard contains a "keyboard controller" that is
in charge of decoding all of the data received from the keyboard and informing your software of
what's going on. All communication between the host and the keyboard uses an IBM protocol.
The keyboard uses open-collector drivers so that either the keyboard or the host can drive the
two-wire bus. If the host never sends data to the keyboard, then the host can use simple input
pins. A PS/2-style keyboard uses scan codes to communicate key press data. Nearly all
keyboards in use today are PS/2 style. Each key has a single, unique scan code that is sent
whenever the corresponding key is pressed. The scan codes for most keys appear in figure
below.
If the key is pressed and held, the keyboard repeatedly sends the scan code every 100 ms or so.
When a key is released, the keyboard sends an ―F0‖ key-up code, followed by the scan code of
the released key. The keyboard sends the same scan code, regardless if a key has different shift
and non-shift characters and regardless whether the Shift key is pressed or not. The host
determines which character is intended.
Some keys, called extended keys, send an ―E0‖ ahead of the scan code and furthermore, they
might send more than one scan code. When an extended key is released, an ―E0 F0‖ key-up code
is sent, followed by the scan code.
The keyboard sends commands or data to the host only when both the data and clock lines are
High, the Idle state. Because the host is the bus master, the keyboard checks whether the host is
sending data before driving the bus. The clock line can be used as a clear to send signal. If the
host pulls the clock line Low, the keyboard must not send any data until the clock is released.
The keyboard sends data to the host in 11-bit words that contain a ‗0‘ start bit, followed by eight
bits of scan code (LSB first), followed by an odd parity bit and terminated with a ‗1‘ stop bit.
When the keyboard sends data, it generates 11 clock transitions at around 20 to 30 kHz, and data
is valid on the falling edge of the clock as shown in Figure below.
Both a PC mouse and keyboard use the two-wire PS/2 serial bus to communicate with a host
device, the Spartan-3E FPGA in this case. The PS/2 bus includes both clock and data. Both a
mouse and keyboard drive the bus with identical signal timings and both use 11-bit words that
include a start, stop and odd parity bit. However, the data packets are organized differently for a
mouse and keyboard. Furthermore, the keyboard interface allows bidirectional data transfers so
the host device can illuminate state LEDs on the keyboard.
1.8 VGA
The monitor screen for a standard VGA format contains 640 columns by 480 rows of picture
elements called pixel. An image is displayed on the screen by turning on and off individually
pixels. Turning on one pixel does not represent much, but combining numerous pixels generates
an image. The monitor continuously scans through the entire screen, rapidly turning individual
pixels on and off. Although pixels are turned on one at a time, we get the impression that all the
pixels are on because the monitor scans so quickly. This is why old monitors with slow scan
rates flicker.
The scanning process starts from row 0, column 0 in the top left corner of the screen and moves
to the right until it reaches the last column. When the scan reaches the end of a row, it retraces to
the beginning of the next row. When it reaches the last pixel in the bottom right corner of the
screen, it retraces back to the top-left corner and repeats the scanning process. In order to reduce
flicker on the screen, the entire screen must be scanned 60 times per second. This period is called
the refresh rate. The human eye can detect flicker at refresh rates less than 30 Hz.
The VGA monitor is controlled by 5 signals: red, green, blue, horizontal synchronization, and
vertical synchronization. The three color signals, collectively referred to as the RGB signal,
control the color of a pixel at a given location on the screen. They are analog signals with
voltages ranging from 0.7 to 1.0 volt. Different color intensities are obtained by varying the
voltage. For simplicity, these three-color signals are treated as digital signals, so we can just turn
each one on or off.
The horizontal and vertical synchronization signals are used to control the timing of the scan
rates. Unlike the three analog RGB signals, these two sync signals are digital signals. In other
words, they take on either logic 0 or logic 1 value. The horizontal synchronization signal
determines the time it takes to scan a row, while the vertical synchronization signal determines
the time it takes to scan the entire screen. By manipulating these two sync signals and the three
RGB signals, images are formed on the monitor screen.
When current waveform is passed through the coils, it produce magnetic fields that deflect
electron beam to transverse the display surface in raster pattern. Information is displayed when
beam is moving from left to right and top to bottom but not when it is returned back to left
corner, while returning to top to start again. Synchronization must be done during return time
periods.
Figure 20 CRT Display Timing Example
As shown in above, the VGA controller generates the horizontal sync (HS) and vertical sync
(VS) timings signals and coordinates the delivery of video data on each pixel clock. The pixel
clock defines the time available to display one pixel of information. The VS signal defines the
refresh frequency of the display, or the frequency at which all information on the display is
redrawn. The minimum refresh frequency is a function of the display‘s phosphor and electron
beam intensity, with practical refresh frequencies in the 60 Hz to 120 Hz range. The number of
horizontal lines displayed at a given refresh frequency defines the horizontal retrace frequency.
1.8.2 VGA Signal Timing:
The timing signal shown in above figure is for 640 pixels displayed in 480 lines (rows) using
25MHZ clock. The timing for the sync pulse width (TPW) and front and back porch intervals (TFP
and TBP) are based on observations from various VGA displays. The information is not displayed
for pulse width, front porch and back porch. The following table is taken from user guide shows
timing information for synchronization.
A counter clocked by clock can be generated. Counter can be made to generate HS signal. This
counter tracks the current pixel display location on a given row. This can be used as horizontal
synchronization which generates VGA_HSYNC signal which is high only for display time.
Another counter can be incremented for every complete of HS signal and generating VS signal.
This counter is used to generate VGA_VSYNC signal which is high for display region of both
HS and VS signal.
Both of those two counter form address into video display buffers, using these address definite
pixel can be made to distinct RGB value.
The patterns of the tiles constitute the font of the character set. A variety of fonts are available.
In our project implementation we choose an 8-by-8 (i.e., 8-column-by-8-row) font. In this font,
each character is represented as an 8-by-16 pixel pattern. The pattern for the letter "A" is shown
in Figure.
For this the character patterns are to be stored in a ROM and each pattern requires 8 X 8 bits.
Thus we created a pattern memory known as Font ROM of size 2048 X 8 (256 characters).
When we use these 8-by-8 characters (i.e., tiles) in a 640-by-480 resolution screen, 80 (i.e.,
640/8) tiles can be fitted into a horizontal line and 60 (i.e., 480/8) tiles can be fitted into a vertical
line. In other words, the screen can be treated as an 80-by-60 tile screen. We can put characters
on the screen using these scaled coordinates.
αi = tan-1 2-i
the angles are defined in 32 binary digit representation i.e. 359 degree is represented by
1111_1111_1111_1111_1111_1111_1111_1111in this way the look up table as shown below
was created.
After creating LUT we have checked the angle of rotation. If the angle is between range ± π/2
the rotation doesn‘t need any initial rotations. However, if angle is beyond this range the initial
rotation is required. This is due to the fact that the summation of all angles in our LUT is
99.88296578.
case (quadrant)
2'b00, 2'b11: // no pre-rotation needed for these quadrants
begin
X[0] <= {Xin[WI-1], Xin} << (EXTRA_BITS-1); // since An = 1.647, divide
//input by 2 //and then multiply by 2^EXTRA_BITS
Y[0] <= {Yin[WI-1], Yin} << (EXTRA_BITS-1);
Z[0] <= phase_acc;
end
2'b01: begin
X[0] <= {NYin[WI-1], NYin} << (EXTRA_BITS-1);
Y[0] <= {Xin[WI-1], Xin} << (EXTRA_BITS-1);
Z[0] <= {2'b00,phase_acc[29:0]}; // subtract pi/2 from phase_acc for this
//quadrant
end
2'b10: begin
X[0] <= {Yin[WI-1], Yin} << (EXTRA_BITS-1);
Y[0] <= {NXin[WI-1], NXin} << (EXTRA_BITS-1);
Z[0] <= {2'b11,phase_acc[29:0]}; // add pi/2 to phase_acc for this quadrant
end
endcase
After reducing the angle range with in capacity of CORDIC algorithm we now perform the
iterations for number equal to specified number of bits in output. Thus after performing this
iteration we obtain the result (after 22 clock cycles – output is 22 bit) as shown in waveform
presented in result section.
When rotary encoder is rotated right the obtained waveform is as shown in figure below. Here
rotary_q1 is used to denote if encoder is rotated, and rotary_q2 is used to denote direction of
rotation.
rotary_q1: High if ROT_A is High and ROT_B is High
‗rotary_q2‘ high only when rotation is along left and low when rotation is along right.
Observing these signals counter value is increased and decreased in the ‗rotary_encoder‘ module.
Two always blocks are used for this unit, as one block detects the rotary event and direction and
another unit increases or decreases counter according these events and direction.
1.17 Keyboard
In Verilog code the keyboard read procedure was accomplished in two stages. In the first stage
the filter was designed to remove the glitches/ key de-bouncing. The ps/2 clock is very slow as
compared to system clock (25 KHz against 50 MHz) which enables us to check values of clock
line and data line for multiple numbers of clock cycles and if line is low for that specified
number clock cycles it is assumed to be low. In this way the glitches and de-bounce were filtered
out. To implement this we have written these lines in the code.
This code makes h_sync high for pixel_count from 655 to 751 which is retrace time (96 as
described in theory portion).
This portion of code generates v_sync signal which is high for line count 489 and pixel count
798 to line count 491 and pixel count 798.
Beyond these signals blank signal is generated which is high for end of every line and end of last
line until pixel count returns to active region. Complement of this blank signal is video_on signal
which is used in welcome and process text modules. These modules send RGB values to VGA
only when video_on signal is high.
Above table shows that there are slight error between outputs of CORDIC processor using
Verilog programs and MATLAB programs. The two main factors behind this deviation are
1. The Verilog program doesn‘t support the floating point number. But if the accuracy is
required this can be accomplished by using fixed point representation of output which
seems to complicate the hardware to some extent.
2. In the kordic.v module, there are several equations in Stage 0 that are of the following
format:
X[0] <= {Xin[WI-1], Xin} << (EXTRA_BITS-1); // since An = 1.647, divide input by 2
//and then multiply by 2^EXTRA_BITS
Y[0] <= {Yin[WI-1], Yin} << (EXTRA_BITS-1);
For obtaining sine and cosine wave generated using this algorithm is as shown below.
Figure 37 Wave form showing sine and cosine values for one complete cycle
7. Limitations and Future Enhancement
Though we have tried to make our project perfect and fine, it contains some limitations because
of time constraints. Some of major limitations of our project are listed below:
CORDIC is a powerful algorithm, and a popular algorithm of choice when it comes to various
Digital Signal Processing applications. Implementation of a CORDIC-based processor on FPGA
gives us a powerful mechanism of implementing complex computations on a platform that
provides a lot of resources and flexibility at a relatively lesser cost.
In this project a CORDIC module is designed and simulated using Xilinx ISE using VHDL as a
synthesis tool. The output of the CORDIC core is analyzed and verified on the test-bench, and
compared with the actual values obtained from Matlab.
Finally the CORDIC processor was on a Spartan 3E FPGA kit. We had interface ps2 keyboard
and rotary encoder of board to provide angle input to the processor and the result was displayed
through VGA interfacing on CRT monitor.
10. References
1. Jack E. Volder, “The CORDIC trigonometric computing technique,” IRE Trans. Electron
Computers, vol. EC-8, pp. 330–334, Sept. 1959.
2. Jack E. Volder,‖ The Birth of CORDIC ―, Journal of VLSI Signal Processing 25, 101–
105, 2000.
3. A Comprehensive Approach to Hardware/Software Co-design for embedded systems,
Bikash Poudel, Prasanna Kansakar, Sujit Rokka Chhetri
4. Ramesh Bhakthavatchalu1, Parvathi Nair, Jismi.K, Sinith.M.S, “A Comparison of
Pipelined Parallel and Iterative CORDIC Design on FPGA” 2010 5th International
Conference on Industrial and Information Systems, ICIIS 2010, Jul 29 - Aug 01, 2010,
India
5. OSKAR MENCER, LUC S ´EM´ERIA AND MARTIN MORF, “Application of
Reconfigurable CORDIC Architectures”, Journal of VLSI Signal Processing Systems 24,
211–221, 2000.
6. Pramod K. Meher, Javier Valls, Tso-Bing Juang, K. Sridharan and Koushik Maharatna,
“50 Years of CORDIC: Algorithms, Architectures and Applications” IEEE transactions
on circuits and systems—I: regular papers, vol. 56, no. 9, september 2009.
7. J. Villalba, T. Lang, and E. Zapata, ―Parallel compensation of scale factor for the
CORDIC algorithm,‖ J. VLSI Signal Process., vol. 19, no. 3, pp. 227–241, Aug. 1998.
8. PS/2 Mouse/Keyboard Protocol
http://www.computer-engineering.org/ps2protocol/
9. PS/2 Keyboard Interface
http://www.computer-engineering.org/ps2keyboard/
10. Xilinx. Spartan-3E Starter Kit Board User Guide, UG230 (v1.0). 2006
11. Pong P. Chu. ―FPGA Prototyping By Verilog Examples Xilinx Spartan 3E Version‖, A
John Willey & Sons Pub. 2008