Documente Academic
Documente Profesional
Documente Cultură
9, SEPTEMBER 2015
1793
I. I NTRODUCTION
1063-8210 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
1794
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 9, SEPTEMBER 2015
N1
x(n)W Nnk
n=0
511
+
=
m=0
k(3m+1)
x(3m + 1)W1536
511
m=0
511
511
m=0
m=0
k(3m+2)
x(3m + 2)W1536
m=0
km
k
x(3m)W512
+W1536
511
2k
W1536
m=0
km
x(3m + 1)W512
512point FFT
km
x(3m + 2)W512
.
(5)
512point FFT
(6)
(7)
1024
k
= W1536
W1536
k
= W32 W1536
,
To derive an
and
(2)
(3)
Substituting the index mapping in (2) and (3) into (1), the DFT
algorithm in (1) for N = 8 can be rewritten as follows:
X[k1 + 2k2 + 4k3 ]
1
1
1
x(4n 1 + 2n 2 + n 3 )
=
(8)
(9)
n 3 =0 n 2 =0 n 1 =0
W8(k1 +2k2 +4k3 )(4n1 +2n2 +n3 )
1
1
1
x(4n 1 + 2n 2 + n 3 )W2k1 n1
n 3 =0 n 2 =0 n 1 =0
W8(2n2 +n3 )k1 W8(2n2 +n3 )(2k2 +4k3 )
1
1
=
x(2n 2 + n 3 ) + (1)k1 x(2n 2
n 3 =0 n 2 =0
W8(2n2 +n3 )k1 W8(2n2 +n3 )(2k2 +4k3 ) .
k(3m)
x(3m)W1536
512point FFT
(1)
n = 4n 1 + 2n 2 + n 3 , where n 1 , n 2 , and n 3 = 0, 1
k = k1 + 2k2 + 4k3 , where k1 , k2 , and k3 = 0, 1.
511
n=0
e j 2nk/N .
kn
x(n)W1536
=
k
= W31 W1536
,
0 k N 1
W Nnk
1535
+ n 3 + 4)
(4)
where the constant = sin(2/3) = 0.866 and j is an imaginary unit. With (10)(12), Fig. 2 shows the corresponding
SFG, which comprises three stages. Each stage includes only
one butterfly structure (an incomplete topology); however, it
remains unaffected when implemented using a classic SDF
pipeline scheme. Compared with the design in [18], the
proposed modification of radix-3 SFG simplifies hardware
implementation by combining the complex multiplications by
and j . The original design [18] requires two complex
multiplications, which increases the number of multiplexers
and control signals for hardware implementation. The modified
radix-3 FFT SFG mapping to hardware implementation is
described in Section II-E.
Fig. 1.
Fig. 2.
Fig. 3.
1795
1796
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 9, SEPTEMBER 2015
Fig. 4.
Fig. 5.
Practical example of the proposed 1536-point computation scheme used in ST5 shown in Fig. 4.
1535
kn
x(n)W1536
n=0
511
2
r=0
x(3m
km
+ r )W512
rk
.
W1536
(13)
m=0
1797
TABLE I
C ALCULATION OF E ACH N ODE IN E ACH S TAGE S HOWN IN F IG .4
F. Processing Elements
The FFT SFGs shown in Figs. 1 and 2 require that the
proposed architecture include four types of PE to cope with
FFT computations of various sizes. The first three types of PE
hardware modules (PE1PE3) were described in [12]. With
the radix-3 FFT SFG shown in Fig. 2 and Table I, this paper
propose the novel PE4 hardware module shown in Fig. 6.
In addition to performing butterfly operations, this module
is able to perform multiplications using j . Multiplication
by requires only a complex-constant multiplier, comprising
two constant real-value multipliers. This complex-constant
multiplier design is detailed in the following section.
Fig. 6.
Fig. 7.
(15)
(14)
1798
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 9, SEPTEMBER 2015
Fig. 9.
(16)
Fig. 10.
1799
TABLE III
S UMMARY OF P ROPOSED C HIP D ESIGN
1800
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 9, SEPTEMBER 2015