Documente Academic
Documente Profesional
Documente Cultură
Page 1186
conventional and BEC-based CSLAs to study the data and B = bn–1bn–2 …b0 is the multiplier. The product
dependence and to identify redundant logic operations. P = p2n– 1, p2n–2 …p0 can be written as follows:
Based on this analysis, we have proposed a logic P=∑𝑛−1 𝑛−1
𝑖=0 ∑𝑗=0 ai.bj.2i+j
formulation for the CSLA. The main contributions in
this brief are logic formulation based on data
dependence and optimized carry generator (CG) and
CS design. Most of digital signal processing methods
use nonlinear functions such as discrete cosine
transform (DCT) or discrete wavelet transforms
(DWT). Because they are basically accomplished by
repetitive application of multiplication and addition,
Fig.1 Multiplication Example
speed of the multiplication and addition arithmetic
determines the execution speed and performance of the
An array implementation is shown in Figure 2. In the
entire calculation. Multiplication –and -accumulate
4x4 Array multiplier, the multiplier array consists of 3
operations are typical calculation. Multiplication –and
rows of Carry-save adders (CSAs), in which each row
– accumulateoperations is typical for digital filters.
contains 3 full adders (FAs). Each FA has three inputs
Therefore, the functionality of the MAC unit enables
and two outputs. The sum bit and the carry bit. 3 FAs
high-speed filtering and other processing typical for
in the first CSA row that have only two valid inputs
DSP applications. Since the MAC unit operates
can be replaced by 3 half adders (HAs) and 3 FAs in
completely independent of the CPU, it can process
the last row can be constructed as a 3-bit ripple-carry
data separately and there by reduce CPU load. The
adder.
application like optical communication systems which
is based on DSP, require extremely fast processing of
huge amount of digital data. The Fast Fourier
Transform(FFT)alsorequiresadditionandmultiplication.
AMACunitconsistsofamultiplierand an accumulator
containing the sum of the previous successive
products. The MAC inputs are obtained from the
memory location and given to the multiplier block.
The design consists of 8 bitarraymultiplier,16 bit carry
selectadderandaregister.Thispaperisdividedintosixsecti
Fig.2.Array Multiplier
ons.Inthefirstsectionthe introduction about MAC unit
is discussed. In the
On the other hand, the Baugh-Wooley multiplier uses
secondsectiondiscussaboutthedetailedoperationofMAC
the same array structure to handle 2’s complement
unit.Thethirdand fourth section deals with the
multiplication, with some of the partial products
operation of modified array multiplier and carry select
replaced by their complements. The multiplier array
adder respectively. In the fifth section, the
consists of (n–1) rows of carry-save adders (CSA), in
obtainedresultforthe8bitMACunitisdiscussedandfinally
which each rows contains (n–1) full adders (FA).The
the conclusion is made in the sixth section.
last row is a ripple adder for carry propagation
ARRAY MULTIPLIER
PROPOSED ADDER DESIGN
Consider the multiplication of two unsigned n-bit
The proposed CSLA is based on the logic formulation
numbers, where A=an–1an–2 …a0 is the multiplicand
given in (3a)-(3g), and its structure is shown in Fig.3.
Page 1187
It consists of one HSG unit, one FSG unit, one CG (s). The LSB of S0 is XORed with Cinto obtain the LSB
unit, and one CS unit. The CG unit is composed of two of S.
CGs (CG0 and CG1) corresponding to input-carry ‘0’ 𝑠0 (𝑖) = 𝐴(𝑖) ⊕ 𝐵(𝑖)
and ‘1’. The HSG receives two n-bit operands (A and 𝑐0 (𝑖) = 𝐴(𝑖). 𝐵(𝑖) 3(a)
B) and generate half-sum word S0 and half-carry word 𝑐10 (𝑖) = 𝑐0 (𝑖) + 𝑠0 (𝑖).𝑐10 (𝑖 − 1)
C0 of width n bits each. Both CG0 and CG1 receive S0 For (𝑐10 (0) = 0) 3(b)
and C0 from the HSG unit and generate two n-bit full- 1 (𝑖) 1
𝑐1 = 𝑐0 (𝑖) + 𝑠0 (𝑖).𝑐1 (𝑖 − 1)
carry words 𝑐10 and𝑐11 corresponding to input-carry ‘0’
For (𝑐11 (0) = 1) 3(c)
and ‘1’, respectively. The logic diagram of the HSG 0
C(i)=𝑐1 (𝑖) if cin=0 3(d)
unit is shown in 3(b). The logic circuits of CG0 and 1
C(i)=𝑐1 (𝑖) if cin=1 3(e)
CG1 are optimized to take advantage of the fixed
input-carry bits. The optimized designs of CG0 and 𝐶𝑜𝑢𝑡 = 𝐶(𝑛 − 1) 3(f)
CG1 are shown in 3(c) and (d), respectively. S(0) = S0(0)⊕ Cin
S(i) = S0(i)⊕ C(i-1) 3(g)
MAC IMPLEMENTATION
TheMultiplier-Accumulator(MAC)operation is
thekeyoperation not only in DSP applications but also
in multimedia information processing and various
other applications. MAC unit consist of multiplier,
proposed adder and an accumulator. In this paper, we
used 8-bitarray multiplier.The MAC inputs are
obtained from the memory location andgiven
tothemultiplierblock.Theinputwhichisbeingfedfromthe
memorylocation is 8 bit.Whentheinput
isgiventothemultiplieritstartscomputing value for the
given 8-bit input and hence the output will be16 bits.
Fig.3 Proposed architecture
The CS unit selects one final carry word from the two
carry words available at its input line using the control
signal Cin. It selects 𝑐10 when Cin = 0; otherwise, it
selects 𝑐11 . The CS unit can be implemented using an
n-bit 2-to-l MUX. However, we find from the truth
table of the CS unit that carry words𝑐10 and 𝑐11 follow a
specific bit pattern. If 𝑐10 (i) = ‘1’, then𝑐11 (i) = 1,
irrespective of S0(i) and C0(i), for 0 ≤ i≤ n − 1. This
feature is used for logic optimization of the CS unit.
The optimized design of the CS unit is shown in 3(e),
which is composed of n AND–OR gates. The final
Fig4. MAC Unit
carry word c is obtained from the CS unit. The MSB of
c is sent to output as Cout, and (n − 1) LSBs are
Themultiplieroutputisgivenastheinput to proposed carry
XORed with (n − 1) MSBs of half-sum (S0) in the FSG
select adder whichperformsaddition.Theoutputof
[shown in 3(f)] to obtain (n − 1) MSBs of final-sum
Page 1188
proposed carryselect adderis17 biti.e.onebit is for the
carry (16 bits+ 1 bit). Then, the output is given tothe
accumulator.Theoutputoftheaccumulatoristaken out or
fed back as one of the input to the carry select
adder.The figure 1 shows the basic architecture of
MACunit.
Page 1189
Fig.7Area Comparison
Fig.8DAT Comparison
Page 1190
Layout of MAC by using Proposed SQRT CSLA: [4] O. J. Bedrij, “Carry-select adder,” IRE Trans.
Electron. Comput.,vol. EC-11, no. 3, pp. 340–344, Jun.
1962.
CONCLUSION [8] I.-C. Wey,C.-C. Ho, Y.-S. Lin, and C.C. Peng, “An
We have eliminated all the redundant logic operations area-efficient carryselect adder design by sharing the
of the conventional CSLA and proposed a new logic common Boolean logic term,” in ProcIMECS, 2012,
formulation for the CSLA. The proposed CSLA design pp.1-4.
involves significantly less area and delay than the
[9] S.Manju and V. Sornagopal, “An efficient SQRT
recently proposed BEC-based CSLA. Due to the small
architecture of carry selectadder design by common
carryoutput delay, the proposed CSLA is a good
Boolean logic,” in Proc. VLSI ICEVENT, 2013,pp. 1–
Design for the SQRT adder. Due to this performance
5.
results CONV-CSLA, BEC, CBL, Proposed adders is
placed in MAC and Achieved better results. In future
array multiplier can replaced with any low power
multiplier and extended for different bit widths.
REFERENCES
[1] Basant Kumar Mohanty, Sujitkumar Patel, “Area-
Delay-Power efficient carry select adder”, IEEE
Transactions on Circuit and Systems, Vol.61, No.6,
June-2014, pp.418-422.
Page 1191