Documente Academic
Documente Profesional
Documente Cultură
dB(3:0) , then the BCD carry-out dcout is given by (6) and the
decimal sum dS(3:0) is obtained by (7)
TABLE I
C OMPARISON R ESULTS
TABLE II
C OMPARISON R ESULTS
IV. R ESULTS
In this section, two different implementations of the novel
1-digit BCD adder are characterized. The first one adopts the
clock scheme used in [16] and [23]–[25], whereas in the second
implementation, the 2-D wave clocking (2DDWAVE) mecha-
nism proposed in [28] is applied. Finally, a 2-digit BCD adder propagated through the n − 2 subsequent DAs to be finally
designed as described here is presented. All the proposed circ- absorbed by the most significant DA that furnishes the digit
uits were implemented using the QCADesigner tool [27] adopt- n−1
dS(3:0) and the carry-out dcoutn . Therefore, the worst case de-
ing the simulation setup as in [6]. The first implemented layout lay τw of the n-digit decimal adder depicted in Fig. 6 is given by
is reported in Fig. 4. The generic operation is performed within (8), where τg and τp are the delays required to generate dcout1
12 clock phases (i.e., three clock cycles). Moreover, the adder and to propagate it to the next DA. The most significant DA in
uses only 1065 cells within an overall area of just 0.89 μm2 . the chain will absorb the received carry signal with the delay τa
Data summarized in Table I allow a fair comparison with
existing competitor circuits to be performed. When compared τw = τg + (n − 2) · τp + τa . (8)
to the CFA-based circuit presented in [23], the 1-digit BCD
adder proposed here occupies ∼35% less area and exhibits a Table II reports the generation, propagation, and absorption
computational time ∼36% lower. With respect to the CLA- delays, expressed in terms of the number of clock phases, of
based adder, the area reduction achieved with the novel circuit the novel 1-digit BCD adder here proposed and those achieved
is ∼52%, even though its computational time is 20% higher. It by the CFA- and the CLA-based adders described in [23]. The
is worth underlining that this time overhead is due to the limited aforementioned given data were then exploited to estimate the
wire lengths adopted within the novel adder, in which, in overall computational delay τw of the compared architectures
contrast to [23], wires longer than 16 cascaded cells are avoided. versus n. Results reported in Fig. 6 clearly show that the
Nevertheless, the advantages offered by the novel BCD adder proposed QCA-based BCD adder architecture can reach better
can be better appreciated considering that the high-precision speed performances than its competitors.
computations, in which decimal arithmetic is required, usually Two DAs designed as proposed here have been cascaded to
process n-digit operands, with n > 1. An n-digit decimal adder implement the 2-digit decimal adder. The produced layout uses
can be structured as illustrated in Fig. 5, in which n 1-digit 2574 cells, occupies an overall area of 2.74 μm2 , and operates
adders (DAs) are cascaded in a ripple-carry fashion. The ith within 18 clock phases. It is worth noting that it is more than
DA, with i = 0, . . . , n − 1, receives as inputs the digits dAi(3:0) twice as large as the 1-digit adder because additional cells
i are needed to properly synchronize input and output signals,
and dB(3:0) of the operands together with the carry signal maintaining the geometric regularity of the layout.
i
dcout produced by the previous DA and furnishes the digit As deeply explained in [28], the zone clocking scheme
i
dS(3:0) of the decimal sum and the carry dcouti+1 . adopted within QCA-based designs can significantly affect the
As it is well known, adder architectures based on the ripple- topology of the underlying clock distribution network, thus
carry logic perform the most delay critical addition when the leading to more or less feasible designs. In particular, some
carry signal dcout1 , generated by the least significant DA, is clocking mechanisms allow the clocking circuitry required
COCORULLO et al.: DESIGN OF EFFICIENT BCD ADDERS IN QUANTUM-DOT CELLULAR AUTOMATA 579
R EFERENCES
[1] C. S. Lent, P. D. Tougaw, W. Porod, and G. H. Bernestein, “Quantum
cellular automata,” Nanotechnology, vol. 4, no. 1, pp. 49–57, 1993.
[2] V. Pudi and K. Sridharan, “New decomposition theorems on majority
logic for low-delay adder design in quantum dot cellular automata,”
IEEE Trans. Circuits Syst. II, Exp. Brief , vol. 59, no. 10, pp. 678–682,
Oct. 2012.
[3] H. Cho and E. E. Swartzlander, Jr., “Adder design and analyses for
quantum-dot cellular automata,” IEEE Trans. Nanotechnol., vol. 6, no. 3,
pp. 374–383, May 2007.
Fig. 7. Adopted clocking scheme. [4] H. Cho and E. E. Swartzlander, Jr., “Adder and multiplier design in
quantum-dot cellular automata,” IEEE Trans. Comput., vol. 58, no. 6,
pp. 721–727, Jun. 2009.
[5] V. Pudi and K. Sridharan, “Efficient design of a hybrid adder in quantum-
dot cellular automata,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst.,
vol. 19, no. 9, pp. 1535–1548, Sep. 2011.
[6] S. Perri, P. Corsonello, and G. Cocorullo, “Area-delay efficient binary
adders in QCA,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst.,
vol. 22, no. 5, pp. 1174–1179, May 2014.
[7] L. Lu, W. Liu, M. O’Neill, and E. E. Swartzlander, Jr., “QCA systolic array
design,” IEEE Trans. Comput., vol. 62, no. 3, pp. 548–560, Mar. 2013.
[8] S. Perri, P. Corsonello, and G. Cocorullo, “Design of efficient binary com-
parators in quantum-dot cellular automata,” IEEE Trans. Nanotechnol.,
vol. 13, no. 2, pp. 192–202, Feb. 2014.
[9] S. Perri and P. Corsonello, “New methodology for the design of effic
ient binary addition in QCA,” IEEE Trans. Nanotechnol., vol. 11, no. 6,
pp. 1192–1200, Jun. 2012.
[10] C. A. Krygowski et al., “Functional verification of the IBM system z10
processor chipset,” IBM J. Res. Dev., vol. 53, no. 1, pp. 1–11, Jan. 2009.
Fig. 8. New BCD adder based on the 2DDWAVE clocking scheme. [11] B. Sinharoy et al., “IBM POWER8 processor core microarchitecture,”
IBM J. Res. Dev., vol. 59, no. 1, pp. 1–21, Jan./Feb. 2015.
to polarize the QCA cells to be significantly simplified, and [12] L. Eisen et al., “IBM POWER6 accelerators: VMX and DFU,” IBM J.
consequently, they lead to more feasible designs. Res. Dev., vol. 51, no. 6, pp. 1–21, Nov. 2007.
In order to demonstrate that the BCD adder here proposed [13] Sparc64TM X/X+ Specification Ver. 29.0 Fujitsu, Kawasaki, Japan,
Jan. 2015. [Online]. Available: http://www.fujitsu.com
can be efficiently implemented with different clocking schemes, [14] “SilMinds DFP adder units,” SilMinds, Cairo, Egypt. [Online]. Available:
it was designed by also applying the 2DDWAVE mechanism http://www.silminds.com/ip-products/dfp-units/56
proposed in [28] and partitioning the cells as schematized in [15] IEEE Standard for Floating-Point Arithmetic, IEEE Working Group of the
Fig. 7, where the size of each clock zone is also shown in terms Microprocesser Standards Subcommittee, IEEE Std. 754-1985, 2008.
[16] M. Gladshtein, “Quantum-dot cellular automata serial decimal adder,”
of the number of cells. IEEE Trans. Nanotechnol., vol. 10, no. 6, pp. 1377–1382, Jun. 2011.
The BCD adder implemented in this way is illustrated in [17] E. M. Schwarz, J. S. Kapernick, and M. FCowlishaw, “Decimal floating-
Fig. 8. As summarized in Table I, the adder performs its generic point support on the IBM system z10 processor,” IBM J. Res. Dev., vol. 53,
no. 1, pp. 1–10, Jan. 2010.
operation within 14 clock phases (i.e., 3 1/2 clock cycles) [18] R. D. Kenney and M. J. Schulte, “High-speed multioperand decimal
and uses 1196 cells within an overall area of only 1.36 μm2 . adders,” IEEE Trans. Comput., vol. 54, no. 8, pp. 953–963, Aug. 2005.
Delay and area overheads due to the different partitioning [19] C. Tsen, S. Gonzalez-Navarro, M. J. Schulte, and K. Compton, “Hard-
approach used with the 2-D clocking scheme are the reasonable ware designs for binary integer decimal-based rounding,” IEEE Trans.
Comput., vol. 60, no. 5, pp. 614–627, May 2011.
price to pay to simplify the clock circuitry and to achieve [20] L. Han and S. B. Ko, “High-speed parallel decimal multiplication with
higher feasibility. Nevertheless, if the CFA- and the CLA- redundant internal encodings,” IEEE Trans. Comput., vol. 62, no. 5,
based counterparts will be realized using the 2-D clocking pp. 956–968, May 2013.
[21] A. A. Bayrakci and A. Akkas, “Reduced delay BCD adder,” in Proc. IEEE
scheme, then it would be reasonable to expect similar impacts Int. Conf. Appl.-Specific, Syst., Archit. Process., Montreal, QC, Canada,
on performances and area. Thus, the proposed circuit would Jul. 2007, pp. 266–271.
maintain the advantages as discussed earlier. [22] C. Sundaresan, C. V. S. Chaitanya, P. R. Venkateswaran, S. Bhat, and
J. M. Kumar, “Modified reduced delay BCD adder,” in Proc. IEEE
Int. Conf. Biomed. Eng. Informat., Shanghai, China, Oct. 2011,
V. C ONCLUSION pp. 2148–2151.
A new design approach has been presented and demonstrated [23] W. Liu, L. Lu, M. O’Neill, and E. E. Swartzlander, “Cost-efficient decimal
adder design in quantum-dot cellular automata,” in Proc. IEEE Int. Symp.
to achieve efficient QCA-based implementations of decimal Circuits Syst., Seoul, South Korea, May 2012, pp. 1347–1350.
adders. Unconventional logic formulations and purpose- [24] M. Taghizadeh, M. Askari, and K. Fardad, “BCD computing structures
designed logic modules here proposed allow outperforming in quantum-dot cellular automata,” in Proc. IEEE Int. Conf. Comput.
Commun. Eng., Kuala Lampur, Malaysia, May 2008, pp. 1042–1045.
decimal adders known in the literature. In fact, the new 1-digit [25] F. Kharbash and G. M. Chaudhry, “The design of quantum-dot cellular
BCD adder exhibits computational delay and area occupancy automata decimal adder,” in Proc. IEEE Int. Multitopic Conf., Karachi,
up to 36% and 52% lower than existing competitors. These Pakistan, Dec. 2008, pp. 71–75.
advantages become even more evident when two n-digit dec- [26] V. Pudi and K. Sridharan, “Low complexity design of ripple carry and
Brent-Kung adders in QCA,” IEEE Trans. Nanotechnol., vol. 11, no. 1,
imal numbers must be summed. The 2-digit adder designed pp. 105–119, Jan. 2012.
as proposed here operates within only 18 clock phases and [27] K. Walus and G. A. Jullien, “Design tools for an emerging SoC tech-
occupies an area of just 2.74 μm2 . Finally, by exploiting the nology: Quantum-dot cellular automata,” Proc. IEEE, vol. 94, no. 6,
pp. 1225–1244, Jun. 2006.
2-D wave clocking scheme, a more feasible implementation [28] V. Vankamamidi, M. Ottavi, and F. Lombardi, “Two-dimensional schemes
of the new adder has been obtained without compromising the for clocking/timing of QCA circuits,” IEEE Trans. Comput.-Aided Des.
advantages achieved over its direct competitors. Integr. Circuits Syst., vol. 27, no. 1, pp. 34–44, Jan. 2008.