Documente Academic
Documente Profesional
Documente Cultură
5, MAY 2014
Abstract— Recently, a novel dual mode logic (DML) family was A basic DML gate is composed of any static logic fam-
proposed. This logic allows operation in two modes: 1) static and ily gate, which can be a conventional CMOS gate, and an
2) dynamic modes. DML gates, which can be switched between additional transistor. DML gates have a very simple and
these modes on-the-fly, feature very low power dissipation in
the static mode and high performance in the dynamic mode. intuitive structure, requiring unconventional sizing method-
A basic DML gate is very simple and is composed of any ology to achieve the desired performance. Conventional LE
static logic family gate and an additional clocked transistor. In methodology cannot be used with the DML family as
this paper, we introduce the logical effort (LE) methodology it does not consider its unconventional sizing rules and
for the CMOS-based DML family. The proposed methodology topology.
allows path length minimization, delay optimization, and delay
estimation of DML logic. This is done by development of complete The objective of this paper is to develop a simple method
and approximated LE models, which allows easy extraction of for minimizing delays and achieving an optimized number of
design optimization parameters, such as optimum number of stages in logical paths containing CMOS-based DML gates.
stages, gates sizing factors, and delay estimations. The proposed A unified LE method is introduced for the delay evaluation
optimization is shown for the dynamic mode of operation. and optimization of logic paths constructed with DML logic
Theoretical mathematical analysis is presented, and efficiency of
the proposed methodology is shown in a standard 40-nm CMOS gates. DML-LE answers complete (unapproximate) design
process. problems, which can be solved numerically, and simplifies
these problems to a straightforward and easy computational
Index Terms— Dual mode logic, high performance, logical
effort, low power, optimization. problem [approximate and semiapproximate (SA) solutions]
with a unified analytic model. With this model, we can
I. I NTRODUCTION estimate the minimum to maximum error under delay approx-
imation and the error in the target optimum number of stages
L OGIC optimization and timing estimations are basic
tasks for digital circuit designers. The logical effort (LE)
method was first presented by Sutherland et al. [1]–[4] for easy
for a given logic function. The efficiency of the developed
method is shown by a comparison of the theoretical results,
achieved using the proposed method with simulation results of
and fast evaluation and optimization of delay in CMOS logic
the Cadence Virtuoso optimizer tool using a standard 40-nm
paths. Because of its elegance, the LE method has become a
technology.
very popular tool for designing and education purposes and
The rest of this paper is organized as follows: a review
is adopted to be the basis for several computer-aided-design
of the DML family is described in Section II. DML-LE
tools [3]–[7]. Although LE is mainly used for standard CMOS
model for simple inverter chains is developed in Section III
logic, it is also shown to be useful for other logic families, such
with three different levels of approximations. In Section IV
as the pass transistor logic [8].
we compare the methods by simplicity and accuracy. The
Recently, we proposed the novel dual mode logic (DML),
dependence of the optimum number of stages is also described
which provides the designer with a very high level of flexibility
in Section IV, which provides an intuitive graphical visual-
[9]–[11]. It allows on-the-fly switching between two modes
ization of the problem. DML-LE is expanded to complex
of operation: 1) static and 2) dynamic modes. In the static
nets containing branching in Section V. In Section VI, the
mode, DML gates achieve very low power dissipation, with
efficiency of the DML-LE theoretical optimization is examined
some degradation in performance, as compared with standard
for a standard 40-nm process. A discussion and conclu-
CMOS. On the other hand, dynamic operation of DML gates
sions of the use and applicability of DML-LE are given in
achieves very high speed at the expense of increased power
Section VII.
dissipation.
Manuscript received October 18, 2012; revised February 15, 2013; accepted
March 30, 2013. Date of publication May 16, 2013; date of current version II. DML OVERVIEW
April 22, 2014.
I. Levi and A. Belenky are with the Department of Electrical and com- As previously mentioned, a basic DML gate architecture is
puter Engineering, Ben-Gurion University, Beer-Sheva 84105, Israel (e-mail: composed of a static gate and an additional transistor M1,
itamarlevi@gmail.com; belenky@ee.bgu.ac.il). whose gate is connected to a global clock signal [9]–[11]. In
A. Fish is with the Faculty of Engineering, Bar-Ilan University, Ramat Gan
52900, Israel (e-mail: afish@ee.bgu.ac.il). this paper, we specifically focus on DML gates that utilize
Digital Object Identifier 10.1109/TVLSI.2013.2257902 conventional CMOS gates for the static gate implementation.
1063-8210 © 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
LEVI et al.: LE FOR CMOS-BASED DML GATES 1043
TABLE I
This is the most optimal and accurate solution for DML
I NVERTER C HAIN S IZING FACTORS Si OF THE C A M ETHOD
inverter chain sizing. However, solving it is a very exhaust-
ing task. This unapproximated solution (CS) is much more
S1 S2 S3 S4 S5 SN S N +1
complex than a simple CMOS LE optimal solution, which
√ √ √ N N
is derived with no assumptions and approximations, as is 1 μn/ p A0.5 A μn/ p A1.5 A2 μn/ p A( 2 −0.5) A2
shown in Appendix I. DML CS method complexity is due
to a nonstandard sizing of transistors, connected in parallel to
the Clk-ed transistor. By differentiating d D/dsi = 0 and following the same
Following assumptions will be used in the rest of this procedure (Section B) for all odd i (11) and all even i (12):
paper. First, as solved in last section, the first gate of any
examined chain will be minimum sized, i.e., S1 =1. S1 can be Si Si+1 1
= (11)
generalized to any possible size in accordance with any input Si−1 Si μn/ p
capacitance. Second, even number of stages N is assumed. Si Si−1
= μn/ p . (12)
This is due to the topology of DML chains that basically Si−1 Si
consists of Type B gates following Type A gates. However,
The sizing factors solution to this CA approach is quite
the solution for the chain, that has an odd number of stages,
similar to standard CMOS solution. Similarly to CMOS, the
can be easily derived using the same methodology.
upsizing factor is constant, however all even stages are factored
√
by an additional μn/ p . For the N-size chain, the sizing
D. Complete Approximated (CA) Method for the Sizing factors can be written in series as in Table I where A is
Factors of DML Inverter Chain expressed in (14). In CMOS, the sizing factors are derived
To reduce the complexity of the LE method, a CA from the load to input capacitance ratio, whereas in DML, they
solution, which trades off the complexity and accuracy, is are determined by the ratio of the first to last sizing factors
developed. √
CLoad N
It is previously discussed that (5) describes a general delay In CMOS: F = = f N , f = F. (13)
expression for the whole chain, assuming an even number of Cin,g
inverters N. The CA method assumes that the contribution SN+1 N S N+1 N
In DML: = A 2 , FDML = = f DML 2 ,
of minimal transistors to the drain and gate capacitances S1 S1
N
is negligible in comparison with 2Si and with Si+1 for all A = f DML = 2 FDML (14)
stages of the chain. As shown in Section V, neglecting these
transistors, for complex gates improves the accuracy with where assuming S1 = 1, S N+1 can be extracted from
respect to inverters. Next, (5) can be expressed by
(S N+1 + 1) W L min SN+1 + 1 CLoad
= = . (15)
D= Di (S1 + 1) W L min 2 Cin,g
N ⎛
The presented methodology supports calculation of sizing
⎜ (2S +1) Si+1 +1
factors for a given N size chain of inverters. The proposed
⎜ i
= t p0_DML ⎜ γ + method will be extended to derive the optimal chain length
⎝ 3Si 2Si
odd _i Nopt under a given load capacitance. The problem will be
Type _ A ⎞ defined in a similar manner derived by Sutherland et al.
[1]–[3].
(2Si +1) Si+1 +1 ⎟
⎟ Consider a path of DML gates containing n 1 stages, to
+ μn/ p γ + ⎟. (9)
3Si 2Si ⎠ which we append n 2 additional DML inverters to obtain a path
even_i
Type _ B with N = n 1 + n 2 stages. We assume the original n 1 stages
These assumptions are justified only when the output load cannot be altered, besides scaling, as they perform necessary
capacitance of the chain is large. The large load capacitance logic functions. However, any positive n 2 can be chosen to
affects the sizing factors Si . When Si is increasing, i increases, reduce the delay of the chain. In addition, assuming that the
along the chain, this approximation will increase in accuracy optimum length will be larger than n 1 , we further assume
for large i ’s. After simplification, (9) can be rewritten as that n 2 is even and the logic function would not be altered.
follows: It is known that adding CMOS inverters to a path does not
affect the electrical effort of the path. As the sizing of DML
D= Di inverters is quite different then CMOS and they are added in
N the consequent Type A → Type B structures, the electrical
= t p0
⎛_D M L ⎞ effort of the path is changed by adding these DML buffers.
Using the term for LE in (3), it can be shown that LE of the
⎜
⎜ 2
2 Si+1 ⎟⎟ DML A − B inverter pair can be approximated by (for large
Si+1
×⎜ γ + + μn/ p γ + ⎟.
⎝ 3 2Si 3 2Si ⎠ i ’s) as follows:
odd _i even_i
Type _ A Type _ B 1 μn/p
(10) LEinvA · LEinvB | LARGSi s ∼
= · .
2 2
1046 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 22, NO. 5, MAY 2014
⎧ ⎫
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎪
⎪ LE_DML ⎪
⎪
⎪
⎪ p_DML ⎪
⎪
⎪
⎪ ⎪
⎪
⎨ N
⎪ X S_gate Si +1 bi CLoad_on
⎪
⎬
S_drain Si +N min _drain
D = t p0 + · · (19)
⎪
⎪ 2si 2 · si b_DM L X S_gate Si +1 ⎪⎪
DML ⎪
⎪
⎪
⎪ ⎪
⎪
⎪
⎪
⎪
⎪ ⎪
⎪
⎪ ( f ·b)_D M L ⎪ ⎪
⎪
⎪
⎩ ⎪⎭
EF_DML
LEVI et al.: LE FOR CMOS-BASED DML GATES 1049
i ’s in (26) as follows:
Si Si+1 (i )X i+1
= (25)
Si−1 Si μn/ p (i − 1)X i
Si Si+1 (i )X i+1
= μn/ p (26)
Si−1 Si (i − 1)X i
where for simplicity the next terms are defined as follows:
LEDML_i bDML_i
i = ; X s_gate_i = X i .
X s_gate_i Fig. 8. Six CMOS inverter chain delay as a function of the normalized load
capacitance for the DML CS method and for Cadence optimizer solution.
The sizing factors series, Si , are shown in Table IV.
The required sizing factors (Table IV) are extracted from
SN+1 X N+1 ! N !
N
E FDML = N L E DML_i bDML_i (27)
S1 X 1
i−1 i−1
TABLE IV
C OMPLEX G ATES C HAIN S IZING FACTORS (Si ) FOR THE CA M ETHOD
S1 S2 S3 S4 S N +1
√ √
μn/ p 1 μn/ p 1
1 EF EF2 EF3 EF N
1 X 2 1 2 X 2 X 3 1 2 3 X 2 X 3 X 4 1 2 3 . . . N X 2 X 3 . . . X N +1
Fig. 13. DML and CMOS complex network delays, optimized using DML
and CMOS LEs, accordingly. In addition, the optimizer results for both
methods are presented.
VII. C ONCLUSION
A novel LE approach for CMOS-based DML logic networks
was presented. The proposed approach allowed an efficient
optimization of DML logic networks for maximum perfor-
mance in the dynamic mode of operation, which was the
focus of this paper. DML logic, optimized according to the
proposed LE methods, allowed extended flexibility in opti-
mizing various structures of DML networks. This optimization
utilized the DML inherent properties of significantly reduced
parasitic capacitance and ultralow power dissipation in the
Fig. 12. DML complex network scheme.
static operation mode [9]–[11]. This paper presented three
different approaches, which traded off between computation
overhead are extremely low, as compared with the optimizer complexity and accuracy. The complex CS method was only
solution. addressed for error analysis of the other methods. The CA
To investigate in-depth the performance of the SA, CS, method was identical to CMOS LE computation with very
and CA techniques, the deviation in delay from the complete small error and the SA method was also identical to the CMOS
solution, which is optimal for DML, is calculated, as shown LE computation aiding one more lookup table (which easily
in Fig. 11. derived for all cases and loads). We showed that with these
As expected, the CA precision is improved for larger loads. tools only a design can achieve very high performance results.
The precision of the SA is also very high for cases of small Advantages and drawbacks of each one of the methods were
loads (large N). Therefore, the SA method presents a good discussed. Simulation results, carried out in a standard 40-nm
tradeoff between computation complexity and precision for process, proved the efficiency of the proposed approach and
these cases. compared it with existing CMOS LE.
To evaluate the performance of an optimized DML complex
logic network, which includes branches (schematically shown A PPENDIX I
in Fig. 12) it is compared with the same CMOS network.
CMOS LE OVERVIEW
Although the CMOS network is optimized using a standard
LE, the DML network is sized according to the SA DML-LE LE is a simplified method of transistor sizing optimization
methodology. Fig. 13 shows the results of this comparison. to achieve an improved metrics of a combinational logic. The
As can be seen, application of DML LE effort in complex detailed description of the conventional LE semantics can
network results in maximum error of only ∼4.5% for small be found in [1], [2], [23]. These techniques are extended to
loads, as compared with the Cadence optimizer results. This include advance aspects of latest technologies such as tempera-
is very close to the ∼3.8% error, achieved by the CMOS LE ture/voltage [14], low voltage [15], [16], interconnect inclusion
optimization. [17], [18], energy-delay optimization [19], and complex cells
LEVI et al.: LE FOR CMOS-BASED DML GATES 1051
fitting [20]. According to this method, the gate normalized Conventional LE provides the well-explored solution for the
delay of stage i (Di) in a gates chain can be expressed as a upsizing of a given CMOS gates chain. The upsizing factors
sum of the stage (fi) and parasitic ( pi ) efforts as follows: and amount of gates required in the chain are constrained by
CLOAD, logical functions, area, delay, and power requirements
Di = f i + p i (31)
[1], [2], [21]. Initially, the chain delay is estimated as follows:
where f i = gi · h i · bi , gi is the LE of the stage; h i is the
N
N
electrical effort of the stage as follows: D = tpd = Di = tp0 ( pi ∗ γ + EFi ). (42)
Cout_i 1 1
hi (32) The optimal chain sizing considers upsizing each stage by
Cin_i
an optimal electrical effort (EFopt), which is given by
and bi is the branching effort of the stage as follows: #
√ ! !
Con_path_i + Coff_path_i N
EFopt = PE = N F ∗ LEi ∗ bi (43)
bi = . (33)
Con_path_i
where PE is the path effort and F is the Cload to input
We will use the terminology presented in [8]. According capacitance ratio. For a given chain, containing N CMOS
to this terminology, the LE of the stage is marked by LEi gates, N is not necessarily equal to the optimal number of
and the electrical effort will be marked by f i . The equation is stages, Nopt . If N < Nopt , a number of inverters can be added
normalized, for all parameters, to a simple inverter therefore to better fit the stage effort of the path and therefore to improve
as follows: its delay. For N < Nopt , EFopt is given by
Rgate_i C D,gate_i
pi . (34) EFopt = 3.6 (for γ = 1). (44)
Rinv C D,inv √
Rgate _i C G,gate_i For this case, Nopt is given by Nopt
PE = EFopt .
LEi = gi . (35) For N > Nopt , EFopt can be approximated as follows:
Rinv C G,inv
√
N
Using this terminology, the delay of a gate in stage i is as PE = EFopt . (45)
follows: EFopt (γ , Pinv dependent) may not be feasible in any path,
Di = tpd_i where N and PE are constrained.
⎛ ⎞
E Fi A PPENDIX II
⎜ Con_ pat h_i ⎟
⎜ ⎟ SA M ETHOD FOR THE S IZING FACTORS OF DML
= t p0 ⎜ pi ∗ γ + LEi ∗ bi ∗ ⎟
⎝ Cin_gate_i ⎠ I NVERTER C HAIN
hi Here, a SA approach is presented. The goal of the SA
= t p0 ( pi ∗ γ + EFi ) (36) approach, which is a compromise between the methods pre-
EFi = LEi .bi .h i (37) sented in Sections C and D, is to achieve relatively high
0.69Rinv Cd_inv precision with a reduced computation effort in respect to the
t P0 = CS method.
γ
Omitting all terms of the gate/drain capacitances in (5)
t p.inv = 0.69Rinv Cd_inv (38)
may lead to an increased error in calculation of the delay
where γ is a process parameter, deduced from (Section III-D). This error is mainly because of the first
Cd_inv and the second terms of (5). Therefore, the SA approach
= C g_inv . (39) proposes to approximate only terms starting from stage i = 3.
γ
Therefore, (5) turns to (46), shown at the bottom of the page.
In CMOS logic, the pMOS PUN to nMOS PDN sizing Differentiating (46), as shown at the bottom of the page, by
ratio is marked by β. β exists because of holes and electrons all Si and equating to zero leads to the following set of N
mobility difference (40) [21]. With a gate average resistance, expressions as follows:
PUN and PDN resistance ratio of (41) is as follows:
" dD S2 (γ + 1 + S3 )
# =0→ = μn/ p
Reqp μe d S2 S1 S2
βopt ≈ = (40) Si Si+1
Reqn μp ∀(odd_i > 1), (3, 5, 7 · · · ) : =
$ % Si−1 Si .μn/ p
R
Reqn + βeqp μe Si Si+1 .μn/ p
Rinv
; Reqp
Reqn . (41) ∀(even_i > 2), (4, 6, 8 · · · ) : = . (47)
2 μp Si−1 Si
⎛ ⎞
(2S1 +1) odd_i > 1
(S2 +1) (2Si +1) (Si+1 +1)
⎜ 3S1 γ + 2S1 +
Type_A : 3, 5, 7 . . . 3S1 γ + 2Si + ⎟
D = N Di = t p0_DML ⎜
⎝ $ % $ % ⎟
⎠ (46)
(2S2 +1) (S3 +1) even_i > 2 (Si+1 +1)
+ μn/ p 3S2 γ + 2S2 + μn/ p (2S3Si +1) γ +
Type_B : 4, 6, 8 . . . i 2Si
1052 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 22, NO. 5, MAY 2014
TABLE V
I NVERTER C HAIN S IZING FACTORS Si OF THE SA M ETHOD
S1 S2 S3 S4 S5 S N −1 SN
N−1
N −1
A0.5 A1.5 A 2
1 S2 ( A1 ) √ 1
μn/ p S2 ( A1 ) A1 S2 ( A1 ) √ 1
μn/ p S2 ( A1 ) A2 S
1 2 ( A1 ) √1
μn/ p S2 ( A1 )
The solution to this equation results in the Si sizing factors. where b = (A1 + 4(1 + γ ))−0.5 .
where S2 is the solution to the quadratic equation as follows: To obtain the optimal number of stages Nopt, we further
numerically solve (54) for A1 and substitute A1 in (53).
A1 μn/ p + A1 μn/ p + 4(γ + 1)μn/ p
S2 (A1 ) = (48)
2
R EFERENCES
and A1 is given by
[1] I. E. Sutherland and R. F. Sproull, “Logical effort: Designing for speed
2 on the back of an envelope,” in Proc. Univ. California/Santa Cruz Conf.
SN+1 2 N
A1 = √ . (49) Adv. Res. VLSI , 1991, pp. 1–16.
S1 1 + 1 + 4(γ + 1)/A1 [2] I. E. Sutherland, B. Sproull, and D. Harris, Logical Effort—Designing
Fast CMOS Circuits. San Mateo, CA, USA: Morgan Kaufmann, 1999.
Calculation of A1 requires extraction of SN+1 from CLoad [3] A. Kabbani, D. Al-Khalili, and A. J. Al-Khalili, “Delay analysis of
(15). The set of (47) can be solved for any CLoad and any N CMOS gates using modified logical effort model,” IEEE Trans. Com-
using M ATLAB or a similar tool to produce one lookup table put. Aided Design Integr. Circuits Syst., vol. 24, no. 6, pp. 937–947,
Jun. 2005.
for increased user convenience/automation. [4] B. Lasbouygues, S. Engels, R. Wilson, P. Maurine, N. Azemard, and
As can be seen, the difference between the SA and CA D. Auvergne, “Logical effort model extension to propagation delay
methods is the addition of the 4(γ + 1)μn/p term in (48). representation,” IEEE Trans. Comput. Aided Design Integr. Circuits
Syst., vol. 25, no. 9, pp. 1677–1684, Sep. 2006.
To conclude, to utilize the SA method, the sizing factors [5] S. K. Karandikar and S. S. Sapatnekar, “Technology mapping using
should be calculated from the FDML and A1 metrics as follows: logical effort for solving the load-distribution problem,” IEEE Trans.
N Comput. Aided Design Integr. Circuits Syst., vol. 27, no. 1, pp. 45–58,
A1 = f DML = 2 FDML (50) Jan. 2008.
S N+1 2 N [6] S. K. Karandikar and S. S. Sapatnekar, “Logical effort based technol-
FDML = √ = f DML
2
. (51) ogy mapping,” in Proc. IEEE/ACM Int. Conf. Comput. Aided Design,
S1 1 + 1 + 4(γ + 1)/A1 Nov. 2004, pp. 419–422.
[7] P. Rezvani, A. H. Ajami, M. Pedram, and H. Savoj, “LEOPARD: A
To calculate the optimal chain length Nopt , under a given Logical Effort-based fanout OPtimizer for ARea and delay,” in Proc.
CLoad , (50) and (51) are substituted in (46) to obtain the delay IEEE/ACM Int. Conf. Comput. Aided Design, Dig. Tech., Nov. 1999,
D as follows: pp. 516–519.
[8] J. M. Rabaey, A. P. Chandrakasan, and B. Nikolic, Digital Integrated
D = t0 p_D M L Circuits: A Design Perspective. Upper Saddle River, NJ, USA: Pearson
⎛(2s +1)
(2s2 +1) (s3 +1) ⎞
Education, 2003, ch. 4, p. 222.
1 (s2 +1) [9] A. Kaizerman, S. Fisher, and A. Fish, “Subthreshold dual mode logic,”
γ + +μn/ p γ +
⎜ 3s1 2s1 3s2 2s2 ⎟ IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 21, no. 5, pp.
⎜ & '⎟ 979–983, May 2013.
⎜ √ ⎟.
⎝ (N −2) 2 3 (N −2) μn/ p A10.5 ⎠ [10] I. Levi, O. Bass, A. Kaizerman, A. Belenky, and A. Fish, “High speed
+ γ (1+μn/ p ) + 2 dual mode logic carry look ahead adder,” in Proc. IEEE Int. Symp.
2 32 2 2 Circuits Syst., May 2012, pp. 3037–3040.
[11] I. Levi, A. Kaizerman, and A. Fish, “Low voltage dual mode logic:
(52) Model analysis and parameter extraction,” Excepted Elsevier, Microelec-
tron. J., vol. 12, no. 1, Jan. 2012.
Consequently, S1 − S3 from (48), Table V is substituted [12] N. F. Goncalves and H. De Man, “NORA: A racefree dynamic CMOS
in (52), which is then, using (53), N differentiated and zero technique for pipelined logic structures,” IEEE J. Solid-State Circuits,
equated as follows: vol. 18, no. 3, pp. 261–266, Jun. 1983.
[13] D. Harris and M. A. Horowitz, “Skew-tolerant domino circuits,” IEEE
J. Solid-State Circuits, vol. 32, no. 11, pp. 1702–1711, Nov. 1997.
2 ln S (1+√2SAN+1
1 +4(1+γ )) [14] C.-H. Wu, S.-H. Lin, and H. Chiueh, “Logical effort model extension
N A1 = 1
. (53) with temperature and voltage variations,” in Proc. 14th Int. Workshop
ln(A1 ) Thermal Invest. ICs Syst. THERMINIC, Sep. 2008, pp. 85–88.
Finally, we reach (54) that only contain A1 as follows: [15] M.-H. Chang, C.-Y. Hsieh, M.-W. Chen, and W. Hwang, “Logical effort
⎡ ⎡ ⎡ ⎤ ⎤ models with voltage and temperature extensions in super-/near-/sub-
threshold regions,” in Proc. VLSI Design, Autom. Test (VLSI-DAT), Int.
⎢√μ ⎢(A−0.5 + b) ⎢ 1 + − (1 + γ ) ⎥ ⎥ Symp., Apr. 2011, pp. 1–4.
⎢ n/ p ⎣ 1 ⎣ $ %2 ⎦ ⎥ [16] J. Keane, H. Eom, T.-H. Kim, S. Sapatnekar, and C. Kim, “Subthreshold
⎢ 8 2S ( A ) ⎥
⎢ √ 2
μn/ p
1
⎥ logical effort: A systematic framework for optimal subthreshold device
⎢ . ⎥ sizing,” in Proc. 43rd ACM/IEEE Design Autom. Conf., Jul. 2006,
⎢ 1 −0.5 ⎥
⎢ −0.5 A1 ⎥ pp. 425–428.
⎢ − A1 + N(A1 ). ∗ ⎥ [17] A. Morgenshtein, E. G. Friedman, R. Ginosar, and A. Kolodny, “Cor-
⎢ 4 4 ⎥
⎢ ⎥ rections to unified logical effort-a method for delay evaluation and
⎢⎡ ⎤ & ' ⎥ minimization in logic paths with RC interconnect,” IEEE Trans. Very
⎢ √ ⎥
⎢ ln(A1 ) γ μn/ p ∗ A10.5 ⎥ Large Scale Integr. (VLSI) Syst., vol. 18, no. 8, pp. 689–696, Aug. 2010.
⎣⎣ ⎦+ (1+μn/ p )+ = 0⎦ [18] A. Morgenshtein, E. G. Friedman, R. Ginosar, and A. Kolodny, “Unified
4b.(1+γ ) N( A1 )
A (1+b)
2 − A1
2 2 logical effort-a method for delay evaluation and minimization in logic
1 paths with interconnect,” IEEE Trans. Very Large Scale Integr. (VLSI)
(54) Syst., vol. 18, no. 5, pp. 689–696, May 2010.
LEVI et al.: LE FOR CMOS-BASED DML GATES 1053
[19] M. Alioto, E. Consoli, and G. Palumbo, “General strategies to design Alexander Fish (M’06) received the B.Sc. degree
nanometer flip-flops in the energy-delay space,” IEEE Trans. Circuits in electrical engineering from the Technion, Israel
Syst. I, Regular Papers, vol. 57, no. 7, pp. 1583–1596, Jul. 2010. Institute of Technology, Haifa, Israel, in 1999, and
[20] H. El-Masry and D. Al-Khalili, “Cell stack length using an the M.Sc. and Ph.D. (summa cum laude) degrees
enhanced logical effort model for a library-free paradigm,” in from Ben-Gurion University, Beer-Sheva, Israel, in
Proc. 18th IEEE Int. Conf. Electron. Circuits Syst., Dec. 2011, 2002 and 2006, respectively.
pp. 703–706. He was a Post-Doctoral Fellow with the ATIPS
[21] J. M. Rabaey, A. P. Chandrakasan, and B. Nikolic, Digital Integrated Laboratory, University of Calgary, Calgary, AB,
Circuits: A Design Perspective. Upper Saddle River, NJ, USA: Prentice- Canada, from 2006 to 2008. From 2008 to 2013,
Hall, 2003, p. 761. he headed the Low Power Circuits and Systems
[22] W. P. N. Chen, P. Su, J. S. Wang, C. H. Lien, C. H. Chang, K. Goto, and Lab, VLSI Systems Center, Ben-Gurion University’s
C. H. Diaz, “A new series resistance and mobility extraction method by Department of Electrical and Computer Engineering. In 2012, he joined the
BSIM model for nano-scale MOSFETs,” in Proc. VLSI Technol. Syst. Faculty with Bar-Ilan University, Ramat Gan, Israel, as an Associate Professor,
Appl. Int. Symp., Apr. 2006, pp. 1–2. and the Head of the Faculty of Engineering’s Nano-Electronics Track and the
Energy Efficient Electronics and Applications (EA 3 ) Labs. He has authored
over 90 scientific papers and patent applications. He has published two book
chapters. His current research interests include low voltage digital design,
energy efficient SRAM and Flash memory arrays, low power CMOS image
Itamar Levi received the B.Sc. degree in electrical sensors, and low power design techniques for digital and analog VLSI chips.
and computer engineering from Ben-Gurion Univer- Dr. Fish was a co-author of two papers received the Best Paper Finalist
sity, Beer-Sheva, Israel, in 2002. He is currently Awards at ICECS’04 and ISCAS’05 conferences. He received the Young
pursuing the M.Sc. degree in electrical and computer Innovator Award for Outstanding Achievements in the field of Information
engineering with Ben-Gurion University. Theories and Applications by ITHEA in 2005. In 2006 and 2012, he was
He has been with the VLSI Systems Center, Ben- honored with the Engineering Faculty Dean Teaching Excellence recognition
Gurion, since 2011, where he is responsible for with Ben-Gurion University. He serves as an Editor-in-Chief for the MDPI
various aspects of VLSI systems design: low-energy Journal of Low Power Electronics and Applications and as an Associate Editor
design, high-probability low energy logic families, for the IEEE S ENSORS and IEEE A CCESS J OURNALS . He was a co-organizer
adiabatic logic, and digital logic architecture. of many special issues and sessions in IEEE Journals and conferences.