CMOS Transistor Sizing for Minimizing Energy-Delay Product

CMOS Thnsistor Sizing for Minimization of Energy-Delay Product
Christophe Tretz and Charles Zukowski Department of Electrical Engineering Columbia University in the City of New York, New York NY 10027-6699 (212) 854-8478, tiger@vlsi.columbia.edu
Abstract
In thispaper, we revisit three of the well known optimization results in CMOS transistor sizing with the energy-delay product as a new metric. We study the absolute sizes of and the ratio between n-channel and p-channel transistor widths in uniform logic, the optimal distance between repeaters in an RC line, and the optimum number of inverter stages, along with their sizes, needed to drive a large load capacitance. Results, both theoretical and numerical, show that in general the optimum solutions for energy-delay lead to smaller designs than the ones obtained for minimum delay.
1. Introduction There are a number of well known results for CMOS

transistor sizing based on rough RC timing models and the assumption that minimum delay is the goal. Examples include the ideal p-channel to n-channel ratio of and the use of ln(CI/Co) stages with geometric width progression to drive a large load. These do not reflect any real situation exactly, but they provide a starting point for designer intuition and design iteration. Now that power considerations are becoming more important due to increased densities and portable applications, there is a need to consider the effects of energy efficiency on the old formulas. In CMOS circuits, where energy is primarily used only during transitions, i.e. during actual computation, the
fundamental metric related to power consumption is
isp
energy/operation, or power normalized to unit operating frequency. To evaluate a logic gate or logic chain design for power, we thus consider the energy required t propagate a o single transition. Energy cannot generally be considered in isolation, however. As proposed in [l], we use the energydelay product as our ultimate cost function for two reasons. First, it provides a simple way to consider delay along with energy, since low power is rarely the only goal in logic
circuit design. Second, since adjusting the power supply voltage (including its form in adiabatic circuits) can trade speed directly for lower energy, it provides a good metric for energy efficiency in an optimization problem where the power supply is assumed fixed. A circuit with an improved energy-delay product can have its supply voltage adjusted to achieve lower energy for the same delay, or a lower delay for the same energy. In [2], numerical algorithms to solve simultaneously driver and wire sizing for speed and power optimizationare presented. This method however uses a weighting relation to optimize energy and delay separately. Because of that, the resulting optimum is based on the weights applied to the relation, and their optimum depends on the choice of the weights and is different from the energy-delay product optimum we are presenting in this paper. With the energy-delayproduct as a new metric, we revisit three of the well known optimization results in CMOS transistor sizing. First, we consider the sizing of n-channel and p-channel transistor widths, including their ratio. To match rising and falling delays, one generally uses the mobility ratio for a width ratio. To minimize simple pair delay, however, the optimum ratio is the square root of the mobility ratio; the loading effect of the p-channel transistor in every stage is thus balanced against its weaker drive at a logic stage with a rising output. Consideringenergy as well, there can be an even stronger push towards minimum-size transistors. Second, we consider the optimal distance between repeaters in an RC line, a balance between the delay arising from extra stages and the advantageof cutting a line with a delay that increases quadratically with distance. Energy considerations favor a reduction in the number of repeaters, each of which increases the capacitance that must be charged and discharged. Third, we consider the optimum number of inverter stages, along with their sizes, needed to drive a large capacitive load. Larger stage ratios must be used near the load when energy effects are considered,because in the conventionalminimm-delay solution,most of the energy is consumed by the last stage.
168
0-8186-7502-0/96 $5.00 0 1996 IEEE
In the second section, we briefly review the first-order CMOS energy and delay models, and cover our basic assumptions. In the next three sections, we explore the effects of changing the cost function from delay to energydelay product in each of the three situations mentioned above. Finally, we discuss some of the implications in the conclusion.
3. Optimum Sizing of n-channel and pchannel Transistor Widths

The first optimization result we will revisit is for the optimum absolute device size, including the width ratio between n- and p-channel transistors, in random CMOS logic. To do so, we make the standard simplifying assumption that we have an infinite string of identical CMOS inverters, possibly with some typical wire and/or fanout capacitance between each stage. When optimizing the ratio for minimum delay, if we assume that we have no wire parasitics, we get the standard result. By taking the gradient of the: delay with respect to one of the widths, we end up with thie following ratio of nchannel1 to pchannel transistor widths [5]:
2. First Order Energy and Delay Models

For both the energy and delay of a CMOS logic gate, we consider only first-order inverter models that capture the basic trade-offs involved. This has been found to be very useful as a starting point in optimizations for delay alone,, even for more general logic gates. Logic gates of course have more complex switches, but to first order complex S switches can be modeled by single equivalent. transistors. For the energy model, we assume that the dominant component is associated with charging the load1 capacitance. For a standard inverter this is simply [3]: E = C,Vi, (EO111 In this paper we study only the effects of modifying the. transistor sizing, so we assume that the power supply voltage is fixed. As a result, the energy is proportional tal the capacitive load. In the case where one inverter is driving, another, this load is then proportional to the sum of the: pulldown and pullup widths in the following inverter,, assuming that all transistors are of minimum length, as is; almost always the case: (widths from load devices)(EQ211 For the delay model, we assume that the rising or falling, delay are proportional to the load capacitance, and inversely proportional to the width of the corresponding driving; transistor. If we assume that the transistors are sized tal match rise and fall delay, and wire parasitics are: insignificant, as we do in sections 4 and 5 for simplicity, we: get [41:
E = (Wrsi+
and wilh MOSIS 2 micron process parameters [6], we = 16 . . obtain ii ratio of Considering now the optimum for the energy-delay producl..,we can use the same methodl as the one used lo find the minimum delay. The expression of this product is given by:
E*t-(W,+Wp)
X
(PnWn -
-I---
1
n w
(EQ6)
+ W p , i +1 )
Keeping W, fixed, we can find1 the optimum of the product by setting its gradient with respect to Wp equal to zero. Doing so, we end up with a quadratic equation having two real roots, and the only positive solution is: il 2P 2 2 ----(E~t)=O~-pWp+WpWn-Wr = 0 (EQ7)
or for the desired ratio:
aw,
Pn
Howewer, we can see in this derivation that we could have kept Nrpfixed; doing so, we end up with a completely symmelrical quadratic equation, and solving this time for W,, we. obtain:
where Wn,i and Wn,i+lrepresent the transistor width of the: driver and load respectively. To include wire parasitics, as well as the fanout off the critical path as we do i section 3., n we use Cwire(in units of microns of equivalent transistor width) to represent the parasitic capacitance associated withi interconnect wiring and fanout. In the special case of ai chain of identical inverters, which could represent a pathi through some random logic, when Cwireis included, in general we obtain a pair delay (one stage rising and the: other falling) oE
(EQ 4:)
This second result is obviously different from the first one, and using MOSIS 2 micron parameters, we see that the first ex]:ression would give an optimum ratio of 0.66, while the second expression would give an optimum ratio of 2.85. As a note, we end up with the same expression for the minimum delay case with both approaches. We c(an then plat a surface proportional to the energydelay product as a function of both widths, as well as a 2D plot showing the level lines of the function, giving us a better view of the optimum values for the minimum energydelay product (figure l a and Ib). We can clearly see both
169
FlGURE 1.
Optimum for the energy-delay product, a) mapping, b) level lines

r a
I1
I8 14 It
IO
I
4
IO
It
Id
I*
I#
FlGURE 2.
Optimum for the delay, a) mapping, b) level lines

expression. The optimum solutions for both W, or W, fixed are more complex than the previous ones, and they are:
optimum ratios. For the sake of completeness, we are also showing the same plots obtained for the minimum delay (figure 2a and 2b), on which we also show the optimum ratio. In this case, we can see that we have two minima for the energy-delay, depending if we choose to fix the size of the n- or p-channel transistor. Interestingly, the global minimum with given technological constraints is obtained for minimum size devices. If we also consider the effects of parasitic wiring capacitance, something that is becoming more important for small geometries like submicron or deep submicron processes, we must add a constant t r to the expression em for load capacitance. This also can model fanout within a critical path. In [l], Horowitz shows that we reach the optimum energy-delay roughly when the contribution of the gate capacitance is equal to the contribution of the wire capacitance. Here, we have a more complete result, as we are showing that when we include the wire capacitances in the model, we can reach a global minimum for the energydelay product for which the width of the pullup is larger than the width of the pulldown, as we can see from the equation when we add a significant constant Cwirein the
(EQ 10)
We can see in figure 3 the global optimum for the energydelay when we include the extra capacitance in the model. This reflectsa wire capacitance that is equivalent to a gate capacitance from a transistor that is 4 times minimum width For this global minimum, we have a gate with p-channel transistor wider *thann-channel transistor, but none of the optimum widths are minimum. We have this relation because the mobility of electrons is higher than the mobility of holes. We added on this figure the curve of the various global minima obtained for various values of C , located in between the minimum for delay (1.6) and energy-delay with fixed Wp (2.85). These three optimums (fixed W,, fixed W, or including Cwire)can be used to optimize standard cell designs using scalable cells.
170
m
U U
m
0
a
:t
IO
la
Id
I*
wP
FIGURE 3.
Optimum for the energy-delay product, irlcluding parasitics, a) mapping, b) level lines
;Sli(E * t ) = 0 a
4.0ptimum Distance Between Repeaters in an RC Line

We now examine the problem of breaking up a long resistive line with repeaters to improve its performance. W e know that as the line gets longer, its delay increases quadratically. If we use a repeater, the repeater introduces some extra delay, but also restores the signal rise time. If we solve this problem to achieve minimum delay, we. can derive an optimum number of stages N and an optimum size for the repeaters [5]. Assuming equal rise and fall times (i.e. W P , = U@:
This :systemhas no simple closed-form solutions for N or
Wn. can however plot the product, and find the optimum We
values numerically, in order to compare them to the ones obtained for the minimum delay. We have in figure 4 both the surlhce representing the energy-delay product, and the level lines of this surface as a function of N and W,, and in figure 5 , the same curves for the delay alone for comparilson. From these figures, we clearly see that in order to achieve the minimum energy-debiy product, we need to use fewer repeaters of smaller size than what we would use to achieve the minimum delay.
where RL and CL are per unit length parameters of the long resistive line, W- is the minimum transistor width, and &: and C are parameters of a minimum size inverter, the time: , constant z~ and z are given by ZL=RLCL, z=R& and If we are now minimizing the energy-delay product, we: need the expressions of both delay and energy for this problem, and using the same approach we have:
5. Optimum Number of Invlerter Stages to Drive a Large Capacitive Load

Often it is desired to drive large load capacitances such as long buses, I/O buffers, or ultimately, pads and off-chip capacitive loads. This is achieved by using a chain of invertei:~ where each successive inverter is made larger than the previous one in order to achieve maximum perfomilance. When delay is the main concern, we can find that the optimum values for the numlxr of stages M and the ratio g between two adjacent stages i [SI: m M = ln(CL/C,) g = e (EQ 14) When we consider the minimum energy-delay product, we can immediately see that we have a more complex
Using the gradient of the product with respect to both the: number of repeaters and the width of these repeaters, we: end up with a system of two equations in N and Wn:
171
uRE 4.
Optimum for the energy-delay product, a) mapping, b) level lines
a)
FIGURE 5.
Optimum for the delay, a) mapping, b) level lines

U-1 {M-1
\
problem. In order to find the minimum delay, we were using the fact that we will obtain the minimum delay if each stage has the same delay. We cannot transpose this assumption to e energy-delay problem. An immediate consequence is at we will not have a constant ratio between successive stages. We intuitively already know this result as we know that the last stage will dominate the energy consumption of the delay-minimized inverter chain, and thus, we do not want to have constant ratios between stages in order to reduce the size of the last stages as much as possible. ualitatively, we want to have larger ratios at the end of chain to reduce the energy consumption. This means at from a problem with two unknown parameters (M and gf, w e end up with a problem with (M+l) unknown parameters. The expressions for delay, energy and energy-delay product (after simplification)are:
M-1
M-1
E .t
= %CLV:
i=o
Knowing that the product of all stage ratios should be equal to the ratio between the driven load CL, and the starting load CO, can solve this problem. We used the we Lagrange multipliers method [7] to find the optimum values of the various ratios gi and of the number of stages. The set of equations used for the optimization process are the various derivatives with respect to each ratio, as well as the product of the ratios used as the constraint imposed on the
function to optimize. We thus have the following system to
'+Jn
i i
gj
j=i+l
solve to find the ratios for a fixed number of stages M .

C H, = - n g i + - L
i=O
M-1
(EQ 16)
C O
r=Q-gi
i=O
(
(EO 15)
172
It is then just a matter of iteration to also find the globally of stages by applying the same method tal f M and comparing the results. As the: r complex and slow if the. mes signilicant, we c create a set of m a designer to easily find a good first set of n load capacitance. In figure 6 we: show one such set of curves as an example. In this case, we: values of M that match the: um delay solution, for e s ay comparison between the two situations. As we can see., while the first ratios are smaller for the energy-delay case, the last ones are a lot larger; as a result, we have smaller inverters in the chin, using thus also less area. The table in figure 6 gives the delay and energy-delay values for various; es;the first row of each box gives the valuer; tained for the minimum delay approach, t e h second row for the sizing obtained for minimum energy-. delay. All values en in base units (i.e., unit capacitance for the it time for the delay, and unit energy-ti~e the for hertermsizes -optimum energy-de
delay metric. Contrary to the minimum delay case, the results are not as simple, and numerical solutions are necessary in some cases. However, the obtained results are interesting as significant energy-delay improvements can be achieved using these results. A common result is that in general, smaller is better to improve the energy-delay producl., and from the various results, we can see that we do not netxi to sacrifice speed a lot to improve the energydelay. Other results to minimize delay for particular situations have been derived, and it would be interesting to also find how they change when the energydelay product is minimized instead. An iirrteresting aspect of these results comes with the optimum ratio to achieve minimum pair energy-delay produclt presented in section 3. The optimum curves deriv can be used to design optimum standard cells in scalable CMOS using the various design constraints imposed by the system. Another interesting result is that most chains of gates will have fewer and smaller stages to achieve optimum energydelay product when we use the general optimization algorithm of equation 16, meaning that we reduce the overall (mea,increasing thus the wirability or the available area for additional overhead circuitry used to implement more complex co~ltrol schemes to further reduce power dissipation in the circuit.
References
Maurk Horowitz, Thomas Indermaur C Ricardo Gonzales, Low-Power Digital Design, IEEE Symposium on Low Power Electronics, 1994, 8-11. pp J. Clong & C-KKoh, Simultaneous Driver and Wire Sizing for performance and Power Optimization, IEEE Trans. on VLSI Systems, Vol. 2 No. 4 December 1994,pp 408425. , A. Chmdrakasan & R. Brodersien, Minimizing Power Consumption in Digital CMOS Cin:uits, Froceedings of the IEBB. Vol. 83, NO.4 April 1995,pp 498-523. , N.1-[.E. Weste & K. Eshraghian, B -Des,ipn. A Svstems Persuective. Second Edition, AddisonWesley Publishing Company, 1993. L. .A. Glasser & D. W. Dobberpuhl, The Desien and Analvsis -- VLSI Circuits, Addison-Wes1e:y Publishing Company, of
number
DrOaChlDelav Product min. Dro
I min. prod116.4 I 2730
1985.
MOSIS User Manual, Version 4 0 1995. ., George B. Thomas, Calculus and Analvtical Geometry, Adtfson-Wesley Publishing Compamy. 1972.
6. Conclusions
sections, we revisited three well known transistor sizing with the energy-delay product as a new metric to minimize instead of the standard
173

CMOS Transistor Sizing for Minimizing Energy-Delay Product

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

CMOS Transistor Sizing for Minimizing Energy-Delay Product

Încărcat de

Drepturi de autor:

Formate disponibile

CMOS Thnsistor Sizing for Minimization of Energy-Delay Product

1. Introduction There are a number of well known results for CMOS

3. Optimum Sizing of n-channel and pchannel Transistor Widths

2. First Order Energy and Delay Models

or for the desired ratio:

Optimum for the energy-delay product, a) mapping, b) level lines

Optimum for the delay, a) mapping, b) level lines

4.0ptimum Distance Between Repeaters in an RC Line

This :systemhas no simple closed-form solutions for N or

5. Optimum Number of Invlerter Stages to Drive a Large Capacitive Load

Optimum for the energy-delay product, a) mapping, b) level lines

Optimum for the delay, a) mapping, b) level lines

solve to find the ratios for a fixed number of stages M .

I min. prod116.4 I 2730

S-ar putea să vă placă și