Sunteți pe pagina 1din 4

An Evolutionary Algorithm for Low Power VLSI Cell Placement

Mahmood R. Minhas
Information and Computer Science Department
King Fahd University of Petroleum and Minerals
Dhahran 31261, Saudi Arabia
E-mail: minhas@ccse.kfupm. edu.sa

Abstract - With the increasing use of battery operated physical design responsible for arranigeinenit of cells oni
mobile electronic devices, VLSI circuit designers are con- a layout surface for optimizing certain objectives wlhile
tinuowly focusing on approaches to low power designs. satisfying some constraints. Standard cell placemlenit is
We present an evolutionary cell placement technique for a special case where all the cells to be placed have equal
low power VLSI standard cell placement. The proposed height.
technique is based on two evolutionary algorithms namely
Tabu Search and Genetic Algorithm. Experiments were CAD subproblem lavel Idea Generic CAD toos
carried out using representative circuits from ISCAS- Behavioma modeling anc)
BehavloralVArchlecturatl Architectural design
85/89 benchmark suite. For the comparison purposes, ISimnulation tool
we also implemented GA for our problem and compared Register transter/tgic O Loical design
I
Functional and logic minimizaton5
*logc fittin and simulton tos
placements results of the proposed technique to those of
GA. The comparison shows that the proposed technique CGe%msk Physical desin Tools for patitionin
placement, routing, etc.
outperforms GA both in terms of quality of final place- I
Fabrcation
ment solution obtained as well as CPU run time require-
ments. (New chi
I. Introduction
The need for low power driven VLSI design has emerged Fig. 1. Various steps in VLSI design process.
rapidly in past few years. While optimizing a circuit de-
sign for power consumption, other design objectives like We are addressing the problem at cell placement level
performance and interconnect wire length need also to with the objectives of optimizing power consumptioni,
be taken care of. This fact leads to the development of timing performance (delay), and wire lenigtlh whiile coiI-
techniiques, which target to simnultancously optimlize all sidering layout widtlh as a conistraint. Forinally, the prob-
these design goals. Previously, the objectives of opti- lein cani be stated as follows: A set of cells or itiod-
mizing interconnect wire length and performanice were ules M = {m1l,t2, ...,m",t) and a set of signtals S =
focused, and a large number of efforts targeting either {81s 82s,..., sk} is given. Moreover, a set of signals S,,*,,
one or both of above two objectives are reported in the where Sm, 5S, is associated with each module mi E M.
literature [1], [2]. There has been reported some work Similarly, a set of modules M.,, where M,, = {milsj E
for optimizing power consumption while considering the Sm i} is called a signal net, is associated with each signal
wire length and performance as constraints [3], [4]. Re- ,j E S. Also, a set of locations L ={LI,L2,-...ILp}
cently, some efforts targeting simultaneous optimnization where p > n is given. The problem is to assign each
of all thrce ol)jectives are also reported in [5], [6]. mi E M to a unique location Lj, such that all of our
The organizationi of the rest of this paper is as follows: objcctivcs a-rr. optitnizcd subjcct to ouT con.stiniflt [7].
In the next section, we present the problem and formu-
late the cost functions. Section 3 presents the implemen- A. Cost Functions
tation details of our proposed approach. The experimen- Now we formulate cost functions for our three said ob-
tal results and comparison are presented in Section 4. jectives and for the width constraint.
II. VLSI Cell Placement and Cost Functions . Wire length Cost: Interconnect Wire length of
VLSI design is a complex process and is carried out at each net in the circuit is estiinated and then total
certain abstraction levels ']. The design process starts wire length is computed by adding all these individ-
from an abstract idea, and then each intermediate step ual estimates:
continues refining the design and the process ends with Costwire = E ii (1)
the fabrication of a new chip. The problem of power iEM
optimization can be addressed at a higher level as well where li is the wire length estimation for net i anid
as at a lower level e.g., physical level [8]. In this work, we
address the above problem in the placement step at the M denotes total number of niets in circuit (wlhiclh
physical level. Placement is an important step in VLSI is the same as number of modules for sirngle output
cells).

0-7803-8294-3/04/$20.00 ©2004 IEEE 1540

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on May 9, 2009 at 02:38 from IEEE Xplore. Restrictions apply.
Power Cost: Power consumption Pi of a net i in a * Width Cost: Width cost is given by the maximum
circuit can be given as: of all the row widths in the layout. We have con-
strained layout width not to exceed a certain posi-
Pi 2 Ci *VDD f *Si* (2) tive ratio a to the average row width wavg, where
Wavg is the minimum possible layout width obtained
where Ci is total capacitance of net i, VDD is the by dividing the total width of all the cells in the lay-
supply voltage, f is the clock frequency, Si is the out by the number of rows in the layout. Formally,
switching probability of net i, and ,B is a technology we can express width constrainit as below:
dependent constant. Width -Wavg < C X Wav (10)
Assuming a fix supply voltage and clock frequency,
the above equation reduces to the following:
* Overall Fuzzy Cost Function: Since, we are op-
Pi -
Ci Si (3)
timizing three objectives simultaneously, we iieed to
The capacitance Ci of cell i is given as: have a cost function that represents the effect of all
three objectives in form of a single quantity. We pro-
pose the use of fuzzy logic to integrate these multi-
cicjE: C9
Ci= rt+ 4
~~~~~~~(4) ple, possibly conflicting objectives into a scalar cost
jEMi function. Fuzzy logic allows us to describe the ob-
where Cq is the input capacitance of gate j and Ci jectives in terms of linguistic variables. Then, fuzzy
is the interconnect capacitance at the output node rules are used to find the overall cost of a placement
of cell i. solution. In this work, we have used following fuzzy
At the placement phase, only the interconnect ca- rule:
pacitance Ci can be manipulated while C.^ comes IF a solution has
from the properties of the cell library used and is SMALL wire length AND
thus independent of placement. Moreover, Ci de- LOW power consumption AND
pends on wire length of net i, so equation 3 can be SHORT delay
written as: THEN it is an GOOD solution.
Pi ~-l i Si(5
The cost function for total power consumption in
the circuit can be given as:
Costpower = Z Pi Z(li= S) (6)
C/o,
iEM iEM t.o 9,
9;
* Delay Cost: Delay cost is determined by the delay Fig. 2. Membership functions
along the longest path in a circuit. The delay T,, of a
path 7r consisting of nets {vi, v2, ..., Vk}, is expressed The above rule is translated to and-like OWA fuzzy
as: operator [91 and the membership p(x) of a solution
k-i
x in fuzzy set GOOD solution is given as:
TW = E(CD + IDi) (7)
i=l
where CDi is the switching delay of the cell driving
f3.pin, {ttj(X)} + (1 -d) * 3j=PAd,
m
J=pd E j (x);
net vi and IDi is the interconnect delay of net vi. /(x) if Width - Wavg < a * Wavg,
The placement phase affects IDi because CDi is
technology dependent parameter and is independent 1 0; otherwise.
of placement. Using the RC delay model, IDi is (11)
given as: Here pj (x) for j = p, d, I, widthl are the memnbership
IDi = (LFi + Rr) x Ci (8) values in the fuzzy sets LOW power consumption,
where LFi is load factor of the driving block, that SHORT delay, and SMALL wire length respectively.
is independent of layout, Rr is the interconnect re- ,B is the constant in the range [0, 11. The solution
sistance of net vi and Ci is the load capacitance of that results in maximum value of p(x) is reported
cell i given in Equation 4 . as the best solution found by the search heuristic.
The delay cost function can be written as: The membership functions for fuizzy sets LOW power
consumption, SHORT delay, and SMALL wire length
Costdelay = max{T7r} (9) are shown in Figure 2. We can vary the preferenice
of an objective j in overall meinbership functioIn by
changing the value of gj . The lower bounids Oj for

1541

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on May 9, 2009 at 02:38 from IEEE Xplore. Restrictions apply.
different objectives are computed as given in Equa- good offsprings by mating with one another. Now, GA
tions 12-15: is run for a given number of generations on these passed
n solutions and a record of the best individuals is kept.
Again, either the current individuals or best ones are
°l=ZEl
i=l
Vvi E {V1,V2 Vn} (12)
passed back to TS. The switching between TS and GA
is repeated for a given number of times.
O
i=l*V EVIV2 -iV 13
A. Solution Encoding
A placement solution is an arrangenment of cells in two
OdZ= CDj+IDj Vj E {VI, V2, ... Vk} in path7rC dimensional layout surface. So we decided to represenit
j=l solution in the formii of a 2-D grid. Due to varyinig widtlhs
(14) of the cells in a circuit, all the rows can not have equal
En Widthi(5 number of cells. This fact disturbs our two dimensional
Owidth _ # Z=1 Width
representation. For instance consider a circuit compris-
of rows in layout
where Oj for j E {l,p,d,width} are the optimal ing of 11 cells 1,2,3,... ,11. A possible layout nmay be
costs for wire-length, power, delay and layout width
as below:
respectively, n is the number of nets in layout, l is 3 5 8 6
the optimal wire-length of net vi, CDi is the switch- 97 10
ing delay of the cell i driving net vi, ID, is the op- 4 112 1
timal interconnect delay of net vi calculated with In order to make it a perfect grid, we fill the empty lo-
the help of li, Si is the switching probability of net cations
by dummy cells represented by distinct negative
vi, 7r, is the most critical path with respect to op- integers as shown below. These negative numbers are
timal interconnect delays, k is the number of nets
in 7r, and Widthi is the width of the individual cell used for encoding purpose as well as for the appropriate
driving net vi. application of genetic operators like crossover and these
do not play any role in cost computation of the solution.
III. The Proposed Technique and 3 5 8 6
Implementation Details 9 10 -1 -2
7 11 1 -3
In this section, we first present the proposed algorithm 4 2 -4 -5
and then discuss its implementation details for low power In the initialization step random enicoded striings are
VLSI placement. generated. For enico(inlg purpose. w( us,-e. a s(quare gri(d
An itnterestinig niovel i(lea is the initro(luctioni of n pop- lhtavinig L slots, suichl thtat L > N, wlhero N is the nuihl)er
ulation of solutionis inisten(d of siniglC solution in Tabiu of cells in the circuiit. Etalh cell is astsigedl at positive
Search algorithm (TS) [10]. This is likely to enhance integer value. Also, L - N dummy cells are created,
the power of TS by allowing it to visit the search space each dummy cell is assigned a negative integer in such a
in a parallel fashion. The algorithm starts by taking a way that not two dummy cells have the same value. To
random initial population of solutions. Then, for each in- generate the st.ring, first row of the grid is placed first in
dividual in the population, a certain number of neighbor the string followed by the next row and so on.
solutions are generated and the best neighbor is found.
A characteristic of the move leading to the best neighbor
solutioII is store(d in a tabu list. Tlhere arc as imany tabu B. Cost Evaluationi
lists as the number of solutions in the population i.e., Since, we are addressing a multiobjective optimnization
NT. The reason for taking NT tabu lists is obvious that problem in which we are trying to minimize three mu-
the series of moves for each individual in the population tually conflicting objectives, therefore we should have a
is different. Therefore, each series should be stored in a measure which can quantify the overall quality of a so-
separate list so that a tabu list restricts the cyclic moves lution with respect to all three objectives collectively.
on its corresponding individual only. However, the aspi- A conventional approach to this problem is the use of
ration level (AL) is unique for all the individuals. The weighted sum. This approach is not used in our imple-
purpose is that the tabu move on an individual solution mentation because it is known to have certain problems.
is allowed only if it results in a solution that is better For instance, it is difficult to find values for weights as
than an overall unique best solution. these heavily affect the relative importanice of objectives.
The above process continues for a certain number of In this approach, the costs of all the objectives are first
iterations and a record is kept of the NT individual best Fuzzy logic provides a convenient approach and lhence
solutions obtained from perturbing NT individual initial used in this work. In this scheme, each solution is as-
solutions. Then either these best solutions or the cur- signed a fitness value between 0 and 1 that is equal to the
rent solutions are passed to GA for further optimization. membership value in the fuzzy set of acceptable solutioin.
These semi-optimized individuals are likely to produce This membership value is computed using Equation 11.

1542

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on May 9, 2009 at 02:38 from IEEE Xplore. Restrictions apply.
TABLE I
A COMPARISON BETWEEN THE PROPOSED TECHNIQUE AND GA RESULTS

l______ GA The Proposed Algorithm


Circuit L (pm) P D (ps) T (s) L (pm) P D (ps) T (s)
s2081 2426 388 113 2341 2162 356 110 426
s298 4062 838 130 2922 3454 631 125 362
s386 6824 1665 193 3945 6329 1486 191 656
s641 17812 4532 740 21982 12433 2882 664 2143
s832 21015 4787 395 7206 18451 4257 351 1929
s953 31004 5027 235 11221 25967 4239 212 1846
s1196 48729 14755 372 16120 38574 11526 334 3276
s1238 50387 15035 396 16208 39065 11342 351 3667
s1488 69792 17346 784 21434 56148 14135 669 5157
s1494 69223 17169 771 26032 54914 13763 668 5643
c3540 310996 109850 924 57724 164245 57905 1 703 33247

IV. Experimental Results and Discussion for placement of some circuits from ISCAS-85/89 bench-
The proposed technique is applied for placement of mark suites. The comparison shows the superiority of
some circuits from ISCAS-85/89 benchmark suites. For the proposed technique over GA in terms of placement
the comparison purposes, another well-known evolution- quality and CPU run time requirements.
ary algorithm was also implemented and applied to the
above circuits. The performance of the proposed algo- Acknowledgment:
rithm is compared with that of GA. The costs of the best Author thanks King Fahd University of Petroleum &
solutions generated by both the approaches are listed Minerals, Dhahran, Saudi Arabia, for all assistance and
in Table I. Here "L", "P" and "D" represent the wire support provided for carrying out research.
length, power and delay costs respectively, and "T" rep- References
resents execution time in seconds. Layout width was [11 K. Shahookar and P. Mazumder. VLSI Cell Placemlcnt Tcchi-
conistrained not to exceed more than 1.2 times the av- niques. ACM Computing Surveys, 2(23):143-220, Juiie 1991.
erage row width by fixing the value of a in equation 10 [2J K. Chaudhary A. Srinivasan and E. S. Kuh. Ritual: A
equal to 0.2. This constraint is satisfied in obtaininig all Performance-driven Placemlent Algorithin. IEEE 7hnnsnc-
the results shown here. tions on Circuits and Systetms -11, 11(39):825-840, Novemnber
1992.
The results of GA are obtained by best settings of its [31 H. Vaishnav anid Massoud Pedrarn. PCU13E: A Petformiiaince
various algorithmic parameters. The parameters setting Driven Placemenit Algorithmti for Low Power Designi. IEEE
of the proposed technique for achieving theses results Design Automation Conference, with Euro- VHDL, pages 72-
is as follows. Total number of iterations run are 5000, 77, 1993.
which comprise of 2000 TS iterations and 3000 GA gen- (41 Glenn Holt and Akhilesh Tyagi. GEEP: A Low Power Genetic
Algorithm Layout System. IEEE 39th Midwest Symposium
erations. The switch from TS to GA is made only once. on Circuits and Systems, 3:1337-1340, August 1996.
The population size NT used in TS part is 4 while in [51 Sadiq M. Sait, Habib Youssef, Aiman Al-Maleh, and Mali-
GA part the population size is 16 chromosomles. This mood R. Minhas. Iterative Heuristics for Multiobjective VLSI
fine tuning of parameters is made after careful study of Standard Cell Placement. INNS-IEEE International Joint
the results obtained by choosing different settings. The Conference on Neural Networks (IJCNN), 3:2224-2229, July
2001.
population size in case of TS is reduced after observ- [6J Sadiq M. Sait, Mahmood R. Minhas, and Junaid A. Khan.
ing that large population size increases run time of TS Performance and low power driven VLSI standard cell place-
part without providing any significant performance. By ment using Tabu search. In Proceedings of the IEEE Congress
adopting this measure, the run time requirement of the on Evolutionary Computation (CEC), 1:372-377, May 2002.
proposed technique are significantly lowered. 171 Sadiq M. Sait and Habib Youssef. VLSI Physical Design Au-
tomation: Theory and Practice. Mc Graw-Hill Book Com-
It can be observed from the results that in all of cases, pany, Europe, 1995.
the proposed technique produced solutions which are [81 Srinivas Devadas and Sharad Malik. A Survey of Optimiza-
better in quality as compared to those obtained from tion Techniques Targeting Low Power VLSI Circuits. 32nd
GA. Also, the run time requirements of the proposed ACM/IEEE Design Automation Conference, 1995.
technique are considerably lower than that of GA. [9] Ronald R. Yager. On ordered weighted averaging aggregation
operators in multicriteria decision makitng. IEEE 7Thansactiont
on Systems, MAN, and Cybernetics, 18(1), Janiuary 1988.
V. Conclusions [101 Sadiq M. Sait and Habib Youssef. Iterative Computter Algo-
We presented an evolutionary technique for low power rithms with Applications in Engineering: Solving Combina-
torial Optimization Problems. IEEE Computer Society Press,
VLSI cell placement. The proposed technique is applied California, December 1999.

1543

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on May 9, 2009 at 02:38 from IEEE Xplore. Restrictions apply.

S-ar putea să vă placă și