Sunteți pe pagina 1din 60

ACKNOWLEDGEMENTS

I am fortune enough to have an opportunity to present report on A NOVEL LOOK


AHEAD CLOCK GATING FOR POWER SAVING. I take this opportunity to remember and
acknowledge the cooperation extended by several individuals, out of which this thesis evolved.

I wish to express my deep sense of gratitude to my guide Dr. V. Venkata Rao, Professor
and HOD, Dept. of ECE, Narasaraopeta Engineering College, Narasaraopet for his valuable
guidance to do this project.

I wish to express my sincere thanks to Sri. J. Narasimha Rao, Co-ordinator, PG


courses, Dept. of ECE for offering the valuable suggestions and constant encouragement
throughout the project.

I am graceful to Dr. B. V. Rama Mohan Rao, Principal, Narasaraopeta Engineering


College, for his moral support.

I also express my sincere thanks to Sri. M. V. Koteswara Rao, Chairman,


Narasaraopeta Engineering College, Narasaraopet for providing excellent infrastructure facilities
to complete my course.

I am very much grateful to all the Faculty members of Electronics and Communication
Engineering Department, who helped me to make this project a successful one.

I would like to thank my Parents who helped me in completing my project.

Finally, I would like to extend my sincere thanks to all my Friends for their generous help
in various ways for the completion of this report.

Mr. P. SAMBASIVA RAO

Roll No: 13471D3805

ABSTRACT

I
Clock gating is one of the power saving technique. It is a popular technique used in many
synchronous circuits for reducing dynamic power dissipation and extraordinarily helpful for
decreasing the power wasted by digital circuits. This project proposes a new technique of look
ahead clock gating. It avoids and replaces the drawbacks of the previously existing methods. The
present systems for clock gating are synthesis based clock gating, information driven clock
gating and clock gating on auto gated flip flops. The proposed Look ahead clock gating has been
shown to be very useful in reducing the clock switching power. Similar to data driven gating, it is
capable of stopping the majority of redundant clock pulses. It has however a big advantage of
avoiding the tight timing constraints of AGFF and data driven, by allotting a full clock cycle for
the enabling signals to be computed and propagate to their gate. Furthermore, unlike data driven
gating whose optimization requires the knowledge of FFs data toggling vectors, LACG is
independent of those and also it is independent of the target application. The power in LACG has
been reduced to 50% than the autogated FF which consumes 50% less power than the data driven
method.

CONTENTS
LIST OF FIGURES V

LIST OF ABBREVIATIONS VIII

Chapter 1: INTRODUCTION

1.1 Introductions 1
1.2 Objective of the project 4
1.3 Literature survey 4
1.4 Organization of the report 5

Chapter 2: TECHNIQUES FOR POWER REDUCTION

2.1 Techniques forreducting dynamic power 6

2.1.1 Gate size 6

2.1.2 Control synthesis 7

2.1.3 Clock gating 7

2.1.4 Voltage and frequency scaling 11

Chapter 3: INTEGRATION OF POWER REDUCTION

3.1 Clock Gating 12

3.1.1 AND gate 12

3.1.2 NOR gate 14

3.1.3 Latch based AND gate clock gating 16

3.1.4 Latch based NOR gate clock gating 18

3.1.5 Mux based clock gating 19

3.1.6 New approach for clock gate 21

3.2 Integration of clock gating 24

Chapter 4: APPLICATION OF INTEGRATED TECHNIQUES

VIII
4.1 A chain of 5 flip flops 28

4.2 Clock gating method 29

4.3 Clock gating without runtime power gating method 31

4.3.1 Combination logic power model 32

4.3.2 Sequential logic power model 32

4.3.3 Power analysis of non CG & PBSC circuits 32

4.4 Auto gated flip flops 33

Chapter 5: SOFTWARE TOOL AND SIMULATION RESULT

5.1 SOFTWARE TOOL 36

5.1.1 S Edit 36

5.1.2 T Spice 37

5.1.3 W Edit 37

5.1.4 L Edit 37

5.1.5 Lvs 38

5.2 Steps to be followed to design a circuit and estimate the power dissipation of the
circuit 38

CHAPTER 6: CONCULSION AND FUTURE SCOPE

6.1 Conclusions 47

6.2 Future scope 47

REFERENCES 48

PUBLICATION 49

VIII
LIST OF FIGURES
Fig. 1.1:Leakage power components in CMOS 2
Fig. 2.1: In its simplest form, Clock Gating can be implemented by finding out the 10
signal thatdetermines whether the latch will have a new data at the end of the
cycle. If not, the clock isdisabled using the signal.
Fig. 2.2:In pipelined designs, the effectiveness of Clock Gating can be multiplied. If 10
the inputs toa pipeline stage remain the same, then the clock to the later stages
can also be frozen.
Fig. 3.1(A):Basic Counter (Negative edge triggered). 13
Fig. 3.1(B):Normal output of the counter without Clock Gating. 13
Fig. 3.2(A):Clock Gating using AND gate Circuit. 13
Fig. 3.2(B):Output of Counter when Counter is Negative edge triggered. 14
Fig. 3.2(C):Wrong Output due to Glitch, when counter is Positive edge triggered. 14
Fig. 3.2(D):Right Output when counter is Positive edge triggered. 14
Fig. 3.2(E):Hazards Problem when AND Clock Gating Circuitry used. 14
Fig. 3.3(A): Clock Gating using NOR gate Circuit. 15
Fig. 3.3(B):Incorrect Output of Counter when Counter is Positive edge triggered. 16
Fig. 3.3(C):Output of Counter when enable changes from Positive edge to next 16
Positive edge but Counter is Negative edge Triggered.
Fig. 3.3(D): Correct Output of Counter when counter is Positive edge triggered. 16
Fig. 3.3(E):Hazards Problem when NOR Gate is used for Clock Gating. 16
Fig. 3.4(A):Clock Gating of Negative edge counter using Negative Latch Based AND 17
gate Circuit.
Fig. 3.4(B):Normal output of Negative edge Counter when Negative Latch based 17
AND Gated Clock is used.
Fig. 3.4(C):Output of Negative edge counter when there are some random Hazards at 17
En.
Fig. 3.4(D):Clock Gating of Positive edge counter using Positive Latch Based AND 18
gate Circuit.
Fig. 3.4(E):Output of counter when latch is positive and counter also Positive edge 18
triggered.
Fig. 3.5(A):Clock Gating of Negative edge counter using Positive Latch Based NOR 18

VIII
gate Circuit.
Fig. 3.5(B): Normal output of Negative edge Counter when Positive Latch based OR 19
Gated Clock is used
Fig. 3.5(C):Output of Negative edge counter when there are some random Hazards at 19
En.
Fig. 3.5(D):Clock Gating of Negative edge counter using Negative Latch Based NOR 19
gate Circuit.
Fig. 3.5(E):Output of counter when latch is negative and counter also Negative edge 19
triggered
Fig. 3.6(A): Logic of MUX Based Gated Clock. (B) Counter using MUX Based 20
Clock Gating.
Fig. 3.7(A):Output of Negative edge triggered Counter with MUX Based Clock 20
Gating
Fig. 3.7(B):Output of Positive edge triggered Counter with MUX Based Clock 20
Gating.
Fig. 3.8(A): Generation of Gated Clock When Negative Latch is used. 21
Fig. 3.8(B):Generation of Gated Clock When Positive Latch is used. 21
Fig. 3.9(A):Output of Negative edge Counter with gated clock for circuit shown in Fig. 23
3.8(A).
Fig. 3.9(B): Output of Positive edge Counter with gated clock for circuit shown in Fig. 23
3.8(A).
Fig. 3.10(A):Output of Negative edge Counter with gated clock for circuit shown in 23
Fig.3.8(B).
Fig. 3.10(B):Output of Positive edge Counter with gated clock for circuit shown in 23
Fig.3.8(B).
Fig. 3.11: Conventional Clock Gating 24
Fig. 3.12: Modified Clock Gating 25
Fig. 3.13: Power Gating 26
Fig. 3.14: Integration of CG and RTPG 27
Fig. 4.1: Typical non-CG circuit 28
Fig. 4.2: Traditional XOR-based CG circuitry 29
Fig. 4.3: Bus-Specific-Clock-gating 29
Fig. 4.4: Basic scheme of proposed OBSC technique 30

VIII
Fig. 4.5: Clock Gating without Run Time Power Gating method 32
Fig. 4.6: LACG of general logic 33
Fig. 5.1: Schematic design of a inverter circuit in S-edit 36
Fig. 5.2: Waveforms of the inputs and outputs of the inverter circuit 37
Fig. 5.3: File menu in S-edit Window 38
Fig. 5.4: To name the design and select the folder for the design 38
Fig. 5.5: For defining the cells for different schematic designing in one project 39
to name the schematic design of the cells
Fig. 5.6: To name the schematic design of the cells 39
Fig. 5.7: Adding the libraries for the design 40
Fig. 5.8: To browse the library files 40
Fig. 5.9: After libraries are added , they are displayed in library window 41
Fig. 5.10: Designing the circuit by integrating the components from the library 41
Fig. 5.11: Naming the input and output ports in the circuit 42
Fig. 5.12: The wiring connections of the circuit are done by using the wire tool 42
Fig. 5.13: The circuit can be made as a symbol by update symbol 43
Fig. 5.14: By using drawing tools in toolbar the symbols can be modified 44
Fig. 5.15: Waveforms of the inputs and outputs of the circuit designed 45

LIST OF ABBREVATIONS
AGFF Auto Gated Flip-Flops

ASIC Application Specific Integrated Circuit

BSC Bus Specific Clock Gating

CG Clock Gating

VIII
CMOS Complementary Metal Oxide Semiconductor

DDCG Data Driven Clock-Gating

DIBL Drain Induced Barrier Lowering

FF Flip-Flops

FSM Finite State Machines

IC Integrated Circuit

ITRC International Technology Road Map for Semiconductors

LACG Look Ahead Clock-Gating

LVS Layout Versus Schematic

MOSFET Metal Oxide Semiconductor Field Effect Transistor

MUX Multiplexer

OBSC Optimized Bus Specific Clock Gating

PBSC Partial Bus Specific Clock Gating

PG Power Gating

RT Resistor Transistor

RTL Register Transfer Level

RTPG Run Time Power Gating

VLSI Very Large Scale Integration

VIII
CHAPTER 1
INTRODUCTION
1.1 Introduction
Energy dissipation is a very critical parameter that has to be taken into account during
the design of Very Large Scale Integration (VLSI) circuits. With the rapid progress in
semiconductor technology, chip density and operation frequency have increased, making the
power consumption in battery-operated portable devices a major concern. High power
consumption reduces the battery service life. Reducing power dissipation is a design goal even
for non-portable devices since excessive power dissipation results in increased packaging and
cooling costs as well as potential reliability problems.

There are two major forms of design power efficient Complementary Metal Oxide
Semiconductor (CMOS) circuits: technology and project choices. The former includes research
on new materials, reducing supply, threshold voltages, and doping levels. The latter includes
algorithms, data encoding style, the use of pipeline, parallelism, Clock Gating or any other low
power technique. This work carries a study on the impact of both topology and technology
choices on power consumption of logic gates used in standard cell libraries.

Portable electronic devices tend to be much more complex than a single VLSI chip.
They contain many components, ranging from digital, analog to electro-mechanical and
electro-chemical. Dynamic power management which refers to a selective shut-off or slow-
down of system components that are idle or underutilized has proven to be a particularly
effective technique for reducing power dissipation in such systems. Incorporating a dynamic
power management scheme in the design of an already-complex system is a difficult process
that may require many design iterations, careful debugging and validation.

Integrated Circuit(IC) power dissipation consists of different components depending on


the circuit operating mode. First, the switching or dynamic power component dominates during
the active mode of operation. Second, there are two primary leakage sources, the active
component and the standby leakage component. The standby leakage may be made
significantly smaller than the active leakage by changing the body bias conditions or by Power
Gating.

1
For the most recent CMOS feature sizes (e.g., 90nm and 65nm), leakage power
dissipation has become an overriding concern for VLSI circuit designers. International
Technology Roadmap for Semiconductors (ITRS) reports that leakage power dissipation may
come to dominate total power consumption.

Power consumption of CMOS consists of dynamic and static or leakage components.


Dynamic power is consumed when transistors are switching and static power is consumed
regardless of transistor switching. Dynamic power consumption was previously the single
largest concern for low-power chip designers since dynamic power accounted for 90% or more
of the total chip power. Therefore, many previously proposed techniques, such as voltage and
frequency scaling, focused on dynamic power reduction. However, as the feature size shrinks,
e.g., to 0.09 and 0.065, static power has become a great challenge for current and future
technologies. There are many reasons for which power losses occur in CMOS circuit. Fig 1.1
shows different types of leakage components. They are:

A. Sub-threshold leakage (weak inversion current)


B. Gate oxide leakage (Tunneling current)
C. Channel punch through
D. Drain induced barrier lowering

Fig. 1.1: Leakage power components in CMOS

2
The Sub-threshold conduction or the sub-threshold leakage or the sub-threshold drain
current is the current that flows between the source and drain of a Metal Oxide Semiconductor
Field Effect Transistor (MOSFET) when the transistor is in sub-threshold region, or weak-
inversion region, that is, for gate-to source voltages below the threshold voltage. The sub-
threshold region is often referred to as the weak inversion region. When technology feature size
scales down, supply voltage and threshold voltage also scale down. Sub-threshold leakage
power increases exponentially as threshold voltage decreases.

Next in the gate oxide leakage, the gate oxide, which serves as insulator between the
gate and channel, should be made as thin as possible to increase the channel conductivity and
performance. But as the gate oxide is made thinner the barrier voltage of the oxide changes. For
the positive gate voltage thus some positive charges get stuck in the oxide. Therefore, current
flows through the oxide. This is also known as tunneling current.

Punch through in a MOSFET is an extreme case of channel length modulation where


the depletion layers around the drain and source regions merge into a single depletion region.
The field underneath the gate then becomes strongly dependent on the drain-source voltage, as
is the drain current. Punch through causes a rapidly increasing current with increasing drain-
source voltage. This effect is undesirable as it increases the output conductance and limits the
maximum operating voltage of the device.

Drain Induced Barrier Lowering (DIBL) is referred to the reduction of threshold voltage
of the transistor at higher drain voltages. The combined charge in the depletion region of the
device and that in the channel of the device is balanced by three electrode charges: the gate, the
source and the drain. As drain voltage is increased, the depletion region of the p-n junction
between the drain and body increases in size and extends under the gate, so the drain assumes a
greater portion of the burden of balancing depletion region charge, leaving a smaller burden for
the gate. As a result, the charge present on the gate retains charge balance by attracting more
carriers into the channel, an effect equivalent to lowering the threshold voltage of the device.

3
1.2 Objective of the project

The main objective of this project to implement look ahead clock gating method which

reduces the clock switching power.

1.3 Literature Survey

CMOS Technology is one of the mainstreams of VLSI Design. In 0.18 and above
technology Dynamic power is one of the main factors of total power consumption. But when
technology feature size shrinks static (Leakage) power dominates the dynamic power so
however, the designers proposed several methods to reduce the leakage In Base Technique of
Power Gating there is no method for leakage reduction but it saves the state as well as
minimum area and delay. Sleep Transistor Technique is most common method for achieving
ultra-low leakage but it destroy the state and as well as increasing delay and area.

Forced Stack technique is another method and it can save the state. But in this
technique, Dynamic Power consumption is increased and it cannot use high threshold voltage
without increasing the Delay. By combing these two prior techniques Sleepy Stack approach is
proposed. It reduces the leakage similarly like sleep transistor technique but the main advantage
oversleep transistor technique is save the logic state. Moreover, Sleepy Stack approach comes
with area and delay overhead and slower method than other technique. However, Sleepy
Keeper approach is considerable for propagation delay and static power performances. Variable
Body Biasing approach can be used for efficient area and dynamic power dissipation.

In Base Technique of Clock Gating, output correctness problem is present due to


glitches and hazards. So for elimination of hazard, we can use the latch based AND Clock
Gating. But till now these two common techniques are used. Hence we sought a new method
which can have excellent tradeoff between power, area, and delay.

4
1.4 Organization of the report :
The project report contains six chapters following the introduction; the rest of the report
is organized as follows:

In chapter 2 the techniques for power reduction are presented. Integration of power
reduction techniques are is discussed in chapter 3. The applications of integrated techniques
are given in chapter 4. The software tools used and the simulation results are discussed in
chapter 5. In chapter 6 conclusion and future scope of the project are presented.

5
CHAPTER 2

TECHNIQUES FOR POWER REDUCTION


2.1 Techniques for Reducing Dynamic Power

The dynamic power of a circuit in which all the transistors switch exactly once per
clock cycle will be (1/2) CV2F, if C is the switched capacitance, V is the supply voltage, and F
is the clock frequency. However, most of the transistors in a circuit rarely switch from input
changes. Hence, a constant called the activity factor (0 A 1) is used to model the average
switching activity in the circuit. Using A, the dynamic power of a circuit composed of CMOS
transistors can be estimated as:

P= AC V 2 F

The importance of this equation lies in pointing us towards the fundamental


mechanisms of reducing switching power. The second fundamental scheme is to reduce the
load capacitance, CL. This can be done by using small transistors with low capacitances in non-
critical parts of the circuit. Reducing the frequency of operation F will cause a linear reduction
in dynamic power, but reducing the supply voltage V DD will cause a quadratic reduction. Some
of the established and effective mechanisms for dynamic power reduction are discussed below.

2.1.1 Gate size

The power dissipated by a gate is directly proportional to its capacitive load C L, whose
main components are:

A. Output capacitance of the gate itself (due to parasitics),


B. The wire capacitance, and
C. Input capacitance of the gates in its fan-out.

The output and input capacitances of gates are proportional to the gate size. Reducing
the gate size reduces its capacitance, but increases its delay. Therefore, in order to preserve the
timing behavior of the circuit, not all gates can be made smaller; only the ones that do not
belong to a critical path can be slowed down.

6
2.1.2 Control Synthesis

Most control circuits are conceived as Finite State Machines (FSM), formally defined as
graphs where the nodes represent states, and directed edges, labeled with inputs and outputs,
describe the transition relation between states. The state machines eventually implemented
using a state register and combinational logic, that takes in the current state, the current inputs
and computes the outputs and the new state, which is then written into the state register at the
end of the cycle. The binary values of the inputs and outputs of the FSM are usually determined
by external requirements, while the state encoding is left to the designer. Depending on the
complexity of the circuit, a large fraction of the power is consumed due to the switching of the
state register; this power is very dependent on the selected state encoding.

The objective of low power approaches is therefore to choose a state encoding that
minimizes the switching power of the state register. Given a state encoding, the power
consumption can be modeled as:

P= ( 12 )V 2
dd f C sr E sr

where f is the clock frequency of the state machine, Csris the effective capacitance of the
state register, and Esr is the expected state register switching activity. If S is the set of all states,
we can estimate Esras:

Esr = pij h ij
i : j S

where pij is the probability of a transition between states i and j, and hij is the
Hamming Distance between the codes of states i and j .

2.1.3 Clock Gating

Clock signals are omnipresent in synchronous circuits. The clock signal is used ina
majority of the circuit blocks, and since it switches every cycle, it has an activityfactor of 1.
Consequently, the clock network ends up consuming a huge fractionof the on-chip dynamic
power. Clock Gating has been heavily used in reducing thepower consumption of the clock
network by limiting its activity factor. Fundamentally, Clock Gating reduces the dynamic
power dissipation by disconnecting the clock from an unused circuit block.

7
Fig. 2.1: In its simplest form, Clock Gating can be implemented by finding out the signal that determines
whether the latch will have a new data at the end of the cycle. If not, the clock is disabled using the signal.

Traditionally, the system clock is connected to the clock pin on every flip-flop inthe
design. This results in three major components of power consumption:

A. Power consumed by combinatorial logic whose values are changing on each clock edge;
B. Power consumed by flip-flops this has a non-zero value even if the inputs to the flip-flops
are steady, and the internal state of the flip-flops is constant;
C. Power consumed by the clock buffer tree in the design. Clock Gating has the potential of
reducing both the power consumed by flip-flops and the power consumed by the clock
distribution network.

Clock Gating works by identifying groups of flip-flops sharing a common enable signal
(which indicates that a new value should be clocked into the flip-flops). This enable signal is
AND with the clock to generate the gated clock, which is fed to the clock ports of all of the
flip-flops that had the common enable signal.

In Fig. 2.1, the signal encodes whether the latch retains its earlier value, or takes a new
input. This l signal is AND with the clock signal to generate the gated clock for the latch. This
transformation preserves the functional correctness of the circuit, and therefore does not

8
increase the burden of verification. This simple transformation can reduce the dynamic power
of a synchronous circuit by 510%.

There are several considerations in implementing Clock Gating. First, the enable signal
should remain stable when clock is high and can only switch when clock is in low phase.
Second, in order to guarantee correct functioning of the logic implementation after the gated-
clock, it should be turned on in time and glitches on the gated clock should be avoided. Third,
the AND gate may result in additional clock skew. Clock Gating works by identifying groups of
flip-flops sharing a common enable signal (which indicates that a new value should be clocked
into the flip-flops). This enable signal is AND with the clock to generate the gated clock, which
is fed to the clock ports of all of the flip-flops that had the common enable signal.

Clock signals are omnipresent in synchronous circuits. The clock signal is used ina
majority of the circuit blocks, and since it switches every cycle, it has an activity factor of 1.
Consequently, the clock network ends up consuming a huge fraction of the on-chip dynamic
power. Clock Gating has been heavily used in reducing the power consumption of the clock
network by limiting its activity factor. Fundamentally, Clock Gating reduces the dynamic
power dissipation by disconnecting the clock from an unused circuit block.

For high-performance design with short-clock cycle time, the clock skew could be
significant and needs to be taken into careful consideration. An important consideration in the
implementation of Clock Gating for Application Specific Integrated Circuit (ASIC) designers is
the granularity of Clock Gating.

Clock Gating in its simplest form is shown in Fig. 2.1. At this level, it is relatively easy
to identify the enable logic. In a pipelined design, the effect of Clock Gating can be multiplied.
If the inputs to one pipeline stage remain the same, then all the later pipeline stages can also be
frozen.

9
Fig. 2.2. In pipelined designs, the effectiveness of Clock Gating can be multiplied. If the inputs toa pipeline
stage remain the same, then the clock to the later stages can also be frozen.
Figure 2.2 shows the same Clock Gating logic being used for gating multiple pipeline
stages. This is a multi-cycle optimization with multiple implementation tradeoffs, and can save
significant power, typically reducing switching activity by 1525%.Apart from pipeline latches,
Clock Gating is also used for reducing power consumption in dynamic logic. Dynamic CMOS
logic is sometimes preferred over static CMOS for building high speed circuitry such as
execution units and address decoders. Unlike static logic, dynamic logic uses a clock to
implement the combinational circuits.

Dynamic logic works in two phases, precharge and evaluate. During precharge (when
the clock signal is low) the load capacitance is charged. During evaluate phase (clock is high)
depending on the inputs to the pull-down logic, the capacitance is discharged.

10
2.1.4 Voltage and Frequency Scaling

Dynamic power is proportional to the square of the operating voltage. Therefore,


reducing the voltage significantly improves the power consumption. Furthermore, since
frequency is directly proportional to supply voltage, the frequency of the circuit can also be
lowered, and thereby a cubic power reduction is possible. However, the delay of a circuit also
depends on the supply voltage as follows.

=k C L V dd /(V dd V 2t )

Where is the circuit delay, k is the gain factor, C L is the load capacitance, Vdd is the
supply voltage, and Vt is the threshold voltage. Thus, by reducing the voltage, although we can
achieve cubic power reduction, the execution time increases. The main challenge in achieving
power reduction through voltage and frequency scaling is therefore to obtain power reduction
while meeting all the timing constraints.

11
CHAPTER 3

INTEGRATION OF POWER REDUCTION TECHNIQUES


3.1 Clock Gating

Clock Gating is a technique that can be used to control power dissipated by Clock net.
In synchronous digital circuits the clock net is responsible for significant part of power
dissipation (up to 40%). Clock Gating reduces the unwanted switching on the parts of clock net
by disabling the clock.Register-Transfer Level (RTL)Clock Gating is the most common
technique used for optimization and improving efficiency but still it leaves one question that
how efficiently design is clock gated. Gated Clock is easily accepted technique in order to
optimize power and can be applied at system level, gate level and RTL.Clock Gating can save
more power by not clocking the register if there is no change in its state. Clock continuously
consumes power because it toggles the registers and their associated logic. So, to reduce power
consumption Clock Gating shuts off the clock while system maintaining its current state.

There are five different techniques for Clock Gating as discussed below:

3.1.1 AND Gate

Initially many authors suggested using AND gate for Clock Gating because of its simple
logic. In sequential circuit one two-input AND gate is inserted in logic for Clock Gating. One
input to AND gate is clock and while the second input is a signal used to control the output
(means it will control the sequential circuit's clock). For experimental purpose we are taking a
simple counter shown in Fig 3.1(A) as a sequential circuit application. Figure 3.1(B) shows the
waveform of the output of regular counter, initially at reset = '0', counter initialized to "0" and
after that when reset='1' counters increments at each Negative edge of the clock.

Figure 3.2(A) shows the Clock Gating technique for the counter by inserting one AND
Gate. Figure 3.2(B) shows the output of counter when counter is Negative edge triggered and
enable ('en') changes from clock cycle starting from Negative edge to the next Negative edge,
in this case output of the counter changes after one clock cycle of being en='1'.

From Figure 3.2(C) we have observed that when counter is Positive edge triggered and

12
enable is changing starting from Positive edge to the next Positive edge, counter increments
one extra time, due to tiny "Glitch", when it goes down due to more falling time of the enable,
and the output in this case is wrong.

In Figure 3.2(D) shows that for Positive edge triggered system when enable turns ON at
Negative edge of the clock to the next Negative edge, the counter increments only one time at
Positive edge of the clock because when enable goes down there is the Negative edge of the
clock not Positive. In Figure 3.2(E) we have shown a major problem of Hazards when any
hazard at the enable could be pass on to the Gclock when clock='1' this situation is particularly
very dangerous and could jeopardize the correct functioning of the entire system.

Fig 3.1 (A): Basic Counter (Negative edge triggered).

Fig. 3.1(B): Normal output of the counter without Clock Gating.

Fig. 3.2 (A): Clock Gating using AND gate Circuit.

13
Fig 3.2(B)Output of Counter when Counter is Negative edge triggered.

Fig. 3.2 (C): Wrong Output due to Glitch, when counter is Positive edge triggered.

Fig. 3.2 (D): Right Output when counter is Positive edge triggered.

Fig 3.2 (E): Hazards Problem when AND Clock Gating Circuitry used.

3.1.2 NOR Gate

NOR gate is a very suitable technique for Clock Gating where we need actions to be
performed on Positive edge of the Global clock. For analysis using NOR gate, the circuit
connection is shown in Figure 3.3(A); in this figure we can observe that Counter will work
when enable turn "ON".

Figure 3.3(B) shows the waveform for incorrect output of the Counter when enable
changes to '1' at Negative edge of the clock. Incorrect output is due to the small glitch when

14
enable turns low at Negative edge of the clock, counter increments one more clock.

Figure 3.3(C) shows output of Counter when enable changes from Positive edge to next
Positive edge but counter is Negative edge triggered. Figure 3.3(D) shows correct output of the
counter with Positive edge triggered because enable is changing from Positive edge of the clock
to the next Positive edge of the clock. In the figure 3.3(E) we have shown a major problem of
Hazards. When any hazard at the enable could be pass on to the Gclock when clock='0' this
situation is particularly very dangerous and could jeopardize the correct functioning of the
entire system.

Fig. 3.3 (A): Clock Gating using NOR gate Circuit.

Fig. 3.3 (B): Incorrect Output of Counter when Counter is Positive edge triggered.

15
Fig. 3.3(C): Output of Counter when enable changes from Positive edge to next Positive edge but

Counter is Negative edge Triggered.

Fig3.3(D) Correct Output of Counter when counter is Positive edge triggered.

Fig. 3.3 (E): Hazards Problem when NOR Gate is used for Clock Gating.

3.1.3 Latch Based AND Gate Clock Gating

Latch Based AND Gated Clock circuit is shown in Figure 3.4(A). The enable signal 'En'
is applied through a latch to overcome the previous problems of incorrect output in place of
directly connected to AND gate. The Latch is needed for correct behavior, because En might
have Hazards that must not propagate through AND gate when Global clock is 1. However,
the delay of the logic for the computation of En may on the critical path of the circuit will
increase and its effect must be taken into account during time verification.

It is clear from Figure 3.4(B) that counter will take one extra clock cycle delay to
change its state and after that it will work normally until, En is de-asserted and this time also it
will take one clock cycle extra to stop changing its state.

16
Figure 3.4(C) verifies that unwanted outputs due to Hazards at the En are avoided.
Figure 3.4(E) waveform show that when controlling latch is positive and counter is also
Positive edge triggered then output of the counter is incorrect because it increments once even
when enable is turned down due to a tiny glitch.

Fig. 3.4 (A): Clock Gating of Negative edge counter using Negative Latch Based AND
gate

Circuit.

Fig. 3.4 (B): Normal output of Negative edge Counter when Negative Latch based AND Gated Clock is used.

Fig.
3.4 (C): Output of Negative edge counter when there are some random Hazards at En.

17
Fig. 3.4 (D): Clock Gating of Positive edge counter using Positive Latch Based AND gate Circuit.

Fig. 3.4 (E): Output of counter when latch is positive and counter also Positive edge triggered

3.1.4 Latch Based NOR Gate Clock Gating

Latch based NOR Gated Clock scheme is shown in Figure 3.5(A). Here enable signal is
applied through latch in place of direct connection to NOR gate. We can observe from Figure
3.5(B) that counter will take one extra clock cycle delay to change its state and after that it will
work normally until En is de-asserted and this time also it will take one clock cycle extra to
stop changing its state. In Figure 3.5(C) we have verified that unwanted outputs due to Glitches
at the En are avoided. The figure 3.5(D) shown below is the Clock Gating of Negative edge
counter using Negative Latch Based NOR gate Circuit.

In Figure 3.5(E) waveform the case when controlling Latch is negative and Counter is
also Negative edge triggered is shown. The output of the counter is incorrect because it
increments once even when enable is turned down due to a tiny glitch due to the fall time delay
of enable.

Fig. 3.5 (A): Clock Gating of Negative edge counter using Positive Latch Based NOR gate Circuit.

18
Fig. 3.5 (B): Normal output of Negative edge Counter when Positive Latch based OR Gated Clock is used.

Fig. 3.5 (C): Output of Negative edge counter when there are some random Hazards at En.

Fig. 3.5 (D): Clock Gating of Negative edge counter using Negative Latch Based NOR gate Circuit.

Fig. 3.5 (E): Output of counter when latch is negative and counter also Negative edge triggered.

3.1.5 MUX Based Clock Gating

In Multiplexer (MUX) based Clock Gating we use multiplexer to close and open a feedback
loop around a basic D-type flip-flop under control of the enable signal as shown in Figure
3.6(A). As the resulting circuit is simple, robust, and compliant with the rules of synchronous
design this is a safe and often also a reasonable choice. On the negative side, this approach
takes one fairly expensive multiplexer per bit and consumes more power. This is because any
toggling of the clock input of a disabled flip- flop amounts to wasting of energy in discharging

19
and recharging the associated node capacitances for nothing.

The capacitance of the clock input is not the only contribution as any clock edge causes
further nodes to toggle within the flip-flop itself. In Figure 3.7(A) waveform of Negative edge
triggered Counter is shown and in 3.7(B) Positive edge triggered. We can observe from these
waveforms that when En turns ON then at each Negative and Positive edge of the clock
respectively counter increments and when En goes Low counter holds its state.

Fig. 3.6(A): Logic of MUX Based Gated Clock. (B) Counter using MUX Based Clock Gating.

Fig. 3.7(A): Output of Negative edge triggered Counter with MUX Based Clock Gating.

Fig. 3.7(B): Output of Positive edge triggered Counter with MUX Based Clock Gating.

20
Fig.3.8 (A): Generation of Gated Clock When Negative Latch is used.

Fig. 3.8 (B): Generation of Gated Clock When Positive Latch is used

3.1.6 New Approach for Clock Gating

In this section, we will discuss a new design that will save more power. The new Gated
Clock Generation Circuit is shown in Figure 3.8(A) and 3.8(B) using Negative Latch and
Positive Latch respectively. This circuit saves power in such a way that even when Target
device's clock is ON, the controlling device's clock is OFF and also when the target device's
clock is OFF then also Controlling device's clock is OFF. This way we can save more power by
avoiding unnecessary switching at clock net.

To understand the working of circuit considers Figure 3.8(A), an input signal named
'En' is provided to the latch. When En turns to '1' at that time GEN is 0, XNOR will produce
x='0' which goes to the first clock generation logic that generates clock for controlling device
(Latch). In first logic we have an OR gate which have Global Clock as an input at the other

21
input of OR gate. This logic will generate a clock pulse that will drive the controlling latch
when 'x' turns to '0'.

In the next clock pulse, when GEN turns to '1' our second clock generation logic which
is an AND gate which has GEN and Global clk at its input and when Gen goes '1' it generates
clock pulse that goes to the target device. Since GEN is '1' the XNOR will produce x='1' thus
OR will produce at CClk constant high until En turns to '0'. This way G Clk will be running and
CClk will be at Constant '1' state that means latch will hold its state without any switching.

The circuit shown in figure 3.8(B) performs similar sequence of operations as explained
for the circuit shown in figure 3.8(A). When En turns to '1' at that time GEN is '0' so XOR
willproduce x='1' which goes to the first clock generation logic that generates clock for
controlling device (Latch). In first logic we have an AND gate, which have Global Clock as an
input at the other input of AND gate. This logic will generate a clock pulse that will drive the
controlling latch when 'x' turns to '1'.

In the next clock pulse, when GEN turns to '1' our second clock generation logic which
is an OR gate which has Q and Global clock at its input and when Q goes '0' it generates clock
pulse that goes to the target device. Since GEN is '1' the XOR will produce x='0' thus OR will
produce at CClk constant LOW until En turns to '0'. This way G Clk will be running and CClkwill
be at Constant '0' state that means latch will hold its state without any switching.

The output of Counter for circuit as in Figure 3.8(A) is shown in Figure 3.9(A & B). In
Figures 3.9(A) and 3.9(B) enable changes from Negative edge to next Negative edge and
Positive edge to next Positive edge respectively and also target is Negative edge triggered and
Positive edge triggered respectively. However, in both cases counter's state changing delay is
different but output is correct which gives us solution of the problem that persists in first four
types of Clock Gating.

The output of Counter for circuit as in Figure 3.8(B) is shown in Figure 3.10(A & B). In
figure 3.10(A) and 3.10(B) enable changes from Negative edge to next Negative edge and
Positive edge to next Positive edge respectively and also target is Negative edge triggered and
Positive edge triggered respectively. However, in both cases counter's state changing delay is
different but output is correct which gives us solution of the problem that persists in first four
types of Clock Gating. Thus one can avoid more switching and can save power.

22
Fig 3.9(A) Output of Negative edge Counter with gated clock for circuit shown in Fig. 3.8(A).

Fig 3.9(B): Output of Positive edge Counter with gated clock for circuit shown in Fig. 3.8(A).

Fig. 3.10(A): Output of Negative edge Counter with gated clock for circuit shown in Fig.3.8(B).

23
Fig. 3.10(B): Output of Positive edge Counter with gated clock for circuit shown in Fig.3.8(B).

3.2 Integration of Clock Gating

Clock Gating is the most common and widely used technique to reduce dynamic power, and
Power Gating is the dominant technique to reduce standby leakage power. As active leakage
power becomes more and more important it also requires care.

Clock Gating is accomplished by using Clock Gating Integrated Cell (CGIC) which gates
the clock to the sequential elements present in its fan-out when the enable signal is logic 0.
Power Gating structures may be of two types: Simple Power Gating and State Retention Power
Gating.

Using the former technique, the output of the logic gates slowly leaks the charge at the
output and thereby when the Sleep signal is de-asserted, one cannot predict the logic value at
the output. The latter technique is able to retain the state at the output which was last present
before asserting the Sleep signal.

Let's take up a few plausible scenarios:

Case I - Normal Case: Which employs only conventional Clock Gating? It is depicted in the
figure.

Fig. 3.20: Conventional Clock Gating

Case II - When one does not need to retain the states of the combinatorial cells or the sequential
elements. One possible scenario could be in the case of a standalone IP, which is not

24
communicating with any other IP on the SoC. Here one can use the simple Power Gating where
the SLEEP signal is derived from the CGIC itself using a latch, as depicted in the figure below.
Doing so, we would save both dynamic and leakage powers.

Fig. 3.21: Modified Clock Gating

Case IIII - When one does not need to retain the states of the combinatorial cells, but the
sequential outputs need to be safe-stated. Possible use-case could be where only the sequential
outputs communicate with other IPs on the SoC. This can be accomplished by using State
Retention Flip Flops instead of the conventional flip-flops.

Case IV - When both the combinatorial cells and the sequential cells interact with other IPs.
But the previous value need not be required. Since it is a classic case of interaction between
"switchable power domain" with" always ON", it entails the use of isolation cells between such
power domain crossings. It must be noted that in such a case, isolation cell would always be
present in the always ON power domain, i.e., it would receive its V DD supply from the always
ON power domain supply. This is because, when the switchable power domain in OFF, the
isolation cell can function only if receives the power supply.

25
Fig. 3.22: Power Gating

Isolation Cells can be simple cells like AND or an OR gate, which receive one input in a
way that, irrespective of the second input coming from the switchable power domain, the value
would be controllable. For example, logic 0 for AND gate and logic 1 for an OR gate. I will try
to take this up in a separate post.

In order to distinguish it from the traditional PG, which is used to reduce the standby leakage,
the PG to minimize active leakage power in the operation mode is referred to as Run Time
Power Gating. CG is a technique used to gate the unnecessary clock toggles of a register.
During the clock gated period, there are some components that are performing redundant
operations, and RTPG will put these components into sleep. There are several researchers
focusing on the integration of CG and RTPG.

All of their designs are based on clock gated designs generated after synthesis, and they
evaluate the feasibility of RTPG according to the signal activity of the design. However, it is
possible that a design cannot be clock gated during synthesis. Then their approach cannot be
used. Moreover, with the signal activity of the design, CG should also be analyzed to determine
if dynamic power is reduced. If dynamic power is increased, the total power may increase even
if active leakage power is reduced. In this method, we have used an activity-driven fine-grained
CG and RTPG integration, which can reduce dynamic power and active leakage power
simultaneously. An activity-driven optimized bus specific Clock Gating is used to maximize

26
dynamic power reduction at RT level before synthesis. It chooses only a subset of flip-flops to
be gated selectively, and the problem of gated FF selection is reduced from exponential
complexity into linear.

Fig. 3.23: Integration of CG and RTPG.

After the OBSC is applied to the design, the components performing redundant
operations during the clock gated period are determined by forward traversing the circuit from
the gated FF outputs. These components will be power gated using the clock enable signal
generated by OBSC only if the implementation of RTPG can reduce active leakage power.

CHAPTER 4
27
APPLICATION OF THE INTEGRATED TECHNIQUES
We have proposed an activity-driven fine-grained CG and RTPG integration, which can
reduce dynamic power and active leakage power simultaneously. In this application, we use the
most CG structure (a Latch based AND Clock Gating approach) to reduce the dynamic power.
The partial BSC (PBSC afterward) circuit is used as it have much less power. We use the most
PG structure (a Variable Body Biasing technique) to reduce the leakage power. Instead, the
sleep signal of RTPG we focus on is generated by CG in operation mode. It is used to turn off
the components that are executing redundant operations in operation mode

Latch based AND Clock Gating approach and Variable Body Biasing techniques can
successfully be implemented in logic design. To verify this statement it is applied with a chain
of five flip flops. A chain of five flip flops is chosen because a flip flop is the most basic logic
circuit in CMOS technology.

4.1 A chain of 5 Flip-flops

Fig. 4.1: Typical non-CG circuit

28
Flip-flops are the basic storage elements used extensively in all kinds of digital designs.
In particular, digital designs nowadays often adopt intensive pipelining techniques and employ
many FF-rich modules. It is also estimated that the power consumption of the clock system,
which consists of clock distribution networks and storage elements, is as high as 20%45% of
the total system power. Delay flip-flop(DFF) forms the integral part of a digital system to
construct the sequential part of the circuit to achieve low power and low area.

4.2 Clock gating method

Fig. 4.2: Traditional XOR-based CG circuitry Fig. 4.3: Bus-Specific-Clock-gating (BSC)

29
Fig.4.2 is the traditional XOR-based CG circuitry [we call it Bus-Specific-Clock-gating
(BSC) afterwards]. BSC circuit compares the inputs and outputs, and gates the clock when they
are equal. BSC can be used as a final CG option to reduce dynamic power when no CG can be
applied during synthesis. However, BSC is far from optimal in terms of dynamic power
minimization, and the partial BSC (PBSC afterward) circuit [Fig.4.3] may have much less
power.

Optimized Bus Specific Clock Gating is very effective technique to maximize dynamic
power reduction. It chooses only a subset of flip-flops (FF) to be gated selectively, and the
problem of gated FF selection is reduced from exponential complexity into linear. It works by
comparing the inputs and outputs and gates the clock when they are equal. Considering N FFs
in the non-CG circuit, each FF can be chosen as gated or nongated. Hence, 2N CG solutions are
possible and the exponential complexity problem is reduced into linear.

Fig 4.4: Basic scheme of proposed OBSC technique.

Assume that all the FFs are chosen to be gated initially, and then the problem is in
determining which FFs should be excluded from gating. Heuristically, the FF with the
maximum output data toggle rate should be excluded from gating first. This is because that

30
maximum output data toggle rate indicates that minimum clock toggles will be gated, thus
power will reduce least or even increase if the FF is gated.

More formally, the FF with the maximum output toggle rate is excluded from gating
first, then the FF with the second largest output toggle rate is excluded and so on until all the
FFs are excluded (i.e., the original on CG circuit). Apparently, during the process of exclusion,
there will be N+1 possible CG solutions which is linear complexity.

4.3 Clock Gating without Run Time Power Gating method

4.3.1.Combinational Logic Power Model:

If the logic is part of a synchronous digital system controlled by a global clock, the average

dynamic power dissipated by the gate Pcomb


avg canbe expressed as

comb V 2dd
P avg =0.5 . .TR . C
T cyc

Where Vdd is the supply voltage, Tcyc is the global clock period, TR is toggle rate of the gate
output, and C is the gate output capacitance.

Among these four parameters, only Vdd and Tcyc can be determined in advance from the
technology and design information, and they can be treated as constants in the estimation
process. TR depends on both the logic function being performed and the statistical properties of
the primary inputs. When the output of a combinational logic toggles every clock cycle, its TR

is 1, and the power dissipated by this combinational logic is defined as unit power Pcomb
unit .

Pcomb comb
avg =Punit .TR

As a result, where Pcomb


unit unit is a function of C, and it can be determined once we have

the circuit structure.

4.3.2. Sequential Logic Power Model:

31
The sequential logic is normally a latch or a D Flip Flop. The power estimation of the
sequential logic cannot be evaluated by the above technique. For a D FF/latch, its operation per
each clock cycle can be classified into four categories:

A. OP_I both clock and input data toggle;


B. OP_II only clock toggles;
C. OP_III only input data toggles;
D. OP_IV neither clock nor input data toggles.

Fig. 4.5: Clock Gating without Run Time Power Gating method

32
4.3.3. Power Analysis of Non-CG and PBSC Circuits

Suppose the toggle rate of the ith (1 i N) FF input is T R i in Fig. 4. 1. Since the circuit
power is analyzed at RTL, it indicates that there is no glitch at the FF input. Consequently, the
toggle rate of the i th FF outputs also T Ri . On the other hand, if there is a register which
consists of n FFs, suppose the toggle rate of a register bus input (or output) is Tn. Tn should be
no smaller than any T Rj.
A. Power of the Non-CG Circuit [Refer to Fig. 4.2]
For the non-CG circuit, its clock signal is always toggling. So the operation of each FF
in the non-CG circuit will be either in OP_I or OP_II. The operation distribution time
tOP_I and tOP_II of the ith FF will be T Ri and 1 T Ri.
B. Power of the PBSC Circuit [Refer to Fig. 4.3]
Different from the non-CG circuit, there are combinational logics in the PBSC circuit.
With the toggle rate information of each FF, the output toggle rate of each combinational logic
can be obtained. For example, for the n-fan-out AND gate, its output toggle rate is 2Tn because
both rising and falling edges of the clock are required to latch one toggled bus input whose
toggle rate is Tn. Similarly, the output toggle rate of the it XOR gate and the n-input OR gate
can be figured out as 2 T Ri and Tn.
For the sequential logic in the PBSC circuit, the operation distribution of each non gated
FF is the same as in non-CG circuit. For the ith (1 i n) gated FF, the distribution of its
operation time is tOP_I= T Ri (when its input toggles), tOP_II= TnT Ri (when its input does not
toggle but other gated FF inputs toggle), tOP_III= 0, and tOP_IV= 1 Tn (when all inputs do not
toggle and the clock is gated). For the latch, its operation distribution cannot be represented by
the toggle rate information. So we assume its maximum power consumption is Plmax.

4.4.Auto-gated flip-flops:
The basic circuit used for LACG is Auto-Gated Flip-Flip (AGFF) illustrated in Fig

33
The FFs master latch becomes transparent on the falling edge of the clock, where its
output must stabilize no later than a setup time prior to the arrival of the clocks rising edge,
when the master latch becomes opaque and the XOR gate indicates whether or not the slave latch
should change its state. If it does not, its clock pulse is stopped and otherwise it is passed. In [12]
a significant power reduction was reported for register-based small circuits, such as counters,
where the input of each FF depends on the output of its predecessor in the register.

AGFF can also be used for general logic, but with two major drawbacks. Firstly, only the
slave latches are gated, leaving half of the clock load not gated. Secondly, serious timing
constraints are imposed on those FFs residing on critical paths, which avoid their gating.

LACG takes AGFF a leap forward, addressing three goals; stopping the clock pulse also in
the master latch, making it applicable for large and general designs and avoiding the tight timing
constraints. LACG is based on using the XOR output in Fig. 4 to generate clock enabling signals
of other FFs in the system, whose data depend on that FF. There is a problem though. The XOR
output is valid only during a narrow window of around the clock rising edge, where and are the
FFs setup time and clock to output contamination delay, respectively.

After a delay the XOR output is corrupted and turns eventually to zero. To be valid during
the entire positive half cycle it must be latched as shown in Fig. 5(a). Fig. 5(b) is the symbol of
the enhanced AGFF with the XOR output. The power consumed by the new latch can be reduced
by gating its clock input . Such gating has been proposed in [16] and it involves another XOR
and OR gates, useful for high clock switching probability. It is subsequently shown that
probability is very low and it is therefore not further being gated

Fig. 4.6 illustrates how LACG works. We call target and source. A target FF depends on
source FFs. It is required that the logic driving a target FF does not have an input externally of
the block. Let denote the set of the XOR outputs of the source FFs, and denote by the set of their
corresponding outputs. The source FFs can be found by a traversal of the logic paths from back
to, which can be performed either in the RTL or the net-list descriptions of the underlying
system.

34
The logic tree with root and leaves is sometimes called the logic cone of [11]. Let and be
two successive clock cycles shown, where the time tics refer to the rising edge of the clock
pulses

Fig. 4.6: LACG of general logic.

35
CHAPTER 5

SOFTWARE TOOL AND SIMULATION RESULTS


5.1 Software Tool

Tanner EDA helps to transform your ideas into design. It has created a software platform
that is cost efficient. It is powerful enough to handle complex design. Tanner EDAs continued
innovation makes its tools effective solution that grows with a company as its performance
needs change. Tanner EDA consist of various tools namely S-edit, T-spice, W-edit, L-edit and
LVS.

5.1.1 S-edit

In S-Edit, schematic design of circuit enables you to check your design for common
errors such as undriven nets, unconnected pins and nets driven by multiple outputs so you can
catch errors early before running simulations.

36
Fig. 5.1: Schematic design of a inverter circuit in S-edit

5.1.2 T-spice

T-Spice lets you precisely characterize circuit behavior using virtual data measurements.
For greater efficiency and productivity, TSpice controls over your simulation process with an
easy-to-use graphical interface.

5.1.3 W-edit

The W-Edit waveform analysis tool is a comprehensive viewer for comparing,


displaying and analyzing simulation results. W-Edit is dynamically linked to T-Spice and S-Edit
with a run-time update feature that displays simulation results as they are being generated and
allows waveform cross-probing directly in the schematic editor for faster design cycles.

Fig. 5.2: Waveforms of the inputs and outputs of the inverter circuit

5.1.4 L-edit

Layout is essentially a drawing process. L-Edit gives you the flexibility and control you
need to master the editing process.

37
5.1.5 LVS

LVS (Layout Versus Schematic) compares net list generated by schematic and net list
generated by layout. The generated parameters are compared and if found similar then it is an
indication that the designed layout is ready for fabrication.

5.2 Steps to be followed to design a circuit and estimate the power dissipation
of the circuit
1. Open S-edit, then select File> New > New Design.

Fig. 5.3: File menu in S-edit Window

2. Give a name for the design & select the required folder for the design.

Fig. 5.4: To name the design and select the folder for the design.

38
3. Then select Cell > New View for drawing the schematic.

Fig. 5.5: For defining the cells for different schematic designs in one project
4. Then give a name for the schematic &click OK.

Fig. 5.6: To name the schematic design of the cells

39
5. Then click on add in the libraries window for adding the Tanner library file.

Fig. 5.7: Adding the libraries for the design

6. The browse for the library file in MY DOCUMENTS/TANNER TOOLS


V13.0/LIBRARIES/ALL/ALL.TANNER

Fig. 5.8: To browse the library files

40
Fig. 5.9: After libraries are added, they are displayed in library window.
7. Place the nMOS of inverter as shown in the figure, from Devices library.

Fig. 5.10: Designing the circuit by integrating the components from the library.

41
8. Similarly place the pMOS, place supply (VDD) & ground (GND) from MISC library.

9. To edit the parameters of the device, select the device & press CTRL+E, and can be edited
in the Properties window in the right.

10. Then place the Input& Output port. You can give any name for ports before placing.

Fig. 5.11: Naming the input and output ports in the circuit
11. Now using the wire tool, make the circuit connections of the inverter.

Fig. 5.12: The wiring connections of the circuit are done by using the wire tool.

42
12. The BULK pin of p MOS should be connected to VDD. Similarly for n MOS connect to
GND.

13. Then specify a symbol for the inverter by selecting, Cell >Update Symbol.

Fig. 5.13: The circuit can be made as a symbol by update symbol

43
14. Default symbol will be a square. You can modify the symbol by using drawing tools in the
toolbar.

Fig. 5.14: By using drawing tools in toolbar the symbols can be modified

15. After drawing the symbol, save the design & close it.

Next step is to create the testing circuit

A. Create a new design.


B. Then add the previous design
C. All the default libraries will be included along with it.
D. Now drag & drop the INVERTER cell symbol into the design.
E. Place the DC voltage source for VDD from SPICE ELEMENTS library.
F. Before placing it, change the voltage as required. (For 180nm technology ,VDD is
about 2.5V)
G. Now for the input to inverter, choose the Interface as BIT. Change the properties
as needed
H. After placing, right click on empty region.
I. Then place a load capacitor
J. Wire the devices.
K. Save design
L. Browse for the required model file & select OK.
M. Select TRANSIENT /FOURIER ANALYSIS. Set the maximum step time as 2n
& Stop time as req. (here 200n)

44
N. To print the voltages select PRINT VOLTAGE from SPICE_COMMANDS
library, place on the required node (here on input& output nodes)
O. Save design and click on START SIMULATION ICON. The simulated output waveform
will be opened if simulated correctly without any errors.

16. Alternatively, click on OPEN IN T-SPICE to view the net list& to simulate directly in T-Spice.
Save the net list in req. location before simulating in t-spice. This is the net list of our inverter test
circuit. The inverter is instantiated by using sub-circuit definition. Then insert the command of
power estimation in the simulation settings and save it.

17. To see the power analysis output, open the *.out file of the corresponding net list.
18. Go to the end of the out file to see the power analysis result.

Fig. 5.15: Waveforms of the inputs and outputs of the circuit designed.
Power results
Average power consumed 5.943817e-005 watts
Max power 3.357000e-003 watts at time 4.2e-008
Min power 9.752878e-011watts at time 2e-008

45
46
CHAPTER 6

CONCLUSION AND FUTURE SCOPE

6.1 CONCLUSION
The Look ahead clock gating has been shown to be very useful in reducing the clock
switching power. Similar to data driven gating, it is capable of stopping the majority of
redundant clock pulses. It has however a big advantage of avoiding the tight timing constraints of
AGFF and data driven, by allotting a full clock cycle for the enabling signals to be computed and
propagate to their gate. Furthermore, unlike data driven gating whose optimization requires the
knowledge of FFs data toggling vectors, LACG is independent of those and also it is
independent of the target application. The power in LACG has been reduced to 50% than the
auto gated FF which consumes 50% less power than the data driven method.

6.2 FUTURE SCOPE


Clock gating method uses to reduce the Dynamic Power consumption. In future we can
integrates the static power consumption reduction techniques in-order to reduce both the losses of
CMOS.

47
REFERENCES

[1] V. G. Oklobdzija, Digital System Clocking High-Performance and Low-Power Aspects.


New York, NY, USA: Wiley, 2003.

[2] P.Sambasivarao, V.Venkata Rao A novel Look ahead clock gating for power saving of
international conference on advances in signal processing and communications
2K15(NECICASPC-2K15), 5th DEC 2016.

[3] M. S. Hosny and W. Yuejian, Low power clocking strategies in deep submicron
technologies, in Proc. IEEE Int. Conf. Integr. Circuit Design

[4] C. Chunhong, K. Changjun, and S. Majid, Activity-sensitive clock tree construction for low
power, in Proc. ISLPED, 2002, pp. 279282.

[5] A. Farrahi, C. Chen, A. Srivastava, G. Tellez, andM. Sarrafzadeh, Activity- driven clock
design, IEEE Trans. Comput.AidedDes.Integr.Circuits Syst., vol. 20, no. 6, pp. 705714, Jun.
2001.

[6] W. Shen, Y. Cai, X. Hong, and J. Hu, Activity and register placement aware gated clock
network design, in Proc. ISPD, 2008, pp. 182189.

[7] Synopsys Design Compiler, Version E-2010.12-SP2.

[8] S. Wimer and I. Koren, The Optimal fan-out of clock network for power minimization by
adaptive gating, IEEE Trans. VLSI Syst., vol. 20, no. 10, pp. 17721780, Oct. 2012.

[9] M. Donno, E. Macii, and L. Mazzoni, Power-aware clock tree planning, in Proc. ISPD,
2004, pp. 138147.

[10] S. Wimer and I. Koren, Design flow for flip-flop grouping in datadriven clock gating,
IEEE Trans. VLSI Syst., to be published.

[11] ShmuelWinner,AryeAlbahari, A look-ahead clock gating based on auto gated flip-


flops,IEEE Transactions on circuits and systems,vol.61,no.5,may 2014

48
A NOVEL LOOK-AHEAD CLOCK
GATING FOR POWER SAVING
P.Samba Siva Rao1, V.VenkataRao2, B.V. Rama Mohana Rao3
1
PG Student, Department of ECE, 2Professor & HOD, Department of ECE, 3Principal
Narasaraopeta Engineering College, Narasaraopet, Guntur (Dt), Andhra Pradesh
1
pssr453@gmail.com, 2vvenkatarao2k9tec@gmail.com

ABSTRACT shot to extend clock gating opportunities. Gated


Clock gating is one of the power saving technique. It is a clock may be a common methodology for
popular technique used in many synchronous circuits reducing power dissipation in synchronous
for reducing dynamic power dissipation and digital system. victimization this methodology the
extraordinarily helpful for decreasing the ability power clock isn't given to the flip flop once the circuit is idle
wasted by digital circuits. This paper proposes a new . wehave a tendency to decision the on top of ways
technique of look ahead clock gating. It avoids and
information driven primarily based. Synthesis-based
replaces the drawbacks of the previously existing ways.
clock gating is that the most generally used
the present systems for clock gating are synthesis base
clock gating, information driven clock gating and clock methodology by EDA tools[7]. the use of the clock
gating on auto gated flip flops however of these pulses, measured by data-to-clock toggling
techniques had some disadvantages. This project deals quantitative relation, left when the employment of
with the replaces the drawbacks within the existing synthesis-based gating should still be terribly low.
system Look-Ahead Clock Gating (LACG), combines Fig. one depicts the common data-to-clock toggling
all the three. LACG computes the clock enabling signals quantitative relation[8], obtained by in depth power
of every FF one cycle before time, supported the current simulations of sixty one blocks comprising 200k FFs,
cycle information of these FFs on that it depends. It
taken from a thirty two nm high-end 64-bit chip[9]
avoids the tight temporal arrangement constraints of
[10]. Those area unit largely management blocks of
AGFF and data-driven by allotting a full clock cycle for
the computation of the enabling signals and their the data path, register file and memory management
propagation. The Simulation will be done in Tanner units of the processor. The technology parameters
EDA 13.0v T-Spice at 0.18um used throughout the papers area unit of twenty-two
nm low-leakage method technology.
I. INTRODUCTION

One of the most important dynamic power consumers


in computing and consumer electronics product is II.EXISTING DATA DRIVEN CLOCK GATING
that the systems clock signal, generally liable for
To address the above redundancy, a method called
half-hour to 70th of the overall dynamic (switching)
Data-driven clock gating was proposed for flip-flops
power consumption[1]. many techniques to scale
(FFs).There, the clock signal driving a Flipflop, is
back the dynamic power are developed, of that clock
gated when the FFs state is not subject to change in
gating is predominant. Ordinarily, once a logic unit is
the next clock cycle. In an attempt to reduce the
clocked, its underlying serial parts receive the clock
overhead of the gating logic, several flipflops are
signal regard-less of whether or not or not their
driven by the same clock signal, generated by ORing
information can toggle within the next cycle. With
the enabling signals of the individual flipflops . Data-
clock gating, the clock signals area unit AND based
driven gating affected from a very short time-
with expressly predefined signals. Clock gating is
window. The cumulative delay of the XOR, OR, latch
used the least bit levels: system design, block style,
and the AND gate must not increased the setup time
logic style and gates[2],[3]. manyways to require
of the Flipflop.
advantage of this system area unit represented[4]-[6],
with all of them counting on varied heuristics in a

49
IV. AUTOGATED FLIPFLOP

Flip-flops have their content modification solely


either at the rising or falling fringe of the modify
signal. But, once the rising or falling fringe of the
modify signal, the flip-flops content remains
constant even though the input modification. in a
very typical D Flip Flop, the clock signal perpetually
flows into the D flip-flop no matter whether or not
FIG1:DATA DRIVEN CLOCK GATING the input changes or not. A part of the clock energy is
consumed by the interior clock buffer to manage the
Clock enabling signals are very well understood at transmission gates unnecessarily.
the system level and so will effectively be outlined
and capture the periods wherever practical blocks and
modules don't ought to be clocked. Those are later
being automatically synthesized into clock enabling
signals at the gate level. In several cases, clock
enabling signals ar manually accessorial for each FF
as a section of a style methodology. Still, once
modules at a high logic gate level ar clocked, the
state transitions of their underlying FFs rely upon the
information being processed. it's vital to notice that
the complete dynamic power consumed by a system
stems from the periods wherever modules clock FIG2:AUTOGATED FLIPFLOP
signals are enabled.. Fig. one shows the FFs toggling
activity in an arithmetic block comprising designed in Hence, if the input of the flip-flop is the image of its
22-nm technology ,taken from DSP core for output, the shift of the clock will be suppressed to
transmission and wireless base band applications The conserve power. The auto gated flip-flop design has
statistics is obtained from intensive simulations of been illustrated in Fig 2. This block consists of
typical modes of operation, consisting of 240-K clock master and the slave combination of flip-flops and
cycles. once the FFs clock signal is enabled is simply the latch. The FFs falling edge of the clock pulse
100%, that continues to be accountable for the could be gives the time prior of the input signal. The
complete dynamic power consumed by that block. XOR gates are to be highlighting the state of the
The clock enabling signals are obtained by RTL slave latch when it could be enabled. The sectional
synthesis and manual insertions. As Fig. 1 shows, a view of this latch and the flip-flop can be having the
FF toggled its state solely a pair of.9% of the clock timing constraints when compared to the data driven
enabled fundamental quantity, on the common, so clock gating. The level of the clock signal enables the
over 97 of the clock pulses driving FFs are useless. pulses from the triggering edges of the input. The
gating can be detected to be critical in the master
III.DRAWBACK OF DATA DRIVEN METHOD slave flip-flop enabling.

Data driven gating suffers from a very short time V.DRAWBACK OF AUTOGATED FLIP FLOPS
window where the gating circuitry can properly
work.. The cumulative delay of the XOR, OR, latch There are two major drawbacks. Firstly, only the
and the AND gate must not exceed the setup time of slave latches are gated, leaving half of the clock load
the FF. Such constraints may exclude 5%-10% of the not gated. Secondly, serious timing constraints are
FFs from being gated due to their presence on timing imposed on those FFs residing on critical paths,
critical paths. The exclusion percentage increases which avoid their gating.
with the increase of critical paths, a situation
occurring by downsizing or turning transistors of non
critical path to high threshold voltage (HVT) for VI. LOOK AHEAD CLOCK GATING
further power savings.
Look Ahead Clock Gating computes the clock
enabling signals of each FF one cycle ahead of time,

50
based on the present cycle data of those FFs on which Look ahead clock gating has been shown to be very
it depends. Similarly to data-driven gating, it is useful in reducing the clock switching power. Similar
capable of stop-ping the majority of redundant clock to data driven gating, it is capable of stopping the
pulses. It has however a big advantage of avoiding majority of redundant clock pulses. It has however a
the tight timing constraints of AGFF and data driven, big advantage of avoiding the tight timing constraints
by allotting a full clock cycle for the enabling signals of AGFF and data driven, by
to be computed and propagate to their gate. Further
more, unlike data driven gating whose optimization allotting a full clock cycle for the enabling signals to
requires the knowledge of FFs data toggling vectors, be computed and propagate to their gate.
LACG is independent of those. AGFF can also be RESULTS
used for general logic, but with two major
drawbacks. Firstly, only the slave latches are gated,
leaving half of the clock load not gated. Secondly,
serious timing constraints are imposed on those FFs
residing on critical paths, which avoid their gating.

Fig5:Output of Data driven Clock Gating

FIG3: LOOK AHEAD CLOCK GATING

LACG takes AGFF a leap forward, addressing three


goals; stopping the clock pulse also in the master
latch, making it applicable for large and general
designs and avoiding the tight timing constraints.
LACG is based on using the XOR output in Fig. 3 to
generate clock enabling signals of other FFs in the
system, whose data depend on that FF.

Fig6:Output of Look Ahead Clockgating

Fig4: ENHANCED AGFF WITH XOR OUTPUT USED FOR


Fig7:Shift register design by using LACG
LACG
Simulation Results:
ADVANTAGE OF LACG

51
Circuit Power [7] Synopsys Design Compiler, Version E-2010.12-SP2.
Consumption [8] S. Wimer and I. Koren, The Optimal fan-out of clock network
for power minimization by adaptive gating, IEEE Trans. VLSI
Data Driven Clock 1.931668e-008 Syst., vol. 20, no. 10, pp. 17721780, Oct. 2012.
Gating watts
[9] M. Donno, E. Macii, and L. Mazzoni, Power-aware clock tree
Look Ahead 1.322850e-008 planning, in Proc. ISPD, 2004, pp. 138147.
Clockgating watts [10] S. Wimer and I. Koren, Design flow for flip-flop grouping in
datadriven clock gating, IEEE Trans. VLSI Syst., to be published.
Shiftregister design 3.018285e-008
using LACG watts [11]ShmuelWinner,AryeAlbahari, A look-ahead clock gating
based on auto gated flip- flops,IEEE Transactions on circuits and
systems,vol.61,no.5,may 2014

CONCLUSION

The Look ahead clock gating has been shown to be


very useful in reducing the clock switching power. AUTHORS
Similar to data driven gating, it is capable of stopping Mr. P. Samba Siva Rao completed his
the majority of redundant clock pulses. It has B.TechdegreeinElectronics and
however a big advantage of avoiding the tight timing Communications Engineering branch from
PPDV Engineering College, Vijayawada,
constraints of AGFF and data driven, by allotting a Krishna (Dt), Andhra Pradesh ,India in
full clock cycle for the enabling signals to be 2013 and pursuing M.Tech in Digital
computed and propagate to their gate. Furthermore, Electronics
and Communication System branch from
unlike data driven gating whose optimization requires Narasaraopeta Engineering College,
the knowledge of FFs data toggling vectors, LACG Narasaraopet, Guntur(Dt), Andhra
Pradesh, India. Her research area is
is independent of those and also it is independent of VLSI(Communication Systems). P.Samba
the target application. The power in LACG has been Siva Rao may be reached at
reduced to 50% than the autogated FF which pssr453@gmail.com

consumes 50% less power than the data driven Dr. V. Venkata Rao has his M.E in
method. Microwave and Radar Engineering from
Osmania University, Hyderabad. He obtained his Ph.D. from
REFERENCES JNTU, Hyderabad. Presently he is working as Professor and Head
of the department of Electronics and Communication Engineering
[1] V. G. Oklobdzija, Digital System Clocking High-Performance in Narasaraopeta Engineering College, Narasaraopet, Guntur
and Low-Power Aspects. New York, NY, USA: Wiley, 2003. District, Andhra Pradesh, India. He has over 20 years of experience
in R&D, industry and teaching. He has published and presented 50
[2] L. Benini, A. Bogliolo, and G. De Micheli, A survey on design research papers in respective International/National Journals and
techniques for system-level dynamic power management, IEEE Conferences. He is a life member of ISTE. His research area
Trans. includes Global Positioning System, Image Processing and
Embedded Systems. Presently he is guiding one research scholar.
VLSI Syst., vol. 8, no. 3, pp. 299316, Jun. 2000.

[3] M. S. Hosny and W. Yuejian, Low power clocking strategies


in deep submicron technologies, in Proc. IEEE Int. Conf. Integr. Dr. B. V. Rama Mohana Rao, the
Circuit Design Principal of the college leads by
example with an impressive
[4] C. Chunhong, K. Changjun, and S. Majid, Activity-sensitive qualification which includes M.Tech
clock tree construction for low power, in Proc. ISLPED, 2002, pp. and Ph.D from I.I.T. Kharagpur. He has
279282. to his credit a teaching experience of
more than 20 years and around 15
[5] A. Farrahi, C. Chen, A. Srivastava, G. Tellez, andM. years experience in academic
Sarrafzadeh, Activity- driven clock design, IEEE Trans. administration. His areas of interest are VLSI, Signal and Image
Comput.AidedDes.Integr.Circuits Syst., vol. 20, no. 6, pp. 705 processing. He published his papers in 50 Journals and
714, Jun. 2001. Conferences. He is life member of ISTE.

[6] W. Shen, Y. Cai, X. Hong, and J. Hu, Activity and register


placement aware gated clock network design, in Proc. ISPD,
2008, pp. 182189.

52

S-ar putea să vă placă și