Sunteți pe pagina 1din 22

116

CHAPTER 5

DELAY LINE-BASED PLL ARCHITECTURES

5.1 INTRODUCTION

Clock-skew is one of the bottlenecks in realizing high-speed and


high-performance digital systems. The clock skew results from the difference
of propagation delay between different signal paths that usually depend on the
Process, Voltage, Temperature and Loading (PVTL) effects. The common
approach for clock processing, such as de-skewing and frequency
multiplication, is based on Phase Locked Loops (PLLs) or Delay Locked
Loops (DLLs). In recent years, the interest in All-Digital DLLs (ADDLL) has
increased. A delay locked loop is a feedback control system that equalizes the
phase of two delayed copies of the same clock signal. The DLL is useful for
compensating the clock distribution delays that arise in many system
configurations (Maymandi-Nejad and Sachdev 2003). The motivation for
designing an all-digital DLL is to ensure that the clock signal (and hence
input vectors) received from off-chip via the pad circuits would be
synchronized with the distributed and buffered clock signal at the flip-flops
(and hence the synchronous datapath signals) within the core of CMOS
Integrated Circuit (IC).

As shown in Figure 5.1, a DLL can be used to synchronize the


phase of the clock that is distributed by the clock tree in an integrated circuit
to the clock inputs of all flip-flips within the core of the IC (i.e. the buffered
leaf node outputs of the clock tree), with the phase of the clock signal
117

(CLK_IN) just inside the clock input pad. The clock signal from the clock
input pad is usually the most convenient clock for latching data signals just
inside the data input pads. Without active synchronization, the phase of the
two clocks will likely be different due to the uncertain and IC-specific
propagation delay through the clock tree buffer network. The role of the DLL
is to adjust the output tap of a programmable digital delay line at the root of
the clock tree so that the phases at the leaf nodes of (CLK_IN) just inside the
clock pad receiver circuit as specified by Hsiang-Hui Chang et al (2003). A
DLL is also useful in other common situations in digital system design.

For example, a DLL can be used to ensure that the clocks that are
distributed by multiple, separately balanced clock trees within a custom IC or
a Field-Programmable Gate Array (FPGA) are synchronized at all the flip-
flop inputs.

CLK pin

CLK_FB

DLL

CLK Buffers

Circuit
.
.
.

Figure 5.1 Delay locked loop location in an integrated circuit


118

In the DLL operation, the search for an optimal delay of the clock
is similar to the operation of Analog to Digital Converter (ADC) designs. The
essence in DLL is the transformation of the phase difference to the digital
form, while ADC is used to transform the voltage into the digital form. Thus,
some algorithms used in ADC can be used in DLL design.

The ADDLL is designed with a proposed Modified Variable


Successive Approximation Register (MVSAR) controlled algorithm to
achieve the fast-locking property, closed-loop operation and performing
binary search without harmonic-locking issue. The digital delay line was
implemented as a cascade containing 256 inverter pairs. This work mainly
concentrates on the reduction of time to lock of ADDLL.

5.2 WHY AN ALL-DIGITAL DLL

Earlier, analog implementations provided lower jitter. However,


analog circuits are more difficult to scale, while digital circuits are easily
scalable. In most digital implementations, the jitter is proportional to the gate
delay and thus the jitter scales down in advanced technologies. As a result,
jitter in digital implementations catches up with and even surpasses analog
implementations. Analog circuits consume DC current even at standby. In
ADDLL, it is possible to stop any toggling. Thus no capacitors are used in
DLLs. Typical ADDLLs require few thousands gates and the area is scaled at
each technology generation.

Analog circuits are limited by the minimum operating voltage,


since some circuits require minimum biasing voltage for proper operation. In
low voltage CMOS chips, the analog DLL may require supply voltage higher
than the rest of the chip. In ADDLL, where CMOS gates are used, voltage can
be scaled to very low levels, which is also the level of the other CMOS
119

circuits. Digital implementations are also much easier to port to other


technologies. They are also portable to structured arrays and gate arrays
merely by routing the existing gates in the arrays.

For the analog DLL design, the delay line is always controlled by
an analog control signal on a loop filter in the DLL. Since the delay line is
controlled continuously, the phase error between the input and output clocks
tends to be smaller compared to the locking results in the digital DLLs.
Besides, analog DLLs tend to have smaller chip areas and lower power
dissipations. However, due to analog nature, the analog DLLs are susceptible
to process variations and supply noises due to the smaller noise margin of the
analog design.

Therefore, digital DLLs have their particular forts as DLL. Digital


DLLs are well known for their shorter locking time and wide-range
operations. They can be roughly divided into three kinds: the counter-
controlled schemes, flash Time-to-Digital Converter (TDC) architecture and
the binary search algorithm (also called the successive approximation
register-controlled scheme). The first and the third concepts are very similar.
The difference is that the first one uses more clock cycles to lock, while the
third concept has lower locking resolutions due to its open-loop
characteristics. The second concept is the direct transform method, this can
theoretically achieve locking within only one clock cycle, but requires much
more hardware. Finally, there are also hybrid designs for DLLs which
combine the digital and analog designs. And these designs tend to have larger
area compared with the similar digital DLLs. There is always a compromise
between the complexity and locking performance for DLL design.
120

5.3 CONTROLLING ALGORITHMS FOR ADDLL

The controlling algorithms for All-Digital Delay Locked Loop


(ADDLL) to be discussed include:

1. Basic Counter type

2. Variable Successive Approximation Register (VSAR)


algorithm

3. Modified Variable Successive Approximation Register


(MVSAR) algorithm

5.3.1 Basic counter type algorithm

The basic type DLL uses counter type control algorithm with
feedback to achieve the desired phase relationship between an input clock
(CLKIN) and a feedback clock (CLKFB). As shown in Figure 5.2, the PD
produces two error signals (D and U) which indicate the relative timing order
(early and late) of the feedback clock (CLKFB) with respect to the input clock
(CLKIN).

PD CLKFB

U D

Up/Down
Counter

CLK IN DLL CLK OUT

Figure 5.2 ADDLL with basic counter type algorithm


121

When the U (or D) signal is activated, the counter increments


(decrements) to increase (decrease) the delay through the programmable delay
line. With proper DLL controller design, eventually the counter value will
find the two best choices of tap setting and will indeed alternate repeatedly
between them. Time to lock of the counter type DLL increases exponentially
as the number of control bits increases which is a basic constraint in high
speed IC designs.

5.3.2 Variable SAR algorithm

The aim of the variable successive approximation register


controlled algorithm is to reduce the time to lock and performing binary
search without the harmonic-locking issue. The delay produced by the delay
line should be such that CLKFB should not exceed the comparison point as
shown in Figure 5.3.
Comparison point

Ckref

1
2

Correct locking

Ckref

False locking

Figure 5.3 Harmonic-locking issue


122

Otherwise the DLL will lock at harmonics of clock periods i.e.


false locking. For the VSAR controller, the delay of the delay line increases
gradually from the minimum and never exceeds twice the input clock period.
So, the harmonic locking is no longer an issue.

The flowchart of VSAR algorithm is shown in Figure 5.4. It consist


of an M-bit VSAR unit and an (N-M)-bit conventional SAR unit, where N is
the total number of control bits for the digital-controlled delay line (DCDL)
and 1 M N.

Initialisation K = 1

Conventional
(N-M+K) bit SAR
algorithm

K = K+1

No No
Lock? K=M?

Y Y

Closed-loop Failure

Figure 5.4 Flowchart of VSAR algorithm

Initially, the conventional (N-M)-bit SAR units borrow one bit as a


MSB from the LSB of the VSAR units to perform a (N-M+1)-bit binary
search. After the binary search is done, a judgment circuit examines the lock
state. Once the unlock state is detected, the conventional SAR units borrow
123

one more LSB from the VSAR units to lock again, i.e., (N-M+2)-bit binary
search. This is equivalent to twice the delay line. Before the number of
borrowed bits reaches to, the operation repeats until the DLL is locked
correctly. Once the lock state is confirmed, the VSAR controller is
transformed into a counter for a closed-loop operation.

A simplified timing diagram for N=5 and M=3 is illustrated in


Figure 5.5 to explain the operation. When the START signal arises, the
VSAR units lend one LSB to the 1-bit conventional SAR unit to perform a 2-
bit binary search. Once DLL fails to lock then SAR circuit is reset for the next
binary search and the number of lent bits of VSAR units increases by 1 and a
3-bit binary search starts. The entire operation stops if the delay
corresponding to one clock period is achieved.

Figure 5.5 Timing diagram for the VSAR algorithm (N = 5 and M = 3)

5.3.3 Modified variable SAR algorithm

The lock time of DLL can still be reduced by modifying the VSAR
controller by adding one more control loop as shown in Figure 5.6. Here U
and D are phase detector outputs.
124

Initialisation K = 1

Conventional
(N-M+K) bit SAR
algorithm Y U = 0? No
&&
D = 1?
K = K+1
No No

No
Lock? K=M? Y K=M?

Y Y
Y
Closed-loop Failure K = K+1

Figure 5.6 Flowchart of the proposed MVSAR algorithm

Figure 5.7 Timing diagram for the MVSAR algorithm (N = 5 and M = 3)

To explain the operation, a simplified timing diagram for N=5 and


M=3 is illustrated in Figure 5.7. When the START signal arises, the
MVSAR units lend one LSB and then searches for whether U=0 and D=1 i.e.
CLKOUT is lagging CLKIN with a phase difference of 1800 to 360 0. Once the
DLL finds this condition it will transform into a VSAR, otherwise it will lend
one more LSB bit and searches for the condition as shown in Figure 5.6. The
entire operation stops if the delay corresponding to one clock period is
achieved.
125

5.4 ARCHITECTURE OF AN ALL-DIGITAL DELAY LOCKED


LOOP

The complete All Digital Delay-Locked Loop (ADDLL) consists of


four parts.

1. Digital Controlled Delay Line (DCDL)


2. Phase Detector (PD)
3. VSAR controller and

4. Divide-by-2 frequency divider.

5.4.1 ADDLL using MVSAR algorithm

As shown in Figure 5.8 the ADDLL has 8-bit DCDL, 4-bit VSAR
units and 4-bit SAR units in VSAR controller. Here, CLKIN is the input clock,
CLKOUT is the output clock which is de-skewed by ADDLL, CLKSAR is clock
to VSAR controller which is the output of divide-by-2 frequency divider and
START is used to start the controller.
D
Phase
Detector

Digital Controlled
CLKIN
Delay Line (DCDL) CLKOUT
Tap [7:0]
/2
[7:4] [3:0]
CLKSAR
4 bit VSAR 4 bit SAR
START unit unit
MVSAR Controller

Figure 5.8 Block diagram of ADDLL using modified variable SAR


algorithm
126

5.4.2 Digital Controlled Delay Line (DCDL)

The 256-tap maximum programmable delay line length was


selected to allow one period of the clock to fit into the delay line, even if the
tap delay was only 50% of what one would predict from nominal values. Both
the size of the tap increment and the effective length of the delay line were
programmable to provide more robust fall-back modes for DLL operation.
Different maximum delay line lengths were provided to guard against the
possibility that clock pulses might turn into run pulses or disappear entirely
during propagation along the delay line (Maymandi-Nejad and Sachdev
2003).

CLK_IN
...
...
0 1 255

256 : 1 MUX
Tap_SEL [7:0]

CLK_OUT

Figure 5.9 A 256-tap maximum Digital Controlled Delay Line (DCDL)

Operating frequency of ADDLL depends on inverter delays used in


DCDL, so with this 256-tap programmable delay line, operating range of
input clock is defined as follows:

Maximum operating Tclk period: T < 512 * TINVERTER DELAY


Minimum operating Tclk period: T > (30 to 50) * TINVERTER DELAY

As shown in Figure 5.9, the individual taps in the delay line were
balanced with dummy NAND gate loads to minimize clock pulse erosion due
to different rise and fall times.
127

5.4.3 Phase detector using D-flip flops

A conventional phase detector can be a simple D-flipflop with a


binary output. The control code changes back and forth between two or three
different values even if the DLL is locked. To avoid this phenomenon, the
phase detector is realized by two D-flipflops as shown in Figure 5.10. The
CLKIN and CLKOUT are sampled by each other with these two D-flipflops.
After the VSAR controller is transformed into a counter, the control code
changes when only one of the signals is logic one. The counter stops if the
signals are identical (Hsiang-Hui Chang and Shen-Iuan Liu 2005). The
sample output waveform of PD is illustrated in Figure 5.11.

CLKIN Q
D U

CLKOUT
D Q D

Figure 5.10 Phase Detector (PD) using D flip-flops

Figure 5.11 Phase Detector (PD) output for a sample CLK IN and CLK OUT
128

5.4.4 MVSAR controller

The proposed modified variable SAR controller is realized by 4-bit


MVSAR units and 4-bit conventional SAR units as shown in Figure 5.12. At
first, the MVSAR starts the SAR search with a 4-bit. If it fails to satisfy the
required condition, the controller starts 5-bit SAR search. This process is
repeated until the ADDLL gets lock or upto a maximum of available 8-SAR
bits.

MVSAR SAR
M [7: 4] (N-M) [3: 0]

Figure 5.12 MVSAR controller (8-bit) with 4-bit MVSAR units and
4-bit SAR units

After the first locking of ADDLL, the MVSAR controller


transforms into a counter to track the CLKIN changes due to Process, Voltage,
Temperature and Loading (PVTL) variations.

5.4.5 Divide-by-2 frequency divider

The VSAR algorithm seems to have a longer lock time than the
conventional one. However, this may not be true. The key is the Division
Ratio (DR) which is the frequency ratio between the MVSAR controller clock
(CLKSAR) and the input clock (CLKIN).

The DR for a conventional SAR controller is expressed as

DR >[TLOOP/ TCLK]+1

where [TLOOP/ TCLK] denotes the Gaussian operation, TCLK is the input clock
period and TLOOP is the loop delay which includes the propagation delays of
129

the delay line, the PD, the MVSAR controller and so on. DR in the MVSAR
controller is the minimum of 2. For the MVSAR controller, the delay of the
delay line increases gradually from the minimum and never exceeds twice the
input clock period. Thus, the harmonic locking is no longer an issue.

5.5 SIMULATION RESULTS


5.5.1 Basic counter type ADDLL

CADENCE RTL schematic for counter type DLL is shown in


Figure 5.13 and the counter controller is shown in Figure 5.14. This counter
controller increments or decrements the tap [7:0] bits to select the inverter
chain in DCDL according to the output of phase detector.

Figure 5.13 RTL schematic for basic counter type ADDLL

Figure 5.14 RTL schematic for counter controller in basic counter type
ADDLL
130

5.5.2 VSAR type ADDLL

CADENCE RTL schematic for VSAR type DLL is shown in


Figure 5.15 and the VSAR controller is shown in Figure 5.16. This VSAR
controller works using the concept of successive approximation upto the first
locking to select the inverter chain in DCDL according to the output of phase
detector. Then after first lock, it transform into a counter controller to
increment or decrement the tap [7:0] bits according to clock skew due to
PVTL variations.

Figure 5.15 RTL schematic for VSAR type ADDLL

Figure 5.16 RTL schematic for VSAR controller in VSAR type ADDLL

5.5.3 MVSAR type ADDLL

The RTL schematics of phase detector with two D flip-flops and


MVSAR controller are shown in Figures 5.17 and 5.18 respectively.
131

Figure 5.17 RTL schematic for phase detector in MVSAR type ADDLL

Figure 5.18 RTL schematic for MVSAR controller in MVSAR type


ADDLL

Figures 5.19 and 5.20 display the RTL schematics of 32-inverter


chain, sub-design of Digitally Controlled Delay-Line (DCDL) and the
complete DCDL respectively.

Figure 5.19 RTL schematic for 32-inverter chain (sub design of DCDL)
in MVSAR type ADDLL
132

Figure 5.20 RTL schematic for DCDL in MVSAR type ADDLL

The divide by 2 counter implemented using D flip-flop and some


gates are shown in Figure 5.21.

Figure 5.21 RTL schematic for Divide by 2 counter in MVSAR type


ADDLL

The CADENCE RTL schematic for MVSAR type DLL is shown in


Figure 5.22 and the output waveforms obtained for a frequency of 14 MHz
and 150 MHz are demonstrated in Figures 5.23 and 5.24 respectively.
133

Figure 5.22 RTL schematic for MVSAR type ADDLL

Figure 5.23 Timing diagram for MVSAR type ADDLL for fCLKIN =14
MHz
134

Figure 5.24 Timing diagram for MVSAR type ADDLL for fCLKIN =150
MHz

Figure 5.25 CADENCE SoC encounters physical view (die photograph)


of MVSAR type ADDLL
135

The physical design of the MVSAR type ADDLL is done using


CADENCE SoC encounter and the physical view (die photograph) is shown
in Figure 5.25.

5.6 CONCLUSION

All components in the proposed ADDLL are designed using


Verilog HDL and synthesized using a CADENCE RTL compiler. With the
available TSMC 0.18m digital CMOS technology library, the ADDLL
operates in the frequency range (Lock range) of 14MHz to 170MHz.

The clock cycles needed for locking in MVSAR type ADDLL is


very less in the frequency range of 14MHz-170HHz when compared to that of
the basic counter type ADDLL and VSAR type ADDLL (Figure 5.26). And
also proposed MVSAR algorithm avoids the harmonic-locking issue and
reduces the lock time in this wide-range operation.

300

250
No. of clock cycles needed for first lock

200

MVSAR
150
VSAR
Counter

100

50

0
0 20 40 60 80 100 120 140 160 180
CLKIN (MHz)

Figure 5.26 Comparison of Lock time required by ADDLL using


different algorithms
136

Table 5.1 shows the comparison of values of various parameters


obtained using different algorithms for ADDLL and a designed macro area
(physical design): 142 * 142 Sq.m (0.020164 Sq.mm)

Table 5.1 Comparison of different algorithms for ADDLL

MVSAR type VSAR type Counter type


Parameter
ADDLL ADDLL ADDLL

Digital Technology TSMC 0.18um TSMC 0.18um TSMC 0.18um

Lock range (operating


14 to 170 14 to 170 14 to 170
frequency range) in MHz

Max. clock cycles needed


28 at 14 MHz 62 at 14MHz 239 at 14MHz
for first lock

Total No. of cells Cell 1750 1760 1585


Area(Sq.um) 19000 18967 15448

Leakage Power(nW) 165.070 163.819 91.690


Internal Power(nW) 12241992.777 12996150.015 12278730.920
Net Power(nW) 13029264.161 13649589.053 13324282.102
Switching Power(nW) 25271256.938 26645739.068 25603013.022

Max. jitter <251.8ps <251.8ps <251.8ps

The ADDLLs developed using different algorithms are compared


with the state-of art works and tabulated in Table 5.2. It is seen that the
number of clock cycles required for first lock using ADDLL is much reduced
while using MVSAR algorithm. Also, if the ADDLL with MVSAR algorithm
is implemented using analog and digital components, the design may yield
low jitter and reduced lock time.
137

Table 5.2 Comparison of proposed ADDLL with state-of art works

Hsiang-
Bum-sik Hui
MVSAR Counter Cockburn
VSAR type Kim Chang
type type Yang and Liu and Keith
Parameters ADDLL ADDLL and Lee and
ADDLL (2007) Boyle
(modeled) Sup Kim Shen-
(proposed) (modeled) (2006)
(1998) Iuan Liu
(2005)
Technology TSMC TSMC TSMC 0.18m 1P6M 0.18m 0.35 m 0.18m
0.18m 0.18m 0.18m 1P6M 1P4M 1P6M
1P6M 1P6M 1P6M
Category Digital Digital Digital Digital Digital Digital Digital
& analog & analog & analog
Lock range 14 ~ 170 14 ~ 170 14 ~ 170 40 ~ 550 14~166 100 2~700
(operating
frequency
range) MHz
Max. clock 28 @14 62 at 239 at 134 ~14 cycles 122~244 <2sec 32 cycles
cycles MHz 14MHz 14MHz cycles i.e.200
needed for 16@ 26@ 15@ cycles
first lock 170MHz 170MHz 170MHz
Power 25.27mW 26.64mW 25.60mW 12.6mW@ X 3.2mW@ 23mW@
(mW) 550MHz 100MHz 700MHz

Peak-Peak <250ps <250ps <250ps 32.9ps@ X <200ps 17ps@


jitter 40MHz 700MHz
16.9ps@
200MHz
12ps@550MHz

Total No. of 1750 1760 1585 X X X X


cells
Cell Area
(Sq.um)
19000 18967 15448
Active Area 0.0202 0.0210 0.0190 0.2 Sq.mm 0.028Sq.mm 0.1 0.88
Sq.mm Sq.mm Sq.mm Sq.mm Sq.mm
Supply 1.8V 1.8V 1.8V 1.8V 1.2~2.0V 2.0V 1.4~2.5V

X Not mentioned

S-ar putea să vă placă și