Documente Academic
Documente Profesional
Documente Cultură
Preface xi
Designing Bang-Bang PLLs for Clock and Data Recovery in Serial Data Transmission Systems 34
R. C. Walker
K. S. Kundert
Part II Devices
Physics-Based Closed-Form Inductance Expression for Compact Modeling of Integrated Spiral Inductors 73
S. Jenei, B. K. J C. Nauwelaers, and S. Decoutere {IEEE Journal ofSolid-State Circuits, January 2002)
The Modeling, Characterization, and Design of Monolithic Inductors for Silicon RF IC's 77
J R. Long and M. A. Copeland {IEEE Journal of Solid-State Circuits, March 1997)
Analysis, Design, and Optimization of Spiral Inductors and Transformers for Si RF IC's 89
A. M. Niknejad, and R. G. Meyer {IEEE Journal of Solid-State Circuits, October 1998)
Estimation Methods for Quality Factors of Inductors Fabricated in Silicon Integrated Circuit
Process Technologies 110
K. O {IEEE Journal of Solid-State Circuits, August 1998)
On-Chip Spiral Inductors with Patterned Ground Shields for Si-Based RF IC's 118
C. Patrick Yue and S. S. Wong {IEEE Journal of Solid-State Circuits, May 1998)
The Effects of a Ground Shield on the Characteristics and Performance of Spiral Inductors 127
S.-M. Yim, T. Chen, and K. O {IEEE Journal of Solid-State Circuits, February 2002)
Design of High-g Varactors for Low-Power Wireless Applications Using a Standard CMOS Process 148
A.-S. Porret, T. Melly, C C Enz, and E. A. Vittoz. (IEEE Journal of Solid-State Circuits, March 2000)
The Effect of Varactor Nonlinearity on the Phase Noise of Completely Integrated VCOs 214
JWM. Rogers, J A. Macedo, and C Plett (IEEE Journal of Solid-State Circuits, September 2000)
Measurements and Analysis of PLL Jitter Caused by Digital Switching Noise 253
P. Larsson (IEEE Journal of Solid-State Circuits, July 2001)
On-Chip Measurement of the Jitter Transfer Function of Charge-Pump Phase-Locked Loops 260
A Low-Noise, Low-Power VCO with Automatic Amplitude Control for Wireless Applications 271
M.A. Margarit, J. L. Tham, R. G Meyer, and M. J. Been (IEEE Journal of Solid-State Circuits, June 1999)
A Fully Integrated VCO at 2 GHz 282
M. Zannoth, B. Kolb, J. Fenk, and R. Weigel (IEEE Journal of Solid-State Circuits, December 1998)
vi
Tail Current Noise Suppression in RF CMOS VCOs 287
RAndreani and K Sjoland {IEEE Journal ofSolid-State Circuits, March 2002)
Low-Power Low-Phase-Noise Differentially Tuned Quadrature VCO Design in Standard CMOS 294
M. Tiebout {IEEE Journal of Solid-State Circuits, July 2001)
A Low-Phase-Noise 5GHz Quadrature CMOS VCO Using Common-Mode Inductive Coupling 310
S. L. J. Gierkink, S. Levantino, R. C. Frye, and V. Boccuzzi {European Solid-State Circuits Conference,
September 2002)
An Integrated 10/5GHz Injection-Locked Quadrature LC VCO in a 0.18jjLm Digital CMOS Process 314
A. Ravi, K. Soumyanath, L. R. Carley, and R. Bishop {European Solid-State Circuits Conference, September 2002)
35-GHz Static and 48-GHz Dynamic Frequency Divider IC's Using 0.2-jjum AlGaAs/GaAs-HEMT's 330
Z. Lao, W. Bronner, A. Thiede, M. Schlechtweg, A. Hulsmann, M. Rieger-Motzer, G. Kaufel, B. Raynor, and M. Sedler
{IEEE Journal of Solid-State Circuits, October 1997)
A Family of Low-Power Truly Modular Programmable Dividers in Standard 0.35-|xm CMOS Technology 346
C. S. Vaucher, I. Ferencic, M. Locher, S. Sedvallson, U. Voegeli, and Z Wang {IEEE Journal of Solid-State Circuits,
July 2000)
A 1.2 GHz CMOS Dual-Modulus Prescaler Using New Dynamic D-Type Flip-Flops 361
B. Chang, J Park, and W Kirn {IEEE Journal of Solid-State Circuits, May 1996)
High-Speed Architecture for a Programmable Frequency Divider and a Dual-Modulus Prescaler 365
P. Larsson {IEEE Journal of Solid-State Circuits, May 1996)
A 1.6-GHz Dual Modulus Prescaler Using the Extended True-Single-Phase-Clock CMOS Circuit
Technique (E-TSPC) 370
J N. Soares, Jr. and W A. M. Van Noije {IEEE Journal of Solid-State Circuits, January 1999)
A 320 MHz, 1.5 mW @ 1.35 V CMOS PLL for Microprocessor Clock Generation 383
V von Kaenel, D. Aebischer, C. Piguet, and E. Dijkstra {IEEE Journal of Solid-State Circuits, Nov. 1996)
A Low Jitter 0.3-165 MHz CMOS PLL Frequency Synthesizer for 3 V/5 V Operation 391
H. C Yang, L. K. Lee, and R. S. Co {IEEE Journal of Solid-State Circuits, April 1997)
VII
Low-Jitter Process-Independent DLL and PLL Based on Self-Biased Techniques 396
1 G. Maneatis (IEEE Journal ofSolid-State Circuits, Nov. 1996)
A Low-Jitter PLL Clock Generator for Microprocessors with Lock Range of 340-612 MHz 406
D. W. Boerstler (IEEE Journal of Solid-State Circuits, April 1999)
A 960-Mb/s/pin Interface for Skew-Tolerant Bus Using Low Jitter PLL 413
S Kim, K. Lee, Y Moon, D.-K. Jeong, Y Choi, and H K. him (IEEE Journal of Solid-State Circuits, May 1997)
An All-Analog Multiphase Delay-Locked Loop Using a Replica Delay Line for Wide-Range Operation
and Low-Jitter Performance 456
Y. Moon, J Choi, K. Lee, D.-K. Jeong, and M.-K. Kim (IEEE Journal of Solid-State Circuits, March 2000)
A Wide-Range Delay-Locked Loop with a Fixed Latency of One Clock Cycle 474
H.-H. Chang, J.-W. Lin, C-Y Yang, and S.-I Liu (IEEE Journal of Solid-State Circuits, August 2002)
CMOS DLL-Base 2-V 3.2-ps Jitter 1-GHz Clock Synthesizer and Temperature-Compensated
Tunable Oscillator 493
C J. Foley and M. P Flynn (IEEE Journal of Solid-State Circuits, March 2001)
A 1.5 V 86 mW/ch 8-Channel 622-3125-Mb/s/ch CMOS SerDes Macrocell with Selectable Mux/Demux Ratio 499
F. Yang, J. O 'Neill, P Larsson, D. Inglis, and J. Othmer (Dig. International Solid-State Circuits Conference, Feb. 2002)
A Low-Jitter Wide-Range Skew-Calibrated Dual-Loop DLL Using Antifuse Circuitry for High-Speed DRAM 506
S. J Kim, S. H. Hong, J.-K. Wee, J. H Cho, P. S. Lee, J. H Ahn, and J Y Chung (IEEE Journal of Solid-State Circuits,
June 2002)
Part VI RF Synthesis
An Adaptive PLL Tuning System Architecture Combining High Spectral Purity and Fast Settling Time 517
C S. Vaucher (IEEE Journal of Solid-State Circuits, April 2000)
A 2-V 900-MHz Monolithic CMOS Dual-Loop Frequency Synthesizer for GSM Receivers 530
W.S.T. Yan and H C Luong (IEEE Journal of Solid-State Circuits, Feb. 2001)
viii
A CMOS Frequency Synthesizer with an Injection-Locked Frequency Divider for a 5-GHz Wireless
LAN Receiver 543
H R. Rategh, H Samavati, and T. H Lee {IEEE Journal ofSolid-State Circuits, May 2000)
A Modeling Approach for X-A Fractional-TV Frequency Synthesizers Allowing Straightforward Noise Analysis 578
M. H Perrott, M. D. Trott, and C G. Sodini (IEEE Journal of Solid-State Circuits, Aug. 2002)
A Fully Integrated CMOS Frequency Synthesizer with Charge-Averaging Charge Pump and Dual-Path
Loop Filter for PCS- and Cellular-CDMA Wireless Systems 589
Y Koo, H Huh, Y Cho, J Lee, J Park, K Lee, D.-K. Jeong, and W. Kim (IEEE Journal of Solid-State Circuits,
May 2002)
A 1.1-GHz CMOS Fractional-TV Frequency Synthesizer With a 3-b Third-Order 2-A Modulator 596
W.Rhee, B.-S. Song, and A. AH (IEEE Journal of Solid-State Circuits, Oct. 2000)
A 27-mW CMOS Fractional-TV Synthesizer Using Digital Compensation for 2.5-Mb/s GFSK Modulation 610
M. H Perrott, T. L Tewksbury III, and C G. Sodini (IEEE Journal of Solid-State Circuits, Dec. 1997)
A 2.5-Gb/s Clock and Data Recovery IC with Tunable Jitter Characteristics for Use in LAN's and WAN's 635
K. Kishine, N. Ishihara, K Takiguchi, and H Ichino (IEEE Journal of Solid-State Circuits, June 1999)
Clock/Data Recovery PLL Using Half-Frequency Clock 643
M. Ran, T. Oherst, R. Lares, A. Rothermel, R. Schweer, and N. Menoux (IEEE Journal of Solid-State Circuits,
July 1997)
A 0.5-jxm CMOS 4.0-Gbit/s Serial Link Transceiver with Data Recovery Using Oversampling 647
C.-K. K. Yang, R. Farjad-Rad, andM.A. Horowitz (IEEE Journal of Solid-State Circuits, May 1998)
A 2-1600-MHz CMOS Clock Recovery PLL with Low- Vdd Capability 656
P Larsson (IEEE Journal of Solid-State Circuits, Dec. 1999)
SiGe Clock and Data Recovery IC with Linear-Type PLL for 10-Gb/s SONET Application 666
Y M. Greshishchev and P Schvan (IEEE Journal of Solid-State Circuits, Sept. 2000)
ix
A 10-Gb/s CMOS Clock and Data Recovery Circuit with a Half-Rate Linear Phase Detector 681
J. Savoj and B. Razavi (IEEE Journal of Solid-State Circuits, May 2001)
A 10-Gb/s CMOS Clock and Data Recovery Circuit with Frequency Detection 688
J. Savoj and B. Razavi (Dig. International Solid-State Circuits Conference, Feb. 2001)
A 40-Gb/s Integrated Clock and Data Recovery Circuit in a 50-GHz/y, Silicon Bipolar Technology 694
M. Wurzer, J. Bock, H. Knapp, W.Zirwas, E Schumann, and A. Felder (IEEE Journal of Solid-State Circuits,
Sept. 1999)
A Fully Integrated 40-Gb/s Clock and Data Recovery IC With 1:4 DEMUX in SiGe Technology 699
M. Reinhold, C. Dorschky, E. Rose, R. Pullela, P. Mayer, E Kunz, Y Baeyens, T. Link, andJ-P. Mattia
(IEEE Journal of Solid-State Circuits, Dec. 2001)
Index 713
Devices and Circuits for Phase-Locked Systems
Behzad Razavi
Abstract—This turtorial deals with the design of devices design of the stage(s) driven by the VCO. On the other hand,
such as varactors and inductors and circuits such as ring to avoid forward-biasing the varactors significantly, Vx and
and LC oscillators. First, MOS varactors are introduced as Vy must remain above approximately Vcont — 0.4 V. Thus,
a means of frequency control for low-voltage circuits and the peak-to-peak swing at each node is limited to about 0.8 V.
their modeling issues are discussed. Next, spiral inductors Note that the cathode terminals of the varactors also introduce
are studied and various geometries targetting improved Q substantial n-well capacitance at X and Y, further constraining
or higher self-resonance frequencies are presented. Noise- the tuning range.
tolerant ring oscillator topologies are then described. Fi- In contrast to pn junctions, MOS varactors are immune to
nally, a procedure for the design of LC oscillators is out- forward biasing while exhibiting a sharper C-V characteristic
lined. and a wider dynamic range. If configured as a capacitor [Fig.
2(a)], a MOSFET suffers from both a nonmonotonic C-V be-
The design of phase-locked systems requires a thorough
understanding of devices, circuits, and architectures. Intended CGs
as a continuation of [1], this tutorial provides an overview G Accumulation Strong Inversion
of concepts in device and circuit design for phase-locking in
digital, broadband, and RF systems.
I. PASSIVE DEVICES
S ^TH VGS
The demand for low-noise PLLs has encouraged extensive
(a)
research on active and passive devices. In this section, we
study varactors and inductors as essential components of LC ^var
oscillators.
Accumulation Depletion
A. Varactors
As supply voltages scale down, pn junctions become a less
attractive choice for varactors. Specifically, two factors limit p-substrate
the dynamic range of pn-junction capacitances: (1) the weak 0 vQS
dependence of the capacitance upon the reverse bias voltage, (b)
e.g., Cj = C ; o/(1 + VR/^B)"1, where m w 0.3.; and (2) the
narrow control voltage range if forward-biasing the varactor Fig. 2. (a) Simple MOSFET operating as capacitor, (b) MOS varactor.
must be avoided. havior and a high channel resistance in the region between
As an example, consider the LC oscillator shown in Fig. 1. accumulation and strong inversion. To avoid these issues, an
It is desirable to maximize the voltage swings at nodes X and "accumulation-mode" MOS varactor is formed by placing an
NMOS device inside an n-well [Fig. 2(b)]. Providing an
Voo
ohmic connection between the source and drain for all gate
voltages, the n-well experiences depletion of mobile charges
under the oxide as the gate voltage becomes more negative.
X Y
Thus, the varactor capacitance, Cvar, (equal to the series com-
bination of the oxide capacitance and the depletion region
"cont capacitance) varies as shown in Fig. 2(b). Note that for a
sufficiently positive gate voltage, Cvar approaches the oxide
capacitance.
Fig. 1. LC oscillator using pn-junction varactors. The design of MOS varactors must deal with two important
Y so as to both minimize the relative phase noise and ease the issues: (1) the trade-off between the dynamic range and the
channel resistance, and (2) proper modeling for circuit simu- C
var
lations. We now study each issue.
''max
Dynamic Range Deep-submicron MOSFETs exhibit susb-
tantial overlap capacitance between the gate and source/drain
terminals. For example, in a typical 0.13-/mi technology, a
Cmin
transistor having minimum channel length, Lmin, displays an
VGS
overlap capacitance of 0.4 fF/^m and a gate-channel capaci- 0
tance of 12 fF/fim2. In other words, for an effective channel
length of 0.12 /im and a given width, the overlap capacitance Fig. 4. Typical MOS varactor characteristic.
between the gate and source/drain terminals of a varactor con-
stitutes 2 x 0.4 fF /(0.12 x 12 fF+2 x 0.4 fF) « 36% of the circuits in terms of voltages and currents (e.g., SPICE) interpret
total capacitance. Thus, even if the gate-channel component the nonlinear capacitance equation correctly. On the other
varies by a factor of two across the allowable voltage range, the hand, programs that represent the behavior of capacitors by
overall dynamic range of the capacitance is given by (0.12 x 12 charge equations (e.g., Cadence's Spectre) require that the
fF+2 x 0.4 ff)/(0.12 x 6 fF +2 x 0.4 fF) = 1.47. model be transformed to a Q-V relationship [3]:
In order to widen the varactor dynamic range, the transistor
length can be increased, thereby raising the voltage-dependent Qv = I CvardVGS (2)
component while maintaining the overlap capacitance rela-
tively constant. This remedy, however, leads to a greater resis- Cmax — Cmin T , . i ., . VGS J
tance between the source and drain, lowering the Q. The re- = 2 Vo In cosh(a + - y - )
sistance reaches a maximum for the most negative gate-source (-'max + l^min T ,
(3)
voltage, at which the depletion region's width is maximum and + ~ VGS,
the path through the n-well the longest (Fig. 3).1 Note that
which is then used to compute
*var — (4)
dt
If used in charge-based analyses, Eq. (1) typically overesti-
mates the tuning range of oscillators.
p-substrate B. Inductors
The design of monolithic inductors has been studied exten-
Fig. 3. Effect of n-well resistance in MOS varactor. sively. The parameters of interest include the inductance, the
the total equivalent resistance that appears in series with the Q, the parasitic capacitance (i.e., the self-resonance frequency,
varactor is equal to 1/12 of the drain-source resistance. This fsR), and the area, all of which trade with each other to some
is because shorting the drain and source lowers the resistance extent. For a spiral structure such as that in Fig. 5, the line
by a factor of 4 and the distributed nature of the capacitance width, the line spacing, the number of turns, and the outer
and resistance reduces it by another factor of 3 [2]. Depending
on both the phase noise requirements and the Q limitations
imposed by inductors, the varactor length is typically chosen
between Lmin and3L m t n .
Modeling The C-V characteristics of MOS varactors can be
approximated by a hyperbolic tangent function with reasonable
accuracy. Using the characteristic shown in Fig. 4 and noting
that tanh(±oo) = ± 1, we can write
„ , T / .. Cmax ~ Cn , , , VGS x . Cmax + Cmin
Cvar{VGS) = ~
• tanh(a+—- )+
0) Fig. 5. Spiral inductor.
Here, a and Vo allow fitting for the intercept and the slope,
respectively, and C m , n includes the overlap capacitance. dimension are under the designer's control, chosen so as to
The above model yields different characteristics in different obtain the required performance.
circuit simulation programs! Simulation tools that analyze Quality Factor The quality factor of monolithic inductors
1
Fortunately, the capacitance reaches a minimum at this point, and the Q has been the subject of many studies. Before considering the
degrades only gradually. phenomena that limit the Q, it is important to select a useful
and clear definition for this quantity. For a simple inductor
operating at low frequencies, the Q is denned as
(a) (b)
Fig. 9. (a) On-wafer measurement of inductor using coaxial probe, (b) cali- (a) (b)
bration structure.
"signal" (S) pad and the other to the "ground" (G) pads. The Fig. 11. (a) Asymmetric and (b) symmetric inductors.
signal pad is sensed by the center conductor and the ground a moderate Q, about 5 to 6 at 5 GHz, and its interwinding
pads by the outer shield of the coaxial probe. capacitance does not limit the self-resonance frequency be-
Since the capacitance of the pads and the wires connecting to cause adjacent turns sustain a small potential difference. The
the spiral is typically significant, the test device is accompanied line spacing is therefore set to the minimum allowed by the
by a calibration structure [Fig. 9(b)], where the spiral itself technology.
is omitted. The scattering (S) parameters of both structures The symmetric geometry of Fig. 11 (b) provides a greater Q
are measured by means of a network analyzer across the band if stimulated differentially [4], about 7 to 10 at 5 GHz, but its
of interest and subsequently converted to Y parameters. Sub- interwinding capacitance is typically quite significant because
traction of the Y parameters of the calibration geometry from of the large voltage difference between adjacent turns. For this
those of the device under test yields the actual characteristics reason, the line spacing is chosen to be twice or three times
of the spiral. the minimum allowable value, lowering thefringecapacitance
An alternative method of measuring the Q of inductors is considerably but degrading the Q slightly.
illustrated in Fig. 10. Here, inductors are incorporated in In differential circuits, the use of symmetric inductors ap-
an oscillator and the tail current can be controlled externally. pears to save area as well. For example, two asymmetric
In the laboratory measurement, the output is monitored on 1-nH inductors can be replaced by a symmetric 2-nH struc-
a spectrum analyzer while Iss is reduced so as to place the ture, which occupies less area. However, a cascade of dif-
circuit at the edge of oscillation. Next, the value of Iss thus ferential stages employing multiple symmetric inductors [Fig.
obtained is used in the simulation of the oscillator and the 12(a)] faces routing difficulties. As illustrated in Fig. 12(b),
equivalent parallel resistance of each tank, Rp, is lowered the signal lines must travel across the spirals, impacting the
Voo cantly. However, the capacitance between the spirals may
limit the self-resonance frequency. For the two-layer structure
of Fig. 13(a), the overall equivalent capacitance is given by
[5] 4Cl+C
^eg — > (8)
12
Thus, if the bottom layer is moved down [Fig. 13(b)], then
Ceq falls considerably. For example, in a typical 0.13-/im
(a) CMOS technology having eight metal layers, the geometry of
Fig. 13(b) exhibits one-fifth as much as capacitance as the
structure in Fig. 13(a)does.
Stacked structures use lower metal layers, which typically
suffer from a greater sheet resistance than the topmost layer.
As explained below, the resistance can be reduced by placing
spirals in parallel.
Figure 14 illustrates three other configurations aiming to
improve the quality factor. In Fig. 14(a), multiple spirals are
(b) (c)
Fig. 12. (a) Cascade of inductively-loaded differential pairs, (b) layout of first
stage using a symmetric inductor, (c) layout of first stage using asymmetric
inductors.
Kcont
Despite their relative high noise and poor drive capabil-
ity, ring oscillators are used in many high-speed applications.
Several reasons justify this popularity: (1) in some cases, the (a) (b)
oscillator must be tuned over a wide frequency range (e.g.,
one decade) because the system must support different data
Fig. 16. (a) Constant-current ring oscillator, (b) transistor-level implementa-
3 tion of (a).
In reality, the effective value of 7 also depends on the drain-source voltage
to some extent, further complicating the matter. inverters in the ring are supplied by a current source, IQD,
rather than a voltage source, and frequency tuning is also ac- 16 relates to frequency tuning by means of current sources.
complished through IDD- If IDD is designed for low sensi- The voltage-to-current (V/I) conversion required here presents
tivity to VDD, then the oscillator remains relatively immune difficulties at low supply voltages. In the example of Fig.
to supply noise—the principal advantage of this configuration 16(b), as Vcont rises and Vx falls, transistor M3 eventually
over standard inverter-based rings that are directly connected enters the triode region, thus making I\ supply-dependent. The
to the supply voltage. useful range of Vcont is therefore given by VTHN < Vcont <
In practice, the nonidealities associated with IDD limit the VDD - I VGSP\ - VTHN, suggesting the use of a wide device
supply rejection. Shown in Fig. 16(b) is a transistor imple- for Mi to minimize |PGSP|-
mentation where M\ operates as a contolled current source. If
I\ is constant, V\ tracks VDD variations whereas Vy does not, III. LC OSCILLATORS
yielding a change in IDD through channel-length modulation
LC oscillators have found wide usage in high-speed and/or
in M\. Choosing long channels for M\ and Mi alleviates
low-noise systems. Extensive research on inductors, varac-
this issue while necessitating wide channels as well to allow
tors, and oscillator topologies has provided the grounds for
a relatively small drain-source voltage for M\. However, the
systematic design, helping to demystify the "black magic."
resulting high drain junction capacitance of M\ at Y creates a
LC oscillators offer a number of advantages over ring struc-
low-impedance path from VDD to this node at high frequen-
tures: (a) lower phase noise for a given frequency and power
cies. To suppress both resistive and capacitive feedthrough of
dissipation; (b) greater output voltage swings, with peak levels
VDD noise, a bypass capacitor, CB, is tied from Y to ground.
that can exceed the supply voltage; and (c) ability to operate
However, the pole associated with this node now enters the
at higher frequencies.
VCO transfer function, complicating the design of the PLL.
However, LC VCO design requires precise device and cir-
Let us now study the response of the circuit of Figs. 15(a)
cuit modeling because (a) the narrow tuning range calls for
and 16 to substrate noise, VSub- In the former, V8Ub manifests
accurate prediction of the center frequency; (b) the phase noise
itself through two mechanisms (Fig. 17): (1) by modulat-
is greatly affected by the quality of inductors and varactors and
ing the drain junction capacitance of M\ and M2 and hence
the noise of transistors. Also, occupying a large area, spiral
inductors pick up noise from the substrate and make it difficult
<
to incorporate many such oscillators on one chip.
*1 M2 The design of LC VCOs targets the following parameters:
v* Vln center frequency, phase noise, tuning range, power dissipation,
"h
voltage headroom, startup condition, output voltage swing,
p
cP fsub and drive capability. The last two have often received less
attention, but they directly determine the design difficulty and
Vsub /ss
f power consumption of the stages following the oscillator. That
is, a buffer placed after the VCO may consume more power
than the VCO itself!
Fig. 17. Effect of substrate noise on a differential stage.
the delay of the stage (a static effect); and (2) by injecting a A. Design Example
common-mode displacement current through Cp (a dynamic
effect). If injected slightly before or after the zero crossings As an example of VCO design, let us consider the topology
of the oscillation waveform, such a current gives rise to a dif- shown in Fig. 18. Here, M\ and M2 present a small-signal
ferential component at the drains of Mi and Mi because these negative resistance of -2/gm\}i between nodes X and Y,
transistors display unequal transconductances as they depart
from equilibrium. *fao
In the circuit of Fig. 16(b), Vsub modulates both the drain Cp RP LP LP RP CP
junction capacitance of the NMOS devices and their threshold
voltage (and hence the transition points of the waveform). X Y
"1 M2
Both effects are static, making the circuit susceptible even to
low-frequency noise.
/ss
It is instructive to determine the minimum supply voltage for
the above two circuits. At the midpoint of switching, where the
input and output differential voltages are around zero, the stage
Fig. 18. LC oscillator.
of Fig. 15(a) requires that VDD > | V G S P | 4- VQSN + Viss,
compensating for the resistive loss in the tanks and sustaining
where VQSP abd VQSN denote the gate-source voltages of
oscillation. Each tank is modeled by a parallel RLC network,
M3-M4 and M\-M2, respectively, and Viss is the minimum
with all loss mechanisms lumped in Rp.4
voltage necessary for Iss> Interestingly, the circuit of Fig.
16(b) imposes the same minimum supply voltage. 4
For a narrow frequency range, series resistances in the tank elements can
Another critical issue in the circuits of Figs. 15(a) and be transformed to parallel components.
The design process begins with a power budget and hence (typically a buffer), CL- Thus, the allowable varactor capaci-
a maximum value for Iss • This is justified by the following tance is given by the difference between Ctot and the sum of
observation. Once completed and optimized for a given power these components:
budget, the design can readily be scaled for different power
levels, bearing a linear trade-off with phase noise while main- Cvar = (LPU2)-1-CLP-CDB-CGS-4CGD-CL. (13)
taining all other parameters constant. For example, if Iss,
the width of M\ and M2, and the total tank capacitance are This expression gives the center value of the tolerable varactor
doubled and the inductance value is halved, the phase noise capacitance. Of course, a negative Cvar means the inductance
power falls by a factor of two but the frequency of oscillation is excessively large, calling for a lower Lp, a smaller Rp, and
and the output voltage swings remain unchanged.5 hence a larger Iss. However, to steer a greater tail current,
Since subsequent stages typically require the VCO core to the circuit must employ wider MOS transistors, thus incur-
provide a minimum voltage swing, Vmin9 we assume M\ and ring a larger capacitance at nodes X and Y and approaching
M 2 steer nearly all of Iss to their correponding tanks and diminishing returns. This ultimately limits the frequency of
write IssRp = Vmin- Thus, the minimum inductance value oscillation in a given technology.
is given by For a given supply voltage and oscillator topology, the var-
actor capacitance exhibits a known dynamic range Cvar,min <
LP = %• (9) CVar < Cvar,max, yielding a tuning range of u)min < u>osc <
Umax, where
l 0 )
- IssQu'
UJmin = (14)
IT ir -x-r \
where it is assumed the tank Q is limited by that of the induc- y L>p\Lsvar,max ~r Ofixed)
tor. Note that this calculation demands knowledge of the Q
"max = . , (15)
before the inductance is computed, a minor issue because for a VLP\^var,min + ^ fixed)
given geometry and frequency of operation, the Q is relatively
independent of the inductance. and Cfixed = CLP -f CDB + CGS + 4CGD + CL.
We now determine the dimensions of Mi and M2. Increas- Figure 19(a) depicts the oscillator with MOS varactors di-
ing the channel length beyond the minimum value allowed rectly tied to X and Y. Since the output common-mode level
by the technology does not significantly lower 7 unless the
length exceeds approximately 0.5 fim. For this reason, the V
DD VDD vb
minimum length is usually chosen to minimize the capaci-
vb
tance contributed by the transistors. The transistors must be M L2 Li L2
*1 R2
X Y X Y
wide enough to steer most of Iss while experiencing a voltage
C
swing of Vmin at nodes X and Y. Viewing M\ and M2 as a C1 CC1
VminZ= (U)
\l»nC0!w/L' (a) (b)
and hence Fig. 19. LC oscillator with (a) direct coupling and (b) capacitive coupling of
2Iss
w- (n) varactors to tanks.
a r 1/2 IT ' ^ ' is near VDD > M3 and M4 sustain only a positive gate-source
but for short-channel devices, W must be obtained by simula- voltage (if 0 < VCOnt < VDD). A S seen from the C-V charac-
tions using proper device models. This choice of W typically teristic of Fig. 2(b), this limitation reduces the dynamic range
guarantees a small-signal loop gain greater than unity, enabling of the capacitance by about a factor of two. As a remedy, the
the circuit to start at power-up. varactors can be capacitively coupled to X and Y, allowing
With Lp computed from Eq. (10), the total capacitance independent choice of dc levels. Illustrated in Fig. 19(b), such
at nodes X and Y is calculated as Ctot = (Lpu2)~l. This an arrangement defines the gate voltage of Mv\ and Mv2 by
capacitance includes the fo\\ovfingfixed components: (1) the Vb « VDD/2 through large resistors R\ and R2.
parasitic capacitance of Lp, CLP\ (2) the drain junction, gate- The coupling capacitors, Cc\ and Cci, must be chosen
source, and gate-drain capacitances of Mi and M 2 , CDB + much greater than the maximum value of Cvar so as not
Cos + 4CGD>6 and (3) the input capacitance of the next state to limit the tuning range. For example, if Cc\ — Cci —
5 5Cvartmax, then the equivalent series capacitance reaches only
We assume that, at a given frequency, the Q is relatively independent of
the inductance value. 5Ciar,max/(6Cvartmax) = 0.83Cvar,maar, Suffering from a
6
Since CQD experiences a total voltage swing of 2V p m m, its Miller effect 17% reduction in dynamic range. On the other hand, large cou-
translates to a factor of two for each transistor. pling capacitors display significant bottom-plate capacitance,
10
thereby loading the oscillator and limiting the tuning range.7 V
DD
It is possible to realize Cc\ and Cci as "fringe" capacitors t.1 L2
(Fig. 20) [7] to exploit the lateral field between adjacent metal X Y
Cu Cu Cu Cu
Fine Control
Coarse Control Coarse Control
(a)
'out
Fig. 20. Fringe capacitor. Fewer
lines. This structure exhibits a bottom-plate parasitic of a few Capacitors
Switched in
percent, but its value must usually be calculated by means of
field simulators.
The tuning range of LC VCOs must be wide enough to
encompass (a) process and temperature variations, (b) uncer- (b)
tainties due to model inaccuracies; and (c) the frequency band
of interest. In wireless communications, the last component Fig. 21. (a) VCO with fine and coarse digital control, (b) resulting character-
istics.
makes the design particularly difficult, especially if a single
VCO must cover more than one band. For example, in the the use of NMOS devices with a gate-source voltage equal to
Global System for Mobile Communication (GSM) standard, VDD , minimizing their on-resistance.
the transmit and receive bands span 890-915 MHz and 935-960 The above technique entails three critical issues. First, the
MHz, respectively. For one VCO to operate from 890 MHz trade-off between the on-resistance and junction capacitance
to 960 MHz, the tuning range must exceed 7.8%. With an- of the MOS switches translates to another between the Q and
other 7 to 10% required for variations and model inaccuracies, the tuning range. When on, each switch limits the Q of its
the overall tuning rang reaches 15 to 18%, a value difficult corresponding capacitor to (ROnCuu)~]• When off, each
to achieve. In such cases, two or more oscillators may prove switch presents its drain junction and gate-drain capacitances,
necessary, but at the cost of area and signal routing issues. CPB + CGD, in series with Cu, constraining the lower bound
The phase noise of each oscillator topology must be quanti- of the capacitance to CU(CDB + CGD)/(CU + CDBCGD)
fied carefully. The reader is referred to the extensive literature rather than zero. In other words, wider switches degrade the
on the subject. overall Q to a lesser extent but at the cost of narrowing the
discrete frequency steps.
B. Digital Tuning
The second issue relates to potential "blind" zones in the
Our study thus far implies that it is desirable to maximize the characteristic of Fig. 21(b). As exemplified by Fig. 22, if the
tuning range. However, for a given supply voltage, a wider tun-
ing range inevitably translates to a greater VCO gain, Kvco,
thereby making the circuit more sensitive to disturbance ("rip-
ple") on the control line. This effect leads to larger reference
sidebands in RF synthesizers and higher jitter in timing appli-
cations. With the scaling of supply voltages, the problem of
high Kvco has become more serious, calling for alternative
solutions.
A number of circuit and architecture techniques have been Fig. 22. Blind zone resulting from insufficient fine tuning range.
devised to lower the sensitivity of the VCO to ripple on the discrete step resulting from switching out one unit capacitor is
control line. For example, a digital tuning mechanism can be greater than the range spanned continuously by the varactors,
added to perform coarse adjustment of the frequency, allowing then the oscillator fails to assume the frequency values between
the analog (fine) control to cover a much narrower range. Il- /i and f for any combination of the digital and analog controls.
lustrated in Fig. 21 (a), the idea is to switch constant capacitors For this2reason, the discrete steps must be sufficiently small to
into or out of the tanks, thereby introducing discrete frequency ensure overlap between consecutive bands.8
steps. The varactors then tune the frequency within each step, The third issue stems from the loop settling speed. As
leading to the characteristic shown in Fig. 21(b). Note that described below, the PLL takes a long time to determine how
the switches are placed between the capacitors and ground -
rather than between the tank and the capacitors. This permits 8
With afiniteoverlap, however, more than one combination of digital and
analog controls may yield a given frequency. To avoid this ambiguity, the
7
This is relatively independent of whether the bottom plates are connected loop must begin with a minimum (or maximum) value of the digital control
to nodes X and Y or to R\ and Rz. and adjust it monotonically.
11
many capacitors must be switched into the tanks. Thus, if a REFERENCES
change in temperature or channel frequency requires a discrete
[1] B. Razavi, "Design of Monolithic Phase-Locked Loops
frequency step, then the system using the PLL must remain idle
and Clock Recovery Circuits - A Tutorial," in Monolithic
while the loop settles.
Phase-Locked Loops and Clock Recovery Circuits, B.
When employed in a phase-locked loop, the oscillator of Fig.
Razavi, Ed., Piscataway, NJ: IEEE Press, 1996.
21 (a) requires additional mechanisms for setting the digital
[2] P. Larsson, "Parasitic Resistance in an MOS Transistor
control. Figure 23 depicts an example for frequency synthesis.
Used as On-Chip Decoupling Capacitor," IEEEJ. Solid-
State Circuits, vol. 32, pp. 574-576, April 1997.
[3] K. Kundert, Private Communication.
VM-
[4] M. Danesh et al., "A Q-Factor Enhancement Technique
Logic for MMIC Inductors," Proc. IEEE Radio Frequency In-
VL< Coarse tegrated Circuits Symp., pp. 217-220, April 1998.
Control [5] A. Zolfaghari, A. Y. Chan, and B. Razavi, "Stacked In-
Charge
Pump VCO
Fine ductors and Transformers in CMOS Technology," IEEE
Control Journal of Solid-State Circuits, vol. 36, pp. 620-628,
April 2001.
[6] F. Behbahani, et al., "A 2.4-GHz Low-IF Receiver for
Wideband WLAN in 0.6-//m CMOS," IEEE Journal of
Fig. 23. Synthesizer using fine and coarse frequency control. Solid-State Circuits, vol. 35, pp. 1908-1916, December
Here, the oscillator control voltage is monitored and compared 2000.
with two low and high voltages, VL and VJJ, respectively. If [7] O. E. Akcasu, "High-Capacity Structures in a Semicon-
Vcont falls below Vi, the oscillation frequency is excessively ductor Device," US Patent 5,208,725, May 1993.
low 9 , and one unit capacitor is switched out. Conversely, if
Vcont exceeds V#, one unit capacitor is switched in. After each
switching, the loop settles and, if still unlocked, continues to
undergo discrete frequency steps.
9
We assume the frequency increases with VCOnt.
12
Delay-Locked Loops - An Overview
Chih-Kong Ken Yang
Abstract — Phase-locked loops have been used for a wide the data bus, the actual sampling clock is no longer properly
range of applications from synthesizing a desired phase aligned with the data. A DLL is commonly used to lock the
or frequency to recovering the phase and frequency of an phase of the buffered clock to that of the input data. The
input signal. Delay-locked loops (DLLs) have emerged as phase locking significantly reduces timing uncertainty in
a viable alternative to the traditional oscillator-based sampling the data, which then enables higher data rates as in
phase-locked loops. With its first-order loop [3].
characteristic, a DLL both is easier to stabilize and has Although aperiodic signals can also be delayed by the
no jitter accumulation. The paper describes design delay line in a DLL, the inputs to delay lines are typically
considerations and techniques to achieve high clock signals. By using a periodic signal, the delay lines do
performance in a wide range of applications. Issues such not need arbitrarily long delays and typically only need to
as avoiding false lock, maintaining 50% clock duty cycle, span the period of the clock to generate all possible phases.
building unlimited phase range for frequency synthesis, A data signal can be delayed by sampling the data with the
and multiplying the reference frequency are discussed. appropriately delayed clock.
The motivation for using DLLs is that the design of the
I. INTRODUCTION control loop is simplified by having only phase as the state
Many applications require accurate placement of the variable. Section II reviews how such a loop is
phase of a clock or data signal. Although simply delaying the unconditionally stable and has better jitter characteristics.
signal could shift the phase, the phase shift is not robust to However, a DLL is not without its own limitations. The
variations in processing, voltage, or temperature. For more variable delay line has a finite delay range and finite
precise control, designers incorporate the phase shift into a bandwidth. Section II also discusses these design
feedback loop that locks the output phase with an input considerations. Section III describes different
reference signal that indicates the desired phase shift. In implementations of the variable delay line. Within the past
essence, the loop is identical to a phase-locked loop (PLL) ten years, modifications to the basic DLL architecture have
except that phase is the only state variable and that a enabled clock and data recovery applications in
variable-delay line replaces the oscillator. Such a loop is "plesiochronous" systems [4] where the sampling rates for
commonly referred to as a delay-line phase-locked loop or clock and data differ by a few hundred parts-per-million in
delay-locked loop (DLL). As with a PLL, the goals are (1) frequency. Delay lines with effectively infinite delay are also
accurate phase position or low static-phase offset, and (2) addressed in Section III.
low phase noise or jitter. More recently, several researchers such as [5] and [6]
have introduced architectures that permit frequency
Because a DLL does not contain an element of variable
multiplication based on delay lines which further extends
frequency, it historically has fewer applications than PLLs.
their use in clock generation and frequency synthesis.
Bazes in [1] demonstrated an example of precisely delaying
Section IV describes these architectures.
a signal in generating the timing of the row and column
access strobe signals for a DRAM. Another common
II. DLL CHARACTERISTICS
application uses a DLL to generate a buffered clock that has
the same phase as a weakly-driven input clock. Johnson in The basic loop building blocks are similar to that of a
[2] synchronizes the timing of the buffered clock of a PLL: a phase detector, a filter, and a variable-delay line.
floating-point unit with the clock of a microprocessor. A Figure 1 illustrates the three main functional blocks. Since
similar application recovers the data of a parallel bus by phase is the only state variable, a control loop higher than
generating a properly positioned sampling clock. Typically, first-order is not needed to compensate a fixed phase error.
these systems provide a sampling clock with the same The resulting transient impulse response is a simple
sampling rate but with an arbitrary phase as compared to the exponential. Although the simple loop characteristics are an
data (i.e. a "mesochronous" system [4]). A clocked DRAM advantage that DLLs have over PLLs, the design is
data bus is an example of such a system. A clock propagates complicated by the additional circuitry that is needed to
with the data as one of the signals in the bus and therefore overcome having a limited delay range and not producing its
has a nominally known phase relationship with the data. own frequency.
However, in order to receive and buffer the clock to sample
A. First-order Loop
A phase detector compares the phase of the reference
C.K. Ken Yang is with University of California at Los Angeles, input and the delay-line output. The comparison yields a
yang@ee.ucla.edu. signal proportional to the phase error. The error is low-pass
13
1.2 PLL
DLL
in PD Filter 1
KpD Gp(s)
0.8
VC
0.6
Delay Line
dly__in *^DL dly_put 0.4
&
20 60
*HLOOP open loop
° ° ^eO*) °
H(s) Figure 3: Step response of PLL and DLL (with same loop charac-
teristics).
(dB) ,20dB/dec
14
data sample
Rcvr
+7T
lock 2nd lock
data transition sample point point
Rcvr Filter 0
Vc
ref_clk -71
Delay Line
Delay (V c )
sampling clock Figure 5: Delay line phase/delay characteristic.
15
filter would significantly attenuate a high-frequency input
clock. The attenuation increases the jitter and may even clock in
prohibit the input from reaching the output.
Even if the delay line is constrained to span only one Vref
lock point but greater than 2rc, a second similar issue exists. clock out
It is difficult to design a delay line such that the adjustable
range is exactly *c to -tic across different operating and offset
processing conditions. If initialized at the minimum or
maximum delay, the phase detector may push the loop vref
toward either the maximum or minimum delay limit and
"false-lock" to an incorrect phase.
To address false-locking, designers employ several
techniques depending on the application. For systems that
require a delay line with a known fixed delay, operating Vrefl
condition variations may be small enough such that the delay / \vrcf2
line only needs a small variable range that is less than +n and
clock in /
-n. For systems that lock to a fixed phase over a wide range \
of frequencies, one design [9] uses an auxiliary
frequency-sensing loop that generates a voltage to coarsely
set the delay for the given input frequency. Then DLL only duty c y c l e
fine tunes the delay for the desired phase. For data recovery *—
clock out ™ —»» reduction
applications where the clock phase can be arbitrary with mmmmmmm
16
vc
resistive
in + m
- in+ in_
vb~ Vc
capacitive
(a) (b) control (f)
V
CP
out vc
Figure 8: Delay versus voltage for two different delay buffer ele-
in out ments: types (d) and (f) of Fig. 7.
in+ in. For push-pull type elements such as inverters, the delay
VOT can be changed by changing the rate at which the output
Vb~ Vc capacitance is charged [Fig. 7-(d)]. An adjustable current
source limits the peak current of an inverter and varies the
delay. An alternative method regulates the supply voltage of
(c) (d) the inverters and uses the control voltage to set the supply
Vc voltage [Fig. 7-(e)]. The effective switching resistance varies
in out with the supply voltage. Instead of changing the resistance,
the effective capacitance can also be made adjustable [Fig.
vc 7-(f)]. A transistor that behaves as an adjustable resistance
in can be used to decouple an explicit output capacitance. The
out
larger the resistance the less capacitance is seen at the
output.
(e) (f) Figure 8 illustrates the delay versus control voltage for a
Figure 7: Six different delay elements. resistively-controlled delay element. For the element of Fig.
7-(d), either \fcs"^th o r m e ^*as c u r r e n t c a n De z e r o an d,
therefore, a single element's delay can span from the
A. Basic Delay Line minimum buffer delay to infinite. However, since the time
constant is proportional to the delay, a long delay setting
A delay line comprises of a chain of variable-delay would significantly attenuate a high-frequency clock. Delay
elements. Each element is controllable by either a voltage or lines with a wide range for high clock frequencies require a
a current. The delay of each element is proportional to its RC large number of broadband delay elements.
time constant and changing the effective resistance or Unlike resistive control, the maximum delay in a
capacitance adjusts the delay. capacitively-controlled element [Fig. 7-(f)] is proportional to
Figure 7 depicts several examples of buffer elements. R(C int +C exp ) and the minimum delay is proportional to
For a differential buffer, the load resistance can be an MOS RC int where C int is the intrinsic capacitance of the buffer
transistor in the triode region [Fig. 7-(a)] where the and the load of the subsequent stage, and C e x p is the explicit
resistance is proportional to Vos-^th- Varying the gate capacitance added to the circuit. Because of the limited
voltage adjusts the delay of the element. A non-linear device range per buffer, obtaining a wide delay range involves a
such as a diode can also serve as a load resistance [Fig. large number of buffers. The maximum delay of each buffer
7-(b)]. Since the resistance varies with the current, varying is chosen to avoid attenuating the signal. In designs where
the bias current of the buffer would adjust the delay. the clock has a large voltage swing, the transistor in series
Similarly, a negative transconductance that changes with the with the explicit capacitance no longer appears as a variable
bias current can be placed in parallel with a fixed load resistor because the device enters saturation and cut-off. For
resistance [Fig. 7-(c)]. The varying negative these buffers, the control voltage determines the fraction of
transconductance changes the effective load resistance and current and period of time in which the buffer's current
hence varies the delay. Because nonlinear elements have charges the explicit capacitance.
resistances that depend on both voltage and current, they can An example of the delay versus control voltage for a
be more sensitive to supply noise. capacitively-controlled element is overlaid in Fig. 8. Most
17
180° Phase <*inO & cI W>
Detect +
Filter ck
in0
Ckjnl c
^outl
yc
<*inl
X
<*outO
ck
inO ^Io
c
clock^ ^outOl ^outl
4*90,270 f ^0,180
^45,225
t 0135,315 cllWoi
Figure 9: 180°-locked DLL to generate intermediate phases that are
a fraction of a cycle.
I 0-oc)I0
\ JPhaselntei^Iartor, ^
delay elements exhibit some nonlinearity. As a result, the Figure 10: Phase interpolator design by shorting of the output of
delay-line gain, K DL , is a function of the delay. Because a two integrators/buffers..
DLL is unconditionally stable, the loop still functions with
the varying loop parameter. However, more linear elements
are better for designs that require a constant loop bandwidth. Multiplexers are needed to select the phases to interpolate
To compensate for the variable K D L , designers add between. For example, with phases tapped from a 4-stage
programmability to the loop-filter capacitor. delay line, if the desired output clock phase is 120°, the
The control signal for either type of delay elements can interpolator inputs would be from the second and third delay
be digital. In a digital implementation [11], the current elements.
source is binary weighted and switched by a digital word. Interpolators essentially perform a weighted average of
For capacitively-controlled elements, the capacitance can be the input phases. As shown in Fig. 10, ideally, the two input
binary weighted and switched. A nearly all-digital DLL is phases drive two integrators which charge a single output.
then possible by using a simple counter to replace the analog The weighting of the average is by the relative currents of
integrating filter. the two integrators. When <x=l, the output clock phase
depends only on ckinQ. When a=0.5, i.e. the current is split
B. Phase Interpolation
equally between the two integrators, the output phase is
Instead of only using the clock phase at the end of a additionally delayed by half the phase difference. As
delay line, an earlier clock phase can be tapped from the illustrated in Fig. 10, the phase of the interpolated output
middle of a delay line. Some applications require the delay (ckoutQj) falls between the phases of the non-interpolated
line to produce a delay that is a fixed fraction of the outputs (ckout0 and ck0UtJ).
input-clock period. Figure 9 shows one implementation that With ideal integrators, the interpolation is linear,
uses a DLL to lock the input clock to the output. An 180° resulting in a constant KpL. Alternatively, an interpolator
phase detector would guarantee the absolute delay of a delay can effectively be formed with buffer elements instead of
line to be a half-cycle. Tapping from different points on the integrators. By weighting the drive strength or current of two
delay line provides different phases. As shown in Fig. 9, for buffer elements whose outputs are shorted together, one can
a 45° phase shift, the clock can be tapped from the first delay adjust the output phase. Because the output is not integrated,
stage of a 4-stage differential delay line. If an arbitrary phase the resulting interpolation is slightly nonlinear and depends
is needed, each delay stage can be tapped and multiplexers on (1) the phase difference between the inputs and (2) the
can select the nearest desired phase. The number of delay slew rate (or time constant) of the input and output signals
elements quantizes the phase step and limits the resolution [13]. Figure 11 depicts the linearity of the interpolation for
[12]. Fine phase resolution requires longer delay lines. Yet, two different input phase separations, s=r and S=2T where x
the resolution is limited at high clock frequencies because is the buffer's time constant. The larger phase spacing results
the maximum number of delay elements needed to span 180° in greater nonlinearity. Similar to RC delay elements, the
is limited. interpolation can be digitally controlled. Since the weighting
An arbitrary intermediate phase can be obtained by of the interpolation depends on the proportional current, the
"interpolating" between two clock phases that are tapped current sources of the integrators or buffers can be digitally
from a delay line. Depending on the weighting, an weighted and programmed.
interpolator produces a clock that has a programmable In a design for clock and data recovery by [3],
output phase in between the input clock phases. As long as quadrature clocks are interpolated to generate an
discrete clock phases that span the entire cycle are available intermediate clock phase within a quadrant. Figure 12
as inputs, any phase for the interpolator's output is possible. illustrates the mostly analog architecture. An analog control
18
clocks
3.61 Phase Generator
l^O J<feo J<Pl80 % 7 0
,Mux A
Control
I 2.6T
Interpolator
datajn docksamp
Phase
1.6T Detect
Figure 12: Infinite-range delay line based on phase rotation.
0.0 0.2 0.4 0.6 0.8 1.0
current partition (X)
Figure 11: Buffer based phase interpolator linearity.
19
datait1 Startable Oscillator
L L Lr
D
>
C D
>
C D
>
C
••• Receiver
Samplers
dafcin
cloclT l^o
[1:N]
R. I
-£>\
r
18(f Delay Line
Transition
do'
Detect
dock^p
Decision
Logic Receiver delay control
f received data
received data Figure 15: Clock/data recovery using startable oscillator.
Figure 14: Oversampled data recovery architecture.
20
restore a clock's duty cycle, the output clock requires
t correction circuitry. To use DLLs in plesiochronous systems,
Counter+ the delay line must have even more circuitry to achieve an
Control unlimited delay range. In clock multiplication applications,
very careful matching in the DLL components is critical to
eliminate reference tones. In the many designs that have
clock^ Delay Line addressed these subtleties, DLLs have demonstrated
Nxf ref low-jitter clock outputs for a variety of clock generation and
data recovery applications.
Filter 1
4 REFERENCES
21
Solid-State Circuits, vol 34, no 12, Dec. 1999, pp. [22] Ota, Y. et. al., "High-Speed, Burst-Mode, Packet Capable
1951-60 Optical Receiver and Instantaneous Clock Recovery for
[16] Cordell, R., "A 45-Mbit/s CMOS VLSI Digital Phase Optical Bus Operation," IEEE Journal of Lightwave
Aligner," IEEE Journal of Solid-State Circuits, vol 23, no Technology, vol 12, no 2, Feb. 1994, pp. 325-330
2, Apr. 1988, pp. 323-28 [23] Foley, D., M.P. Flynn, "CMOS DLL-Based 2-V 3.2ps
[ 17] Lee, K., et. al., "A CMOS Serial Link For Fully Duplexed Jitter 1-GHz Clock Synthesizer and Temperature-Com-
Data Communication," IEEE Journal of Solid-State Cir- pensated Tunable Oscillaor," IEEE Journal of Solid-State
cuits, vol 30, no 4, Apr. 1995, pp. 353-64 Circuits, vol 36, no 3, Mar. 2001, pp. 417-23
[18] Yang, C.K., et al., "A 0.5-|im CMOS 4.0-Gb/s Serial [24] Kim, C , I. Hwang, S.M. Kang, "Low-Power Small-Area
Link Transceiver with Data Recovery Using Oversam- +/-7.28ps Jitter lGHz DLL-Based Clock Generator,"
pling," IEEE Journal of Solid-State Circuits, vol 33, no 5, IEEE ISSCC Dig. of Tech. Papers, Feb. 2002, San Fran-
May 1998, pp. 713-22 cisco, Session 8.3
[19] Weinlader, D., et al., "An Eight Channel 36-GS/s CMOS [25] Farjad-rad, R., et. al., "A 0.2-2GHz 12mW Multiplying
Timing Analyzer," IEEE ISSCC Dig. of Tech. Papers, DLL for Low-Jitter Clock Synthesis in Highly-Integrated
Feb. 2000, San Francisco, pp. 170-1 Data Communication Chips," IEEE ISSCC Dig. of Tech.
[20] Maneatis, J., M. Horowitz, "Precise Delay Generation Papers, Feb. 2002, San Francisco, Session 4.5
Using Coupled Oscillators," IEEE Journal of Solid-State [26] Ye, S., L. Jansson, I. Galton, "A Multiple-Crystal Inter-
Circuits, vol 28, no 12, Dec. 1993, pp. 1273-82 face PLL with VCO Realignment to Reduce Phase
[21] Gray, C , et. al., "A Sampling Technique and Its CMOS Noise," IEEE ISSCC Dig. of Tech. Papers, Feb. 2002,
Implementation with lGb/s Bandwidth and 25ps Resolu- San Francisco, Session 4.6
tion", IEEE Journal of Solid-State Circuits, vol 29, no 3, [27] Kim, J., et. al., "A Low-Jitter Mixed-Mode DLL for
Mar. 1994, pp. 340 High-Speed DRAM Applications," IEEE Journal of
Solid-State Circuits, vol 35, no 10, Oct. 2000, pp. 1430-3
22
Delta-Sigma Fractional-TV Phase-Locked Loops
Ian Galton
I. INTRODUCTION d
T
ref-
Over the last decade, delta-sigma (AS)fractional-TVphase
locked loops (PLLs) have become widely used for frequency Figure 1: A typical integer-N PLL.
synthesis in consumer-oriented electronic communications underlying fractional-TV PLLs in general and AS fractional-TV
products such as cellular phones and wireless LANs. Unlike PLLs in particular are presented in Section III. The primary
an integer-TV PLL, the output frequency of a AS fractional-TV innovation in ASfractional-TVPLLs relative to other types of
PLL is not limited to integer multiples of a reference fre- fractional-TV PLLs is the use of AS modulation. Therefore, a
quency. The core of a ASfractional-TVPLL is similar to an self-contained introduction to AS modulation as it relates to
integer-TV PLL, but it incorporates additional digital circuitry ASfractional-TVPLLs is presented in Section IV. A AS frac-
that allows it to accurately interpolate between integer multi- tional-TV PLL linearized model is derived in Section V and
ples of the reference frequency. The tuning resolution de- compared to the corresponding model for integer-TV PLLs. A
pends only on the complexity of the digital circuitry, so con- design example is presented to demonstrate how the model is
siderable flexibility and programmability is achieved. A sin- used in practice. Design issues that arise in AS fractional-TV
gle AS fractional-TV PLL often can be used for local oscillator PLLs but not integer-TV PLLs are presented in Section VI, and
generation in applications that would otherwise require a cas- recently developed enhancements to AS fractional-TV PLLs
cade of two or more integer-TV PLLs. Moreover, the fine tun- that allow wideband digital modulation of the VCO are pre-
ing resolution makes it possible to perform digitally-controlled sented in Section VII.
frequency modulation for generation of continuous-phase
(e.g., FSK and MSK) transmit signals, thereby simplifying
II. INTEGER-TV PLL LIMITATIONS
wireless transmitters. These benefits come at the expense of
increased digital complexity and somewhat increased phase An example of a typical integer-TV PLL for frequency syn-
noise relative to integer-TV PLLs. However, with the relentless thesis is shown in Figure 1 [1], [2]. Its purpose is to generate
progress in silicon VLSI technology optimized for digital cir- a spectrally pure periodic output signal with a frequency of TV
cuitry, this tradeoff is increasingly attractive, especially in /„,/, where TV is an integer, and/ re /is the frequency of the refer-
consumer products which tend to favor cost reduction over ence signal. The example PLL consists of a phase-frequency
performance. detector (PFD), a charge pump, a lowpass loop filter, a voltage
This paper presents a tutorial on ADfractional-TVPLLs. It controlled oscillator (VCO), and an TV-fold digital divider.
is assumed that the reader has a working knowledge of inte- The PFD compares the positive-going edges of the reference
ger-TV PLLs. The paper builds on this knowledge by present- signal to those from the divider and causes the charge pump to
ing the additional concepts required to understand AS frac- drive the loop filter with current pulses whose widths are pro-
tional-TV PLLs. The limitations of integer-TV PLLs with respect portional to the phase difference between the two signals. The
to tuning resolution are described in Section II. The key ideas pulses are lowpass filtered by the loop filter and the resulting
waveform drives the VCO. Within the loop bandwidth phase
The author is with the Department of Electrical and Computer Engi- noise from the VCO is suppressed and outside the loop band-
neering, University of California at San Diego, La Jolla, CA, USA. width most of the other noise sources are suppressed, so the
23
/ . - 4 0 kHz fw - 2.402 GHz Phase/
Charge Loop
+ *MHz<wk Freq.
^-492 Phase/ Charge Loop
Detector Pump Filter
f**r2-403 GHz
Freq. VCO 19.68 MHz (on average)
Detector Pump Filter
^•60050 + 2 5 * _r
d
Shift Register with 51
ones and 441 zeros y\»\
Figure 2: An example integer-N PLL for generation of the Bluetooth wireless
LAN RF channel frequencies. Figure 3: A fractional-.^ PLL that generates non-integer multiples of the refer-
ence frequency, but has phase noise consisting of large spurious tones.
PLL can be designed to generate a spectrally pure output sig-
nal at any integer multiple of the reference frequency,/*/. scribed in the next section, a singlefractional-TVPLL can be
As indicated by the timing diagram in Figure 1, the loop used.
filter is updated by the charge pump once every reference pe-
riod. This discrete-time behavior places an upper limit on the III. THE IDEA BEHIND AS FRACTIONAL-//PLLs
loop bandwidth of approximately fnj/l0 above which the PLL
tends to be unstable [1]. In integrated circuit PLLs, it is com- In this section, the example problem of generating the sec-
mon to further limit the bandwidth to approximately f^/20 to ond Bluetooth channel frequency, 2.403 GHz, with a reference
allow for process and temperature variations. frequency of 19.68 MHz is used as a vehicle with which to
The output frequency can be changed by changing N, but explain the idea behind AE fractional-N PLLs. First, a pair of
N must be an integer, so the output frequency can be changed "bad"fractional-TVPLLs are presented that achieve the desired
only by integer multiples of the reference frequency. If finer frequency but have poor phase noise performance. Then the
tuning resolution is required the only option is to reduce the AEfractional-TVPLL technique is presented as a means of im-
reference frequency. Unfortunately, this tends to reduce the proving the phase noise performance.
maximum practical loop bandwidth, thereby increasing the The output frequency of an integer-N PLL with a reference
settling time of the PLL, the noise contributed by the VCO, frequency of 19.68 MHz is 2.40096 GHz when the divider
and the in-band portions of the noise contributed by the refer- modulus, N, is set to 122 and 2.42064 GHz when N is set to
ence source, the PFD, the charge pump, and the divider. 123. The problem is that to achieve the desired frequency of
This fundamental tradeoff between bandwidth and tuning 2.403 GHz, TV would have to be set to the non-integer value of
resolution in integer-Af PLLs creates problems in many appli- 122 + 51/492. This cannot be implemented directly because
cations. For example, a PLL that can be tuned from 2.402 the divider modulus must be an integer value. However the
GHz to 2.480 GHz in steps of 1 MHz is required to generate divider modulus can be updated each reference period, so one
the local oscillator signal in a direct conversion Bluetooth option is to switch between N = 122 and N= 123 such that the
transceiver [3]. An integer-N PLL capable of generating the average modulus over many reference periods converges to
local oscillator signal from a commonly used crystal oscillator 122 + 51/492. In this case, the resulting average PLL output
frequency, 19.68 MHz, is shown in Figure 2. A reference fre- frequency is 2.403 GHz as desired. This is the fundamental
quency of fref = 40 kHz—the greatest common divisor of the idea behind most fractional-// PLLs [4].
crystal frequency and the set of desired output frequencies—is While dynamically switching the divider modulus solves
obtained by dividing the crystal oscillator signal by 492. The the problem of achieving non-integer multiples of the refer-
resulting PLL output frequency is 60050 + 25k times the ref- ence frequency, a price is paid in the form of increased phase
erence frequency, where k is an integer used to select the de- noise. During each reference period the difference between
sired frequency step. the actual divider modulus and the average, i.e., ideal, divider
The PLL achieves the desired output frequencies, but its modulus represents error that gets injected into the PLL and
bandwidth is limited to approximately 2 kHz, i.e.,/^/20. Un- results in increased phase noise. As described below, the
fortunately, with such a low bandwidth the settling time ex- amount by which the phase noise is increased depends upon
ceeds the 200 jiS limit specified in the Bluetooth standard, and the characteristics of the sequence of divider moduli.
the phase noise contributed by the VCO would be unaccepta- For example, in the fractional-JV PLL shown in Figure 3,
bly high if it were implemented in present-day CMOS tech- the divider modulus is set each reference period to 122 or 123
nology. One solution is to use a 1 MHz reference signal, but such that over each set of 492 consecutive reference periods it
this requires the crystal frequency to be an integer multiple of is set to 122 a total of 441 times and 123 a total of 51 times.
1 MHz, or another PLL to generate a 1 MHz reference fre- Thus, the average modulus is 122-1- 51/492 as required. The
quency. Unfortunately, in low cost consumer electronics ap- sequence of moduli is periodic with a period of 492, so it re-
plications such as Bluetooth, it is often desirable to be com- peats at a rate of 40 kHz. Consequently, the difference be-
patible with all of the popular crystal frequencies, so restrict- tween the actual divider moduli and their average is a periodic
ing the crystal frequencies to multiples of 1 MHz is not always sequence with a repeat rate of 40 kHz, so the resulting phase
an option. In such cases, an additional PLL capable of gener- noise is periodic and is comprised of spurious tones at integer
ating the 1 MHz reference signal with very little phase noise multiples of 40 kHz. Many of the spurious tones occur at low
from any of the crystal frequencies is required, or, as de- frequencies, and they can be very large. Unfortunately, the
24
Phase/ Charge Loop
Freq.
1 Pump Filter rD>°T * Phase/ Charge Loop 2.403 GHz
HOI- H Detector fvaT 2 - 4 0 3 G H z
(on average)
HDr-J
Freq.
r Detector
Pump Filter vco
19.68 MHz
19.68 MHz
122 +y\n\
-f- 122 +y[n)
Randomized Pulse fl with probability 51/492
Density Modulator 51/492
10 with probability 441 /492 «*®*£U Simulated PLL Phase Noise
Figure 4: A fractional-TV PLL that generates non-integer multiples of the refer- 500 kHz Loop Bandwidth
{0,217} -80
ence frequency, but has a large amount of in-band phase noise.
pseudo-random
•100
bit sequence y[n
only way to suppress the tones is have a very small PLL {-1,0,1,2} g-120-
bandwidth, which negates the potential benefit of the frac- -140
tional-N technique. " 16 ° I 50 kHz Loop Bandwidth \
One way to eliminate spurious tones is to introduce ran- -180
domness to break up the periodicity in the sequence of moduli
while still achieving the desired average modulus. For exam- Figure 5: A A I fractional-A^ PLL example.
ple, as shown in Figure 4, a digital block can be used to gener-
ate a sequence, y[n], that approximates a sampled sequence of crease the phase noise of the PLL.
independent random variables that take on values of 0 and 1 Also shown in Figure 5 are PSD plots of the output phase
with probabilities 441/492 and 51/492, respectively. During noise arising from AS modulator quantization noise, em[n], in
the n^ reference period the divider modulus is set to 122 + two computer simulated versions of the example AS frac-
y[n], so the sequence of moduli has the desired average yet its tional-Af PLL, one with a 50 kHz loop bandwidth and the other
power spectral density (PSD) is that of white noise. Thus, with a 500 kHz loop bandwidth. As shown in the next section,
instead of contributing spurious tones, the modified technique the PSD of em[n] increases with frequency, so the phase noise
introduces white noise. Unfortunately, the portion of the PSD corresponding to the 50 kHz bandwidth PLL is signifi-
white noise within the PLL's bandwidth is integrated by the cantly smaller than that corresponding to the 500 kHz band-
PLL transfer function, so the overall phase noise contribution width PLL. For example, the former easily meets the re-
again can be significant unless the PLL bandwidth is small. quirements for a local oscillator in a direct conversion Blue-
In each fractional-M PLL example presented above, the se- tooth transceiver, but the latter falls short of the requirements
quence, y[n], can be written as y[n] =x + em[n], where x is the by at least 23 dB.
desired fractional part of the modulus, i.e., x = 51/492, and
em[n] is undesired zero-mean quantization noise caused by IV. DELTA-SIGMA MODULATION OVERVIEW
using integer moduli in place of the ideal fractional value. In
the first example, em[n] is periodic and therefore consists of As mentioned above, a digital AS modulator performs
spurious tones at multiples of 40 kHz. In the second example, coarse quantization in such a way that the inevitable error in-
em[n] is white noise. Each PLL attenuates the portion of em[n] troduced by the quantization process, i.e., the quantization
outside its bandwidth, but the portion within its bandwidth is noise, is attenuated in a specific frequency band of interest.
not significantly attenuated. Unfortunately, in each example There are many different AS modulator architectures. Most
em[n] contains significant power at low frequencies, so it con- use coarse uniform quantizers to perform the quantization with
tributes substantial phase noise unless the PLL bandwidth is feedback around the quantizers to suppress the quantization
very low. noise in particular frequency bands. Therefore, to illustrate
A AS fractional-N PLL avoids this problem by generating the AS modulator concept, first a specific uniform quantizer
the sequence of moduli such that the quantization noise has example is considered in isolation, and then a specific AS
most of its power in a frequency band well above the desired modulator architecture that incorporates the uniform quantizer
bandwidth of the PLL [5], [6], [7]. An example AS fractional- is presented.
ly PLL is shown in Figure 5. The PLL core is similar to those
A. An Example Uniform Quantizer
of the previous fractional-N PLL examples, but in this case
y[n] is generated by a digital AS modulator. The details of The input-output characteristic of the example uniform
how the AS modulator works are presented in the next section, quantizer is shown in Figure 6. It is a 9-level quantizer with
but its purpose is to coarsely quantize its input sequence, x[n], integer valued output levels. For each input value with a
such that y[n] is integer-valued and has the form: y[n] = x[n - magnitude less than 4.5, the quantizer generates the
2] + em[ri], where em[n] is de-free quantization noise with most corresponding output sample by rounding the input value to
of its power outside the PLL bandwidth. In this example, x[n] the nearest integer. For each input value greater than 4.5 or
consists of the desired fractional modulus value, 51/492, plus less than -4.5, the quantizer sets its output to 4 or - 4 , respec-
a small, pseudo-random, 1-bit sequence. As described in the tively; such values are said to overload the quantizer. By
next section, the pseudo-random sequence is necessary to defining the quantization noise as eg[n] = y[n]-r[n]9 the
avoid spurious tones in the AS modulator's quantization noise, quantizer can be viewed without approximation as an additive
but its amplitude is very small so it does not appreciably in- noise source as illustrated in the figure.
25
-y Delay Delay 9-Level
Quantizer
TI
K
I
4-
9-Levei
3 •
2
f] /J
Quantizer
1
I r
\
-4 5 - 3 5 -2.5 -I 5 | 0.5 1.5 2.5 3.5 4 5
Figure 8: A AX modulator example.
e =y-r
-Si can be used to circumvent this problem. The structure incor-
porates the same 9-level quantizer presented above, but in this
y 0.5 r case the quantizer is preceded by two delaying discrete-time
-0.5
integrators (i.e., accumulators), and surrounded by two feed-
M
No-ovcrload range"
back loops [8], [9]. Each discrete-time integrator has a trans-
Figure 6: A 9-level quantizer example. fer function of z~ 1 /(l-z~ 1 ) which implies that its «* output
sample is the sum of all its input samples for times k < n.
48 kHz sinusoid plus white With the quantizer represented as an additive noise source as
noise (SNR = lOOdB) —
sampled at 48 MHz
9-level
Quantizer
(a),(b) Lowpass Filter
(BW = 500 kHz) -fc> depicted in Figure 6, the AS modulator can be viewed as a
two-input, single-output, linear time-invariant, discrete-time
(a) (b)
system. It is straightforward to verify that
2 y[n] = x[n-2] + em[n], (1)
I* where em[n] is the overall quantization noise of the AS modu-
I- 80
•2
< -120 Ao
To illustrate the behavior of the AS modulator, suppose that
-2
the same 48 Msample/s input sequence considered above is
^ -140
26
Phase/
48 kHz sinusoid plus white Charge
noise (SNR = lOOdB) —
sampled at 48 MHz
Second-Order
AI Modulator
(a),(b) Lowpass Filter
(BW = 500 kHz) -fc> Freq.
Detector Pump
'*«>
vco
'm«
*<
c, ^')
Ci
(a) (b),
J .0 v Loop Filter
«tQ
+ N + y\n\
f:
£.100
\
0 SO0 1OO0 1500 2000
y\n\-
V
V
ref-
div-
(C)
II
2
• ' • - * ;
*•„••-'„•.
<-120 -2 Figure 10: The AL fractional-N PLL with the details of a commonly used
10 4 10 s 10 e 10T 0 500 1000 1500 2000 loop filter and a timing diagram relating to the charge pump output.
Hz time (units of 1/(48 MHz))
-180
27
ing the w* reference period the charge pump output is a cur-
rent pulse of amplitude / or - / and duration \tn - rn\, where /„
and rn are the times of the charge pump output transitions trig-
gered by the positive-going edges of the divider output and
reference signal, respectively. Therefore, the average current
sourced or sunk by the charge pump during the «* reference
period is /•(/„ - rn)ITnf. In practice, the PFD is usually de-
signed such that, except for a possible constant offset, this
result holds even though the current sources have finite rise
and fall times [2]. Figure 11: The A£ fractional-N PLL linearized model. Except for the shaded
region the model is identical to the corresponding integer-Af PLL model.
The first step in deriving the model is to develop an ex-
pression for /„ — xn. Ideally, rn = nTKf, but phase noise intro- tions in (5) can be neglected, and the charge pump output can
duced by the reference source and PFD cause it to have the be modeled as a smoothly varying function of time with an
form average value over each reference period equal to that of (5).
With these approximations, (5) implies that
T = nT
" * "fj[MO+*m>(O]» (3) ' "m(j)-Kcoum{t)-kycoUVc^t)-vml)dt~^l
where 6reJ(j) and OPFDO) a r e the reference source and PFD LO) = I - lE
—
phase noise functions, respectively. If the VCO output were cp
N+a
ideal its positive-going edges would be spaced at uniform in- (6)
1
tervals of Tnf I (N + a), where a is the fractional part of the 0
^,v(') , ~AO , &pFD(t)
modulus (e.g., a = 51/492 in Figure 5). Therefore, ideally,
2n + 2n + 2n
^v^i{N+m'
but in practice it deviates because of VCO phase noise,
where ujj) is the result of discrete-time integrating and con-
verting to continuous-time the quantity, y[n] — a.
OyccAi), divider phase noise, #</,v(*X and instantaneous devia- The AL fractional-N PLL linearized model follows directly
tions of the VCO control voltage from its ideal average value from (6) and Figure 10. It is shown in Figure 11, where in(t)
of
v^, =(N+a)l(TrefKyco), where Kvco is the VCO gain in represents the noise contributed by the charge pump current
units of Hz/Volt. As a result, sources and the loop filter, and z//s) is the transfer function of
the loop filter. The model specifies the phase noise transfer
<.-iyg<"+**j)-^-(v-<<>-v,)*-^] functions and loop dynamics of the PLL. For example, the
model implies that
T
2n e*.(O, 4 ^ ^
#«/(•*)
+ aJ ) ^ - ,
l + T(s)
and ^ l £ l
eYCO{s)
= _L_
l + T(s)
(7)
which reduces to
where
tn=nTnf +^—\y\(yW-<*) vc
T(s)= °y Y (8)
* N + alh '
-Kco f ( v ^ C ) - ^ ) * - ^ ] (4)is the loop gain of the PLL. For the loop filter shown in Fig-
ure 10, the transfer function is
T
1 z (J, 1 l + sRCt
ref
2n
0*(O- (9)
" C^+C2 s[\ +sRClC2 /(C, + C 2 )]'
Subtracting (3) from (4) yields an expression for the average
current sourced or sunk by the charge pump during the «* B. Differences Between the AS Fractional-N and Integer-N
reference period: PLL Models
I(tn-Tn)/Tre/ =
The shaded region in Figure 11 indicates the part of the
model that is specific to AZ fractional-TV PLLs; except for the
SUtj-^-^fMo-v,,)*-^ shaded region the model is identical to the corresponding
Il _ _ (5)
(5) model for integer-iV PLLs. Therefore, each phase noise trans-
N+a
fer function in an integer-iVTLL is identical to the correspond-
ing phase noise transfer function in a AZfractional-.AfPLL,
g«,C.) , M O , <WO except every occurrence of TV in the former is replaced by N+a
2n In ' In in the latter. In most cases, N» \ and a<\, so N+a~N
and the corresponding transfer functions in integer-N and AE
As mentioned above, the phase noise terms are assumed to fractional-AT PLLs are nearly identical in practice. Similarly,
have bandwidths that are much smaller than the reference fre- the loop dynamics and stability issues are nearly the same in
quency. Consequently, the sampling of the phase noise func- ASfractional-TVPLLs and integer-TV PLLs.
28
^ ^ / w = 2.402 GHz margin, and AS modulator quantization noise suppression.
pum Fiiter The process is demonstrated below for the AS fractional-N
L|QI I pi Debtor H pM r V T r PLL presented in Section III to generate the local oscillator
19.68 MHz frequencies in a direct conversion Bluetooth wireless LAN
| — ±N+y\n\ * transceiver. The PLL is shown in Figure 12 with additional
2nd-Order~| I detail regarding the frequency plan. As described previously,
m/492—*<£>-*> Digital A£ * the desired output frequencies are/pco = 2.402 GHz + k MHz
T 1 Modulator | y\n\ = {-1, 0,1, 2}
for k = 0, ..., 78, and the crystal reference frequency is 19.68
{0,2~i7} pseudo-random bit sequence
MHz. Each of the 79 possible output frequencies is chosen by
Frequency Plan: selecting m and N as indicated in the figure. In each case, the
• Toget* = 0, I , . . . , o r l 8 : set N= 122, m = k-25 + 26 divider modulus is restricted to the set of four integers {N- 1,
• To get* = 19, 2 1 , . . . , or 38: set N= 123, m = (k- 19)25 + 9
• To get * = 39, 4 1 , . . . , or 57: settf= 124, m = (*-39)-25 + 17 N,N+ 1, N + 2}. The combinations of m and N were chosen
• To get k = 58,60,..., or 79: set N= 125, m = (k- 58)25 to achieve the desired output frequencies yet keep the signals
at the input of the AS modulator sufficiently small so as not to
Figure 12: The example A I fractional-N PLL and frequency plan for genera- overload the AS modulator [11].
tion of the Bluetooth wireless LAN RF channel frequencies.
Typical requirements for such a PLL are that the loop
The primary difference between the AS fractional-^ and bandwidth must be greater than 40 kHz, the phase margin
integer-TV PLL models is the signal path corresponding to the must be greater than 60°, and the PLL phase noise be less than
AX modulator shown in the shaded region of Figure 11. The -120 dBc/Hz at offsets from the carrier of 3 MHz and above.
sequence, y[n] - a, consists of AS modulator quantization Assume that the VCO, divider, PFD, and charge pump circuits
noise, e m [«], which, as described previously, gives rise to have been designed such that the overall PLL phase noise
phase error in the PLL output. For the example second-order specification can be met provided the phase noise contributed
AS modulator it follows from the results presented in Section by the AS modulator and loop filter are each less than -130
IV and the AS fractional-AT PLL model equations presented dBc/Hz at offsets from the carrier of 3 MHz and above. Fur-
above that the PLL phase noise component resulting from thermore, assume that the VCO and charge pump circuits are
em[n] has a PSD given by such that Kvco and / are 200 MHz/V and 200 uA, respectively,
and that the loop filter has the form shown in Figure 10. Thus,
the remaining design task is to choose the loop filter compo-
nents such that the bandwidth, phase margin, and phase noise
l0.J-*J2.J*£tf _j_Au(W] dBc/Hz specifications are met.
The PLL phase margin, bandwidth, and phase noise arising
(10) from AS modulator quantization noise can be derived from the
The argument of the log function has the form of a highpass linearized model equations, (7) through (10). While this can
function times a lowpass function, which is consistent with the be done directly, it involves the solution of third order equa-
claim in Section III that the PLL lowpass filters the primarily tions which can be messy. Alternatively, approximate solu-
high frequency quantization noise from the AS modulator. It tions of the equations can be derived that provide better intui-
follows from (10) that the phase noise resulting from em[n] can tion [21]. A particularly convenient set of approximate solu-
be decreased by reducing the PLL bandwidth or increasing the tions are
=tan (11)
reference frequency. If a higher-order AS modulator is used,
an equation similar to (10) results except that the exponent of
the sinusoid is greater than two. This reduces the in-band por-
™ iH)'
_ IKycoR b-\ .
tion of the quantization noise, but increases the out-of-band hw ( }
portion, which, depending upon the loop parameters of the
~ 2nN " b '
PLL, can result in a somewhat lower overall phase noise.
However, the PLL loop filter is highly constrained to maintain 2
XJBW
PLL stability, so the phase noise reduction that can be
achieved by increasing the order of the AS modulator is lim- and
ited in most applications [16].
S0 (/)| «l 0 .logf^-sin 2 Mf^T] dBc/Hz,
C. A System Design Example (14)
The PLL bandwidth and the phase margin both depend where PM is the phase margin of the PLL, fBw is the 3 dB
upon the loop gain, 7\s)9 which, for the loop filter shown in bandwidth of the PLL, and b = 1 + C2IC\ is a measure of the
Figure 10, depends upon the parameters fn/, N, /, KVco, ^> C\9 separation between the two loop filter capacitors [22]. The
and C2. Usually,/*,/and //are dictated by the application, and derivations assume that b is greater than about 10, and (14) is
/ and Kyco are, at least partially, dictated by circuit design valid for frequencies greater than (C2+Ci)/(2^RC2Ci).
choices. This leaves the loop filter components as the main These equations are sufficient to determine appropriate
variables with which to set the desired PLL bandwidth, phase loop filter component values. For example, suppose b is set to
29
-60 tion delay depends upon the divider modulus and the number
"Exact" simulation of AI modulator output levels is greater than two, the effect is
-80 Linearized Model that of a hard non-linearity applied to the AI modulator quan-
-100 tization noise. This tends to fold out-of-band AI modulator
quantization noise to low frequencies and introduce spurious
N -120 tones, which can significantly increase the PLL phase noise.
The problem is analogous to that of multi-bit digital-to-analog
-140
O
CQ
converter step-size mismatches in analog AI data converters
•o
-160 [23]. Unfortunately, circuit simulations are required to evalu-
ate the severity of the problem on a case by case basis as both
-180 the extent of any modulus-dependent delays and their affect
-200 on the PLL phase noise are difficult to predict using hand
analysis.
-220 There are two well-known solutions to this problem. One
105 106u 10? 108 solution is to resynchronize the divider output to the nearest
Hz
VCO edge or at least a higher-frequency edge obtained from
FigureB: Simulated and calculated PSD plots of the phase noise arising from
A I modulator quantization noise for the example A I fractional-N PLL. within the divider circuitry [22], [24]. The ^synchronization
erases memory of modulus-dependent delays and noise intro-
49, so, as indicated by (11), the phase margin is approximately duced within the divider circuitry, but care must be taken to
70°. Solving (14) with the phase noise set to -130 dBc/Hz a t / ensure that the signal used for resynchronization is itself free
= 3 MHz indicates t h a t y ^ ~ 50 kHz. Therefore, the phase of modulus dependent delays. The primary drawback of the
noise resulting from AI modulator quantization noise is suffi- approach is that it increases power consumption.
ciently suppressed with a 50 kHz bandwidth and a phase mar- The other solution is to use a AI modulator with single-bit
gin of 70°. With this information (12) can be solved to find R (i.e., two level) quantization. In this case, modulus-dependent
= 960 Q. with which (13) and the definition of b can be used to delays give rise to phase error at the output of the divider that
calculate C2 = 23 nF and C\ = 480 pF. It is straightforward to consists of a constant offset plus a scaled version of the AI
verify that the phase noise introduced by the loop filter resistor modulator quantization noise. Since, by design, the AI modu-
(the only noise source in the loop filter) is well below -130 lator quantization noise has most of its power outside the PLL
dBc/Hz at offsets from the carrier of 3 MHz and above as re- bandwidth, the modulus-dependent delays increase the phase
quired. noise only slightly. Unfortunately, AI modulators with single-
Figure 13 shows PSD plots of the phase noise arising from bit quantization tend not to perform as well as AI modulators
AI modulator quantization noise for the example PLL with the with multi-bit (i.e., more than two-level) quantization. For
loop filter component values derived above. The heavy curve example, if the 9-level quantizer in the 48 Msample/s AI
was calculated directly from the linearized model equations modulator example presented in Section IV were replaced by
(7) through (10). The light curve was obtained through a a one-bit quantizer, the dynamic range of the AI modulator in
behavioral computer simulation of the PLL. As is evident the zero to 500 kHz band would be reduced from 88.5 dB to
from the figure, the two curves agree very well which suggests approximately 65 dB. Moreover, unlike the 9-level quantizer
that the approximations made in obtaining the linearized case, the additive noise from the single-bit quantizer would
model are reasonable. not be white and would be correlated with the input sequence.
An effect that does not have a counterpart in integer-Af Its variance would be input dependent and it would contain
PLLs is the presence of zeros in the PSD of the phase noise spurious tones.
arising from AI modulator quantization noise at multiples of These problems can be mitigated by using a higher-order
the reference frequency. These zeros are a result of the dis- AI modulator architecture to more aggressively suppress the
crete-to-continuous-time conversion of the AD modulator in-band portion of the additive noise from the two-level quan-
quantization noise; each zero is a sampling image of the dc tizer. However, to maintain stability in a higher-order AI
zero imposed on the quantization noise by the AI modulator. modulator with single-bit quantization, the useful input range
of the AI modulator input signal must be reduced and more
VI. AS FRACTIONAL-TV PLL SPECIFIC PROBLEMS poles and zeros must be introduced within the feedback loop
as compared to a multi-bit design with a comparable dynamic
One of the most significant problems specific to AI frac- range. Even then, the problem of spurious tones persists, and
tional-AT PLLs is that they can be sensitive to modulus- it is difficult to predict where they will appear except through
dependent divider delays. In practice, each positive-going extensive simulation. Furthermore, to compensate for the
divider edge is separated from the VCO edge that triggered it restricted input range of the AI modulator the reference fre-
by a propagation delay. Ideally, this propagation delay is in- quency must be large enough that all of the desired PLL out-
dependent of the corresponding divider modulus, in which put frequencies can be achieved. This can severely limit de-
case it introduces a constant phase offset but does not other- sign flexibility. For example, if the magnitude of the AI
wise contribute to the phase noise. However, if the propaga- modulator input signal were limited to less than 0.5 in the case
30
of the Bluetooth local oscillator application considered above, chip transmitter. Furthermore, the modulation index of the
the reference frequency would have to be greater than 79 transmitted signal depends upon the absolute tolerances of the
MHz. Otherwise, it would not be possible to generate all the VCO components which are often difficult to control in low-
Bluetooth channel frequencies. cost VLSI technologies and can also drift rapidly over time.
Another issue specific to AI fractional-TV PLLs is that In principle, AI fractional-TV PLLs can avoid these prob-
modulus switching increases the average duration over which lems by modulating the VCO within the PLL. This can be
the charge pump current sources are turned on each period done by driving the input of the digital AS modulator with the
relative to integer-TV PLLs. For comparison, consider a AI desired frequency modulation of the transmitted signal. The
fractional-TV PLL and an integer-TV PLL with the same TV primary limitation is that bandwidth of the PLL must be nar-
(where TV » a), the samey^/, and identical loop components. row enough that the quantization noise from the AI modulator
It follows from (5) that is sufficiently attenuated, but sufficiently high to allow for the
modulation. For instance, the phase noise PSD of the example
ADfractional-TVPLL shown in Figure 5 with a 50 kHz loop
bandwidth meets the necessary phase noise specifications
(15)
when used as a local oscillator in a conventional upconversion
The last term in (15), which is caused by having the AI modu-
stage within a Bluetooth wireless LAN transmitter. However,
lator switch the divider modulus, represents a significant in-
if the Bluetooth transmitter is to be implemented by modulat-
crease in the time during which the charge pump current
sources are turned on each reference period. Consequently, ing the VCO through the digital AI modulator, then the loop
the phase noise arising just from charge pump current source bandwidth of the PLL must be approximately 500 kHz. Un-
noise is larger in the AIfractional-TVPLL by fortunately, when the loop bandwidth of the fractional-// PLL
shown in Figure 5 is widened to 500 kHz, the resulting phase
T Averagefractional-TVPLL charge pump "on time"! noise becomes too large to meet the Bluetooth transmit re-
L Average integer-TV PLL charge pump "on time" J quirements.
where A is a constant between 10 and 20. The value of A de- Nevertheless, commercial transmitters with VCO modula-
pends upon the autocorrelation of the charge pump current tion through A I fractional-TV synthesizers are beginning to be
source noise. For example, if the current source noise in suc- deployed, especially in low-performance, low-cost wireless
systems such as Bluetooth wireless LANs [28]. Facilitating
cessive charge pump pulses is completely uncorrelated, then A
this trend are various solutions that have been devised in re-
is 10. Near the other extreme, A is close to 20.
cent years to allow for wideband VCO modulation in AI frac-
tional-TV PLLs without incurring the phase noise penalty men-
VII. TECHNIQUES TO WIDEN AE FRACTIONAL-N tioned above. One of the solutions is to keep the loop band-
PLL LOOP BANDWIDTHS width relatively low, but pre-emphasize (i.e., highpass filter)
the digital phase modulation signal prior to the digital AI
A transmitter with virtually any modulation format can be
modulator [29]. Unfortunately, this approach requires the
implemented using D/A conversion to generate analog base-
highpass response of the digital pre-emphasis filter to be a
band or IF signals and upconversion to generate the final RF
reasonably close match to the inverse of the closed-loop filter-
signal. However, many of the commonly used modulation
ing imposed by the largely analog PLL. Another of the solu-
formats in wireless communication systems such as MSK and
tions is to use a high-order loop filter in the PLL with a sharp
FSK involve only frequency or phase modulation of a single
lowpass response [30]. Increasing the order of the loop filter
carrier [25]. In such cases, the transmitted signal can be gen-
increases the attenuation of out-of-band quantization noise
erated by modulating a radio frequency (RF) VCO, thereby
which allows for higher-order AI modulation to reduce in-
eliminating the need for conventional upconversion stages and
band quantization noise thereby allowing the loop bandwidth
much of the attendant analog filtering. At least two ap-
to be increased without increasing the total phase noise.
proaches have been successfully implemented in commercial
However, as described in [30], this necessitates the use of a
wireless transmitters to date. One is based on open-loop VCO
Type 1 PLL which significantly complicates the design of the
modulation, and the other is based on AI fractional-TV synthe-
phase detector. Yet another solution is to use a narrow loop
sis.
bandwidth but modulate the VCO both through the digital AI
An example of a commercial transmitter that uses the modulator and through an auxiliary modulation port at the
open-loop VCO modulation technique is presented in [26] and VCO input [28]. The idea is to apply the low-frequency
[27], in this case for a DECT cordless telephone. Between modulation components at the AI modulator input and the
transmit bursts, the desired center frequency is set relative to a high frequency modulation components directly to the VCO.
reference frequency by enclosing the VCO within a conven- Again, matching is an issue, but it has proven to be manage-
tional PLL. During each transmit burst the VCO is switched able at least for low-end applications such as Bluetooth trans-
out of the PLL and the desired frequency modulation is ap- ceivers.
plied directly to its input. The primary limitation of the ap-
proach is that it tends to be highly sensitive to noise and inter- VIII. CONCLUSION
ference from other circuits. For example, in [27], the required
level of isolation precluded the implementation of a single- The additional concepts and issues associated with AI
31
fractional-^ PLLs for frequency synthesis relative to integer-Af ory, vol. 38, no.3, pp.1015-1028, May 1992.
PLLs have been presented. It has been shown that AI frac- 12. I. Galton, "One-bit dithering in delta-sigma modulator-
tionak/V PLLs provide tuning resolution limited only by digi- based D/A conversion," Proc. of the IEEE International
tal logic complexity, and, in contrast to integer-^ PLLs, in- Symposium on Circuits and Systems, 1993.
creased tuning resolution does not come at the expense of re-
13. S. W. Golomb, Shift Register Sequences. Laguna Hills,
duced bandwidth. Since one of the main innovations in a AS CA: Aegean Park Press, 1982
fractional-^ PLL is the use of a AE modulator to control the
14. E. J. McCluskey, Logic Design Principles. Englewood
divider modulus, the relevant concepts underlying AI modula-
Cliffs, NJ: Prentice-Hall, 1986.
tion have been described in detail. A linearized model has
been derived from first principles and a design example has 15. S. K. Tewksbury, R. W. Hallock, "Oversampled, linear
been presented to illustrate how the model is used in practice. predictive and noise-shaping coders of order N >1,"
Techniques for wideband digital modulation of the VCO IEEE Transactions on Circuits and Systems, vol. CAS-
within a delta-sigma fractional-TV PLL have also been pre- 25, pp. 436-447, July 1978.
sented. 16. W. Rhee, B. S. Song, A. AH, "A 1.1-GHz CMOS frac-
tional-N frequency synthesizer with a 3-b third-order AI
modulator," IEEE Journal of Solid-State Circuits, vol.
ACKNOWLEDGEMENTS
35, no. 10 , pp. 1453-1460, October 2000.
The author is grateful to Sudhakar Pamarti, Eric Siragusa, 17. W. L. Lee, C. G. Sodini, "A topology for higher order
and Ashok Swaminathan for their helpful discussions and ad- interpolative coders," Proceedings of the 1987 IEEE In-
vice regarding this paper. ternational Symposium on Circuits and Systems, vol. 2,
pp.459-462, May 1987.
18. K. C.-H. Chao, S. Nadeem, W. L. Lee, C. G. Sodini, "A
REFERENCES higher order topology for interpolative modulators for
oversampling A/D converters," IEEE Transactions on
1. P. M. Gardner, "Charge-pump phase-lock loops," IEEE Circuits and Systems, vol. 37, no.3, p.309-318, March
Transactions on Communications, vol. COM-28, pp. 1990.
1849-1858, November 1980. 19. Y. Matsuya, K. Uchimura, A. Iwata, T. Kobayashi, M.
2. B. Razavi, Design of Analog CMOS Integrated Circuits, Ishikawa, T. Yoshitome, "A 16-bit oversampling A-to-D
McGraw Hill, 2001. conversion technology using triple integration noise
3. Bluetooth Wireless LAN Specification, Version 1.0, shaping," IEEE Journal of Solid-State Circuits, vol. SC-
2000. 22, pp. 921-929, December 1987.
4. U. L. Rohde, Microvave and Wireless Synthesizers The- 20. K. Uchimura, T. Hayashi, T. Kimura, A. Iwata, "Over-
ory and Design, John Wiley & Sons, 1997. sampling A-to-D and D-to-A converters with multistage
5. B. Miller, B. Conley, "A multiple modulator fractional noise shaping modulators," IEEE Transactions on
Acoustics, Speech, and Signal Processing, vol. AASP-
divider," Annual IEEE Symposium on Frequency Con-
36, pp. 1899-1905, December 1988.
trol, vol. 44, pp. 559-568, March 1990.
6. B. Miller, B. Conley, "A multiple modulator fractional 21. J. Craninckx, M. S. J. Steyaert, "A fully integrated
divider," IEEE Transactions on Instrumentation and CMOS DCS-1800 frequency synthesizer," IEEE Journal
Measurement, vol. 40, no. 3, pp. 578-583, June 1991. of Solid-State Circuits, vol. 33, pp. 2054=2065, Decem-
ber 1998.
7. T. A. Riley, M. A. Copeland, T. A. Kwasniewski,
"Delta-sigma modulation in fractional-N frequency syn- 22. S. Pamarti, "Techniques for Wideband Fractional-Af
thesis," IEEE Journal of Solid-State Circuits, vol. 28, no. Phase-Locked Loops," PhD Dissertation, University of
5, pp. 553-559, May, 1993. California, San Diego, 2003.
8. S. K. Tewksbury, R. W. Hallock, "Oversampled, linear 23. S. R. Norsworthy, R. Schreier, G. C. Temes, Eds. Delta-
predictive and noise-shaping coders of order N >1," SigmaData Converters, Theory, Design, and Simulation,
IEEE Transactions on Circuits and Systems, vol. CAS- New York: IEEE Press, 1997.
25, pp. 436-447, July 1978. 24. L. Lin, L. Tee, P. R. Gray, "A 1.4 GHz differential low-
9. G. Lainey, R. Saintlaurens, P. Serin, "Switched-capacitor noise CMOS frequency synthesizer using a wideband
second-order noise-shaping coder," IEE Electronics Let- PLL architecture", IEEE ISSCC Digest of Technical Pa-
ters, vol. 19, pp. 149-150, February 1983. pers, pp. 204-205, Feb. 2000.
10. I. Galton, "Granular quantization noise in a class of 25. J. G. Proakis, Digital Communications, fourth ed.,
delta-sigma modulators," IEEE Transactions on Infor- McGraw Hill, 2000.
mation Theory, vol. 40, no. 3, pp. 848-859, May 1994. 26. S. Heinen, S. Beyer, J. Fenk, "A 3.0 V 2 GHz transmitter
11. N. He, F. Kuhlmann, A. Buzo, "Multiloop sigma-delta IC for digital radio communication with integrated
quantization," IEEE Transactions on Information The- VCO's," Digest of Technical Papers, IEEE International
Solid-State Circuits Conference, vol. 38, pp. 150-151,
32
Feb. 1995. mW CMOS fractional-N synthesizer using digital com-
27. S. Heinen, K. Hadjizada, U. Matter, W. Geppert, V. pensation for 2.5-Mb/s GFSK modulation," IEEE Jour-
Thomas, S. Weber, S. Beyer, J. Fenk, E. Matshke, "A 2.7 nal of Solid-State Circuits, vol. 32, no. 12, pp. 2048-
V 2.5 GHz bipolar chipset for digital wireless communi- 2059, Dec. 1997.
cation," Digest of Technical Papers, IEEE International 30. S. Willingham, M. Perrott, B. Setterberg, A. Grzegorek,
Solid-State Circuits Conference, vol. 40, pp. 306-307, B. McFarland, "An integrated 2.5GHz LA frequency
Feb. 1997. synthesizer with 5 ns settling and 2Mb/s closed loop
28. N. Filiol, et. al., "A 22 mW Bluetooth RF transceiver modulation," Digest of Technical Papers, IEEE Interna-
with direct RF modulation and on-chip IF filtering," Di- tional Solid-State Circuits Conference, vol. 43, pp. 200-
gest of Technical Papers, IEEE International Solid-State 201, Feb. 2000.
Circuits Conference, vol. 43, pp. 202-203, Feb. 2001.
29. M. H. Perrott, T. L. Tewksbury III, C. G. Sodini, "A 27-
33
Designing Bang-Bang PLLs for Clock and Data
Recovery in Serial Data Transmission Systems
Richard C. Walker
Abstract - Clock recovery using phase-locked loops (PLL) coaxial delay lines for setting the timing of the recovered sampling
with binary (bang-bang) or ternary-quantized phase detectors clock with respect to the data eye [1].
has become increasingly common starting with the advent of
fully monolithic clock and data recovery (CDR) Circuits in the Early monolithic CDR designs imitated these discrete block
late 1980's. Bang-bang CDR circuits have the unique advan- diagrams. The propagation delay differences between data and
tages of inherent sampling phase alignment, adaptability to clock paths could be ignored as long as the gate delay skew was a
multi-phase sampling structures, and operation at the highest negligible fraction of the total bit time, or unit interval. The need
speed at which a process can make a working flip-flop. This for higher link speeds grew faster than Moore's law, and as clock
paper gives insight into the behavior of the nonlinear bang- frequencies approached the effective fT of the active devices, it
bang PLL loop dynamics, giving approximate equations for became increasingly difficult to maintain an optimum sampling
loop jitter, recovered clock spectrum, and jitter tracking per- phase alignment between the recovered clock and the data over
formance as a function of various design parameters. A novel process, temperature, data-rate, and voltage variations.
analysis shows that the bang-bang loop output jitter grows as
the square-root of the input jitter as contrasted with the linear A second problem was that most linear phase detectors pro-
dependence of the linear PLL. duced narrow pulses with widths proportional to the phase error
between the timing of the data and the clock [2], [3]. These narrow
I. INTRODUCTION pulses required a process speed in excess of that required to sim-
ply sample data at a given rate. The timing skew and speed of lin-
Prior to the advent of fully monolithic designs, clock recovery ear phase detector circuits then became the limiting factor for
was traditionally performed with some variant of the circuit in Fig. aggressive designs.
1. The clock frequency component was typically extracted from
Variable Delay Block 4 Retiming Latch
Both these difficulties are eliminated by a family of circuits
which simultaneously retime data and measure phase error by
nput NRZ Data Retimed Data using matched flip-flops to sample both the middle of each data bit
X D Q
/\ and the transitions between the data bits. Fig. 2 shows such an
/ samples of master transitions-,
BPF Recovered Clock
d samples of all transitions
X2 or
dt PLL Input t D Q
D Q —•
Data A X A
pulse
conditioning
frequency
extraction Y 2 vco
Fig. 1. Traditional non-monolithic clock and data recovery architec- jt divide by 20 loop
filter
V Y
ture. D Q — • Retimed Data
34
0.8 : : [38](8x)^j| : J30J(4x)
0.6 [SOpx) 7 :
training sequence : : [361 (4x) ' "
f : g - { 3 5 J ••••; # ••• [ 1 9 l ( 1 O x >
0.4
i I T> 2 9 1 f13K10x> * *[27]
16 data data 0.2
means: 16 data
means: 16 data
means: Training Sequence
means: Control Word
I 0.1
0.08
0.06
0.04
0.02
*
[25J
^
:
fo-n
lij*i"
i
i (15](2X)-" / ? ^ ( 2 X ) £ H 3 3 ]
i
j .PI]
: 1261(8*) B . i [22] !- Q . . .
Trck] * H 7 3 i
i i
j ["]
i
l J
I o
: [32]
:
Linear PLL
BBPLL
I
35
and the introduction of a ternary hold mode are considered in a frequency detector, these non-uniform sampling times must be
later section. accounted for.
Thefirst-orderBB PLL of Fig. 5 can be rendered into a block With the uniform time step approximation, the VCO phase
diagram for analysis as shown in Fig. 6. The loop phase error changes up or down (or "walks off') by
ra(
®bb ~ ^(fbb^fnom) ^ a n s during each update period.
•w ev ee{±l}
In summary, the first order loop obeys a simple set of discrete
i 1 time difference equations:
s I I P Kv
s
Q
d % tn e
</('») = e</(0) + 2«8// B + <)>('„) w
+ + z
fin = fnom V fvco-fnom fbb
= + £
Fig. 6. Block diagram offirstorder loop showing definition of signal
%(tn+l) Wn) r,Qbt (2)
case of a ternary data-driven phase detector, Zn may be set to 0 The phase detector duty cycle, and therefore its average output
voltage are proportional to the loop frequency error. Fig. 7 shows a
when it is not possible to make a determination of phase error due simulated loop with a range of input frequencies. The loop is
to consecutive identical bits in the data stream. The consequence
of this "hold" state is treated in a later section). The error signal
drives the VCO through an attenuator p , to produce a change in 249O.0
fvGO
/«-•/»
frequency of fbb — P^vco' F r o m t * me *n unt** t * m e *n + 1 ' | /„,„
/.«.-/«
the VCO operates at one of the two frequencies given by 2484.C
Fin
out of lock In lock out of lock
200.0
J nom nJ bb'
•V:
Because the VCO frequency changes on each cycle, the system
has non-uniform sampling times. The time of phase sample e8
-200.0
=
'« + 1 *n + l/(fnom +£
Jb^ •In a ^ ^ CDR
> fbb is 5.0 10.0
time (jiseconds)
15.0
on the order of 0.1% of fnom , so that an analysis assuming uni- Fig. 7. Simulated response offirst-orderPLL to a range of input fre-
form time steps of t ^ate = 1 /fnom is sufficiently accurate quencies.
for most purposes. However, for loop analyses requiring exact "locked" whenever the input frequency is bracketed by the two
charge pump balance, such as wide-range loop pull-in without a VCO frequencies. The rapid alternation between frequencies
36
slightly too high and slightly too low creates a bounded hunting
jitter (Jpp). Proportional (BB) branch
VCO
The derivative of the input data phase deviation, d[$(t)]/dt,
v$ P
adds to the frequency error that must be tolerated by the loop. D Q Xs
Assuming 8 / = 0 , then for <|)(f) = Asin(2Kfmodt), the
>
maximum amplitude A of phase modulation at frequency fmod
before onset of slew-rate limiting is \f^A/ f mod • Fig- 8. demon- Integral branch
frequency deviation to exceed i / ^ £ • The loop stops toggling and ratio of these two is the stability factor of the loop
goes into slew rate limiting, leading to a transient phase error. qt __ ^proportional __ 2pT ^
A. Summary of First-Order Loop ^^integral 'update
The reader should be careful not to confuse the bang-bang loop
The first-order bang-bang loop has only one degree of freedom. stability factor t, with the linear loop damping factor £ [5].
Jitter generation, lock range, and jitter tolerance are all inconve-
niently controlled by one parameter, fbb. This situation can be The discrete time difference equations for the second-order
improved by using a second control loop to dynamically adjust the loop can be written as
nominal VCO frequency fnom to be equal to the incoming data
Qd(tn) = Qd(0) + 2n8ftn + Wn) M
frequency. Because the phase detector duty cycle is proportional to
the loop frequency error, this dynamic centering of VCO fre- e (4)
( « 2" 1
quency can be accomplished by adjusting the VCO center fre- 8A+i> = W + ^ +^ + ^ J
quency in a feedback loop to drive the phase detector duty cycle
C to 50%. This decouples the lock range from jitter tolerance and
jitter generation, giving more design freedom. en = sign[e rf (/ n )-9 v (g] (3)
37
B. Simulations of Second-Order Loop ited to ^fbb, then there is no jitter accumulation or phase tran-
Fig. 10 shows two block diagrams for the second-order loop. sient at the sampling flip-flop.
The upper diagram is a straightforward translation of the sche-
matic in Fig. 9. The lower diagram is a topological re-arrangement
r
2480 0 1 ! ! ! ! ! ? T ! ! ! ! ! ! !
400.0 , , , , _
!_ 1 jpTH i i Si 0.0 , fi * * i £*"i^ • ^^ , i i , , |
I | ? ! I j ye r^*\ I I ! ?
2.0
tT i" H l i i i i i i i i i i i i i i I
[ i i I i i i i i i i i i i i I
W 6
tint , v rElh II 4.0 5.0 6.0 7.0
f time (^seconds)
-!!^z)J7L^ KV = Fig. 12. Second-order loop response to instantaneous frequency
.-• L—I* ..-' J \K '—' '—' step larger than fbb .
Af A6i A92 0 e tp V^
Fig. 10. Two equivalent second-order bang-bang loop block diagrams.
The proportional phase-control signal flow is highlighted with a Fig. 12 is a simulation in which the input frequency step is big-
dashed line, and the integral frequency-control loop with a solid line. ger than fbb, so the loop goes into slew rate limiting, leading to a
Fig. 11. Second-order loop response to instantaneous frequency with n = t/tupdate> The time of the first zero crossing
step smaller than fbb . approaches A as t, —> °o 9 consistent with a first-order loop. In
general, the second-order loop is quicker to reach zero phase error
than the first-order loop, but pays for this with an oscillatory over-
Fig. 11 shows the second-order loop responding to a step
shoot. As a conservative rule of thumb, the magnitude of the oscil-
change in input frequency fin, producing a slow response fint latory transient of a second-order step response can be considered
in the outer integral loop. The resulting phase error A 0 j is bounded by the simple linear transient of the first-order loop. The
time required to reach steady state, given a step of A is always
tracked by the inner bang-bang loop 0 V to produce the final less than or equal to A timesteps, independent of £ .
sampler phase error Qe . Notice that, unlike linear PLLs, if the
power-supply noise-induced VCO frequency modulation is lim-
38
5OO 200.0
400
A91
300
200
^infinite
g«2d00
I\ °-°S -200.0
AG2 ot.tr
CO 100 100.0
9
o v
CD , 0.0.
«
-10O
£-200
-100.0
A0 2 - V
-200
-300 •8 2.0 I
-400 S*20. V6-
'\*z 0.0
-500
0 100 200 300 400 500 600 700 -2.04.0
1 5.0 6.0 7.0
time / typdafe 5 time (jiseconds)
Fig. 13. Noise-free loop response to a phase step with stability
Fig. 15. Second-order loop response to large sinusoidal input jitter.
factor ^ as a parameter.
load so that a loop can be easily designed to never slew for signals
meeting a typicalfrequency-domainjitter tolerance specification.
IV. S L O P E O V E R L O A D
A. Delta-Sigma Analogy
Many systems, such as SONET, specify jitter tolerance in the
form of a sinusoidal jitter at various frequencies.
Before developing an analytic equation for slope overload, it is
100.0 helpful to introduce a further rearrangement of block diagram II
£ e i. from Fig. 10. Fig. 16 transforms the loop by pulling two integra-
-100.0
-AegS 4»(t)'
50.0
%. %. s**(t) fbb
0.0'
TJ
-50.0 5 -A92 fin 1 2fbb
2.0
CE [L X
s s£
SB 0.0
% • tn first order AE on Af
-2.0
4.0 5.0 6.0 7.0
time (n seconds) Fig. 16. Redrawing of the loop to show inner AX inner modulator
Fig. 14. Second-order loop response to sinusoidal input jitter. operating on the loop frequency error.
tors through the last summing node prior to the quantizer. The
update time interval is set to 1. The definition for bang-bang fre-
Fig. 14 shows the loop response with a sinusoidal input phase quency step f^ = $KVV±, and stability factor
jitter <|>(f) . The outer integral loop tracks the input jitter at AGj ^ = 2pT/£ u p ( j a t e are also substituted in.
with a slight phase lag. The resulting phase error A 9 2 is tracked
The shaded area in Fig. 16 shows how the proportional feed-
by the inner bang-bang loop 6 y to produce the final sampler
back loop can be thought of as an inner AX modulator producing
phase error 0^ . The duty-cycle of the PD output F . varies with a phase detector duty cycle proportional to the VCO frequency
error [6],[7].
the slope of A 9 2 which is proportional to the instantaneous fre-
quency error of the outer loop. Fig. 17 summarizes an analysis of the first order delta-sigma
(after [8]). When the loop is not in slew rate limiting, or in a peri-
In Fig. 15, the phase modulation is increased until the instanta- odic limit-cycle, the quantizer (e.g., PD) can be replaced with a
neous frequency error exceeds the inner loop's ability to track. unity gain element and a noise source Q(z) with the same
Slew-rate limiting produces a tracking error at the sampler Qe . A Asin(2ntft/tupda(e)/(2nft/tupdate) noise characteris-
CDR would normally be designed such that slewing would never tics as a random binary bitstream. Both these constraints are met
occur for any valid signal allowed by a particular standard. The in practice as the VCO phase noise is sufficient to eliminate any
next two sections develop an analytic expression for slope over- deterministic limit cycles, and the loop is designed to never slew
rate limit on any conforming input signal. This insight is critical as
39
maximum normalized input phase as a function of normalized fre-
quency
max. v
O7 (s) // 2
s+s 2^^/( 3 2 "\
X(z)
£ H(z)
(integration)
L
Y(z) -V-(( i) ')•
Q(z) This is a curious bootstrapped analysis, in that it assumes a lack
of slewing to justify the linearization which permits the computa-
tion of the onset of slew rate limiting.
r
^-TrmXU)+rniu)Q^
Fig. 18 shows a good agreement between this expression and
simulated loop performance in which slewing is defined as a con-
tiguous sequence of ten or more identical phase-error indications.
c c This expression can be used to design a loop for a given jitter tol-
O) CD
100G
freq freq
1G
•6-0.1; (s2 + s + ?)/(s3 + s2)
Fig. 17. Simplified analysis of delta-sigma circuit.
S=3
10M fctQ.
it allows linear analysis to be applied whenever the bang-bang points shown
loop is not in slew rate limiting. 100k are from numerical
"e-iod simulation
B. Expression for Slope Overload Fig. 18. Normalized amplitude of sinusoidal jitter just sufficient to
cause slope overload as a function of normalized jitter frequency
A closed-form analysis of slope overload can now be derived. and with ^ as a parameter.
Referring to Fig. 16, the system slews when |AF| > f^-
Assuming no slew rate limiting, we can use the results from the erance. The tolerance plots are single-pole slope for high ^ and
AZ analysis to justify replacing the loop quantizer with a unity high jitter frequency, becoming double-pole at lower frequencies
and small i;. At high frequencies, all of the curves become
gain element. The maximum input phase jitter in UI as a function
asymptotic to the single-pole tolerance of a first-order bang-bang
of frequency, O • (s) , normalized to 8 ^ can then be calcu- PLL. The operating region below each of these curves is where the
lated using Laplace transforms. AE approximation is valid, and where a linear loop analysis is
justified.
We want to find an input excitation F(s), for which
|AF| = fbb at all frequencies. The inner AZ of Fig. 16 has a
source I Kv
P + JT output
linearized transfer function of \/{s +f b b ) . Using standard phase I L S
L
noise
feedback loop theory, the expression for AF can then be written i
BB phase noise VCO open loop phase noise
as of form: Asin(x)/x
F\
AF = —m—. Fig. 19. Loop redrawn replacing phase detector with unity gain
1+ ffbbV 1 ^
U* JU+/J element and additive quantization noise.
40
V. JITTER GENERATION VI. GAUSSIAN INPUT NOISE
With these insights, it is possible to accurately predict the loop Fig. 21 is a plot of output jitter vs input jitter with £ as a
jitter generation in the frequency domain. Fig. 19 is a redrawing of
the loop replacing the phase detector by a unity gain element, and 1OM j ; : : : : : —.
-100 ^fff^Z.A 1777(7) ...» TTJHT) .--{ ."--. parameter. For convenience, all jitter sigmas are normalized to
.12o L 1 JL ! 1
N -80 • 1 —IV^^ . . . 1 0££ , the loop phase step size. The total loop output jitter can be
I " 90 TilEnmttiLy.J.y — l-^^^^v.^ii^*' vco phase noise
approximated by three regions of operation:
§ I120 11'" 1""_" 1 ? ii^jii^? 1 *^:!: 111111 • 11 iTTr^r^ •=*> r>n&se noiso_35^ J J + J + J InRe ion J the out
-130 source phase noise TTfoiiaiMV-..
-140 I
I..>K^...I
! ^»»»»~in | Tiihi ii MrtllH
total " idle linear walk • g > P u t J itter
M -80 | ^ . . , . . is independent of input jitter G .. This occurs when the self-gener-
~E -90 - ^ y g ^ ^ i ^ - - - -- { } computed phase noise
75 - 1 0 0 ...^'yiWHUmiWiHL^.,. 1 .1 1... ated hunting jitter exceeds the input jitter. The RMS jitter in this
iS-110 ....A^.:...^w?!...i.i^[.;; s***^..: region is empirically determined to be well approximated by
"O -120 ^^-r-™ measured phase noise ..J-.-.r?*^'
-130 } - * "J^toaaii, ^idle ~ ^ + (1.65/2;) . In Region II, the output jitter is pro-
-140 I I _J I ! • "——
1k 10k 100k 1M 10M 100M 1G portional to the input jitter. This occurs when the input jitter is so
Fig. 20. Example computation of loop jitter generation spectrum high that, for a given £ , the bang-bang dynamic is unable to con-
with parametersfrom[11]. trol the second-order portion of the loop. This leads to large qua-
dratic trajectories in the phase domain, causing the loop phase to
generally taken to be the spectrum of the clock driving the data "hunt" towards the limits of the input jitter distribution. As the
source or BERT, or in the case of a clock multiplying circuit, the loop phase nears the limits of the input jitter distribution, the bang-
spectrum of the reference clock corrected by 20 times the log of bang hunting has more effect on stabilizing the second-order loop.
the loop frequency multiplication ratio. In this region, the output jitter is proportional to the input jitter:
The phase noise power is given by JUn * 2 a . / ( 1 + 7 | ) -In Region III, the output RMS jitter
ls a
™max
^walk PP r o x i m a t e l Y ec ual t 0
l 0-7 * J<5j . This surprising
S
RMS = J W - result says that loops with large ^ have output jitter which grows
0
as the square root of the input jitter. Contrast this with a linear PLL
The RMS jitter in unit intervals is then which simply low-pass filters the input jitter and thus has an out-
J
RMS = atan
(^M^) /TC
• put jitter which grows linearly with the input jitter.
An approximate analysis of loop jitter can shed light on this curi-
It should be noted that the linearized loop model is only suit- ous square-root dependence of output jitter on input jitter. Assume
able for computation of the jitter spectrum but not for computing a zero-mean input jitter distribution with a sigma G -. Using a lin-
the actual sampling point phase error or other time-domain tran- earized approximation to the standard probability distribution
sient response. The linearized response only covers the dynamics
function, the probability of getting an "early" phase error indica-
of the outerfrequencytracking loop, but does not capture the extra
tion for small loop phase deviation A 6 , is approximately
tracking of the internal nonlinear A S core.
41
VII. DATA-DRIVEN PHASE DETECTORS
1 A9
Unless the data contains a guaranteed periodic transition, the
CDR will be required to lock onto random transitions embedded in
The expected phase change in the loop after one update time is
the data stream. The effects of runlength and transition density on
loop performance must then be considered. The effect of these two
e
data attributes is dependent on the type of phase detector used.
«((i-"«)~"')-^e» Most modern codes use some variation of Alexander's phase
detector [9] shown in Fig. 22.Two matched flip-flops form the
The discrete time equation for the average evolution of loop phase
under the condition of a small input phase error can then be
expressed as Retimed Data
'pump
( 26 ^
Input
Data
B
A
DOWN
UP v
tune
50% duty-cycle
This equation has the same form as a discrete time approximation clock from VCO •pump
T
to the capacitor voltage in an RC lowpass filter. By analogy, when Transition Samples
time is expressed in units of loop update times, any transient phase
error in the bang-bang loop can then be said to decay to zero with
Fig. 22. Modified form of Alexander's ternary-quantized phase
a time constant of T = GJ2K/(2Q^^) . detector for NRZ data along with a typical charge pump for driving
the VCO tuning input.
This "lowpass" loop characteristic is being driven by random
energy from the early/late phase detector output. A related prob-
lem is the computation of the baseline wander voltage generated front-end of Alexander's phase detector, with the first flip-flop
by passing a random NRZ data stream through a coupling capaci- driven on therisingedge of the 50% duty-cycle clock, and the sec-
tor. It can be shown that the sigma on the capacitor voltage is ond flip-flop driven on the falling edge of the same clock. (Using
a fully-differential monolithic ring-oscillator, it is possible to
given by OBLW = vppJtbi/&x) ' E x t e n d i n g t h i s analogy to achieve a very precise 50% duty-cycle clock source). When the
the loop, we can consider the output of the phase detector as a loop is locked, the rising-edge retiming flip-flop samples the cen-
50% duty-cycle random NRZ data stream. Given that the output ter of each data bit and produces a retimed data bit at (A) and the
from each "bit" must cause a loop phase change of Qbb , we can following retimed bit at (B). The falling-edge flip-flop functions as
a phase detector by sampling the transition (T) between the data
compute that the effective V to satisfy our loop difference bits (A,B). To improve the circuit's operating speed, the (T) sam-
ple is delayed an extra half bit time by a latch so that the logic on
equation must be JlTlO,. We can then compute the loop jitter by (A,T,B) has a full bit time for resolution.
using the analogous baseline wander expression with the effective
The transition sample is then compared to the surrounding data
loop V and t . The result is
bits to determine whether the clock sampling phase is early or late
to derive a binary-quantized (bang-bang) or ternary phase error
/eTToJ^n indication. A truth-table for the logic in Fig. 22 is given in Table 1.
42
The states 2 and 5 in Table 1 correspond to the normally impos- when Aj > AQ , for this implies exponential growth of the acqui-
sible condition of sampling a " 1 " midway between two "0" bits. A
custom truth table can use these states to detect either a high bit- sition transient. The convergence is guaranteed whenever
error-rate condition [10], a VCO running grossly too slowly (eg: $>2X.
lump these states into the "late" condition), or taken as an indica-
tion that a link has locked onto its own VCO crosstalk, perhaps by Although usable for tightly constrained block codes such as 8b/
amplification of power supply noise by pick up from a high-gain 10B, binary phase detectors are essentially unusable for codes
optical transimpedance amplifier [11]. such as 10Gb Ethernet 64b/66b or SONET which can have very
long runlengths of up to 66 or 80 bits, respectively.
Since the mid-bit samples (A,B) straddle the (T) transition
sample, it is also possible to detect the lack of a transition. This B. Ternary Phase-Detector
condition corresponds to states 0 and 7 in Table 1. This informa-
tion can be used to create an extra ternary hold-state in the PD out- The 3-state, or ternary phase detector provides superior jitter
put, causing the charge pump to hold its value during long run- performance for data with long runs [12]. Ternary PDs neither
lengths. Both binary and ternary PDs will be discussed in turn, charge nor discharge the loop filter during long runs causing the
along with their implications on loop performance. loop to hold the current estimate of the data frequency. Such loops
effectively "stop time" during long runs.
A. Run-length and Latency
If the charge pump does not have a hold-mode, it is possible to
Binary phase detectors have no hold state, so the PD continues emulate a ternary loop, with some loss of performance, by contin-
to put out the last valid phase error indication during long data uously toggling the phase-detector output to approximately main-
runlengths. In this situation, the loop idling jitter will be multi- tain the current charge pump voltage during long runs.
plied from the expected value by the maximum runlength of the
data. For example, an 8B/10B code has a maximum code run- The peak idling jitter for ternary loops is unchanged from the
length of 5 and will have a peak jitter walk-off five times the value simple 100% transition density analysis. The RMS jitter will be
of that computed for a "10" repetitive data pattern. The average reduced by the average transition density. Because the loop phase
RMS jitter will be a function of the runlength distributions of each cannot change during hold mode, the jitter tolerance will be der-
particular code. There is also a trade-off in effective stability factor ated by the average transition density. This can easily be taken into
as a repetitive pattern such as "11110000" will be equivalent to a
account by increasing 8 ^ appropriately for the characteristics of
loop with an effective update time 4 times larger than the expected
1
the code to be used.
update = l/fnom- S i n c e t h e stabilitv factor is
inversely
dependent on update time, it is possible for binary PDs to become C. VCO Tuning Bandwidth
unstable with data patterns containing very long runs due to the
delay in timely phase-error feedback. The previous analyses all assumed an infinite VCO tuning
bandwidth for the proportional tuning input. A VCO time-constant
Slope(t=0) = S tvco , can slightly reduce hunting jitter if it is small compared to
*1 X% - SX the loop update time.
Due to the loop latency X, the loop overshoots zero phase by VIII. CONCLUSION
X /t,-SX before the "braking" effect of the proportional Bang-bang CDR circuits have the unique advantages of inher-
branch starts to act. The onset of catastrophic instability occurs ent sampling phase alignment, adaptability to multi-phase sam-
43
pling structures, and operation at the highest speed at which a [11] R. C. Walker, C. Stout and C. Yen, "A 2.488Gb/s Si-
process can make a working flip-flop. Approximate equations for Bipolar Clock and Data Recovery IC with Robust Loss of
loop jitter, recovered clock spectrum, and jitter tracking perfor- Signal Detection," in ISSCC Digest of Technical Papers
mance as a function of various design parameters have been pp. 246-247,466, Feb. 1997.
derived. The median-tracking property of the bang-bang loop
resulting in an output jitter equal to the square root of the input jit- [12] N. Ishihara and Y. Akazawa, "A Monolithic 156 Mb/s
ter has been presented. Clock and Data Recovery PLL Circuit Using the Sample-
and-Hold Technique," IEEE Journal of Solid-State Cir-
ACKNOWLEDGMENT cuits, vol. 29, no. 12, pp. 1566-1571, Dec. 1994.
[13] D. Chen, and M. O. Baker, "A 1.25 Gb/s, 460mW
The author is grateful to the contributions of Birdy Amrutur, CMOS Transceiver for Serial Data Communication," in
Bill Brown, John Corcoran, Craig Corsetto, Dave DiPietro, Brian
ISSCC Digest of Technical Papers, pp. 242- 243,465 Feb.
Donoghue, Jeff Galloway, Andrew Grzegorek, Tom Hornak, Jim
1997.
Homer, Tom Knotts, Benny Lai, Adolf Leiter, Bill McFarland,
Charles Moore, Rasmus Nordby, Cheryl Owen, Pat Petruno, Kent [14] L. DeVito, J. Newton, R. Goughwell, J. Bulzacchelli
Springer, Guenter Steinbach, Hugh Wallace, Bin Wu, J.T. Wu, and and F.Benkley, "A 52MHz and 155 MHz Clock-Recovery
Chu Yen for technical discussions and helpful insights into bang- PLL," in ISSCC Digest of Technical Papers, pp. 142-
bang loop behavior. 143,306, Feb. 1991.
[15] J. F. Ewen, A. X. Widmer, M. Soyuer, K. R. Wrenner,
REFERENCES
B. Parker and H. A. Ainspan, "Single-Chip 1062Mbaud
[1] C. B. Armitage, "SAW Filter Retiming in the AT&T CMOS Transceiver for Serial Data Communication," in
432 Mb/s Lightwave Regenerator," in Conference Pro- ISSCC Digest of Technical Papers, pp. 32-33,336, Feb.
ceedings: AT&T Bell Labs, pp. 102-103, Sept. 3-6, 1984. 1995.
[2] C. R. Hogge, Jr., "A Self Correcting Clock Recovery [16] A. Fiedler, R. Mactaggart, J. Welch and S. Krishnan, "A
Circuit," IEEE Transactions on Electron Devices, vol. ED- 1.0625Gbps Transceiver with 2x-Oversampling and
32, no. 12, pp. 2704-2706, Dec. 1985. Transmit Signal Pre-Emphasis," in ISSCC Digest of Tech-
[3] J. Tani, Crandall, D., Corcoran, J. Hornak, T., "Parallel nical Papers, pp. 238-239,464, Feb. 1997.
Interface ICs for 120Mb/s Fiber Optic Links," in ISSCC [17] B. Guo, A. Hsu, Y. Wang and J. Kubinec, "125Mb/s
Digest oj Technical Papers, pp. 190-191,390, Feb. 1987. CMOS All-Digital Data Transceiver Using Synchronous
[4] R. C. Walker, T. Hornak, C. Yen and K. H. Springer, "A Uniform Sampling," in ISSCC Digest of Technical Papers,
Chipset for Gigabit Rate Data Communication," in Pro- pp. 112-113, Feb. 1994.
ceedings of the 1989 Bipolar Circuits and Technology [18] Y. M. Greshishchev, P. Schvan, J. L. Showell, M. Xu, J.
Meeting, pp. 288-290 September 18-19 1989. J. Ojha and J. E. Rogers, "A Fully Integrated SiGe
[5] F. Gardner, Phaselock Techniques, New York: John Receiver IC for 10-Gb/s Data Rate," IEEE Journal of Solid
Wiley & Sons, 1979, pp. 8-14. State Circuits, vol. 35, no. 12, pp. 1949-1957, Dec. 2000.
[6] I. Galton, "Higher-order Delta-Sigma Frequency-to- [19] R. Gu, J. M. Tran, H. Lin, A. Yee and M. Izzard, "A 0.5-
Digital Conversion," in Proceedings of IEEE International 3.5Gb/s Low-Power Low-Jitter Serial Data CMOS Trans-
Symposium on Circuits and Systems, pp. 441-444, May 30 ceiver," in ISSCC Digest of Technical Papers, pp. 352-
-June 2, 1994. 353,478, Feb. 1999.
[7] I. Galton, "Analog-Input Digital Phase-Locked Loops [20] J. Hauenschild, C. Dorshcky, T. W. Mohrenfels and R.
for Precise Frequency and Phase Demodulation," Transac- Seitz, "A lOGb/s BiCMOS Clock and Data Recovery 1:4-
tions on Circuits and Systems-II: Analog and Digital Sig- Demultiplexer in a Standard Plastic Package with External
nal Processing, vol. 42, no. 10, pp. 621-630, Oct. 1995. VCO," in ISSCC Digest of Technical Papers, pp. 202-
203,445, Feb. 1996.
[8] M. W. Hauser, "Principles of Oversampling AID Con-
version," J. Audio Eng. So. vol 39, no. 1/2, pp 3-26, Jan./ [21] T. He, and P. Gray, "A Monolithic 480 Mb/s AGC/Deci-
Feb. 1991. sion/Clock Recovery Circuit in 1.2 urn CMOS," IEEE
Journal of Solid State Circuits, vol. 28, no. 12, pp. 1314-
[9] J. D. H. Alexander, "Clock Recovery from Random
1320, Dec. 1993.
Binary Signals," Electronics Letters, vol. 11, no. 22, pp.
541-542, Oct. 1975. [22] P. Larsson, "A 2-1600MHz 1.2-2.5V CMOS Clock-
Recovery PLL with Feedback Phase-Selection and Aver-
[10] J. Hauenschild, D. Friedrich, J. Herrle, J. Krug, "A Two-
aging Phase-Interpolation for Jitter Reduction," in ISSCC
Chip Receiver for Short Haul Links up to 3.5Gb/s with
Digest of Technical Papers, pp. 356-357, Feb. 1999.
PIN-Preamp Module and CDR-DMUX " in ISSCC Digest
of Technical Papers, pp. 308-309,452, Feb. 1996.
44
[23] B. Lai, and R. C. Walker, "A Monolithic 622Mb/s Clock [36] R. C. Walker, K. Hsieh, T. A. Knotts and C. Yen, "A
Extraction Data Retiming Circuit," in JSSCC Digest of lOGb/s Si-Bipolar TX/RX Chipset for Computer Data
Technical Papers, pp. 144,145, Feb. 1991. Transmission," in ISSCC Digest of Technical Papers, pp.
[24] T. H. Lee, and J. F. Bulzacchelli, "A 155MHz Clock 302-303,450, Feb. 1998.
Recovery Delay- and Phase-Locked Loop," IEEE Journal [37] R. C. Walker, J. Wu, C. Stout, B. Lai, C. Yen, T. Hornak
of Solid State Circuits vol. 27, no. 12, pp. 1736-1746, Dec. and P. Petruno, "A 2-Chip 1.5Gb/s Bus-Oriented Serial
1992. Link Interface," in ISSCC Digest of Technical Papers, pp.
[25] R. H. Leonowich, and J. M. Steininger, "A 45-MHz 226-227,291, Feb. 1992.
CMOS phase/frequency-locked loop timing recovery cir- [38] C. K. Yang, and M. A. Horowitz, "0.8um CMOS 2.5Gb/
cuit," in ISSCC Digest of Technical Papers, pp. 14-15,278- s Oversampled Receiver for Serial Links," IEEE Journal
279, Feb. 1988. of Solid State Circuits vol. 31, no. 12, pp. 20150-2023,
[26] I. Lee, C. Yoo, W. Kim, S. Chai and W. Song, "A Dec. 1996.
622Mb/s CMOS Clock Recovery PLL with Time- Inter-
leaved Phase Detector Array," in ISSCC Digest of Techni-
Richard Walker was born in San Rafael
cal Papers, pp. 198-199,444, Feb. 1996. CA, in 1960. He received the B.S.
[27] M. Meghelli, B. Parker, H. Ainspan and M. Soyuer, "A degree in Engineering and Applied
SiGe BiCMOS 3.3V Clock and Data Recovery Circuit for Science from the California Institute of
lOGb/s Serial Transmission Systems," in ISSCC Digest of Technology in 1982, and an M.S.
Technical Papers, pp. 56-57, Feb. 2000. degree in Computer SciencefromCali-
fornia State University, Chico, CA in
[28] T. Morikawa, M. Soda, S. Shiori, T. Hashimoto, F. Sato 1992. Rick joined Agilent Laboratories
and K. Emura, "A SiGe Single-Chip 3.3V Receiver IC for (formerly Hewlett-Packard Laborato-
lOGb/s Optical Communication System," in ISSCC Digest ries) in 1981, where he is currently a
of Technical Papers, pp. 380-381,481, Feb. 1999. Principal Project Engineer. Since that
time, he has worked in the areas of
[29] A. Pottbacker, and U. Langmann, "An 8GHz Silicon broadband-cable modem design, solid-
Bipolar Clock-Recovery and Data-Regenerator IC," IEEE state laser characterization, phase-locked-loop theory, linecode
Journal of Solid State Circuits vol. 29, no. 12, pp. 1572- design, and gigabit-rate serial data transmission. He holds 15 U.S.
1576, Dec. 1994. patents.
[30] M. Reinhold, C. Dorschky, F. Pullela, E. Rose, P.
Mayer, P. Paschke, Y. Baeyens, J. Mattia and F. Kunz, "A
Fully-Integrated 40Gb/s Clock and Data Recovery / 1:4
DEMUX IC in SiGe Technology," in ISSCC Digest of
Technical Papers, pp. 84-85,435, Feb. 2001.
[31] M. Soyuer, and H. A. Ainspan, "A Monolithic 2.3 Gb/s
lOOmW Clock and Data Recovery Circuit," in ISSCC
Digest of Technical Papers, pp. 158-159,282, Feb. 1993.
[32] S. Ueno, K. Watanabe, T. Kato, T. Shinohara, K.
Mikami, T. Hashimoto, A. Takai, K. Washio, R. Takeyar
and T. Harada, "A Single-Chip lOGb/s Transceiver LSI
using SiGe SOI/BiCMOS," in ISSCC Digest of Technical
Papers, pp. 82-83,435, Feb. 2001.
[33] H. Wang, and R. Nottenburg, "A lGb/s CMOS Clock
and Data Recovery Circuit," in ISSCC Digest of Technical
Papers, pp. 354-355,477, Feb. 1999.
[34] P. Wallace, R. Bayruns, J. Smith, T. Laverick and R.
Shuster, "A GaAs 1.5Gb/s Clock Recovery and Data
Retiming Circuit," in ISSCC Digest of Technical Papers,
pp. 192-193, Feb. 1990.
[35] Z. Wang, M. Berroth, J. Seibel, P. Hofinann, A. Huls-
mann, Kohler, B. Raynor and J. Schneider, "19GHz
Monolithic Integrated Clock Recovery Using PLL and
0.3um Gate-Length Quantum-Well HEMTs," in ISSCC
Digest of Technical Papers, pp. 118-119, Feb. 1994,
45
Predicting the Phase Noise and Jitter of PLL-Based
Frequency Synthesizers
Kenneth S. Kundert
Abstract — Two methodologies are presented for predicting the frequency dividers (FDs). The PLL is a feedback loop that,
phase noise and jitter of a PLL-based frequency synthesizer when in lock, forces /ft, to be equal to/ r e f . Given an input fre-
using simulation that are both accurate and efficient. The meth- quency ^ n , the frequency at the output of the PLL is
odologies begin by characterizing the noise behavior of the
blocks that make up the PLL using transistor-level RF simula-
tion. For each block, the phase noise or jitter is extracted and •'out M Y (1)
applied to a model for the entire PLL.
where M is the divide ratio of the input frequency divider, and
N is the divide ratio of the feedback divider. By choosing the
I. INTRODUCTION
frequency divide ratios and the input frequency appropriately,
Phase-locked loops (PLLs) are used to implement a variety of the synthesizer generates an output signal at the desired fre-
timing related functions, such as frequency synthesis, clock quency that inherits much of the stability of the input oscilla-
and data recovery, and clock de-skewing. Any jitter or phase tor. In RF transceivers, this architecture is commonly used to
noise in the output of the PLL used in these applications gen- generate the local oscillator (LO) at a programmable fre-
erally degrades the performance margins of the system in quency that tunes the transceiver to the desired channel by
which it resides and so is of great concern to the designers of adjusting the value of N.
such systems. Jitter and phase noise are different ways of
referring to an undesired variation in the timing of events at /ref
OSC PU
the output of the PLL. They are difficult to predict with tradi- PFD CP LF VCO
/in
tional circuit simulators because the PLL generates repetitive /out
switching events as an essential part of its operation, and the
/fbp
-f-N
noise performance must be evaluated in the presence of this
large-signal behavior. SPICE is useless in this situation as it Fig. 1. The block diagram of a frequency synthesizer.
can only predict the noise in circuits that have a quiescent
(time-invariant) operating point. In PLLs the operating point B. Direct Simulation
is at best periodic, and is sometimes chaotic. Recently a new In many circumstances, SpectreRF* can be directly applied to
class of circuit simulators has been introduced that are capa- predict the noise performance of a PLL. To make this possi-
ble of predicting the noise behavior about a periodic operating ble, the PLL must at a minimum have a periodic steady state
point [1]. SpectreRF is the most popular of this class of simu- solution. This rules out systems such as bang-bang clock and
lators and, because of the algorithms used in its implementa- data recovery circuits and fractional-Af synthesizers because
tion, is likely to be the best suited for this application [2]. they behave in a chaotic way by design. It also rules out any
These simulators can be used to predict the noise perfor- PLL that is implemented with a phase detector that has a dead
mance of PLLs. The ideas presented in this paper allow those zone. A dead zone has the effect of opening the loop and let-
simulators to be applied even to those PLLs that have chaotic ting the phase drift seemingly at random when the phase of
operating points. the reference and the output of the voltage-controlled oscilla-
tor (VCO) are close. This gives these PLLs a chaotic nature.
A. Frequency Synthesis
To perform a noise analysis, SpectreRF must first compute
The focus of this paper is frequency synthesis. The block dia- the steady-state solution of the circuit with its periodic steady
gram of a PLL operating as a frequency synthesizer is shown state (PSS) analysis. If the PLL does not have a periodic solu-
in Figure 1 [3]. It consists of a reference oscillator (OSC), a tion, as the cases described above do not, then it will not con-
phase/frequency detector (PFD), a charge pump (CP), a loop verge. There is an easy test that can be run to determine if a
filter (LF), a voltage-controlled oscillator (VCO), and two circuit has a periodic steady-state solution. Simply perform a
transient analysis until the PLL approaches steady state and
Ken Kundert is with Cadence Design Systems, San Jose, Cal-
ifornia, kundert@cadence.com. t Spectre is a registered trademark of Cadence Design Systems.
46
then observe the VCO control voltage. If this signal consists jitter parameters for the corresponding behavioral models [8].
of frequency components at integer multiples of the reference Once everything is ready, simulation of the PLL occurs with
frequency, then the PLL has a periodic solution. If there are the blocks of the PLL being described with behavioral models
other components, it does not. Sometimes it can be difficult to that exhibit jitter. The actual jitter or phase noise statistics are
identify the undesirable components if the components asso- observed during this simulation. Generally tens to hundreds
ciated with the reference frequency are large. In this case, use of thousands of cycles are simulated, but the models are effi-
the strobing feature of Spectre's transient analysis to elimi- cient so the time required for the simulation is reasonable.
nate all components at frequencies that are multiples of the This approach allows prediction of PLL jitter behavior once
reference frequency. Do so by strobing at the reference fre- the noise behavior of the blocks has been characterized. How-
quency. In this case, if the VCO control voltage varies in any ever, it requires the use of an experimental simulator that is
significant way the PLL does not have a periodic solution. not readily available to characterize the jitter of the blocks.
If the PLL has a periodic solution, then in concept it is always In an earlier series of papers [9, 10], the relevant ideas of
possible to apply SpectreRF directly to perform a noise analy- Demir were adapted to allow use of a commercial simulator,
sis. However, in some cases it may not be practical to do so. Spectre [11], and an industry standard modeling language,
The time required for SpectreRF to compute the noise of a Verilog-A^ [12]. These ideas are further refined in the later
PLL is proportional to the number of circuit equations needed half of this paper.
to represent the PLL in the simulator times the number of
time points needed to accurately render a single period of the E. Predicting Noise in PLLs
solution times the number of frequencies at which the noise is There are two different approaches to modeling noise in
desired. When applying SpectreRF to frequency synthesizers PLLs. One approach is to formulate the models in terms of
with large divide ratios, the number of time points needed to the phase of the signals, producing what are referred to as
render a period can become problematic. Experience shows phase-domain models. In the simplest case, these models are
that divide ratios greater than ten are often not practical to linear and analyzed easily in the frequency domain, making it
simulate. Of course, this varies with the size of the PLL. simple to use the model to predict phase noise, even in the
For PLLs that are candidates for direct simulation using Spec- presence of flicker noise or other noise sources that are diffi-
treRF, simply configure the simulator to perform a PSS analy- cult to model in the time domain. Phase-domain models are
sis followed by a periodic noise (PNoise) analysis. The period described in the first half of this paper.
of the PSS analysis should be set to be the same as the refer- The process of predicting the phase noise of a PLL using
ence frequency as defined in Figure 1. The PSS stabilization phase-domain models involves:
time (tstab) should be set long enough to allow the PLL to
1. Using SpectreRF to predict the noise of the individual
reach lock. This process was successfully followed on a fre-
blocks that make up the PLL.
quency synthesizer with a divide ratio of 40 that contained
2500 transistors, though it required several hours for the com- 2. Building high-level behavioral models of each of the
plete simulation [4]. blocks that exhibit phase noise.
3. Assembling the blocks into a model of the PLL.
C. When Direct Simulation Fails 4. Simulating the PLL to find the phase noise of the overall
The challenge still remains, how does one predict the phase system.
noise and jitter of PLLs that do not fit the constraints that The other approach formulates the models in terms of volt-
enable direct simulation? The remainder of this paper age, which are referred to as voltage-domain models. The
attempts to answer that question for frequency synthesizers, advantage of voltage-domain models is that they can be
though the techniques presented are general and can be refined to implementation. In other words, as the design pro-
applied to other types of PLLs by anyone who is sufficiently cess transitions to being more of a verification process, the
determined. abstract behavioral models initially used can be replaced with
detailed gate- or transistor-level models in order to verify the
D. Monte Carlo-Based Methods PLL as implemented.
Demir proposed an approach for simulating PLLs whereby a A voltage-domain model is strongly nonlinear and never has a
PLL is described using behavioral models simulated at a high quiescent operating point, making it incompatible with a
level [5, 6]. The models are written such that they include jit- SPICE-Iike noise analysis. Often such models have a periodic
ter in an efficient way. He also devised a simulation algorithm operating point and so can be analyzed with small-signal RF
based on solving a set of nonlinear stochastic differential noise analysis (SpectreRF), but it is also common for that not
equations that is capable of characterizing the circuit-level to be the case. For example, a fractional-^ synthesizer does
noise behavior of blocks that make up a PLL [6, 7]. Finally,
he gave formulas that can be used to convert the results of the
t Verilog is a registered trademark of Cadence Design Systems
noise simulations on the individual blocks into values for the licensed to Accellera.
47
not have a periodic operating point. Occasionally, the circuit involves hundreds or thousands of cycles at the input to the
is sensitive enough that the noise affects the large-signal phase detector. With large divide ratios, this can translate to
behavior of the PLL, such as with bang-bang clock-and-data hundreds of thousands of cycles of the VCO. Thus, the num-
recovery PLLs, which invalidates any use of small-signal ber of time points needed for a single simulation could range
noise analysis. into the millions.
Modeling large-signal noise in a voltage-domain model as a This is all true when simulating the PLL in terms of voltages
voltage or a current is problematic. Such signals are very and currents. When doing so, one is said to be using voltage-
small and continuously and very rapidly varying. Extremely domain models. However, that is not the only option avail-
tight tolerances and small time steps are required to accu- able. It is also possible to formulate models based on the
rately resolve such signals with simulation. To overcome phase of the signals. In this case, one would be using phase-
these problems, the noise is instead represented using the domain models. The high frequency variations associated
effect it has on the timing of the transitions within the PLL. In with the voltage-domain models are not present in phase-
other words, the noise is added to circuit in the form of jitter. domain models, and so simulations are considerably faster. In
In this case there is no need for either small time steps or tight addition, when in lock the phase-domain-based models gener-
tolerances. ally have constant-valued operating points, which simplifies
small-signal analysis, making it easier to study the closed-
The process of predicting the jitter of a PLL with voltage-
loop dynamics and noise performance of the PLL using either
domain models involves:
AC or noise analysis.
1. Using SpectreRF to predict the noise of the individual
blocks that make up the PLL. A linear phase-domain model of a frequency synthesizer is
shown in Figure 2. Such a model is suitable for modeling the
2. Converting the noise of the block to jitter.
behavior of the PLL to small perturbations when the PLL is in
3. Building high-level behavioral models of each of the lock as long as you do not need to know the exact waveforms
blocks that exhibit jitter. and instead are interested in how small perturbations affect
4. Assembling the blocks into a model of the PLL. the phase of the output. This is exactly what is needed to pre-
5. Simulating the PLL to find the jitter of the overall system. dict the phase noise performance of the PLL.
The simple linear phase-domain model described in the first
part of this paper, and the nonlinear voltage-domain model osc FD/u PFD CP LF VCO
*det l w
^out
described in the second part, represent the two ends of a con- % //(CO)
M 271 /CO
tinuum of models. Generally, the phase-domain models are +^
considerably more efficient, but the voltage-domain models I ¥DN
1
do a better job of capturing the details of the behavior of the '•ft N
loop, details such as the signal capture and escape processes.
The phase-domain models can be made more general by mak- Fig. 2. Linear time-invariant phase-domain model of the synthesizer
ing them nonlinear and by analyzing them in the time domain. shown in Figure 1.
It is common to use such models with fractional-TV synthesiz-
The derivation of the model begins with the identification of
ers. Conversely, simplifications can be made to the voltage-
those signals that are best represented by their phase. Many
domain models to make them more efficient. It is even possi-
blocks have large repetitive input signals with their outputs
ble to use both voltage- and phase-domain models for differ-
being primarily sensitive to the phase of their inputs. It is the
ent parts of the same loop. One might do so to retain as much
efficiency as possible while allowing part of the design to be signals that drive these blocks that are represented as phase.
refined to implementation level. In general it is best to under- They are identified using a (]) variable in Figure 2. Notice that
stand both approaches well, and use ideas from both to con- this includes all signals except those at the inputs of the LF
struct the most appropriate approach for your particular and VCO.
situation. The models of the individual blocks will be derived by assum-
ing that the signals associated with each of the phase variables
is a pulse train. Though generally the case, it is not a require-
II. PHASE-DOMAIN MODEL
ment. It simply serves to make it easier to extract the models.
It is widely understood that simulating PLLs is expensive Define ri(r0, T, T) to be a periodic pulse train where one of the
because the period of the VCO is almost always very short pulses starts at / 0 and the pulses have duration t and period T
relative to the time required to reach lock. This is particularly as shown in Figure 3. This signal transitions between 0 and 1
true with frequency synthesizers, especially those with large if t is positive, and between 0 and - 1 if z is negative. The
multiplication factors. The problem is that a circuit simulator phase of this signal is defined to be $ = 2%t^/T. In many cases,
must use at least 10-20 time points for every period of the the duration of the pulses is of no interest, in which case
VCO for accurate rendering, and the lock process often n(r 0 , T) is used as a short hand. This occurs because the input
48
that the signal is driving is edge triggered. For simplicity, we The model of (3) is a continuous-time approximation to what
assume that such inputs are sensitive to the rising edges of the is inherently a discrete-time process. The phase detector does
signal, that r0 specifies the time of a rising edge, and that the not continuously monitor the phase difference between its
signal is transitioning between 0 and 1. two input signals, rather it outputs one pulse per cycle whose
width is proportional to the phase difference. Using a continu-
ous time approximation is generally acceptable if the band-
't-fhn , t.r - H width of the loop filter is much less than/ ref (generally less
than/ ref /10 is sufficient). In practical PLLs this is almost
4 i w ' ;tura^r
T>0 T<0
Fig. 3. The pulse train waveform represented by FI(^Q, X, 7).
always the case. It is possible to develop a detailed phase-
domain PFD model that includes the discrete-time effects, but
it would run more slowly and the resulting phase-domain
model of the PLL would not have a quiescent operating point,
which makes it more difficult to analyze.
The input source produces a signal vin = n(/Q, T). Since this is The voltage-controlled oscillator, or VCO, converts its input
the input, t0 is arbitrary. As such, we are free to set its phase <>| voltage to an output frequency, and the relationship between
to any value we like. input voltage and output frequency can be represented as
Given a signal vj = n(/ 0 , T) a frequency divider will produce /out = *Xv c ) (4)
an output signal v o = U(t0, NT) where N is the divide ratio. The mapping from voltage to frequency is designed to be lin-
The phase of the input isfa= 2nto/T and the phase of the out- ear, so a first-order model is often sufficient,
put is <|>o = 2ntrf(NT) and so the phase transfer characteristic
/ o u t = ^vco v c- (5)
of a divider is
•o = *iW. (2) It is the output phase that is needed in a phase-domain model,
There are many different types of phase detectors that can be •outW = 2nJ* vco v c (r)dr (6)
used, each requiring a somewhat different model. Consider a
simple phase-frequency detector combined with a charge or in the frequency domain,
pump [13]. In this case, the detector takes two inputs,
vj = n(*i, T) and v 0 = n(f 0 , T) and produces an output <t>out(«) = — ^ ^ ( c o ) . (7)
*cp = /maxn(^o» ?i~ro» T) where / max is the maximum output
current of the charge pump. The output of the charge pump A. Small-Signal Stability
immediately passes through a low pass filter that is designed
to suppress signals at frequencies of 1/7 and above, so in most This completes the derivation of the phase-domain models for
cases the pulse nature of this signal can be ignored in favor of each of the blocks. Now the full model is used to help predict
its average value, <i ) . Thus, the transfer characteristic of the small-signal behavior of the PLL. Start by using Figure 2
the combined PFD/CP is to write a relationship for its loop gain. Start by defining
dct vc
G fwd = I2H1 = - J 2 f f ( ( D ) _ l E 2 = ° (8)
</cp> = 7max~Y"" = 7max 1%
=
~^^\ "*o) <3)
where AT Kdetdet = / m a x . Of course, this is only valid for to be the forward gain,
^% a t t n e mmost.
l^l"" ^2! < 2fl
ost
- The behavior outside this range <I>4V 1
depends strongly on the type of phase detector used [3]. Even G = — =~ (9)
rCV
within this range, the phase detector may be better modeled <>ou« * ^
with a nonlinear transfer characteristic. For example, there to be the feedback factor, and
can be a flat spot in the transfer characteristics near 0 if the
detector has a dead zone. However it is generally not produc- T ~ GfwdGrev ^ (10)
tive to model the dead zone in a phase-domain model.'
to be the loop gain. The loop gain is used to explore the small-
signal stability of the loop. In particular, the phase margin is
an important stability metric. It is the negative of the differ-
t This phase-domain model is a continuous-time model that ignores ence between the phase shift of the loop at unity gain and
the sampling nature of the PFD. A dead zone interacts with the sam- 180°, the phase shift that makes the loop unstable. It should
pling nature of the PFD to create a chaotic limit cycle behavior that
is not modeled with the phase-domain model. This chaotic behavior be no less than 45° [14]. When concerned about phase noise
creates a substantial amount of jitter, and for this reason, most mod- or jitter, the phase margin is typically 60° or more to reduce
ern phase detectors are designed such that they do not exhibit dead peaking in the closed-loop gain, which results in excess phase
zones. noise.
49
B. Noise Transfer Functions As co -» 0, Gfwd -» °° because of the 1 /(/G>) term from the
In Figure 4 various sources of noise have been added. These VCO. So at DC, G r e f , G f d m , G f d n ^ N , Cm->N/M and
G
noise sources can represent either the noise created by the vco -* 0 • At low frequencies, the noise of the PLL is con-
blocks due to intrinsic noise sources (thermal, shot, and tributed by the OSC, PFD/CP, FD M and FD M and the noise
flicker noise sources), or the noise coupled into the blocks from the VCO is diminished by the gain of the loop.
from external sources, such as from the power supplies, the Consider further the asymptotic behavior of the loop and the
substrate, etc. Most are sources of phase noise, and denoted VCO noise at low offset frequencies (co —> 0). Oscillator
phase noise in the VCO results in the power spectral density
FDw PFD/CP LF VCO 5(})vco being proportional to 1/co2, or fyvco~ 1/co2 (neglect-
•»f ing flicker noise). If the LF is chosen such that //(co) ~ 1,
•in" 1 I *det
//(co)
2TC
^vco •oui
M (g 271
^ j(O then Gfwd - 1 /co, and contribution from the VCO to the out-
^fdm 'det TVCO put noise power, GvccAvco»*s f*mte anc* nonzero. If the LF
FD*
1 is chosen such that //(co) - 1/co, as it typically is when a
r N true charge pump is employed, then Gfwd ~ 1/co2 and the
^fdn
noise contribution to the output from the VCO goes to zero at
low frequencies.
Fig. 4. Linear time-invariant phase-domain model of the synthesizer
shown in Figure 2 with representative noise sources added. The <|)'s C. Noise Model
represent various sources of noise.
One predicts the phase noise exhibited by a PLL by building
and applying the model shown in Figure 4. The first step in
•in* •fdm* ^fdn' anc* •vcc because the circuit is only sensitive doing so is to find the various model parameters, including
to phase at the point where the noise is injected. The one the level of the noise sources, which generally involves either
exception is the noise produced by the PFD/CP, which in this direct measurement or simulating the various blocks with an
case is considered to be a current, and denoted /det. RF simulator, such as SpectreRF. Use periodic noise (or
Then the transfer functions from the various noise sources to PNoise) analysis to predict the output noise that results from
the output are stochastic noise sources contained within the blocks using
G NG simulation. Use a periodic AC or periodic transfer function
r - *out _ fwd _ fwd n n
ref =
^"f =
i ^ =
^-Gfwd' (11) (PAC or PXF) to compute the perturbation at the output of a
block due to noise sources outside the block, such as on sup-
G _ •«* _ l _ N ( m
plies.
K
°vco - i •; f - TTTa ' '
Once the model parameters are known, it is simply a matter of
1 N computing the output phase noise of the PLL by applying the
C = isH! = ±S*a* = (\3) equations in Section II-B to compute the contributions to (J)out
* tfa Ml-T MN-G^' from every source and summing the results. Be careful to
and by inspection, account for correlations in the noise sources. If the noise
sources are perfectly correlated, as they might be if the ulti-
Gfdn = j = * - "Graf d4) mate source of noise is in the supplies or substrate, then use a
direct sum. If the sources produce completely uncorrelated
noise, as they would when the ultimate source of noise is ran-
G fdm - j 2 * - ~GKf, (15) dom processes within the devices, use a root-mean-square
Tfdm sum.
27CG
r - *out _ ref ,v
Alternatively, one could build a Verilog-A model and use sim-
n
u }
^det - "j 7?—' ulation to determine the result. The top-level of such a model
A
'det det
is shown in Listing 1. It employs noisy phase-domain models
On this last transfer function, we have simply referred *det to
for each of the blocks. These models are given in Listings 3-7
the input by dividing through by the gain of the phase detec-
and are described in detail in the next few sections (HI-VI). In
tor.
this example, the noise sources are coded into the models, but
These transfer functions allow certain overall characteristics the noise parameters are not set at the top level to simplify the
of phase noise in PLLs to be identified. As co->°o, model. To predict the phase noise performance of the loop in
Gfwd -» 0 because of the VCO and the low-pass filter, and so lock, simply specify these parameters in Listing 1 and per-
G
ref> Gdet> Gfdm> Gfdn> G i n ~> ° a n d G vco ~> l • A t h i S h f r e " form a noise analysis. To determine the effect of injected
quencies, the noise of the PLL is that of the VCO. Clearly this noise, first refer the noise to the output of one of the blocks,
must be so because the low-pass LF blocks any feedback at and then add a source into the netlist of Listing 1 at the appro-
high frequencies. priate place and perform an AC analysis.
50
Listing 1 — Phase-domain model for a PLL configured as a is because oscillators inherently tend to amplify noise found
frequency synthesizer. near their oscillation frequency and any of its harmonics. The
reason for this behavior is covered next, followed by a
include "discipline.h" description of how to characterize and model the noise in an
module pll(out); oscillator. The origins of oscillator phase noise are described
output out; in a conceptual way here. For a detailed description, see the
phase out; papers by Kaertner or Demir et al [15, 16, 17].
parameter integer m = 1 from [1 :inf); //input divide ratio
parameter real Kdet = 1 from (O:inf); //detector gain A. Oscillator Phase Noise
parameter real Kvco = 1 from (0:inf); // VCO gain
Nonlinear oscillators naturally produce high levels of phase
parameter real d = 1 n from (O:inf); //Loop filter C1
parameter real c2 = 200p from (0:inf); //Loop filter C2
noise. To see why, consider the trajectory of a fully autono-
parameter real r = 10K from (0:inf); //Loop filterR mous oscillator's stable periodic orbit in state space. In steady
parameter integer n = 1 from [1 :inf); // fb divide ratio state, the trajectory is a stable limit cycle, v. Now consider
phase in, ret, fb; perturbing the oscillator with an impulse and assume that the
electrical c; deviation in the response due to the perturbation is Av, as
shown in Figure 5. Separate Av into amplitude and phase vari-
oscillator OSC(in);
divider #(.ratio(m)) FDm(in, ref); ations,
phaseDetector #(.gain(Kdet)) PD(ref, fb, c); Av(r) = [\+a(t)]v(t + $&)-v(t). (17)
loopFilter#(.c1(c1), .c2(c2), .r(r)) LF(c);
vco #(.gain(Kvco)) VCO(c, out);
divider #(.ratio(n)) FDn(out, fb); where v represents the unperturbed T-periodic output voltage
endmodule
of the oscillator, oc represents the variation in amplitude, § is
the variation in phase, and/ o = \IT is the oscillation fre-
Listings 1 and 3-7 have phase signals, and there is no phase quency.
discipline in the standard set of disciplines provided by Ver-
ilog-A or Verilog-AMS in discipline.h. There are several dif- v2
ferent resolutions for this problem. Probably the best solution 'CL Av(0)
is to simply add such a discipline, given in Listing 2, either to '6
discipline.h as assumed here or to a separate file that is 'o
included as needed. Alternatively, one could use the rota- 'l h A!>6 '51
v
l
tional discipline. It is a conservative discipline that includes l
A '6
torque as a flow nature, and so is overkill in this situation. h
Finally, one could simply use either the electrical or the volt- h h
h h
age discipline. Scaling for voltage in volts and phase in radi- K
ans is similar, and so it will work fine except that the units Fig. 5. The trajectory of an oscillator shown in state space with and
will be reported incorrectly. Using the rotational discipline without a perturbation Av. By observing the time stamps (?Q» ..., fg)
would require that all references to the phase discipline be one can see that the deviation in amplitude dissipates while the
changed to rotational in the appropriate listings. Using either deviation in phase does not.
the electrical or voltage discipline would require that both the
Since the oscillation is stable and the duration of the distur-
name of the disciplines be changed from phase to either elec-
bance is finite, the deviation in amplitude eventually decays
trical or voltage, and the name of the access functions be
away and the oscillator returns to its stable orbit (oc(f) -» 0 as
changed from Theta to V.
t -» oo). In effect, there is a restoring force that tends to act
Listing 2 — Signal flow discipline definition for phase signals (the against amplitude noise. This restoring force is a natural con-
nature Angle is defined in discipline.h). sequence of the nonlinear nature of the oscillator that acts to
suppresses amplitude variations.
* include "discipline.h" The oscillator is autonomous, and so any time-shifted version
discipline phase of the solution is also a solution. Once the phase has shifted
potential Angle; due to a perturbation, the oscillator continues on as if never
enddiscipline disturbed except for the shift in the phase of the oscillation.
There is no restoring force on the phase and so phase devia-
tions accumulate. A single perturbation causes the phase to
m. OSCILLATORS
permanently shift (§(t) —> A(|) as t —> oo). If we neglect any
Oscillators are responsible for most of the noise at the output short term time constants, it can be inferred that the impulse
of the majority of well-designed frequency synthesizers. This response of the phase deviation <|>(0 can be approximated with
51
a unit step s(t). The phase shift over time for an arbitrary input at A/= 1 Hz and/ c is the flicker noise corner frequency. As
disturbance u is shown in Figure 6, n is extracted by simply extrapolating to
oo t 1 Hz from a frequency where the noise from the white
sources dominates.
<K0~ js(t-x)u(x)dx = J«(x)rfr, (18)
—oo —oo
lHz \^
^v White sources dominate
This shows that in all oscillators the response to any form of
perturbation, including noise, is amplified and appears mainly \T2:1 External noise sources
in the phase. The amplification increases as the frequency of
the perturbation approaches the frequency of oscillation in
proportion to l/A/(or I/A/ 2 in power). a
Notice that there is only one degree of freedom — the phase f\» **
of the oscillator as a whole. There is no restoring force when /o
the phase of all signals associated with the oscillator shift Fig. 6. Extracting the noise parameters, n> a, and/ c , for an oscillator.
together, however there would be a restoring force if the The parameter a is an alternative to n where n = afo2. It is used later.
phase of signals shifted relative to each other. This observa- The graph is plotted on a log-log scale.
tion is significant in oscillators with multiple outputs, such as
Sty is not directly observable and often difficult to find, so now
quadrature or ring oscillators. The dominant phase variations
Sty is related to L, the power spectral density of the output
appear identically in all outputs, whereas relative phase varia-
voltage noise Sv normalized by the power in the fundamental
tions between the outputs are naturally suppressed by the
tone. Sv is directly available from either measurement with a
oscillator or added by subsequent circuitry and so tend to be
spectrum analyzer or from RF simulators, and £ i s defined as
much smaller [8].
SJ
B. Characterizing Oscillator Phase Noise
Above it was shown that oscillators tend to convert perturba-
««> - ^f. m
where Vj is the fundamental Fourier coefficient of v, the out-
tions from any source into a phase variation at their output
put signal. It satisfies
whose magnitude varies with I/A/ (or l / A / 2 i n power). Now
oo
assume that the perturbation is from device noise in the form
of white and flicker stochastic processes. The oscillator's »(,) = £ Vkei2«kf°'. (23)
response will be characterized first in terms of the phase noise
Sty, and then because phase noise is not easily measured, in
terms of the normalized voltage noise L. The result will be a In (41) of [15], Demir et al shows that for a free-running
small set of easily extracted parameters that completely oscillator perturbed only by white noise sources*
describe the response of the oscillator to white and flicker n
noise sources. These parameters are used when modeling the 440 =I 2 2 A,2, (24)
oscillator. 2 2 2
2w 7C + A /
Assume that the perturbation consists of white and flicker which is a Lorentzian process with corner frequency of
noise and so has the form
/comer = " * « / o - <25>
SJL6f)~\+f±. (20) At frequencies above the corner,
52
C. Phase-Domain Models for the Oscillators though they should not contain any white space, wpn was
The phase-domain models for the reference and voltage-con- chosen to represent white phase noise and/p/i stands for
trolled oscillators are given in Listings 3 and 4. The VCO flicker phase noise.
model is based on (6). Perhaps the only thing that needs to be When interested in the effect of signals coupled into the oscil-
explained is the way that phase noise is modeled in the oscil- lator through the supplies or the substrate, one would com-
lators. Verilog-AMS provides the flicker jioise function for pute the transfer function from the interfering source to the
modeling flicker noise, which has a power spectral density phase output of the oscillator using either a PAC or PXF anal-
proportional to l / / a with a typically being close to 1. How- ysis. Again, one would simply assume that the perturbation in
ever, Verilog-AMS does not limit a to being close to one, the output of the oscillator is completely in the phase, which
making this function well suited to modeling oscillator phase is true except at very high offset frequencies. One then
noise, for which a is 2 in the white-phase noise region and employs (12) and (13) to predict the response at the output of
close to 3 in the flicker-phase noise region (at frequencies the PLL.
below the flicker noise corner frequency). Alternatively, one
could dispense with the noise parameters and use the IV. LOOP FILTER
noise jtable function in lieu of the flicker jfioise functions to
use the measured noise results directly. The "wpn" and "fpn" Even in the phase-domain model for the PLL, the loop filter
remains in the voltage domain and is represented with a full
Listing 3 — Phase-domain oscillator noise model. circuit-level model, as shown in Listing 5. As such, the noise
behavior of the filter is naturally included in the phase-
include "discipline.h' domain model without any special effort assuming that the
module oscillator(out); noise is properly included in the resistor model.
output out;
phase out; Listing 5 — Loopfiltermodel.
parameter real n = 0 from [O:inf);
// white output phase noise at 1 Hz (rad2/Hz) include "discipline.h"
parameter real fc = 0 from [O:inf); module loopFilter(n);
// flicker noise corner frequency (Hz) electrical n;
analog begin ground gnd;*
Theta(out) <+ flicker__noise(n, 2, "wpn") parameter real d = 1n from (0:inf);
+ flicker_noise(n*fc, 3, "fpn"); parameter real c2 = 200p from (0:inf);
end parameter real r = 10K from (O:inf);
endmodule electrical int;
capacitor #(.c(d)) C1(n, gnd);
Listing 4 — Phase-domain VCO noise model. capacitor #(.c(c2)) C2(n, int);
resistor #(.r(r)) R(int, gnd);
include "discipline.h"
endmodule
include "constants.h"
module vco(in, out); t The ground statement is not currently supported in Cadence's Ver-
input in; output out; ilog-A implementation, so instead ground is explicitly passed into
voltage in; the module.
phase out;
parameter real gain = 1 from (0:inf); V. PHASE DETECTOR AND CHARGE PUMP
//transfer gain, Kvco (HzN)
parameter real n = 0 from [0:inf); As with the VCO, the noise of the PFD/CP as needed by the
// white output phase noise at 1 Hz (rad2/Hz) phase-domain model is found directly with simulation. Sim-
parameter real fc = 0 from [0:inf); ply drive the block with a representative periodic signal, per-
// flicker noise corner frequency (Hz) form a PNoise analysis, and measure the output noise current.
analog begin In this case, a representative signal would be one that pro-
Theta(out) <+ 2*'M_PI*gain*ldt(V(in)); duced periodic switching at the output. This is necessary to
Theta(out) <+ flickerjioise(n, 2, "wpn") capture the noise present during the switching process. Gen-
+ flicker_nofse(n*fc, 3, "fpn"); erally the noise appears as in Figure 7, in which case the noise
end is parameterized with n and/ c . n is the noise power density at
endmodule frequencies above the flicker noise corner frequency,/ c , and
below the noise bandwidth of the circuit.
strings passed to the noise functions are labels for the noise
The phase-domain model for the PFD/CP is given in
sources. They are optional and can be chosen arbitrarily,
Listing 6. It is based on (3). Alternatively, as before one could
53
A A.
A Cyclostationary Noise.
SQ Flicker sources dominate P 0]
Formally, the term cyclostationary implies that the autocorre-
^V^TJI White sources dominate lation
a
* tJ function of a stochastic process varies with / in a peri-
odic fashion [19, 20], which in practice is associated with a
T" TTS. pei
periodic variation in the noise power of a signal. In general,
/c \. the noise produced by all of the nonlinear blocks in a PLL is
1 N*^ str
strongly cyclostationary. To understand why, consider the
Noise bandwidth noise produced by a logic circuit, such as the inverter shown
noi
Fig. 7. Extracting the noise parameters, n and / c , for the PFD/CP. in
i n Figure 8. The noise at the output of the inverter, n out ,
54
circuit is periodically sampled to create a discrete-time ran- v is T periodic, which makes dv(iT)ldt a constant, and so
dom sequence, as shown in Figure 9. SpectreRF then com-
putes the power-spectral density of the sequence. The sample S^f) = [2nfo/^Pjsn{f). (32)
time would be adjusted to coincide with the desired threshold
crossings. Since the T-periodic cyclostationary noise process where Sn(f) and S^(f) are the power spectral densities of the
is sampled every T seconds, the resulting noise process is sta- ni and <(),• sequences.
tionary. Furthermore, the noise present at times other than at
the sample points is completely ignored. C. Phase-Domain Model for Dividers
To extract the phase noise of a divider, drive the divider with a
1 *
t
assure that the maxsidebands parameter is set sufficiently
large to capture all significant noise folding. A large value
I i
Fig. 9. Strobed noise. The lower waveform is a highly magnified
will slow the simulation. To reduce the number of sidebands
needed, use T as small as possible. S^(f) is then computed
from (32). Generally the noise appears as in Figure 10. Notice
that the noise is periodic in/with period 1/7 because n is a
view of the noise present at the strobe points in vn, which are chosen
to coincide with the threshold crossings in v. discrete-time sequence with period T. The parameters n and/ c
for the divider are extracted as illustrated. The high frequency
B. Converting to Phase Noise roll-off is generally ignored because it occurs above the fre-
The act of converting the noise from a continuous-time pro- quency range of interest.
cess to a discrete-time process by sampling at the threshold
crossings makes the conversion into phase noise easier. If vn 5A Flicker sources dominate
is the continuous-time noisy response, and v is the noise-free «r\ White sources dominate
A
/ \
response (response with the noise sources turned off), then^
n;=v n (/7)~v0T). (27>
Then if vn is noisy because it is corrupted with a phase noise
Noise bandwidth \ / UT
process 0, then
v(t+m
Vn{t) = 2itf/ (28) Fig. 10. Extracting the noise parameters, n and/ c , for the divider.
Assume the phase noise § is small and linearize v using a Tay- With ripple counters, one usually only characterizes one stage
lor series expansion at a time and combines the phase noise from each stage by
assuming that the noise in each stage is independent (true for
(29) device noise, would not be true for noise coupling into the
divider from external sources). The variation due to phase
and noise accumulates, however it is necessary to account for the
increasing period of the signals at each stage along the ripple
ni . sv(jT) + MiI)gZ)_ v ( j T ) = M L D | p . (30) counter. Consider an intermediate stage of a /sT-stage ripple
' at 2nf0n at lntn
counter. The total phase noise at the output of the ripple
Finally, <|>,- can be found from nt using counter that results due to the phase noise 5 ^ at the output of
. , ,dv(iT) stage k is (TK/T02S^. So the total phase noise at the output
(31) of the ripple counter is
K
Vut = 4 £ ^ (33)
t The strobed-noise feature of SpectreRF is also referred to as its
time-domain noise feature. *=o *
arem e
t It is assumed that the sequence nt is formed by sampling the noise where S^ and 7Q phase noise and signal period at the
at iT, which implies that the threshold crossings also occur at iT. In input to the first stage of the ripple counter.
practice, the crossings will occur at some time offset from iT. That
offset is ignored. It is done without loss of generality with the under- With undesired variations in the supplies or in the substrate
standing that the functions v and vn can always be reformulated to the resulting phase noise in each stage would be correlated, so
account for the offset. one would need to compute the transfer function from the sig-
55
nal source to the phase noise of each stage and combine in a a frequency divider that implements non-integer divide ratio
vector sum. except in a few very restrictive cases, so instead a divider that
Unlike in ripple counters, phase noise does not accumulate is capable of switching between two integer divide ratios is
with each stage in synchronous counters. Phase noise at the used, and one rapidly alternates between the two values in
output of a synchronous counter is independent of the number such a way that the time-average is equal to the desired non-
of stages and consists only of the noise of its clock along with integer divide ratio [13]. A block diagram for a fractional-Af
the noise of the last stage. synthesizer is shown in Figure 11. Divide ratios of N and
N + 1 are used, where N is the first integer below the desired
The phase-domain model for the divider, based on (2), is divide ratio, and N + 1 is the first integer above. For example,
given in Listing 7. As before, one could use the noisejtable if the desired divide ratio is 16.25, then one would alternate
function in lieu of the white_noise and flickerjnoise functions between the ratios of 16 and 17, with the ratio of 16 being
to use the measured noise results directly. used 75% of the time. Early attempts at fractional-N synthesis
alternated between integer divide ratios in a repetitive man-
Listing 7 — Phase-domain divider noise model.
ner, which resulted in noticeable spurs in the VCO output
Include "discipline.h" spectrum. More recently, AZ modulators have been used to
generate a random sequence with the desired duty cycle to
module divider(in, out);
control the multi-modulus dividers [21]. This has the effect of
input in; output out;
trading off the spurs for an increased noise floor, however the
phase in, out;
parameter real ratio = 1 from (OAni);//divide ratio AZ modulator can be designed so that most of the power in its
parameter real n = 0 from [0:inf); output sequence is at frequencies that are above the loop
// white output phase noise (rads?/Hz) bandwidth, and so are largely rejected by the loop.
parameter real fc = 0 from [0:inf);
// flicker noise corner frequency (Hz) /ref
osc
analog begin PFD CP LF VCO
/out
Theta(out) <+ Theta(in) / ratio;
Theta(out) <+ white_noise(n, "wpn") /J FD
+ flicker_noise(n*fc, 1, "fpn"); +N, N+l
end
endmodule Mod
The constraint on the loop bandwidth imposed by the withy assumed to be a zero-mean process and v assumed to be
required frequency resolution is eliminated if the divide ratio a 7-periodic function, j has units of seconds and can be inter-
N is not limited to be an integer. This is the idea behind frac- preted as a noise in time. Alternatively, it can be reformulated
tional-N synthesis. In practice, one cannot directly implement as a noise is phase, or phase noise, using
56
Listing 8 — Phase-domain fractional-N divider model. between transitions. The next metric characterizes the corre-
lations between transitions as a function of how far the transi-
include "discipline.h" tions are separated in time.
module divider(in, out); Define Jk(i) to be the standard deviation of ti+k - th
input in; output out;
phase in, out; 7,(0 = Vvar(f,. + , - * , . ) . (38)
parameter real ratio = 1 from (O:lnf);// divide ratio Jk(i) is referred to as k-cycle jitter or long-term jitter '. It is a 1
parameter real n = 0 from [0:inf);
measure of the uncertainty in the length of k cycles and has
// white output phase noise (rads?/Hz)
units of time. 7j, the standard deviation of the length of a sin-
parameter real fc = 0 from [0:inf);
// flicker noise corner frequency (Hz) gle period, is often referred to as the period jitter, and it
parameter real bw = 1 from (O:inf);//AX mod bandwidth denoted J, where J = 7].
parameter integer order = 1 from (0:9);//AZ mod order Another important jitter metric is cycle-to-cycle jitter. Define
parameter real fmax = 10*bw from (bw:inf); 7} = ft-+i - tx to be the period of cycle i. Then the cycle-to-
// maximum frequency of concern cycle jitter 7CC is
analog begin
Theta(out) <+ Theta(in) / (ratio + noise_table([ ' c c » = V Var < 7 '.- + l- 7 '*>- <39>
0, n, Cycle-to-cycle jitter is like edge-to-edge jitter in that it is a
bw, n, scalar jitter metric that does not contain information about the
fmax, n*pow((fmax/bw),order) correlation in the jitter between distant transitions. However,
], "dsn"));
it differs in that it is a measure of short-term jitter that is rela-
end
tively insensitive to long-term jitter [22]. As such, cycle-to-
endmodule
cycle jitter is the only jitter metric that is suitable for use
<K0 = 27t/ o7 (0, (35) when flicker noise is present. All other metrics are unbounded
in the presence of flicker noise.
where/ o = 1/Fand
If7(0 is either stationary or T-cyclostationary, then {t{\ is sta-
vn(0 = v(, + f | ) . (36) tionary, meaning that these metrics do not vary with i, and so
7 e e (0, «J&(0> a n d Jcc(0 c a n b e shortened to 7 ee , Jk, and 7CC.
These jitter metrics are illustrated in Figure 12.
A. Jitter Metrics
Define {^} as the sequence of times for positive-going zero
crossings, henceforth referred to as transitions, that occur in edge-to-edge jitter ~ \ K_c
vn. The various jitter metrics characterize the statistics of this
sequence. jeeu) = 7^8^j "*" p '
The simplest metric is the edge-to-edge jitter, 7 ee , which is
the variation in the delay between a triggering event and a
fc-cycle jitter "~i |—I |—| |—i r~\
response event. When measuring edge-to-edge jitter, a clean
jitter-free input is assumed, and so the edge-to-edge jitter 7 ec Jk(i) = Jynr(tl + k-tii H* ^\
is HI k cycles ~ti+k
(37)
cycle-to-cycle jitter p H * ~H
Edge-to-edge jitter assumes an input signal, and so is only
defined for driven systems. It is an input-referred jitter metric, '««= j™iTi+l-Ti) J l J l J U L f
meaning that the jitter measurement is referenced to a point Fig. 12. The various jitter metrics.
on a noise-free input signal, so the reference point is fixed. No
such signal exists in autonomous systems. The remaining jit- B. Types of Jitter
ter metrics are suitable for both driven and autonomous sys- The type of jitter produced in PLLs can be classified as being
tems. They gain this generality by being self-referred, from one of two canonical forms. Blocks such as the PFD,
meaning that the reference point is on the noisy signal for CP, and FD are driven, meaning that a transition at their out-
which the jitter is being measured. These metrics tend to be a put is a direct result of a transition at their input. The jitter
bit more complicated because the reference point is noisy,
which acts to increase the measured jitter.
t Some people distinguish betweenfc-cyclejitter and long-term jitter
Edge-to-edge jitter is also a scalar jitter metric, and it does not by defining the long-term jitter J^ as being thefc-cyclejitter Jk as
convey any information about the correlation of the jitter k -»«>.
57
exhibited by these blocks is referred to as synchronous jitter, J
it is a variation in the delay between when the input is k® = Vvar<'i + ft-'i>' < 43)
received and the output is produced. Blocks such as the OSC <>*(0 = Vvaraa + k)T+jsync(ti + k)] - [iT+j^)]) ,(44)
and VCO are autonomous. They generate output transitions
not as a result of transitions at their inputs, but rather as a y,(/) = 72varO sync (/ /) ). (45)
result of the previous output transition. The jitter produced by
Jk(i) = V27ee(0 . (46)
these blocks is referred to as accumulating jitter, it is a varia-
tion in the delay between an output transition and the subse- Since 7Sync(0 is jT-cyclostationary ysync =; syn c('/) is indepen-
quent output transition. Table I previews the basic dent of i, and so is 7 ee and Jk. The factor of 72 in (46) stems
characteristics of these two types of jitter. The formulas for from the length of an interval including the independent vari-
jitter given in this table are derived in the next two sections. ation from two transitions. From (46), Jk is independent of £,
and so
T A B L E I: THE TWO CANONICAL FORMS OF JITTER.
Jk = J for k = 1,2, ...m. (47)
Jitter Type Circuit Type Period Jitter Using similar arguments, one can show that with simple syn-
chronous jitter,
driven , /var(« ( r ) )
Jcc = J, (48)
synchronous (pFD/cp ^ J = ^ / < f t
Generally, the jitter produced by the PFD/CP and FDs is well
. . autonomous ,—
approximated by simple synchronous jitter if one can neglect
accumulating | ( Q S Q yep) | 7 = ToT
flicker noise.
58
Generally nv is not stationary, but cyclostationary (refer back With ripple counters, one usually only characterizes one stage
to Section VI-A). It is only important to know when the noisy at a time. The total jitter due to noise in the ripple counter is
periodic signal vn(t) crosses the threshold, so the statistics of then computed by assuming that the jitter in each stage is
nv are only significant at the time when vn(t) crosses the independent (again, this is true for device noise, but not for
threshold, noise coupling into the divider from external sources) and
var(n ( O ) taking the square-root of the sum of the square of the jitter on
v
"°- < '< )) - T^pk>- <50) each stage.
Unlike in ripple counters, jitter does not accumulate with syn-
The jitter is computed from (42) using (49) or (50), chronous counters. Jitter in a synchronous counter is indepen-
dent of the number of stages and consists only of the jitter of
V its clock along with the jitter of the last stage.
dv(tc)/dt '
To compute var(nv(rc)), one starts by driving the circuit with a 2) Extracting the Jitter of the Phase Detector: The PFD/CP is
representative periodic signal, and then sampling v(t) at inter- not followed by a threshold. Rather, it feeds into the LF,
vals of 7to form the ergodic sequence {v(rl)} where tt = tc for which is sensitive to the noise emitted by the CP at all times,
some i. Then the variance is computed by computing the not just during transitions. This argues that the noise of the
power spectral density for the sequence by integrating from PFD/CP be modeled as a continuous noise current. However,
/ = ~/0/2 to/ o /2. Recall that the noise is periodic in/with as mentioned earlier, doing so is problematic for simulators
period/o = 1/7because n is a discrete-time sequence with rate and would require very tight tolerances and small time steps.
T. So instead, the noise of the PFD/CP is referred back to its
inputs. The inputs of the PFD/CP are edge triggered, so the
In practice, this is done by using the strobed noise capability noise can be referred back as jitter.
of SpectreRF^ to compute the power spectral density of the
sequence. When the strobed noise feature is active, the noise To extract the input-referred jitter of a PFD/CP, drive both
produced by the circuit is periodically sampled to create a dis- inputs with periodic signals with offset phase so that the PFD/
crete-time random sequence, as shown in Figure 9. SpectreRF CP produces a representative output. Use SpectreRF's PNoise
then computes the power-spectral density of the sequence. analysis to compute the output noise over the total bandwidth
The sample time should be adjusted to coincide with the of the PFD/CP (in this case, use the conventional noise analy-
desired threshold crossings. Since the T-periodic cyclostation- sis rather than the strobed noise analysis). Choose the fre-
ary noise process is sampled every T seconds, the resulting quency range of the analysis so that the total noise at
noise process is stationary. Furthermore, the noise present at frequencies outside the range is negligible. Thus, the noise
times other than at the sample points is completely ignored. should be at least 40 dB down and dropping at the highest fre-
quency simulated. Integrate the noise over frequency and
1) Extracting the Jitter of Dividers: To extract the jitter of a apply Wiener-Khinchin Theorem [24] to determine
divider, drive the divider with a representative periodic input
signal and perform a PSS analysis to determine the threshold var(n) = f Sn(f)df, (53)
—oo
crossing times and the slew rate (dv/dt) at these times. Then
use SpectreRF's strobed PNoise analysis to compute £„(/). the total output noise current squared [19]. Then either calcu-
The sample point should be set to coincide with the point late or measure the effective gain of the PFD/CP, K^cV Scale
where the output signal crosses the threshold of the subse- the gain so that it has the units of amperes per second. Then
quent stage (the phase detector) in the appropriate direction. divide the total output noise current by the gain and account
When running PNoise analysis, assure that the maxsidebands for there being two transitions per cycle to distribute the noise
parameter is set sufficiently large to capture all significant over to determine the input-referred jitter for the PFD/CP,
noise folding. A large value will slow the simulation. To
reduce the number of sidebands needed, use T as small as J = T F**W (54)
ee K }
possible. SpectreRF computes the power spectral density, PFD/cp 2nKdJ 2 '
which is integrated to compute the total noise at the sample As before, when running PNoise analysis, assure that the
points, maxsidebands parameter is set sufficiently large to capture all
significant noise folding. A large value will slow the simula-
/°/2 tion. To reduce the number of sidebands needed, use T as
var(nv(fc)) = J Sn(f,tz)df. (52) small as possible.
•'o
t The strobed-noise feature of SpectreRF is also referred to as its Accumulating jitter is exhibited by autonomous systems, such
timf-Hotnain nnisp. fpfltiire. as oscillators, that generate a stream of spontaneous output
59
transitions. In the PLL, the OSC and VCO exhibit accumulat- Similarly,
ing jitter. Accumulating jitter is characterized by an undesired
Jcc = 727. (59)
variation in the time since the previous output transition, thus
the uncertainty of when a transition occurs accumulates with Generally, the jitter produced by the OSC and VCO are well
every transition. Compared with a jitter free signal, the fre- approximated by simple accumulating jitter if one can neglect
quency of a signal exhibiting accumulating jitter fluctuates flicker noise.
randomly, and the phase drifts without bound. Thus, the jitter
appears as a modulation of the frequency of the output, which A. Extracting Accumulating Jitter
is why it is sometimes referred to as frequency modulated or The jitter in autonomous blocks, such as the OSC or VCO, is
FM jitter. almost completely due to oscillator phase noise. Oscillator
Again assume that T| be a stationary or T-cyclostationary pro- phase noise is a variation in the phase of the oscillator as it
cess, then proceeds along its limit cycle.
In order to determine the period jitter / of vn(f) for a noisy
W ) = f neorfc (55) oscillator, assume that it exhibits simple accumulating jitter
J
o so that T| in (55) is a white Gaussian r-cyclostationary noise
process (this excludes flicker noise) with a power spectral
v n (0 = v(/+y a c c (O) (56)
density of
exhibits accumulating jitter. While Tj is cyclostationary and so
has bounded variance, (55) shows that the variance of y acc , S^(f)= a, (60)
and hence the phase difference between v(t) and v n (0, is and an autocorrelation function of
unbounded. Rr](tvt2) = ab{tx-t2)y (61)
If t| is further restricted to be a white Gaussian stationary or
T-cyclostationary random process, then v n (0 exhibits simple where 8 is a Kronecker delta function. Then
accumulating jitter. In this case, the process {yacc(*T)} that
results from sampling y a c c every T seconds is a discrete AccW = f T\TW (62>
Wiener process and the phase difference between v(/7) and
vn(/7) is a random walk [19]. As shown next, simple accumu- is a Wiener process [19], which has an autocorrelation func-
lating jitter corresponds to oscillator phase noise that results tion of
from white noise sources. R
j (*!> l2> = amin(f 1? h^ • ( 63 >
The essential characteristic of simple accumulating jitter is
that the incremental jitter that accumulates over each cycle is The period jitter is the standard deviation of the variation in
independent or uncorrelated. Autonomous circuits exhibit one period, and so
simple accumulating jitter if they are broadband and if the Jl
= ™0'acc('+7Wacc«). (64)
noise sources are white, Gaussian and small. The sources are
considered small if the circuit responds linearly to the noise, ^2 = E [ 0 a c c 0 + 7 ) - j a c c ( 0 ) 2 ] (65)
though at the same time the circuit may be responding nonlin- 2 2 2
J = E[/ acc (r + T) - 2jacc(t + 7); acc (0 +; a c c (0 ] (66)
early to the oscillation signal. An autonomous circuit is con-
sidered broadband if there are no secondary resonant Jl = EL/acc« + 7) 2 ] " 2 Et/ a c c (/ + 7)y acc (/)] + E[/ a c c (0 2 ] (67)
responses close in frequency to the primary resonance.*
J2 = R. (t + T,t + T)-2Rj (/+7W) + * / . (M) (68)
For systems that exhibit simple accumulating jitter, each tran- •'ace •'ace 'ace
sition is relative to the previous transition, and the variation in J2 = a(t + T) - 2at + at (69)
the length of each period is independent, so the variance in
the time of each transition accumulates, / = Jaf (70)
Jk= 4~kJ for k = 0, 1 , 2 , . . . , (57) We now have a way of relating the jitter of the oscillator to the
PSD of T|. However, x\ is not measurable, so instead the jitter
where is related to the phase noise S§. To do so, consider simple
J = ^varO^. +^-varO-^,.)). (58) accumulating jitter written in terms of phase,
60
From (26) XI. JITTER OF A PLL
AJ log(*)
Fig. 15. Long-term jitter (Jk) for an idealized PLL as a function of
a = 2 • 10" 11 1Q
= 165.3X10" 21 . (75) A feature of Verilog-A allows especially simple modeling of
Vl.lxlOV synchronous jitter. The transitionQ function, which is used to
model signal transitions between discrete levels, provides a
The period jitter J is then computed from (70), delay argument that can be dithered on every transition. The
/ r* fa /165.3 x 10~ 21 1O~. ,nf-, delay argument must not be negative, so a fixed delay that is
J = JaT = / - = /— = 12.3 fs. (76) greater than the maximum expected deviation of the jitter
^/0 A/ 1.1 GHz must be included. This approach is suitable for any model that
In this example, the noise was extracted for the VCO alone. In exhibits synchronous jitter and generates discrete-valued out-
practice, the LF is generally combined with the VCO before puts. It is used in the Verilog-A divider module shown in
extracting the noise so that the noise of the LF is accounted Listing 9, which models synchronous jitter with (41) where
for. 7 sync *s a stationary white discrete-time Gaussian random pro-
cess. It is also used in Listing 10, which models a simple
t At one point it was mistakenly suggested in the documentation for PFD/CP.
SpectreRF that maxsidebands should be set to 0 for oscillators. This
causes SpectreRF to ignore all noise folding and results in a signifi- 1) Frequency Divider Model: The model, given in Listing 9,
cant underestimation of the total noise. operates by counting input transitions. This is done in the
61
Listing 9 — Frequency divider that models synchronous jitter. Listing 10 — PFD/CP model with synchronous jitter.
62
Listing 11 — Fixed frequency oscillator with accumulating jitter. AT isis aa random
AT random variable
variable with
with variance
variance
0) •
Vv JU Conceptually, a model that includes jitter should be just as
efficient as one that does not because jitter does not increase
Vin k Y,] f mod 2rc
the activity of the models, it only affects the timing of particu-
"Knit
lar events. However, if jitter causes two events that would nor-
75 mally occur at the same time to be displaced so that they are
Fig. 16. Block diagram of VCO behavioral model that includes no longer coincident, then a circuit simulator will have to use
jitter. more time points to resolve the distinct events and so will run
more slowly. For this reason, it is desirable to combine jitter
The jitter is modeled as a random variation in the frequency sources to the degree possible.
of the VCO. However, the jitter is specified as a variation in
To make the HDL models even faster, rewrite them in either
the period, thus it is necessary to relate the variation in the
Verilog-HDL or Verilog-AMS. Be sure to set the time resolu-
period to the variation in the frequency. Assume that without
tion to be sufficiently small to prevent the discrete nature of
jitter, the period is divided into K equal intervals of duration T
time in these simulators from adding an appreciable amount
= T/K = l/Kf0. The frequency deviation will be updated
of jitter.
every interval and held constant during the intervals. With jit-
ter, the duration of an interval is 1) Including Synchronous Jitter into OSC: One can combine
%. = T + AT.. (77) the output-referred noise of F D ^ and FD^ and the input-
63
Listing 12 — VCO model that includes accumulating jitter. Listing 13 — Quadrature Differential VCO model that includes
accumulating jitter.
include "discipline.h"
include "constants.h" include "discipline.h"
Include "constants.h"
module vco (out, in);
module quadVco (Plout.Nlout, PQout,NQout, Pin,Nin);
input in; output out; electrical out, in;
electrical Plout, Nlout, PQout, NQout, Pin, Nin;
parameter real Vmin=0;
output Plout, Nlout, PQout, NQout;
parameter real Vmax=Vmin+1 from (Vmin:inf);
input Pin, Nin;
parameter real Fmin=1 from (Orinf);
parameter real Fmax=2*Fmin from (Fmin:inf); parameter real Vmin=0;
parameter real Vlo=-1, Vhi=1; parameter real Vmax=Vmin+1 from (Vmin:inf);
parameter real tt=0.01/Fmax from (O:inf); parameter real Fmin=1 from (O:inf);
parameter real jitter=O from [0:0.25/Fmax);// period jitter parameter real Fmax=2*Fmin from (Fminrinf);
parameter real ttol=1u/Fmax from (0:1/Fmax); parameter real Vlo=-1, Vhi=1;
parameter real jitter=O from [0:0.25/Fmax);// period jitter
real freq, phase, dT;
parameter real ttol=1u/Fmax from (0:1/Fmax);
integer n, seed;
parameter real tt=0.01/Fmax;
analog begin
real freq, phase, dT;
©(initlaLstep) seed = - 5 6 1 ;
integer i, q, seed;
//compute the freq from the input voltage
analog begin
freq = (V(in) - Vmin)*(Fmax - Fmin) / (Vmax - Vmin)
@(initial_step) seed = 133;
+ Fmin;
//compute the freq from the input voltage
//bound the frequency (this is optional)
freq = (V(Pin.Nin) - Vmin) * (Fmax - Fmin) / (Vmax - Vmin)
if (freq > Fmax) freq = Fmax;
+ Fmin;
if (freq < Fmin) freq = Fmin;
//bound the frequency (this is optional)
/ / add the phase noise
if (freq > Fmax) freq = Fmax;
freq = f req/(1 + dT*freq);
if (freq < Fmin) freq = Fmin;
//phase is the integral of the freq modulo 2K
/ / add the phase noise
phase = 2*^M_PI*idtmod(freq, 0.0,1.0, -0.5);
freq = freq/(1 + dT*freq);
/ / update jitter twice per period
//phase is the integral of the freq modulo 2K
// 1A14=sqrt(K), K=2 jitter updates/period
phase = 2* % MJ D l*idtmod(freq, 0.0,1.0, -0.5);
@(cross(phase + *M_PI/2, + 1 , ttol) or
cross(phase - 'M_PI/2, +1, ttol)) begin // update jitter where phase crosses n/2
d T = 1.414*jitter*$dist_jiormal(seed,0,1); //2=sqrt(K), K=4 jitter updates per period
n = (phase >= - M_PI/2) && (phase < 'M_PI/2); @(cross(phase - 3**M_PI/4, +1, ttol) or
end cross(phase - x M_PI/4, + 1 , ttol) or
cross(phase + 'lvLPI/4, + 1 , ttol) or
//generate the output
cross(phase + 3**M__PI/4, +1, ttol)) begin
V(out) <+ transition^ ? Vhi: Vlo, 0, tt);
dT = 2*jitter*$dist_normal(seed,0,1);
end
I = (phase >= -3*^M_PI/4) && (phase < %M_PI/4);
endmodule
q = (phase >= - M_PI/4) && (phase < 3*%M_PI/4);
end
referred noise of the PFD/CP with the output noise of OSC. A
modified fixed-frequency oscillator model that supports two //generate the I and Q outputs
jitter parameters and the divide ratio M is given in Listing 14 V(Plout) <+ transition(i ? Vhi: Vlo, 0, tt);
(more on the effect of the divide ratio on jitter in the next sec- V(Nlout) <+ transition^ ? Vlo: Vhi, 0, tt);
tion). The accJitter parameter is used to model the accumulat- V(PQout) <+ transition^ ? Vhi: Vlo, 0, tt);
V(NQout) <+ transition(q ? Vlo : Vhi, 0, tt);
ing jitter of the reference oscillator, and the syncJitter
end
parameter is used to model the synchronous jitter of FD^,
endmodule
FDN and PFD/CP. Synchronous jitter is modeled in the oscil-
lator without using a nonzero delay in the transition function.
2) Merging the VCO and FDN: If the output of the VCO is
This is a more efficient approach because it avoids generating
not used to drive circuitry external to the synthesizer, if the
two unnecessary events per period. To get full benefit from
divider exhibits simple synchronous jitter, and if the VCO
this optimization, a modified PFD/CP given in Listing 15 is
exhibits simple accumulating jitter, then it is possible to
used. This model runs more efficiently by removing support
include the frequency division aspect of the FD^ as part of the
for jitter and the td parameter.
64
Listing 14 — Fixed-frequency oscillator with accumulating and Listing 15 — PFD/CP without jitter.
synchronous jitter.
include "discipline.h"
include "discipline.h"
module pfd_cp (out, ref, vco);
module osc (out);
input ref, vco; output out; electrical ref, vco, out;
output out; electrical out;
parameter real lout=100u;
parameter real freq=1 from (0:inf); parameter integer dir=1 from [-1:1] exclude 0;
parameter real ratio=1 from (0:inf); //dir= 1 for positive edge trigger
parameter real Vlo=-1, Vhi=1; // dir = -1 for negative edge trigger
parameter real tt=0.01*ratio/freq from (0:inf); parameter real tt=1n from (0:inf);
parameter real accJitter=O from [O:O.1/freq); //period jitter parameter real ttol=1p from (0:inf);
parameter real syncJitter=O from [0:0.1 *ratlo/freq);
integer state;
// edge-to-edge jitter
analog begin
integer n, accSeed, syncSeed;
@(cross(V(ref), dir, ttol)) begin
real next, dT, dt, accSD, syncSD;
If (state > -1) state = state - 1;
analog begin end
@(initial_step) begin @(cross(V(vco), dir, ttol)) begin
accSeed = 286; if (state < 1) state = state + 1;
syncSeed = -459; end
accSD = accJltter*sqrt(ratio/2);
l(out) <+ transition(lout * state, 0, tt);
syncSD = syncJitter;
end
next = 0.5/freq + $abstime;
endmodule
end
@(timer(next + dt)) begin Thus, to merge the divider into the VCO, the VCO gain must
n = !n; be reduced by a factor of N, the period jitter increased by a
dT = accSD*$dist_normal(accSeed,0,1); factor of JN , and the divider model removed.
dt = syncSD*$dist_normal(syncSeed,0,1);
next = next + 0.5*ratio/freq + dT; After simulation, it is necessary to refer the computed results,
end which are from the output of the divider, to the output of
VCO, which is the true output of the PLL. The period jitter at
V(out) <+ transition^ ? Vhi: Vlo, 0, tt);
end the output of the VCO, Jyco* c a n ^ e computed with (82).
endmodule To determine the effect of the divider on 5^(0)), square both
sides of (82) and apply (70)
VCO by simply adjusting the VCO gain and jitter. If the fl
aa T FDrFD (83)
divide ratio of FDN is large, the simulation runs much faster VCO7VCO N
because the high VCO output frequency is never generated.
The Verilog-A model for the merged VCO and FDN is given TvcO=TFD/N> and so
in Listing 16. It also includes code for generating a logfile a (84)
vco
containing the length of each period. The logfile is used in
From (72),
Section XIII when determining 5VCO» the power spectral den-
sity of the phase of the VCO output. s^ VCO-£-
r2 - ^FD f 2
(85)
Recall that the synchronous jitter of F D M and FD# has Jvco
already been included as part of OSC, so the divider model Finally,/vco = W/ FD , and so
incorporated into the VCO is noiseless and the jitter at the ^VCO2^ 5prj>. (86)
output of the noiseless divider results only from the VCO jit-
ter. Since the divider outputs one pulse for every N pulses at Once FDN is incorporated into the VCO, the VCO output sig-
its input, the variance in the output period is the sum of the nal is no longer observable, however the characteristics of the
variance in N input periods. Thus, the period jitter at the out- VCO output are easily derived from (82) and (86), which are
put, /prj, is JN times larger than the period jitter at the input, summarized in Table II.
or It is interesting to note that while the frequency at the output
JVcO'
= JN+VCO- (82) of FDN is N times smaller than at the output of the VCO,
'FD
except for scaling in the amplitude, the spectrum of the noise
close to the fundamental is to a first degree unaffected by the
presence of FD#. In particular, the width of the noise spec-
65
T A B L E II: CHARACTERISTICS OF V C O OUTPUT RELATIVE TO THE
Listing 16 — VCO with FD N .
OUTPUT OF FD/v ASSUMING THE V C O EXHIBITS SIMPLE
Include "discipline.h" ACCUMULATING JITTER AND THE FDyy IS NOISE FREE.
//bound the frequency (this is optional) The synthesizer is simulated using the netlist from Listing 18
if (freq > Fmax) freq = Fmax; and the Verilog-A descriptions in Listings 14-16, modifying
if (freq < Fmin) freq = Fmin; them as necessary to fit the actual circuit. The simulation
//apply the frequency divider, add the phase noise should cover an interval long enough to allow accurate Fou-
freq = (freq / ratio)/(1 + dT * freq / ratio); rier analysis at the lowest frequency of interest {Fm^. With
deterministic signals, it is sufficient to simulate for K cycles
//phase is the integral of the freq modulo 1
phase = idtmod(freq, 0.0,1.0, -0.5);
after the PLL settles if F m i n = \I(TK). However, for these sig-
nals, which are stochastic, it is best to simulate for \0K to
/ / update jitter twice per period 100AT cycles to allow for enough averaging to reduce the
@(cross(phase - 0.25, +1, ttol)) begin
uncertainty in the result.
dT = delta * $dist_normal(seed, 0,1);
Vout = Vhi; One should not simply apply an FFT to the output signal of
end the VCO/FDyy to determine £(A/) for the PLL. The result
@(cross(phase + 0.25, +1, ttol)) begin would be quite inaccurate because the FFT samples the wave-
dT = delta * $dist_normal(seed, 0,1); form at evenly spaced points, and so misses the jitter of the
Vout = Vlo; transitions. Instead, -£(40 can be measured with Spectre's
if ($abstime >= outStart) $fstrobe( fp, "%0.10e",
Fourier Analyzer, which uses a unique algorithm that does
$abstime - prev);
accurately resolve the jitter [11]. However, it is slow if many
prev = $abstime;
frequencies are needed and so is not well suited to this appli-
end
V(out) <+ transition(Vout, 0, tt);
cation.
end Unlike HAf), S^(Af) can be computed efficiently. The Ver-
endmodule ilog-A code for the VCO/FDN given in Listing 16 writes the
length of each period to an output file named periods.m. Writ-
trum is unaffected by FD#. This is extremely fortuitous, ing the periods to the file begins after an initial delay, speci-
because it means that the number of cycles we need to simu- fied using outStart, to allow the PLL to reach steady state.
late is independent of the divide ratio N. Thus, large divide This file is then processed by Matlab from Math Works using
ratios do not affect the total simulation time. the script shown in Listing 17. This script computes S^(Af),
66
the power spectral density of <|), using Welch's method [28]. XIV. EXAMPLE
The frequency range is from/ out /2 to/out/nfft. The script corn- These ideas were applied to model and simulate a PLL acting
Listing 17 — Matlab script used for computing S^Af). These results as a frequency synthesizer. A synthesizer was chosen with/ ref
must be further processed using Table II to map them to the output of = 25 MHz,/ 0 U t = 2 GHz, and a channel spacing of 200 kHz.
the VCO. As such, M = 125 and N = 10,000.
The noise of OSC is -95 dBc/Hz at 100 kHz. Applying (74)
% Process period data to compute S^(Af)
to compute a, where HAf) = 316 x 10"12, A/ = 100 kHz, and
echo off; fo = 25 MHz, gives a = 10"14. The period jitter J is then com-
nfft=512; % should be power of two
puted from (70), giving J = 20 ps.
winLength=nfft;
overlap=nfft/2; The noise of VCO is -48 dBc/Hz at 100 kHz. Applying (74)
winNBW=1.5; % Noise bandwidth given in bins and (70) with £(4/*) = 1.59 x 10"5, A/ = 100 kHz, and/ 0 =
% Load the data from the file generated by the VCO
2 GHz, gives a = 7.9 x 10~14 and an period jitter of J = 6.3 ps.
load periods.m; The period jitter of the PFD/CP and FDs was found to be
% output estimates of period and jitter 2 ns. The FDs were included into the oscillators, which sup-
T=mean(periods); presses the high frequency signals at the input and output of
J=std(periods); the synthesizer. The netlist is shown in Listing 18. The results
maxdT = max(abs(periods-T))/T; (compensated for non-unity resolution bandwidth (-28 dB)
fprintf(T = %.3gs, F = %.3gHz\n',T, 1/T); and for the suppression of the dividers (80 dB)) are shown in
fprintf('Jabs = %.3gs, Jrel = %.2g%%\n\ J, 100*J/T); Figures 17-20. The simulation took 7.5 minutes for 450k
fprintf('max dT = %.2g%%\n\ 100*maxdT); time-points on a HP 9000/735. The use of a large number of
fprintf('periods = %d, nfft = %d\n\ length(periods), nfft); time points was motivated by the desire to reduce the level of
% compute the cumulative phase of each transition uncertainty in the results. The period jitter in the PLL was
phases=2*pi*cumsum(periods)/T; found to be 9.8 ps at the output of the VCO.
% compute power spectral density of phase
[Sphi,f]=psd(phases,nfftl1/T,winLength,overlap,'linear>);
Listing 18 — Spectre netlist for PLL synthesizer.
% correct for scaling in PSD due to FFT and window //PLL-based frequency synthesizer that models jitter
Sphi=winNBW*Sphi/nffi; simulator lang=spectre
% plot the results (except at DC) ahdijnclude "osc.va" //Listing 14
K = length(f); ahdLJnclude "pfd_cp.va" //Listing 15
semi!ogx(f(2:K),10*log10(Sphi(2:K))); ahdl_include "vco.va" //Listing 16
title('Power Spectral Density of VCO Phase');
Osc (in) osc freq=25MHz ratio=125\
xlabel('Frequency (Hz)');
accJitter=20ps syncJitter=2ns
ylabel('S phi (dB/Hz)');
PFD (err in fb) pfd__cp lout=500ua
rbw = winNBW/(T*nfft);
C1 (errc) capacitor c=3.125nF
RBW=sprintf('Resolution Bandwidth = %.0f Hz (%.0f dB)\
R (c 0) resistor r=10k
rbw, 10*log10(rbw));
C2 (c 0) capacitor c=625pF
imtext(0.5,0.07, RBW);
VCO (fb err) vco Fmin=1 GHz Fmax=3GHz \
Vmin=-4 Vmax=4 ratio=10000 \
putes Sty(Af) with a resolution bandwidth of rbw.^ Normally, jitter=6ps outStart=10ms
S$(&f) is given with a unity resolution bandwidth. To com- JitterSim tran stop=60ms
pensate for a non-unity resolution bandwidth, broadband sig-
nals such as the noise should be divided by rbw. Signals with in err
Osc& VCO&
bandwidth less than rbw, such as the spurs generated by leak- + 125 PFD & CP + 10,000
age in the CP, should not be scaled. The script processes the
r I
output of VCO/FD^. The results of the script must be further fb
processed using the equations in Table II to remove the effect
ofFDtf.
67
-10 0
VCO-OL
-20
-10
^-30
5 -40
S-50
OL
I-
•o 3
FD/CP,FD-OL
PLL-CL
^
t o- 3 0
*-«
-40
-70 CL
OSC-Ol>
-80 -50
300 Hz 1kHz 3 kHz 10 kHz 30 kHz 100 kHz 300 Hz 1 kHz 3 kHz 10 kHz 30 kHz 100 kHz
Fig. 17. Noise of the closed-loop PLL at the output of the VCO Fig. 20. Closed-loop PLL noise performance compared to the open-
when only the reference oscillator exhibits jitter (CL) versus the loop noise performance of the individual components that make up
noise of the reference oscillator mapped up to the VCO frequency the PLL. The achieved noise is slightly larger than what is expected
when operated open loop (OL). from the components due to peaking in the response of the PLL.
-30
XV. CONCLUSION
CL
-40 A methodology for modeling and simulating the phase noise
and jitter performance of phase-locked loops was presented.
300 Hz 1kHz 3 kHz 10 kHz 30 kHz 100 kHz The simulation is done at the behavioral level, and so is effi-
cient enough to be applied in a wide variety of applications.
Fig. 18. Noise of the closed-loop PLL at the output of the VCO
when only the VCO exhibits jitter (CL) versus the noise of the VCO The behavioral models are calibrated from circuit-level noise
when operated open loop (OL). simulations, and so the high-level simulations are accurate.
Behavioral models were presented in the Verilog-A language,
however these same ideas can be used to develop behavioral
-25
A OL models in purely event-driven languages such as Verilog-
-30
X •
HDL and Verilog-AMS. This methodology is flexible enough
to be used in a broad range of applications where phase noise
^.-35
\
VV
N
and jitter is important.
5 -40
CO CL
3-45 REFERENCES
68
[5] A. Demir, E. Liu, A. Sangiovanni-Vincentelli, and I. [17] F. X. Kaertner. "Analysis of white and/^01 noise in oscil-
Vassiliou. "Behavioral simulation techniques for phase/ lators." International Journal of Circuit Theory and Ap-
delay-locked systems." Proceedings of the IEEE Custom plications, vol. 18, pp. 485-519, 1990.
Integrated Circuits Conference, pp. 453-456, May 1994. [18] G. Vendelin, A. Pavio, U. Rohde. Microwave Circuit
[6] A. Demir, E. Liu, and A. Sangiovanni-Vincentelli. Design. J. Wiley & Sons, 1990.
"Time-domain non-Monte-Carlo noise simulation for [19] W. Gardner. Introduction to Random Processes: With
nonlinear dynamic circuits with arbitrary excitations." Applications to Signals and Systems. McGraw-Hill,
IEEE Transactions on Computer-Aided Design of Inte- 1989.
grated Circuits and Systems, vol. 15, no. 5, pp. 493-505,
May 1996. [20] Joel Phillips and Ken Kundert. "Noise in mixers, oscilla-
tors, samplers, and logic: an introduction to cyclostation-
[7] A. Demir, A. Sangiovanni-Vincentelli. "Simulation and ary noise." Proceedings of the IEEE Custom Integrated
modeling of phase noise in open-loop oscillators." Pro- Circuits Conference, CICC 2000. The paper and presen-
ceedings of the IEEE Custom Integrated Circuits Con- tation are both available from www.designers-
ference, pp. 445-456, May 1996. guide, com.
[8] A. Demir, A. Sangiovanni-Vincentelli. Analysis and [21] T. A. D. Riley, M. A. Copeland, and T. A. Kwasniewski.
Simulation of Noise in Nonlinear Electronic Circuits and "Delta-sigma modulation in fractional-TV frequency syn-
Systems. Kluwer Academic Publishers, 1997. thesis." IEEE Journal of Solid-State Circuits, vol. 28 no.
[9] Ken Kundert. "Modeling and simulation of jitter in 5, May 1993, pp. 553 -559
phase-locked loops." In Analog Circuit Design: RF An- [22] Frank Herzel and Behzad Razavi. "A study of oscillator
alog-to-Digital Converters; Sensor and Actuator Inter- jitter due to supply and substrate noise." IEEE Transac-
faces; Low-Noise Oscillators, PLLs and Synthesizers, tions on Circuits and Systems - //; Analog and Digital
Rudy J. van de Plassche, Johan H. Huijsing, Willy M.C. Signal Processing, vol. 46. no. 1, Jan. 1999, pp. 56-62.
Sansen, Kluwer Academic Publishers, November 1997.
[23] T. C. Weigandt, B. Kim, and P. R. Gray. "Jitter in ring
[10] Ken Kundert. "Modeling and simulation of jitter in PLL oscillators." 1994 IEEE International Symposium on
frequency synthesizers." Available from www.design- Circuits and Systems (ISCAS-94), vol. 4, 1994, pp. 27-
ers-guide.com. 30.
[11] Kenneth S. Kundert. The Designer's Guide to SPICE and [24] A. Papoulis. Probability, Random Variables, and Sto-
Spectre. Kluwer Academic Publishers, 1995. chastic Processes. McGraw-Hill, 1991.
[12] Verilog-A Language Reference Manual: Analog Exten- [25] J. J. Rael and A. A. Abidi. "Physical processes of phase
sions to Verilog-HDL, version 1.0. Open Verilog Inter- noise in differential LC oscillators." Proceedings of the
national, 1996. Available from www.eda.org/verilog- IEEE Custom Integrated Circuits Conference, CICC
ams. 2000.
[13] Ulrich L. Rohde. Digital PLL Frequency Synthesizers. [26] J. McNeill. "Jitter in Ring Oscillators." IEEE Journal of
Prentice-Hall, Inc., 1983. Solid-State Circuits, vol. 32, no. 6, June 1997.
[14] Paul R. Gray and Robert G. Meyer. Analysis and Design [27] H. Chang, E. Charbon, U. Choudhury, A. Demir, E. Felt,
of Analog Integrated Circuits. John Wiley & Sons, 1992. E. Liu, E. Malavasi, A. Sangiovanni-Vincentelli, and I.
[15] A. Demir, A. Mehrotra, and J. Roychowdhury. "Phase Vassiliou. A Top-Down Constraint-Driven Methodology
noise in oscillators: a unifying theory and numerical for Analog Integrated Circuits. Kluwer Academic Pub-
methods for characterization." IEEE Transactions on lishers, 1997.
Circuits and Systems I: Fundamental Theory and Appli- [28] A. Oppenheim, R. Schafer. Digital Signal Processing.
cations, vol. 47, no. 5, May 2000, pp. 655 -674. Prentice-Hall, 1975.
[16] F. Kaertner. "Determination of the correlation spectrum [29] F. Harris. "On the use of windows for harmonic analysis
of oscillators with low noise." IEEE Transactions on Mi- with the discrete Fourier transform." Proceedings of the
crowave Theory and Techniques, vol. 37, no. 1, pp. 90- IEEE, vol. 66, no. 1, January 1978.
101, Jan. 1989.
69
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 31, NO. 3, MARCH 1996 331
Abstract-This paper presents a study of phase noise in two models, the analytical approach can predict the phase noise
inductorless CMOS oscillators. First-order analysis of a linear with approximately 4 to 6 dB of error.
oscillatory system leads to a noise shaping function and a new
The next section of this paper describes the effect of
definition of Q. A linear model of CMOS ring oscillators is used
to calculate their phase noise, and three phase noise phenomena, phase noise in wireless communications. In Section 111, the
namely, additive noise, high-frequency multiplicative noise, and concept of Q is investigated and in Section IV it is generalized
low-frequency multiplicativenoise, are identified and formulated. through the analysis of a feedback oscillatory system. The
Based on the same concepts, a CMOS relaxation oscillator is also resulting equations are then used in Section V to formulate
analyzed. Issues and techniques related to simulation of noise in
the time domain are described,and two prototypesfabricated in a the phase noise of ring oscillators with the aid of a linearized
0.5-pm CMOS technology are used to investigate the accuracy of model. In Section VI, nonlinear effects are considered and
the theoretical predictions. Compared with the measured results, three mechanisms of noise generation are described, and in
the calculated phase noise values of a 2-GHz ring oscillator and Section VII, a CMOS relaxation oscillator is analyzed. In
a 900-MHz relaxation oscillator at 5 MHz offset have an error
of approximately 4 dB.
Section VIII, simulation issues and techniques are presented,
and in Section IX the experimental results measured on the
two prototypes are summarized.
I. INTRODUCTION
11. PHASE NOISEIN WIRELESS COMMUNICATIONS
V OLTAGE-CONTROLLED oscillators (VCO’s) are an
integral part of phase-locked loops, clock recovery cir-
cuits, and frequency synthesizers. Random fluctuations in the
Phase noise is usually characterized in the frequency do-
main. For an ideal oscillator operating at W O , the spectrum
output frequency of VCO’s, expressed in terms of jitter and assumes the shape of an impulse, whereas for an actual
phase noise, have a direct impact on the timing accuracy oscillator, the spectrum exhibits “skirts” around the center
where phase alignment is required and on the signal-to-noise or “carrier” frequency (Fig. 1). To quantify phase noise, we
ratio where frequency translation is performed. In particular, consider a unit bandwidth at an offset Aw with respect to W O ,
RF oscillators employed in wireless tranceivers must meet calculate the noise power in this bandwidth, and divide the
stringent phase noise requirements, typically mandating the use result by the carrier power.
of passive LC tanks with a high quality factor (Q). However, To understand the importance of phase noise in wire-
the trend toward large-scale integration and low cost makes it less communications, consider a generic transceiver as
desirable to implement oscillators monolithically. The paucity depicted in Fig. 2, where the receiver consists of a low-
of literature on noise in such oscillators together with a lack of noise amplifier, a band-pass filter, and a downconversion
experimental verification of underlying theories has motivated mixer, and the transmitter comprises an upconversion
this work. mixer, a band-pass filter, and a power amplifier. The
This paper provides a study of phase noise in two induc- local oscillator (LO) providing the carrier signal for both
torless CMOS VCO’s. Following a first-order analysis of a mixers is embedded in a frequency synthesizer. If the
linear oscillatory system and introducing a new definition of LO output contains phase noise, both the downconverted
Q, we employ a linearized model of ring oscillators to obtain and upconverted signals are corrupted. This is illustrated
an estimate of their noise behavior. We also describe the in Fig. 3(a) and (b) for the receive and transmit paths,
limitations of the model, identify three mechanisms leading respectively.
to phase noise, and use the same concepts to analyze a CMOS Referring to Fig. 3(a), we note that in the ideal case, the
relaxation oscillator. In contrast to previous studies where signal band of interest is convolved with an impulse and thus
time-domain jitter has been investigated [l], [2], our analysis translated to a lower (and a higher) frequency with no change
is performed in the frequency domain to directly determine the in its shape. In reality, however, the wanted signal may be
phase noise. Experimental results obtained from a 2-GHz ring accompanied by a large interferer in an adjacent channel, and
oscillator and a 900-MHz relaxation oscillator indicate that, the local oscillator exhibits finite phase noise. When the two
despite many simplifying approximations, lack of accurate signals are mixed with the LO output, the downconverted band
MOS models for RF operation, and the use of simple noise consists of two overlapping spectra, with the wanted signal
suffering from significant noise due to tail of the interferer.
Manuscript received October 30, 1995; revised December 17, 1995. This effect is called “reciprocal mixing.”
The author was with AT&T Bell Laboratories, Holmdel, NJ 07733 USA.
He is now with Hewlett-Packard Laboratories, Palo Alto, CA 94304 USA. Shown in Fig. 3(b), the effect of phase noise on the transmit
Publisher Item Identifier S 0018-9200(96)02456-0. path is slightly different. Suppose a noiseless receiver is to
0018-9200/96$05.00 0 1996 IEEE
332 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 31, NO. 3, MARCH 1996
- Aiw
Fig. 1. Phase noise in an oscillator.
Low-Noise
Amplifier
c Band-Pass
Filter
Frequency
Synthesizer .
Amplifier
Band-Pass
“cc
ll-
I I
Freq.
Control
-L
111. DEFINITIONS
OF Q
The quality factor, Q, is usually defined within the context
of second-order systems with (damped) oscillatory behavior.
Illustrated in Fig. 5 are three common definitions of Q. For an
RLC circuit, Q is defined as the ratio of the center frequency
and the two-sided -3-dB bandwidth. However, if the inductor (3) Q=--00 dQ
is removed, this definition cannot be applied. A more general 2 do
definition is: 27r times the ratio of the stored energy and the
dissipated energy per cycle, and can be measured by applying a Fig. 5. Common definitions of &.
step input and observing the decay of oscillations at the output.
Again, if the circuit has no oscillatory behavior (e.g., contains
no inductors), it is difficult to define “the energy dissipated
per cycle.” In a third definition, an LC oscillator is considered
as a feedback system and the phase of the open-loop transfer
function is examined at resonance. For a simple LC circuit
such as that in Fig. 4, it can be easily shown that the Q of
the tank is equal to 0 . 5 ~ 0d@/dw, where W O is the resonance
frequency and d@/dw denotes the slope of the phase of the
Fig. 6. Two-integrator oscillator.
transfer function with respect to frequency. Called the “open-
loop &” herein, this definition has an interesting interpretation
if we recall that for steady oscillations, the total phase shift
around the loop must be precisely 360”. Now, suppose the
oscillation frequency slightly deviates from W O . Then, if the
phase slope is large, a significant change in the phase shift
arises, violating the condition of oscillation and forcing the
frequency to return to W O . In other words, the open-loop Q Fig. 7. Linear oscillatory system.
is a measure of how much the closed-loop system opposes
variations in the frequency of oscillation. This concept proves IV. LINEAROSCILLATORY
SYSTEM
useful in our subsequent analyses.
Oscillator circuits in general entail “compressive” nonlin-
While the third definition of Q seems particularlly well-
earity, fundamentally because the oscillation amplitude is not
suited to oscillators, it does fail in certain cases. As an
defined in a linear system. When a circuit begins to oscillate,
example, consider the two-integrator oscillator of Fig. 6 , where
the amplitude continues to grow until it is limited by some
the open-loop transfer function is simply
other mechanism. In typical configurations, the open-loop gain
of the circuit drops at sufficiently large signal swings, thereby
H(s)= -(?) 2
(1) preventing further growth of the amplitude.
In this paper, we begin the analysis with a linear model. This
approach is justified as follows. Suppose an oscillator employs
yielding CP = L H ( s = j w ) = 0, and Q = 0. Since this circuit strong automatic level control (ALC) such that its oscillation
does indeed oscillate, this definition of Q is not useful here. amplitude remains small, making the linear approximation
334 LEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 31, NO. 3, MARCH 1996
valid. Since the ALC can be relatively slow, the circuit spectral density is shaped by
parameters can be considered time-invariant for a large number
of cycles. Now, let us gradually weaken the effect of AJX
so that the oscillator experiences increasingly more “self-
limiting.” Intuitively, we expect that the linear model yields
reasonable accuracy for soft amplitude limiting and becomes
This is illustrated in Fig. 8. As we will see later, (6) assumes
gradually less accurate as the ALC is removed. Thus, the
choice of this model depends on the error that it entails a simple form for ring oscillators.
in predicting the response of the actual oscillator to various
To gain more insight, let H ( j w ) = A ( w ) exp[j@(w)],and
hence
sources of noise, an issue that can be checked by simulation
(Section VIII). While adequate for the cases considered here,
(7)
this approximation must be carefully examined for other types
of oscillators.
To analyze phase noise, we treat an oscillator as a feedback
Since for w M WO, A = 1, ( 6 ) can be written as
system and consider each noise source as an input (Fig. 7).
The phase noise observed at the output is a function of: 1)
sources of noise in the circuit and 2) how much the feedback
system rejects (or amplifies) various noise components. The
system oscillates at w = W O if the transfer function
We define the open-loop Q as
goes to infinity at this frequency, i.e., if H ( j w 0 ) = -1. For Combining (8) and (9) yields
+
frequencies close to the carrier, w = W O A w , the open-loop
transfer function can be approximated as
y,(im)
Y(p) - - -
Fig. 9. Oscillatory system with nonunity-gain feedback. Fig. 11. Linearized model of CMOS VCO.
wn)t
vo”tl(t) fx Q2’4OAn COS(W0 f C. Low-Frequency Multiplicative Noise
~&~~c
( tl)i , ~
cos(wo
, ~-:2w,)t Since the frequency of oscillation in Fig. 10 is a function
vout3(t)a 3 ~ ;cos(2w0~ , - Wn)t. of the tail current in each differential pair, noise components
in this current modulate the frequency, thereby contributing
Note that Voutl(t) appears in band if w, is small, i.e., if phase noise [classical frequency modulation (FM)]. Depicted
it is a low-frequency component, but in a fully differential in Fig. 13, this effect can be significant because, in CMOS
configuration, Voutl(t) = 0 because a2 = 0. Also, Vouta(t)oscillators, W O must be adjustable by more than &20% to
is negligible because A, << Ao, leaving Vout3(t) as the only compensate for process variations, thus making the frequency
significant cross-product. quite sensitive to noise in the tail current. This mechanism is
This simplified one-stage analysis predicts the frequency of illustrated in Fig. 14.
the components in response to injected noise, but not their To quantify this phenomenon, we find the sensitivity or
magnitude. When noise is injected into the oscillator, the “gain” of the VCO, defined as HVCO= dwOut/dIss in
magnitude of the observed response at w, and 2w0 - w, Fig. 13, and use a simple approximation. If the noise per unit
depends on the noise shaping properties of the feedback bandwidth in ISS is represented as a sinusoid with the same
# :qn
RAZAVI: A STUDY OF PHASE NOISE IN CMOS OSCILLATORS 331
tit
0 00
~
+
Fig. 14. Low-frequency multiplicative noise.
e
.
. *-””
..J
00
*e*.
0
~
-4 ;” M1
+ 1,s
M L F
+ ‘ss
R4
Lk
power: I, cos w,t, then the output signal of the oscillator (a) (b)
can be written as Fig. 15. Gain stage with (a) stationary and (b) cyclostationary noise.
Since KVCOcan be easily evaluated in simulation or mea- Fig. 16. Addition of output voltages of N oscillators.
surement, (20) is readily calculated.
It is seen that modulation of the carrier brings the low
frequency noise components of the tail current to the band condition). Simulations indicate that the sideband magnitudes
around WO.Thus, flicker noise in I, becomes particularly in the two cases differ by less than 0.5 dB.
important. It is important to note that this result may not be accurate
In the differential stage of Fig. 3(b), two sources of low- for other types of oscillators.
frequency multiplicative noise can be identified: noise in Iss
and noise in Ms and Me. For comparable device size, these E. Power-Noise Trade-off
two sources are of the same order and must be both taken As with other analog circuits, oscillators exhibit a trade-
into account. off between power dissipation and noise. Intuitively, we note
that if the output voltages of N identical oscillators are added
D. Cyclostationary Noise Sources in phase (Fig. 16), then the total carrier power is multiplied
by N2, whereas the noise power increases by N (assuming
As mentioned previously, the devices in the signal path
noise sources of different oscillators are uncorrelated). Thus,
exhibit cyclostationary noise behavior, requiring the use of pe-
the phase noise (relative to the carrier) decreases by a factor
riodically varying noise statistics in analysis and simulations.
N at the cost of a proportional increase in power dissipation.
To check the accuracy of the stationary noise approximation,
Using the equations developed above, we can also formulate
we perform a simple, first-order simulation on the two cases
this trade-off. For example, from (16), since G,R M 2, we
depicted in Fig. 15. In Fig. 15(a), a sinusoidal current source
have
with an amplitude of 2 nA is connected between the drain and
source of M I to represent its noise with the assumption that
M I carries half of I S S .In Fig. 15(b), the current source is also
a sinusoid, but its amplitude is a function of the drain current To reduce the total noise power by N , G, must increase by the
of M I . Since MOS thermal noise current (in the saturation same factor. For any active device, this can be accomplished
region) is proportional to 6, we use a nonlinear dependent by increasing the width and the bias current by N . (To maintain
source in SPICE [7] as In(t) = a q m s i n w , t , where the same frequency of oscillation, the load resistor is reduced
w, = 27r x 980 MHz. The factor Q is chosen such that by N . ) Therefore, for a constant supply voltage, the power
I,(t) = 2 nA x sinw,t when V,(t) = 1 x Iss/2 (balanced dissipation scales up by N .
338 IEEE JOURNAC OF SOLID-STATE CIRCUITS, VOL. 31, NO. 3, MARCH 1996
TABLE I
COMPARISON OF AND FOUR-STAGE
THREE-STAGE RINGOSCILLATORS
&
3-Stage VCO &Stage VCO
Power Dissipation 1.8 mW 3.6 mW Fig. 17. Substrate and supply noise in gain stage.
supply and substrate noise (Fig. 17). Two phenomena account After lengthy calculations, we have
for this. First, device mismatches degrade the symmetry of the
circuit. Second, the total capacitance at the common source of
the differential pair (i.e., the source junction capacitance of M I
and M2 and the capacitance associated with the tail current and
source) converts the supply and substrate noise to current,
thereby modulating the delay of the gain stage. Simulations
indicate that even if the tail current source has a high dc output
impedance, a 1-mV,, supply noise component at 10 MHz This assumption is justified by decomposing C into two series capacitors,
each one of value 2C, and monitoring the midpoint voltage. The common-
generates sidebands 60 dB below the canier at W O f (27r x 10 mode swing at this node is approximatley 18 dB below the differential swings
MHz). at the source of M I and M2.
RAZAVI: A STUDY OF PHASE NOISE IN CMOS OSCILLATORS 339
LT2vDD
Ml
R T I
-100‘ I I
1 1.1 12 t .3 1A 1.5 1.8
GI42
IX, EXPERIMENTAL
RESULTS
A. Measurements
Two different oscillator configurations have been fabricated
in a 0.5-pm CMOS technology to compare the predictions in
this paper with measured results. Note that there are three sets
of results: theoretical calculations based on linear models but
including multiplicative noise, simulated predictions based on
the actual CMOS oscillators, and measured values.
9 9.5 10 10.5 11
The first circuit is a 2.2-GHz three-stage ring oscillator.
x 100 MHZ Fig. 24 shows one stage of the circuit along with the measured
(b) device parameters. The sensitivity of the output frequency to
Fig. 20 Simple simulation revealing effect of pulse waveforms, (a) sin- the tail current of each stage is about 0.43 MHzIpA. The
gle sinusoidal source and (b) sinusoidal source along with a square wave measured spectrum is depicted in Fig. 25(a) and (b) with
generator.
two different horizontal scales. Due to lack of data on the
flicker noise of the process, we consider only thermal noise
of 30 ps for 8 ps, and the output was processed in MATLAB at relatively large frequency offsets, namely, 1 MHz and 5
to obtain the spectrum. Since simulations of the linear model MHz.
yield identical results to the equations derived above, we will It is important to note that low-frequency flicker noise
not distinguish between the two hereafter. causes the center of the spectrum to fluctuate constantly. Thus,
Shown in Fig. 22 are the output spectra of the linear model as the resolution bandwidth (RBW) of the spectrum analyzer is
and actual circuit of a three-stage oscillator in 0.5-pm CMOS reduced [from 1 MHz in Fig. 25(a) to 100 kHz in Fig. 25(b)],
technology with a Z-nA, 980-MHz sinusoidal current injected the carrier power is subject to more averaging and appears to
into the signal path (the drain of one of the differential pairs). decrease. To maintain consistency with calculations, in which
RAZAVI: A STUDY OF PHASE NOISE IN CMOS OSCILLATORS 341
or .......,.........
OL I
...,."..........,
4 I
. . . . . . . . . . .l'1 ....... ....).
! . . . . . . II . ...... -1
I
a-
a
4!
............................. :..... .....................
......................................................... :..............:..............
-6oc -80 [_ :
-80 ......................................
......................................
.... . . . . . . . .. ............. .. . . . . . . . .. ............. .. . . . .
-100
-120
-140
-180
-1 80
92 9.4 9.8 9.8 10 10.2 10.4
x 100 MHz
-100
I
I /I I
II I
-100
-120
-120
-140
-140
-160
-180
-180
-180
9 9.2 9.4 9.6 9.8 10 10
9.2 9.4 9.6 9.8 10 10.2 10.4 x 100 MHz
x 100 MHz
(b)
(b)
Fig. 23. Simulated output spectra of (a) linear model and (b) actual circuit
Fig. 22. Simulated output spectra of (a) linear model and (b) actual circuit of a four-stage CMOS oscillator.
of a three-stage CMOS oscillator.
(b)
F ig. 25. Measured output spectrum of ring oscillator (10 dB/div. vertical
SIsale). (a) 5 MHz/div. horizontal scale and 1 MHz resolution bandwidth, (b)
1 MHz horizontal scale and 100 lcHz resolution bandwidth.
Correspondence
Corrections to “A General Theory of Comments on “A 64-Point Fourier Transform Chip for
Phase Noise in Electrical Oscillators” Video Motion Compensation Using Phase Correlation”1
0 )2
1 1! : described below.
max
However, note that the discussion following (29) is still valid. II. MATRIX TRANSPOSERS AND BIT REVERSERS
The factor c02 =20rms
2
should be changed to (c0 =20rms )2 in the Block serial/parallel or parallel/serial converters, sometimes called
following instances: matrix transpose or corner turn buffers, are used in many systems.
1) p. 185, second column, last paragraph; They perform a matrix transpose on data blocks by exchanging rows
2) p. 190, second column, first paragraph; and columns. Fig. 2 (from [7]) shows, from upper left to lower right,
3) p. 190, second column, second paragraph. the flow of data through a 4 2 4 shift-based transposer. The rotator
Nevertheless, the expression used to calculate the 0rms to predict
lines show where data will be routed on the next clock cycle and
the output is the transpose of the input. The switching action was
phase noise of ring oscillators is based on a simulation that takes
noted in [7] and [8] and rotator designs can be found in [4], [7],
and [8]. Although Fig. 6(b)1 is also an 8 2 8 transposer, it is being
this effect into account automatically, and therefore the predictions
are still valid. The authors regret any confusion this error may have
used in a somewhat unusual way. By providing a complex (real and
caused.
Manuscript received February 27, 1998. Manuscript received January 31, 1997; revised March 5, 1998.
The authors are with the Center for Integrated Systems, Stanford University, The author was with the Naval Undersea Warfare Center, Newport, RI
Stanford, CA 94305-4070 USA. 02841 USA. He is now at 33 Everett Street, Newport, RI 02840 USA.
Publisher Item Identifier S 0018-9200(98)03730-5. Publisher Item Identifier S 0018-9200(98)03731-7.
1 A. Hajimiri and T. H. Lee, IEEE J. Solid-State Circuits, vol. 33, pp. 1 C. C. W. Hui, T. J. Ding, J. V. McCanny, and R. F. Woods, IEEE J.
179–194, Feb. 1998. Solid-State Circuits, vol. 31, pp. 1751–1761, Nov. 1996.
Abstract—A companion analysis of clock jitter and phase noise in long- and short-channel regimes of operation. Section VI
of single-ended and differential ring oscillators is presented. The describes the effect of substrate and supply noise as well as
impulse sensitivity functions are used to derive expressions for the the noise due to the tail-current source in differential struc-
jitter and phase noise of ring oscillators. The effect of the number
of stages, power dissipation, frequency of oscillation, and short- tures. Section VII explains the design insights obtained from
channel effects on the jitter and phase noise of ring oscillators is this treatment for low jitter/phase-noise design. Section VIII
analyzed. Jitter and phase noise due to substrate and supply noise summarizes the measurement results.
is discussed, and the effect of symmetry on the upconversion of
1/f noise is demonstrated. Several new design insights are given
for low jitter/phase-noise design. Good agreement between theory II. PHASE NOISE
and measurements is observed. The output of a practical oscillator can be written as
Index Terms—Design methodology, jitter, noise measurement,
oscillator noise, oscillator stability, phase jitter, phase-locked
(1)
loops, phase noise, ring oscillators, voltage-controlled oscillators.
where the function is periodic in 2 and and
model fluctuations in amplitude and phase due to internal
I. INTRODUCTION and external noise sources. The amplitude fluctuations are
significantly attenuated by the amplitude limiting mechanism,
D UE to their integrated nature, ring oscillators have be-
come an essential building block in many digital and
communication systems. They are used as voltage-controlled
which is present in any practical stable oscillator and is
particularly strong in ring oscillators. Therefore, we will
oscillators (VCO’s) in applications such as clock recovery focus on phase variations, which are not quenched by such
circuits for serial data communications [1]–[4], disk-drive read a restoring mechanism.
channels [5], [6], on-chip clock distribution [7]–[10], and As an example, consider the single-ended ring oscillator
integrated frequency synthesizers [10], [11]. Although they with a single current source on one of the nodes shown in
have not found many applications in radio frequency (RF), Fig. 1. Suppose that the current source consists of an impulse
they can be used for some low-tier RF systems. of current with area (in coulombs) occurring at time
Recently, there has been some work on modeling jitter This will cause an instantaneous change in the voltage of that
and phase noise in ring oscillators. References [12] and [13] node, given by
develop models for the clock jitter based on time-domain
(2)
treatments for MOS and bipolar differential ring oscillators,
respectively. Reference [14] proposes a frequency-domain
where is the effective capacitance on that node at
approach to find the phase noise based on an linear time-
the time of charge injection. This produces a shift in the
invariant model for differential ring oscillators with a small
transition time. For small the change in the phase is
number of stages.
proportional to the injected charge
In this paper, we develop a parallel treatment of frequency-
domain phase noise [15] and time-domain clock jitter for ring
(3)
oscillators. We apply the phase-noise model presented in [16]
to obtain general expressions for jitter and phase noise of the
where is the voltage swing across the capacitor and
ring oscillators.
The dimensionless function is
The next section briefly reviews the phase-noise model
the time-varying proportionality constant and is periodic in 2
presented in [16]. In Section III, we apply the model to timing
It is large when a given perturbation causes a large phase shift
jitter and develop an expression for the timing jitter of oscilla-
and small where it has a small effect [16]. Since thus
tors, while Section IV provides the derivation of a closed-form
represents the sensitivity of every point of the waveform to a
expression to calculate the rms value of the impulse sensitivity
perturbation, is called the impulse sensitivity function.
function (ISF). Section V introduces expressions for jitter and
The time dependence of the ISF can be demonstrated by
phase noise in single-ended and differential ring oscillators
considering two extreme cases. The first is when the impulse
Manuscript received April 8, 1998; revised November 2, 1998. is injected during a transition; this will result in a large phase
A. Hajimiri is with the California Institute of Technology, Pasadena, CA shift. As the other case, consider injecting an impulse while
91125 USA. the output is saturated to either the supply or the ground.
S. Limotyrakis and T. H. Lee are with the Center for Integrated Systems,
Stanford University, Stanford, CA 94305 USA. This impulse will have a minimal effect on the phase of the
Publisher Item Identifier S 0018-9200(99)04200-6. oscillator, as shown in Fig. 2.
0018–9200/99$10.00 1999 IEEE
HAJIMIRI et al.: JITTER AND PHASE NOISE IN RING OSCILLATORS 791
Being interested in its phase we can treat an oscillator on the foregoing argument, we obtain the following time-
as a system that converts voltages and currents to phase. As dependent impulse response:
is evident from the discussion leading to (3), this system is
linear for small perturbations. It is also time variant, no matter (4)
how small the perturbations are.
Unlike amplitude changes, phase shifts persist indefinitely, where is a unit step.
since subsequent transitions are shifted by the same amount. Knowing the response to an impulse, we can calculate
Thus, the phase impulse response of an oscillator is a time- in response to any injected current using the superposition
varying step. Also note that as long as the introduced change integral
in the voltage due to the current impulse is small, the resultant
phase shift is linearly proportional to the injected charge, and
hence the transfer function from current to phase is linear.
The unit impulse response of the system is defined as the (5)
amount of phase shift per unit current impulse [16]. Based
792 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 6, JUNE 1999
(6)
(7)
Fig. 5. ISF for ring oscillators of the same frequency with different number of stages.
(10)
(11)
Using (10) and (11), the proportionality constant in (8) is Fig. 6. Approximate waveform and ISF for ring oscillator.
calculated to be
(12)
Fig. 8. RMS values of the ISF’s for various single-ended ring oscillators versus number of stages.
On the other hand, stage delay is proportional to the rise time Equation (16) is valid for differential ring oscillators as
well, since in its derivation no assumption specific to single-
(14) ended oscillators was made. Fig. 9 shows the for three
sets of differential ring oscillators, with a varying number of
where is the normalized stage delay and is a proportion- stages (4–16). The data shown with plus signs correspond to
ality constant, which is typically close to one, as can be seen oscillators in which the total power dissipation and the drain
in Fig. 7. voltage swing are kept constant by scaling the tail-current
The period is 2 times longer than a single stage delay sources and load resistors as changes. Members of the
second set of oscillators have a fixed total power dissipation
(15) and fixed load resistors, which result in variable swings and
for whom data are shown with circles. The third case is
Using (13) and (15), the following approximate expression for that of a fixed tail current for each stage and constant load
is obtained: resistors, whose data are illustrated using crosses. Again, in
spite of the diverse variations of the frequency and other
circuit parameters, the 1 dependency of and its
(16) independence from other circuit parameters still holds. In the
case of a differential ring oscillator, which
Note that the 1 dependence of is independent of corresponds to is the best fit approximation for
the value of Fig. 8 illustrates for the ISF shown in This is shown with the solid line in Fig. 9. A similar result
Fig. 5 with plus signs on log–log axes. The solid line shows can be obtained for bipolar differential ring oscillators.
the line of which is obtained from (16) for Although decreases as the number of stages increases,
To verify the generality of (16), we maintain a one should not prematurely conclude that the phase noise can
fixed channel length for all the devices in the inverters while be reduced using a larger number of stages because the number
varying the number of stages to allow different frequencies of of noise sources, as well as their magnitudes, also increases for
oscillation. Again, is calculated, and is shown in Fig. 8 a given total power dissipation and frequency of oscillation.
with circles. We also repeat the first experiment with a different In the case of asymmetric rising and falling edges, both
supply voltage (3 V as opposed to 5 V), and the result is shown and will change. As shown in Appendix B, the 1
with crosses. As can be seen, the values of are almost corner of the phase-noise spectrum is inversely proportional
identical for these three cases. to the number of stages. Therefore, the 1 corner can be
It should not be surprising that is primarily a function reduced either by making the transitions more symmetric in
of because the effect of variations in other parameters, terms of rise and fall times or by increasing the number of
such as and device noise, have already been decoupled stages. Although the former always helps, the latter has other
from , and thus the ISF is a unitless, frequency- and implications on the phase noise in the 1 region, as will be
amplitude-independent function. shown in the following section.
HAJIMIRI et al.: JITTER AND PHASE NOISE IN RING OSCILLATORS 795
Fig. 9. RMS values of the ISF’s for various differential ring oscillators versus number of stages.
times the value given by (6). Taking only these inevitable Using (28) and (29), we obtain the same expressions for
noise sources into account, (6), (16), (18), (21), and (22) result phase noise and jitter as given by (23) and (24), except for
in the following expressions for phase noise and jitter: a new
(23) (30)
which results in a larger phase noise and jitter than the long-
(24)
channel case by a factor of Again, note the absence
of any dependency on the number of stages.
where is the characteristic voltage of the device. For
long-channel mode of operation, it is defined as
Any extra disturbance, such as substrate and supply B. Differential CMOS Ring Oscillators
noise, or noise contributed by extra circuitry or asymmetry in Now consider a differential MOS ring oscillator with resis-
the waveform will result in a larger number than (23) and (24). tive load. The total power dissipation is
Note that lowering threshold voltages reduces the phase noise,
(31)
in agreement with [12]. Therefore, the minimum achievable
phase noise and jitter for a single-ended CMOS ring oscillator, where is the number of stages, is the tail bias current
assuming that all symmetry criteria are met, occurs for zero of the differential pair, and is the supply voltage. The
threshold voltage frequency of oscillation can be approximated by
(25) (32)
(28)
(35)
The frequency of oscillation can be approximated by
Equations (34) and (35) are valid in both long- and short-
channel regimes of operation with the right choice of
Note that, in contrast with the single-ended ring oscillator,
(29) a differential oscillator does exhibit a phase noise and jitter
dependency on the number of stages, with the phase noise
HAJIMIRI et al.: JITTER AND PHASE NOISE IN RING OSCILLATORS 797
(36)
Therefore, the uncertainties add up in amplitude rather than The jitter and phase noise behavior are different for dif-
power, resulting in a region with a slope of one in the log–log ferential ring oscillators. As (34) suggests, jitter and phase
plot of jitter even in the absence of external noise sources noise increase with an increasing number of stages. Hence
such as substrate and supply noise. if the 1 noise corner is not large, and/or proper symmetry
measures have been taken, the minimum number of stages
(three or four) should be used to give the best performance.
VII. DESIGN IMPLICATIONS This recommendation holds even if the power dissipation is
One can use (23) and (34) to compare the phase-noise not a primary issue. It is not fair to argue that burning more
performance of single-ended and differential MOS ring os- power in a larger number of stages allows the achievement of
cillators. As can be seen for stages, the phase noise better phase noise, since dissipating the same total power in a
of the differential ring oscillator is approximately smaller number of stages results in better jitter/phase noise as
times larger than the phase noise of a single- long as it is possible to maximize the total charge swing.
ended oscillator of equal and Since the minimum Another insight one can obtain from (34) and (35) is
for a regular ring oscillator is three, even a properly that the jitter of a MOS differential ring oscillator at a
designed differential CMOS ring oscillator underperforms its given and is smaller than that of a differential
single-ended counterpart, especially for a larger number of bipolar ring oscillator, at least for today’s range of circuit
stages. This difference is even more pronounced if proper and process parameters. As we go to shorter channel lengths,
precautions to reduce the noise of the tail current are not the characteristic voltage for the MOS devices given by (30)
taken. However, the differential ring oscillator may still be becomes smaller, and thus phase noise degrades with scaling.
preferred in IC’s because of the lower sensitivity to substrate Bipolar ring oscillators do not suffer from this problem.
and supply noise, as well as lower noise injection into other LC oscillators generally have better phase noise and jitter
circuits on the same chip. The decision to use differential compared to ring oscillators for two reasons. First, a ring
versus single-ended ring oscillators should be based on both oscillator stores a certain amount of energy in the capacitors
of these considerations. during every cycle and then dissipates all the stored energy
The common-mode sensitivity problem in a single-ended during the same cycle, while an LC resonator dissipates only
ring oscillator can be mitigated to some extent by using two 2 of the total energy stored during one cycle. Thus, for a
identical ring oscillators laid out close to each other that given power dissipation in steady state, a ring oscillator suffers
oscillate out of phase because of small coupling inverters from a smaller maximum charge swing Second, in a ring
[19]. Single-ended configurations may be used in a less noisy oscillator, the device noise is maximum during the transitions,
environment to achieve better phase-noise performance for a which is the time where the sensitivity, and hence the ISF, is
given power dissipation. the largest [16].
As shown in Appendix B, asymmetry of the rising and
falling edges degrades phase noise and jitter by increasing
the 1 corner frequency. Thus, every effort should be taken VIII. EXPERIMENTAL RESULTS
to make the rising and falling edges symmetric. By properly The phase-noise measurements in this section were per-
adjusting the symmetry properties, one can suppress or even formed using three different systems: an HP 8563E spectrum
eliminate low-frequency-noise upconversion [16]. As shown in analyzer with phase-noise measurement capability, an RDL
[16], differential symmetry is insufficient, and the symmetry of NTS-1000A phase-noise measurement system, and an HP
each half circuit is important. One practical method to achieve E5500 phase-noise measurement system. The jitter measure-
this symmetry is to use more linear loads, such as resistors or ments were performed using a Tektronix CSA 803A commu-
linearized MOS devices. This method reduces the 1 noise nication signal analyzer.
upconversion and substrate and supply coupling [20]. Another Tables I–III summarize the phase-noise measurements. All
revealing implication, shown in Appendix A, is the reduction the reported phase-noise values are at a 1-MHz offset from
of the 1 corner frequency as increases. Hence for a the carrier, chosen to achieve the largest dynamic range in
process with large 1 noise, a larger number of stages may the measurement. Table I shows the measurement results for
be helpful. three different inverter-chain ring oscillators. These oscillators
One question that frequently arises in the design of ring are made of the CMOS inverters shown in Fig. 12(a), with no
oscillators is the optimum number of stages for minimum jitter frequency tuning mechanism. The output is taken from one
and phase noise. As seen in (23), for single-ended oscillators, node of the ring through a few stages of tapered inverters.
the phase noise and jitter in the 1 region is not a strong Oscillators number 1 and 2 are fabricated in a 2- m 5-V
function of the number of stages for single-ended CMOS ring CMOS process, and oscillator number 3 is fabricated in a 0.25-
oscillators. However, if the symmetry criteria are not well m 2.5-V process. The second column shows the number of
satisfied and/or the process has a large 1 noise, a larger stages in each of the oscillators. The ratios of the NMOS
will reduce the jitter. In general, the choice of the number and PMOS devices, as well as the supply voltages, the total
of stages must be made on the basis of several design criteria, measured supply currents, and the frequencies of oscillation
such as 1 noise effect, the desired maximum frequency of are shown next. The phase-noise prediction using (23) and
oscillation, and the influence of external noise sources, such (6), together with the measured phase noise, are shown in the
as supply and substrate noise, that may not scale with last three columns.
HAJIMIRI et al.: JITTER AND PHASE NOISE IN RING OSCILLATORS 799
TABLE I
INVERTER-CHAIN RING OSCILLATORS
TABLE II
CURRENT-STARVED INVERTER-CHAIN RING OSCILLATORS
As an illustrative example, we will show the details of agreement with the measured results. The die photo of the chip
phase-noise calculations for oscillator number 3. Using (16) to containing these oscillators is shown in Fig. 13. The slightly
calculate the phase noise can be obtained from (6). We superior phase noise of the three-stage ring oscillator (number
calculate the noise power when the stage is halfway through 4) can be attributed to lower oscillation frequency and longer
a transition. At this point, the drain current is simulated to be channel length (and hence smaller ).
3.47 mA. An of 4 10 V/m and a of 2.5 is used in Table III summarizes the results obtained for differential
(28) to obtain a noise power of A Hz ring oscillators of various sizes and lengths with the inverter
The total capacitance on each node is fF, and topology shown in Fig. 12(c), covering a large span of frequen-
hence fC. There is one such noise source on cies up to 5.5 GHz. All these ring oscillators are implemented
each node; therefore, the phase noise is times the value in the same 0.25- m 2.5-V process, and all the oscillators,
given by (6), which results in MHz dBc/Hz. except the one marked with N/A, have the tuning circuit
Table II summarizes the data obtained for current-starved shown. The resistors are implemented using an unsilicided
ring oscillators with the cell structure shown in Fig. 12(b), polysilicon layer. The main reason to use poly resistors is to
all implemented in the same 0.25- m 2.5-V process. Ring reduce 1 noise upconversion by making the waveform on
oscillators with a different number of stages were designed each node closer to the step response of an RC network, which
with roughly constant oscillation frequency and total power is more symmetrical. The value of these load resistors and the
dissipation. Frequency adjustment is achieved by changing ratios of the differential pair are shown in Table III. A
the channel length, while total power dissipation control is fixed 2.5-V power supply is used, resulting in different total
performed by changing device width. The ratios of the power dissipations. As before, the measured phase noise is in
inverter and the tail NMOS and PMOS devices are shown good agreement with the predicted phase noise using (34) and
in Table II. The node is kept at while node (6). The die photo of oscillator number 26 can be found in
is at 0 V. The measured total current dissipation and Fig. 14.
the frequency of oscillation can be found in columns 7 and To illustrate further how one obtains the phase-noise pre-
8. Phase-noise calculations based on (23) and (6) are in good dictions shown in Table III, we elaborate on the phase-noise
800 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 6, JUNE 1999
TABLE III
DIFFERENTIAL RING OSCILLATORS
calculations for oscillator number 12. The noise current due Therefore, with an of 0.9, (34) predicts a phase noise of
to one of differential pair NMOS devices is given by (28). MHz dBc/Hz
The total capacitance on each node in the balanced case is Timing jitter for oscillator number 12 can be measured using
fF, and the simulated voltage swing is 1.208 V; the setup shown in Fig. 15. The oscillator output is divided
therefore, fC. In the balanced case, this current into two equal-power outputs using a power splitter. The CSA
is half of the tail current, i.e., mA, and therefore 803A is not capable of showing the edge it uses to trigger,
the noise current of the NMOS device has a single-sideband as there is a 21-ns minimum delay between the triggering
spectral density of A Hz The thermal transition and the first acquired sample. To be able to look
noise due to the load resistor is A Hz; at the triggering edge and perhaps the edges before that, a
therefore, the total current noise density is given by delay line of approximately 25 ns is inserted in the signal
A Hz For a differential ring oscillator with path in front of the sampling head. This way, one may look
stages, there is one such noise source on each node; at the exact edge used to trigger the signal. If the sampling
therefore, the phase noise is 2 times the value given by head and the power splitter were noiseless, this edge would
(6), which results in MHz dBc/Hz The total show no jitter. However, the power splitter and the sampling
power dissipation is mW, and head introduce noise onto the signal, which cannot be easily
HAJIMIRI et al.: JITTER AND PHASE NOISE IN RING OSCILLATORS 801
(39)
Fig. 16. RMS jitter versus measurement interval for the four-stage, 2.8-GHz differential ring oscillator (oscillator number 12).
APPENDIX A
RELATIONSHIP BETWEEN JITTER AND PHASE NOISE
The phase jitter is
(40)
where
Fig. 17. Phase noise versus symmetry voltage for oscillator number 7.
(41)
These bias voltages are chosen in such a way as
to keep a constant oscillation frequency while changing only Therefore
the ratio of rise time to fall time. The 1 corner of the
phase noise is measured for different ratios of the pullup and
pulldown currents while keeping the frequency constant. One
can observe a sharp reduction in the corner frequency at the (42)
point of symmetry in Fig. 17.
For a white-noise current source, the autocorrelation func-
tion is ; therefore
IX. CONCLUSION
An analysis of the jitter and phase noise of single-ended (43)
and differential ring oscillators was presented. The general
noise model, based on the ISF, was applied to the case of ring
which is
oscillators, resulting in a closed-form expression for the phase
noise and jitter of ring oscillators [(6), (23), (34)]. The model
was used to perform a parallel analysis of jitter and phase for (44)
noise for ring oscillators. The effect of the number of stages
HAJIMIRI et al.: JITTER AND PHASE NOISE IN RING OSCILLATORS 803
(45)
Fig. 18. Approximate waveform and the ISF for asymmetric rising and
where represents the expected value. Since the autocor- falling edges.
relation function of is defined as
(46) APPENDIX B
NONSYMMETRIC RISING AND FALLING EDGES
the timing jitter in (45) can be written as We approximate the ISF in this Appendix by the function
depicted in Fig. 18. The rms value of the ISF is
(47)
(48)
(52)
where represents the power spectrum of There- where and are the maximum slope during the rising
fore, (47) results in the following relationship between clock and falling edge, respectively, and represents the asymmetry
jitter and phase noise: of the waveform and is defined as
(49) (53)
noting that
It may be useful to know that can be approximated
by for large offsets [22]. As can be seen from the
(54)
foregoing, the rms timing jitter has less information than the
phase-noise spectrum and can be calculated from phase noise
using (49). However, unless extra information about the shape Combining (52) and (54) results in the following:
of the phase-noise spectrum is known, the inverse is not
possible in general. (55)
In the special case where the phase noise is dominated
by white noise, and are given by (6) and (12). which reduces to (16) in the special case of i.e.,
Therefore, can be expressed in terms of phase noise in the symmetric rising and falling edges. The dc value of the ISF,
1 region as can be calculated from Fig. 18 in a similar manner and
is given by
(50)
(56)
where is the phase noise measured in the 1 Using (7), the 1 corner is given by
region at an offset frequency of and is the oscillation
frequency. Therefore, based on (8), the rms cycle-to-cycle jitter
(57)
will be given by
IV. SIGNAL INJECTION IN THE LOOP It is thus equivalent within a multiplicative constant to the jitter
The second option for the generation of jitter is the injection transfer function . For a spectral test such as the
of a test signal at the input of the loop filter as shown in jitter transfer function, the test signal is a sinewave encoded
Fig. 7. This signal source, represented here by the variable into a single bit. The quantization noise is concentrated at high
, is injected through a second charge pump with gain frequencies and is filtered out as explained in the previous
. It should be understood that this signal source is not a section. It is important to note that the clock period for signal
jittery digital signal but an analog signal embedded in a 1-b injection must be an integer multiple of the reference signal
digital signal encoded using a modulator. However, this period to prevent aliasing of the quantization noise back in
signal, when referred back to the input, is equivalent to input the PLL passband. This condition implies that the signal
486 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 3, MARCH 1998
(a)
(a)
(b)
Fig. 12. (a) Circuit to evaluate rad jitter threshold. (b) Typical waveforms. Fig. 13. Signal-to-noise ratio of the output jitter versus the PLL relative
bandwidth.
multibit output [refer to Fig. 10(b)] for the purpose of digital TABLE I
phase modulation. It should be noted that apart from the EXPERIMENT PARAMETERS
quantizer, modulator circuitry is not duplicated as it will
be operated in time-shared mode at double speed for multibit
operation.
The input to the PLL can be set to accommodate both jitter
generation methods. For the loop jitter injection scheme, a 100-
kHz square wave is presented to the input of the phase detector.
The input can also be the same signal phase modulated with the
help of an 800-kHz test clock. Eight jitter steps are therefore
possible, resulting in a quantization.
Various jitter threshold circuits are also implemented on
the FPGA. The jitter threshold circuit of Fig. 12(a) will
be employed in conjunction with the loop jitter injection. On
the other hand, thresholds of and are implemented The results of the experiments on the first configuration
for the digital phase modulation method, making use of the are shown in Fig. 15. A measured jitter transfer function for
higher frequency test clock. each jitter generation method is displayed. The dotted line
For each test, a warm-up stage of 214 data cycles is executed represents the theoretical jitter transfer function as predicted
to remove transients before a 216 data cycle test stage is from the direct measurement of the components. The phase
performed. The error threshold is set to 64, corresponding modulation scheme used a threshold for this experiment.
to a BER of 10 3 . A control module built around a finite The curve shows a 0.4-dB offset which can be attributed
state machine selects the amplitude of the input jitter for the mostly to the static jitter of the PLL. For the other jitter
ensuing test according to the output of the jitter threshold creation scheme, the signal injection clock was chosen to be
circuit, using the binary search algorithm. At each frequency 50 kHz, that is half the PLL rate, in order to demonstrate
point, the amplitude is resolved to an accuracy of 15 b within the flexibility in selecting this parameter. The offset is larger,
13 s. The entire digital circuitry for all the experiments requires possibly because of mismatch between the two charge pumps
81% of the resources of an XC4010 FPGA. This experimental realized out of discrete transistors. The first two columns of
setup is connected to a workstation through I/O modules to Table II summarize the features of the curves after removal
allow a driving software to set the low-pass oscillator of the offsets. Both methods yield similar results for the PLL
frequency as well as read the amplitude. bandwidth and jitter peaking. The theoretical predictions are
slightly off, most probably because of the parasitics of the
setup which were not accounted for in the calculations.
IX. EXPERIMENTAL RESULTS The jitter transfer functions measured in the second ex-
The jitter transfer function measurement was carried out periment are shown in Fig. 16. This PLL exhibits a larger
for both the jitter injection and the digital phase modulation bandwidth and is more damped. It can be seen that the curve
techniques on two different PLL configurations with different obtained here with the jitter injection technique is of lesser
bandwidths and damping values. Table I summarizes the main quality. This came about because the larger bandwidth yields
parameters of these experiments. The same current amplitude a lower output jitter SNR. From the graph of Fig. 13, it can
was used for both charge pumps ( ). The transfer be seen that this SNR is barely over 20 dB. On the other
functions are presented in the continuous-time domain as this hand, the digital phase modulation still shows a smooth curve
is more typical of what can be found in industry. because of the 3-b quantization which results in lower jitter
490 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 3, MARCH 1998
XI. CONCLUSIONS
We have presented a PLL jitter transfer function measure-
ment technique which is entirely digital except for the possible
Fig. 15. Jitter transfer function for experiment 1. addition of a charge pump. The technique is suitable for on-
chip measurement since it does not require trimming and the
TABLE II silicon overhead is small. Two methods were introduced for
RESULTS SUMMARY the creation of jitter, allowing tradeoffs between test clock
frequency on one side and loading, complexity, and accuracy
on the other side. Experimental results were presented which
suggest this scheme could be successfully implemented on
silicon.
ACKNOWLEDGMENT
The authors acknowledge the suggestions of B. Gerson from
PMC Sierra.
REFERENCES
[1] F. M. Gardner, “Charge-pump phase-lock loops,” IEEE Trans. Commun.,
vol. COMM-28, pp. 1849–1857, Nov. 1980.
[2] L. DeVito, “A versatile clock recovery architecture and monolithic im-
plementation,” in Monolithic Phase-Locked Loops and Clock Recovery
Circuits: Theory and Design, B. Razavi, Ed. New York: IEEE Press,
1996.
[3] E. H. Armstrong, “A method of reducing disturbances in radio signaling
by a system of frequency modulation,” in Proc. IRE, May 1936, vol.
24, no. 5, pp. 689–740.
[4] P. Goteti, G. Devarayanadurg, and M. Soma, “DFT for embedded
charge-pump PLL systems incorporating IEEE 1149.1,” in Proc. IEEE
1997 CICC, Santa Clara, CA, May 1997, pp. 210–213.
[5] M. W. Hauser, “Principles of oversampling A/D conversion,” J. Audio
Eng. Soc., vol. 39, nos. 1/2, pp. 3–26, Jan./Feb. 1991.
[6] M. F. Toner and G. W. Roberts, “A BIST scheme for an SNR, gain
tracking, and frequency response test of sigma–delta ADC,” IEEE Trans.
Circuits Syst.–II, vol. 41, pp. 1–15, Jan. 1995.
[7] J. P. Hein and J. W. Scott, “z -domain model for discrete-time PLL’s,”
Fig. 16. Jitter transfer function for experiment 2. IEEE Trans. Circuits Syst., vol. 35, pp. 1393–1400, Nov. 1988.
[8] J. Tierney, C. M. Rader, and B. Gold, “A digital frequency synthesizer,”
IEEE Trans. Audio Electroacoustic, vol. 19, pp. 48–57, 1971.
noise levels at the output. Again, the meaningful parameters [9] J. G. Kenney and L. R. Carley, “Design of multi-bit noise shaping data
converter,” Analog Integrated Circuits and Signal Processing J., May
are summarized in the two right-most columns of Table II. 1993, vol. 3, no. 3, pp. 259–272.
[10] A. K. Lu, G. W. Roberts, and D. A. Johns, “A high-quality analog
X. IMPLEMENTATION oscillator using oversampling D/A conversion techniques,” IEEE Trans.
Circuits Syst.–II, vol. 41, pp. 437–444, July 1994.
For any integrated measurement scheme, the area overhead [11] E. M. Hawrysh and G. W. Roberts, “An integration of memory-based
analog signal generation into current DFT architectures,” in Proc. 1996
is obviously a major concern. While the digital portion of ITC, Washington, DC, Oct. 1996, pp. 528–537.
the experimental setup uses a large portion of the FPGA, a [12] B. Dufort and G. W. Roberts, “Signal generation using periodic single
much more compact implementation is possible. Indeed, a and multi bit sigma–delta modulated streams,” in Proc. IEEE 1997 ITC,
Washington, DC, Nov. 1997, pp. 396–405.
oscillator was selected as the signal generator because [13] C. A. Sharpe, “A 3-state phase detector can improve your next PLL
of its versatility as a complete jitter transfer function was design,” EDN Mag., pp. 55–59, Sept. 1976.
VEILLETTE AND ROBERTS: ON-CHIP MEASUREMENT OF THE JITTER TRANSFER FUNCTION 491
Benoı̂t R. Veillette (S’97) was born in Trois- Gordon W. Roberts (S’85–M’85) was born in
Riviéres, Québec, Canada, on January 1, 1971. Toronto, Canada, in 1959. He received the B.A.Sc.
He received the B.Eng. (Honors) degree and the degree in electrical engineering from the University
M.Eng. degree from McGill University, Montréal, of Waterloo in 1983 and the M.Eng. and Ph.D.
PQ, Canada, in 1993 and 1995, respectively. He degrees also in electrical engineering from the Uni-
is now completing the Ph.D. degree in electrical versity of Toronto in 1986 and 1989.
engineering from the same institution. In 1989 he joined the faculty of McGill University
His current research interests are in delta–sigma where he is presently an Associate Professor. He
modulation, analog integrated circuits for commu- co-authored several text books and has contributed
nications, and mixed-signal testing. seven chapters to various edited volumes related to
analog IC design and test.
He is presently an Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS
AND SYSTEMS and Editor of the IEEE Design and Test Magazine. He has
received numerous department and faculty awards for teaching, as well as
several IEEE awards for his work related to mixed-signal testing.
Physical Processes of Phase Noise Differential LC Oscillators
Systems Laboratory
University o f California
Los Angeles,CA 90095-1594
Introduction The results are validated against SpectreRF simulations and mea-
There is an unprecedented interest among circuit designers today surements on two differential CMOS oscillators tuned by resona-
to obtain insight into mechanisms of phase noise in LC oscilla- tors with very different Qls.
tors. For only with this insight is it possible to optimize oscillator Recognizing Phase Noise
circuits using low-quality integrated resonators to comply with For the purposes of analysis, a noise spectrum is considered as
the exacting phase noise specifications of modern wireless sys- consisting of uncorrelated sinewaves in a 1 Hz bandwidth at any
tems. Various numerical simulators are now available to assist the given frequency. Voltage or current noise produces amplitude and
circuit designer [ 11, [2], [3], in some cases accompanied by qualita- phase fluctuations when superimposed on a periodic signal (from
tive interpretations [4]. At present, therefore, the situation of the
now on, a large sinewave V0sin(2nf,t)). This is clearly seen [lo] by
oscillator designer is similar to the designer of amplifiers who is isolating one sinewave vn in the noise spectrum, say at a frequency
equipped only with SPICE, but who lacks physical insight and offset +fmfrom the sinewave frequency f,. Figure 1 shows this as a
methods for simple yet accurate analysis with which to optimize a phasor vn rotating relative to the sinewave phasor V,, which is then
circuit. decomposed into two equal collinear phasors at +fm, and two anti-
Over the years, various attempts at phase noise analysis have pro- phase conjugate phasors which are assigned a negative relative fre-
duced results that are variations on Leeson’s classic “heuristic der- quency -fm . Grouping the phasors pairwise as ?fm, it is seen that
ivation without formal proof” [SI, [6]. These analyses are based one pair modulates the amplitude of the sinewave with time (AM),
on a linear model of an LC resonator in steady-state oscillation while the other sweeps its phase (PM). Thus, half of any additive
through application of either feedback or negative conductance. noise on a sinewave produces phase noise, the other half amplitude
The results confirm Leeson by showing that phase noise is propor- noise. When sin(w,t) is accompanied either by noise sinewave pha-
tional to noise-to-carrier ratio and inversely to the square of reso- sors +sin(wO+wm)t,+sin(wO-wm)t or by fcos(a,+am)t, +cos(o,-
nator quality factor. However, without knowledge of the constant a,)t, then phase noise alone is present.
of proportionality, which Leeson leaves as an unspecified noise
factor, the actual phase noise cannot be predicted. Simple Model of the Differential Oscillator
It is now well understood that the large-signal periodic switching This paper treats the well-known tail-current biased differential
of a self-limited oscillator [7] underpins this noise factor [SI. At L C oscillator (Figure 2). In steady state, the differential pair acts as
first sight, an accurate noise analysis of an oscillator subject to peri- a negative conductance that switches the tail current I, into the LC
odic bias currents appears intractable, however by using sensible resonator. Owing to filtering in the L C circuit, the square wave of
approximations Huang has solved this problem for a Colpitts oscil- current creates a sinusoidal voltage across the resonator of ampli-
lator [9] and obtained good agreement between analysis and mea- tude (4/z)I,R. This voltage drives the differential pair into switch-
surements of thermally induced phase noise. The mechanisms of ing, thus sustaining oscillation. In a CMOS oscillator the ampli-
flicker noise upconversion, which are important in CMOS oscilla- tude may build up to several volts, eventually limited by the supply
tors, remain obscure. voltage.
In previous work on noise in mixers [ll], we have shown how
In this paper we concentrate on an understanding of the popular
differential LC oscillator. We introduce simple models to capture a simple model of the switching differential pair is sufficient to
the nonlinear processes that convert voltage or current thermal explain all frequency translations of noise. This model is used here.
noise in resistors or transistors into phase noise in the oscillator. Suppose that some noise (v”) accompanies the resonator sinewave.
The analysis does not require hypothetical elements, such as lim- Assuming that a small fraction of the resonator voltage around the
iters or amplitude control loops, to fully explain phase noise. A zero crossing is enough to fully switch the differential pair, then
simple expression at the end accurately specifies thermally induced the noise simply advances or retards the instant of zero crossing
phase noise, and lends substance to Leeson’s original hypothesis. (Figure 3(a)). The randomly pulse-width modulated current at the
switch output may be decomposed into the original periodic square
Next, the upconversion of flicker noise into phase noise is traced wave in the absence of noise, superimposed with pulses of con-
to mechanisms first identified in the 1930’s, but apparently since stant height but random width (Figure 3(b)). In turn, these pulses
forgotten. Unlike thermally induced phase noise, which appears as may be approximated by a train of impulses at twice the oscillation
phase modulation sidebands, flicker noise is shown to upconvert by frequency multiplying the original noise waveform vn(t) (Figure
bias-dependent frequency modulation.
3(c)).
25-1-1
0 2000 IEEE IEEE 2000 CUSTOM INTEGRATED CIRCUITS CONFERENCE
0-7803-5809-O/OO/$lO.OO 569
Thermally Induced Phase Noise where y is the noise factor of a single FET, classically 2/3. It is
Resonator Noise important to note that the AM noise resulting from upconversion,
if impiessed across a varactor at the resonator, will modulate the
Now consider a current source insin((w,+wm)t+@) representing varactor, thus the oscillation frequency by AM-to-FM conversion
noise in the loss conductance of the resonator, where i:=4kT/R. [E!]. Although the process is different, the resulting sidebands
According to the model above, this modulates the zero crossing are indistinguishable from PM noise sidebands. Unlike the other
instants of the differential pair, producing a current which, in addi- mechanisms of phase noise, this effect depends on the varactor
tion to the usual square wave, also consists of current pulses sam- characteristics and VCO tuning range and it may be significant
pling this noise at 20,. After sampling, frequency components only in certain situations.
appear at O,?O,, 3w,+0,, ... However, usually the resonator will
filter the 3"' and higher harmonics, leaving o ~ ~ was, the only DifferentialPair Noise
important terms. These will induce a symmetric voltage response Noise originating in the differential pair is unlike the previous two
in the resonator, and through feedback arrive at steady state. The cases. There, only certain parts of the noise spectrum contributed
steady-state oscillation, in general, is of the form: significantly to the total phase noise. White noise in the resonator is
uOut= V,sinw,,t + Asin(w, - w,)t + Bcos(w,, - ~ , ~ ) t filtered at harmonics of the resonant frequency. White noise in the
tail current only experiences a significant conversion gain around
+Csin(w,, + +
w,")t Dcos(w,, + w,,,)t the second harmonic of the oscillation frequency. However, the
and here A=-C= i"x(L0,2/40,), while BzD-0. The relative signs simple model says that an impulse train samples white noise in the
of A and C prove that the steady-state response to current noise in differential pair, which if true, will cause it to accumulate without
the resonator's resistor is phase noise in the oscillator. The single- bound at any specified offset frequency om.
sideband phase noise density is found by the ratio of the sideband
In reality, any practical differential pair requires a non-zero input
power at a given frequency to the power in the fundamental oscil-
voltage excursion to switch, and this is provided by the oscillation
lation frequency. Thus, the thermally induced phase noise density
waveform across the resonator. Therefore, noise in the differential
due to resonator loss is: pair is actually not sampled by impulses, but by time windows of
finite width. The window height is proportional to transconduc-
tance, and width is set by tail current, and slope of the oscillation
waveform at zero crossing. The input-referred noise spectral den-
where N,=2, the number of loss sources (in the left and right reso- sity of the differential pair is inversely proportional to transcon-
nators) and N,=4 because uncorrelated quadrature noise originat- ductance. Thus, the narrower the sampling window, that is, the
ing at o,+o, contributes to SSB phase noise at offset w,. larger the sampling bandwidth, the lower the noise spectral density
Tail Current Noise [ 111. Analysis shows that the noise bandwidth product is constant,
and produces pure phase noise. After taking into account the accu-
The switching action of the differential pair commutates noise in mulation of frequency translations throughout the sampling band-
the tail currents like a single-balanced mixer. The noise is trans- width, the following compact yet exact expression is reached:
lated up and down in frequency, and enters the resonator. The
resulting voltage drives the differential pair, the noise components
modulating the zero crossing instants. The resulting impulses of
current feed back into the resonator. The steady-state solution is
found by solving simultaneous equations of a form that anticipates We note that [8]has arrived at a similar analysis for the first two
the end result, much like in any feedback circuit. sources of noise, but was unable to obtain a closed-form expression
The single-balanced mixer shows the largest conversion gain for this last term.
around the fundamental switching frequency, 1/3'd the current Proving Leeson's Hypothesis
conversion gain around the 3'" harmonic, and so on. Therefore, Leeson originally postulated that thermally induced phase noise in
only mixing by the fundamental at is important. Noise originat- any oscillator takes the form:
ing in the tail current at W, upconverts to w0+w,. Similarly, noise
at 20,,f0, downconverts to o , ~ o , .
Analysis shows that the upconversion produces coefficients A=C,
B=-D, both of which indicate AM only. It should be noted that
where F is an unspecified noise factor. By summing the expressions
AM noise superimposed on the resonator fundamental frequency
obtained above for thermally induced phase noise arising from the
does not modulate the zero crossings of the switching differential
resonator, differential pair and tail bias current, respectively, for
pair, and therefore does not propagate in the feedback loop back
into the resonator. However, the downconversion results in phase the differential oscillator Leeson's noise factor is:
noise only, with A=-C, and B=D=O. The phase noise caused by
thermal noise originally at 20, is:
570 25-1-2
amplitude of oscillation is smaller than the power supply, the dif- However, this is not the only mechanism of indirect FM. At RF,
ferential pair acts as a pure current switch driving the resonator active device capacitance is also significant, and it no longer appears
and V,=(4/x)RIT [13]. Then the second term comprising F sim- as a pure negative resistance to the resonator. For example, the dif-
plifies to 2y. This means that as tail current increases and assuming ferential pair commutates current flowing in the capacitor C, at
gmblrSR is held constant, the noise factor remains constant and phase the tail, which presents a negative capacitor (or, equivalently, an
noise improves as V i , that is, as I,’. This has been observed by inductor in a narrowband sense) at the differential output (Figure
others [131. However, beyond a critical tail current the amplitude 6). This speeds up the oscillation frequency. Flicker noise in the
Vuis pegged constant, limited by supply voltage. Further increases differential pair FETs modulates the duty cycle of commutation,
in I, will cause the differential pair’s contribution to noise factor to and therefore the effective negative capacitance. Here, too, Grosz-
rise, degrading phase noise proportionally to I, (Figure 4). There- kowski gives a method of systematic analysis [16], which captures
fore, for least phase noise the tail current should be just enough to the reactive components in the active devices by measuring the area
drive the amplitude to its maximum possible value. n enclosed by hysteresis in the dynamic negative resistance curve.
Flicker Noise Upconversion
Close-in to the oscillation frequency, the slope of the phase noise
_
Aw -
w,
- -- n
2Q2w,L
+q-
n2(1- n’)
2Q2 n=2 (1 n’)’ n2/Q’
.m:
+
spectrum in all CMOS VCO’s turns from -20 to -30 dB/decade. Thus the sensitivity of the reactance to bias current or offset volt-
This is ascribed to the upconversion of flicker noise in FETs. To age in the differential pair is estimated, which is another means
understand this, let us first see if the analysis above explains this whereby flicker noise modulates the frequency of oscillation.
upconversion.
Validation of Analysis
Flicker noise in the tail current source at frequency W, indeed
upconverts to O&O, and enters the resonator, but as AM, not PM T h e phase noise model was validated on two CMOS differential
noise. Therefore, in the absence of a high gain varactor to convert L C oscillators. One oscillator uses a low Q, on-chip inductor, while
AM to FM, flicker noise in the tail current will not appear as phase the other uses off-chip inductors with large Q Flicker noise is
noise. Next consider flicker noise in the differential pair. T h e pre- modelled as a bias-independent, gate-referred voltage source [ 141.
ceding analysis says that this modulates zero crossings, and injects T h e measured data and SpectreRF simulations are plotted with
a noise current into the resonator consisting of flicker noise sam- predictions based on this paper. Excellent agreement (Figure 7) is
pled by an impulse train with frequency 20,. Thus noise originat- found across the entire spectrum, which encompasses thermally
ing at frequency O, produces currents at O, and at 20,f0, . Both induced phase noise and upconverted flicker noise.
frequencies are strongly attenuated in the resonator, and neither
K. S. Kundert, “Introduction to RF simulation and its application,” IEEE
explains flicker-induced phase noise at w,+o,. One can only con- J’ournulof SolidSture Circuitr, pp. 1298-319, 1999.
clude that the mechanisms of flicker noise upconversion are quite A. Demir, A. Mehrotra, and J. Roychowdhury, “Phase noise in oscillators: a
different than for thermally induced phase noise. unifying theory and numerical methods for characterisation,” in Derign und
Automution Confirence, San Francisco, p p 26-3 1, 1998.
FundamentalSources of FM in Oscillators B. De Smedt and G. Gielen, “Accurate simulation of phase noise in oscillators,”
in Europun Solid-Sture Circuits Confirence, p p 208-1 1, 1997.
In 1934, Groszkowski [151 while studying electronic oscillators A. Hajimiri and T H. Lee, “A general theory of phase noise in electrical oscilla-
realized that the steady-state oscillation frequency seldom coin- tors,” IEEEJournul of 3olid-Stute Circuirr, vol. 33, no. 2, p p 179-94, 1998.
cides with the natural frequency of the resonator which tunes D. B. Leeson, “A Simple Model of Feedback Oscillator Noise Spectrum,”
Proceedings of the IEEE, vol. 54, pp. 329-330, 1966.
the oscillator. He found that the discrepancy arises because the J. Craninckx and M. Steyaert, “Low-noise voltage-controlled oscillators using
active device in the oscillator, such as the differential pair current enhanced LC-tanks,” I E E E Trumucrionr on Circuirr und Syslems 11: Anulog und
switch in the circuit considered here, drives the resonator with a DigiiulSignulProcesring, vol. 42, no. 12, p p 794-804, 1995.
K. K. Clarke and D. T Hess, Cummuniculion Circurrr: Anulysis undDerign.
harmonic-rich waveform. T h e harmonics will flow into the lower Malabar, FL: Krieger, 1971.
impedance capacitor (Figure 5) and upset the exact reactive power C. Samori, A. L. Lacaita, E Villa, and E Zappa, “Spectrum folding and phase
balance between the Land the C required for steady state. Now the noise in LC tuned oscillators,” IEEE Trunructionr on Circuits andSysremr 11:
Anulog und DigitulSignul Processing, vol. 45, no. 7, pp. 781-90, 1998.
frequency of oscillation must shift down until the reactive power in
Q Huang, “On the exact design of R F oscillators,” CICCProceedings, pp. 4 1 4 ,
the inductor increases to equal the reactive power in the capacitor 1998.
due to the fundamental and all harmonics. T h e shift, Am, is: W. P. Robins, Phure Noire in SignulSuurcer. London: Peter Peregrinus, 1982.
H. Darabi and A. Abidi, “Noise in CMOS Mixers: A Simple Physical Model,”
_
Aw -
--1 n2(1-n2) IEEEJournuluf S o l i d S r ~ r eCircuirr, vol. 35, no. 1, in press, 2000.
w, 2Q‘ 2 +
(1 - n2)’ n2/ Q’ ’ m’
C. Samori, A. L. Lacaita, A. Zanchi, S. Levantino, and E Torrisi, “Impact of
Indirect Stability on Phase Noise Performance of Fully-Integrated LC Tuned
VCOs,” in Europeun Solid-Stute Circuits Confirence, Duisburg, Germany, p p
where mnis the normalized level of the nthharmonic. AO is the sum 202-205, 1999.
c of all negative terms, which means that oscillation frequency slows A. Hajimiri and T H . Lee, “Phase Noise in CMOS Differential L C oscillators,”
in Symporium on VLSI Cirnrirr, Honolulu, HI, pp. 48-51, 1998.
down with more harmonic content. Now the harmonic content at J. Chang, A. A. Abidi, and C. R. Viswanathan, “Flicker Noise in CMOS
the output of a periodically switching differential pair is a function Transistors from Subthreshold to Strong Inversion at Various Temperatures,”
of the tail current. In the autonomous oscillator, the drive to the IEEE Trunructionr on Electron Dcuicer, vol. 41, pp. 1965-1971,1994.
differential pair is also a function of tail current. The sensitivity J. Groszkowski, “The Interdependence of Frequency Variation and Harmonic
Contcnt, and the problem of Constant-Frequency Oscillators,” Proc. of the I R E ,
a ~ / a I ,is responsible for an “indirect” F M [7] due to flicker noise vol. 21, no. 7, pp 958-981, 1934.
in I, J. Groszkowski, Frequency of Sel/-OrciNu~ions.Oxford: Pergamon Press, 1964.
25-1-3 571
Figure 5. Harmonics of oscil-
Figure 1. Noise phasor added to a lating current flow into capaci-
sinewave decomposes into PM and tor, increasing its reactive energy.
AM sidebands. Steady state frequency shifts
down until inductor energy bal-
ances.
T
Figure 2. Differential LC oscillator
biased by tail current.
Nonlinear active
/Y~LV~/(~L)~
Figure 6. Capacitors associated
with active device appear as
reactances across the resonator,
shifting frequency.
LO Voltage -40
N -60
til 5
l%
% -80
.-
v)
0
z
al
2-100
c
a
-120
Approximate Model of Noise Pulses Samdina
Impdsetain -140
J I I Jime 1 10 100 1000
N o i s h Offset Frequency,kHz
-60
Figure 3. (a) Noise at input of differential pair modulates instants
of zero crossing. (b) Output current consists of square wave, -70
plus random noise pulses. (c) Noise pulses modelled as a train of N
impulses sampling noise waveform. 5 -80
I%
9
cm -90
.-v)
Phase
Noise Oscillation Figure 4. Increasing tail current
:-100
0
(ZI
v)
first causes amplitude to rise,
until limited by supply. Phase
E -110
noise diminishes with rising -120
amplitude, then worsens due to
higher noise factor. -130
' Bias Current, IT 1 10 100 1000
Offset Frequency,kHz
Figure 7. Validation of the analysis presented in this paper. Measured phase noise is compared with predictions
from analysis, and with SpectreRF simulations. (a) 0.35-pm CMOS 1.1 GHz oscillator using resonator with
loaded Q o f 6. (b) 0.25-pm CMOS 830 MHz oscillator using discrete inductor with loaded of Q o f 25.
572 25-1-4
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 2, FEBRUARY 1998 179
Abstract— A general model is introduced which is capable Since any oscillator is a periodically time-varying system,
of making accurate, quantitative predictions about the phase its time-varying nature must be taken into account to permit
noise of different types of electrical oscillators by acknowledging accurate modeling of phase noise. Unlike models that assume
the true periodically time-varying nature of all oscillators. This
new approach also elucidates several previously unknown design linearity and time-invariance, the time-variant model presented
criteria for reducing close-in phase noise by identifying the mech- here is capable of proper assessment of the effects on phase
anisms by which intrinsic device noise and external noise sources noise of both stationary and even of cyclostationary noise
contribute to the total phase noise. In particular, it explains the sources.
details of how 1=f noise in a device upconverts into close-in Noise sources in the circuit can be divided into two groups,
phase noise and identifies methods to suppress this upconversion.
The theory also naturally accommodates cyclostationary noise namely, device noise and interference. Thermal, shot, and
sources, leading to additional important design insights. The flicker noise are examples of the former, while substrate and
model reduces to previously available phase noise models as supply noise are in the latter group. This model explains
special cases. Excellent agreement among theory, simulations, and the exact mechanism by which spurious sources, random
measurements is observed. or deterministic, are converted into phase and amplitude
Index Terms—Jitter, oscillator noise, oscillators, oscillator sta- variations, and includes previous models as special limiting
bility, phase jitter, phase locked loops, phase noise, voltage cases.
controlled oscillators. This time-variant model makes explicit predictions of the
relationship between waveform shape and noise upcon-
version. Contrary to widely held beliefs, it will be shown
I. INTRODUCTION that the corner in the phase noise spectrum is smaller
than noise corner of the oscillator’s components by a
T HE recent exponential growth in wireless communication
has increased the demand for more available channels in
mobile communication applications. In turn, this demand has
factor determined by the symmetry properties of the waveform.
This result is particularly important in CMOS RF applications
imposed more stringent requirements on the phase noise of because it shows that the effect of inferior device noise
local oscillators. Even in the digital world, phase noise in the can be reduced by proper design.
guise of jitter is important. Clock jitter directly affects timing Section II is a brief introduction to some of the existing
margins and hence limits system performance. phase noise models. Section III introduces the time-variant
Phase and frequency fluctuations have therefore been the model through an impulse response approach for the excess
subject of numerous studies [1]–[9]. Although many models phase of an oscillator. It also shows the mechanism by which
have been developed for different types of oscillators, each noise at different frequencies can become phase noise and
of these models makes restrictive assumptions applicable only expresses with a simple relation the sideband power due to
to a limited class of oscillators. Most of these models are an arbitrary source (random or deterministic). It continues
based on a linear time invariant (LTI) system assumption with explaining how this approach naturally lends itself to the
and suffer from not considering the complete mechanism by analysis of cyclostationary noise sources. It also introduces
which electrical noise sources, such as device noise, become a general method to calculate the total phase noise of an
phase noise. In particular, they take an empirical approach in oscillator with multiple nodes and multiple noise sources, and
describing the upconversion of low frequency noise sources, how this method can help designers to spot the dominant
such as noise, into close-in phase noise. These models source of phase noise degradation in the circuit. It concludes
are also reduced-order models and are therefore incapable of with a demonstration of how the presented model reduces
making accurate predictions about phase noise in long ring to existing models as special cases. Section IV gives new
oscillators, or in oscillators that contain essential singularities, design implications arising from this theory in the form of
such as delay elements. guidelines for low phase noise design. Section V concludes
with experimental results supporting the theory.
Manuscript received December 17, 1996; revised July 9, 1997. II. BRIEF REVIEW OF EXISTING MODELS AND DEFINITIONS
The authors are with the Center for Integrated Systems, Stanford University,
Stanford, CA 94305-4070 USA. The output of an ideal sinusoidal oscillator may be ex-
Publisher Item Identifier S 0018-9200(98)00716-1. pressed as , where is the amplitude,
0018–9200/98$10.00 1998 IEEE
180 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 2, FEBRUARY 1998
Fig. 1. Typical plot of the phase noise of an oscillator versus offset from
carrier. The semi-empirical model proposed in [1]–[3], known also
as the Leeson–Cutler phase noise model, is based on an LTI
assumption for tuned tank oscillators. It predicts the following
is the frequency, and is an arbitrary, fixed phase refer- behavior for :
ence. Therefore, the spectrum of an ideal oscillator with no
random fluctuations is a pair of impulses at . In a practical
oscillator, however, the output is more generally given by
(1) (3)
where and are now functions of time and is a where is an empirical parameter (often called the “device
periodic function with period 2 . As a consequence of the excess noise number”), is Boltzmann’s constant, is the
fluctuations represented by and , the spectrum of a
absolute temperature, is the average power dissipated in
practical oscillator has sidebands close to the frequency of
the resistive part of the tank, is the oscillation frequency,
oscillation, .
is the effective quality factor of the tank with all the
There are many ways of quantifying these fluctuations (a
loadings in place (also known as loaded ), is the offset
comprehensive review of different standards and measurement
from the carrier and is the frequency of the corner
methods is given in [4]). A signal’s short-term instabilities are
usually characterized in terms of the single sideband noise between the and regions, as shown in the sideband
spectral density. It has units of decibels below the carrier per spectrum of Fig. 1. The behavior in the region can be
hertz (dBc/Hz) and is defined as obtained by applying a transfer function approach as follows.
The impedance of a parallel RLC, for , is easily
calculated to be
1 Hz
(2)
(4)
(c)
Fig. 4. (a) Impulse injected at the peak, (b) impulse injected at the zero
crossing, and (c) effect of nonlinearity on amplitude and phase of the oscillator
in state-space.
(6)
III. MODELING OF PHASE NOISE
Note that the factor of 1/2 arises from neglecting the con-
tribution of amplitude noise. Although the expression for the A. Impulse Response Model for Excess Phase
noise in the region is thus easily obtained, the expression
An oscillator can be modeled as a system with inputs
for the portion of the phase noise is completely empirical.
(each associated with one noise source) and two outputs
As such, the common assumption that the corner of the
that are the instantaneous amplitude and excess phase of the
phase noise is the same as the corner of device flicker
oscillator, and , as defined by (1). Noise inputs to this
noise has no theoretical basis.
system are in the form of current sources injecting into circuit
The above approach may be extended by identifying the
nodes and voltage sources in series with circuit branches. For
individual noise sources in the tuned tank oscillator of Fig. 2
each input source, both systems can be viewed as single-
[8]. An LTI approach is used and there is an embedded
input, single-output systems. The time and frequency-domain
assumption of no amplitude limiting, contrary to most practical
fluctuations of and can be studied by characterizing
cases. For the RLC circuit of Fig. 2, [8] predicts the following:
the behavior of two equivalent systems shown in Fig. 3.
Note that both systems shown in Fig. 3 are time variant.
(7) Consider the specific example of an ideal parallel LC oscillator
shown in Fig. 4. If we inject a current impulse as shown,
where is yet another empirical fitting parameter, and the amplitude and phase of the oscillator will have responses
is the effective series resistance, given by similar to that shown in Fig. 4(a) and (b). The instantaneous
voltage change is given by
(8)
(9)
where , , , and are shown in Fig. 2. Note that it
is still not clear how to calculate from circuit parameters. where is the total injected charge due to the current
Hence, this approach represents no fundamental improvement impulse and is the total capacitance at that node. Note
over the method outlined in [3]. that the current impulse will change only the voltage across the
182 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 2, FEBRUARY 1998
(a) (b)
Fig. 5. (a) A typical Colpitts oscillator and (b) a five-stage minimum size (a) (b)
ring oscillator. Fig. 6. Phase shift versus injected charge for oscillators of Fig. 5(a) and (b).
capacitor and will not affect the current through the inductor. where it has the maximum effect on phase. As can be seen, the
It can be seen from Fig. 4 that the resultant change in and current-phase relation is linear for values of charge up to 10%
is time dependent. In particular, if the impulse is applied of the total charge on the effective capacitance of the node
at the peak of the voltage across the capacitor, there will be no of interest. Also note that the effective injected charges due
phase shift and only an amplitude change will result, as shown to actual noise and interference sources in practical circuits
in Fig. 4(a). On the other hand, if this impulse is applied at the are several orders of magnitude smaller than the amounts of
zero crossing, it has the maximum effect on the excess phase charge injected in Fig. 6. Thus, the assumption of linearity is
and the minimum effect on the amplitude, as depicted in well satisfied in all practical oscillators.
Fig. 4(b). This time dependence can also be observed in the It is critical to note that the current-to-phase transfer func-
state-space trajectory shown in Fig. 4(c). Applying an impulse tion is practically linear even though the active elements may
at the peak is equivalent to a sudden jump in voltage at point have strongly nonlinear voltage-current behavior. However,
, which results in no phase change and changes only the the nonlinearity of the circuit elements defines the shape of
amplitude, while applying an impulse at point results only the limit cycle and has an important influence on phase noise
in a phase change without affecting the amplitude. An impulse that will be accounted for shortly.
applied sometime between these two extremes will result in We have thus far demonstrated linearity, with the amount
both amplitude and phase changes. of excess phase proportional to the ratio of the injected charge
There is an important difference between the phase and to the maximum charge swing across the capacitor on the
amplitude responses of any real oscillator, because some node, i.e., . Furthermore, as discussed earlier, the
form of amplitude limiting mechanism is essential for stable impulse response for the first system of Fig. 3 is a step whose
oscillatory action. The effect of this limiting mechanism is amplitude depends periodically on the time when the impulse
pictured as a closed trajectory in the state-space portrait of is injected. Therefore, the unit impulse response for excess
the oscillator shown in Fig. 4(c). The system state will finally phase can be expressed as
approach this trajectory, called a limit cycle, irrespective of
its starting point [10]–[12]. Both an explicit automatic gain (10)
control (AGC) and the intrinsic nonlinearity of the devices
act similarly to produce a stable limit cycle. However, any where is the maximum charge displacement across the
fluctuation in the phase of the oscillation persists indefinitely, capacitor on the node and is the unit step. We call
with a current noise impulse resulting in a step change in the impulse sensitivity function (ISF). It is a dimensionless,
phase, as shown in Fig. 3. It is important to note that regardless frequency- and amplitude-independent periodic function with
of how small the injected charge, the oscillator remains time period 2 which describes how much phase shift results from
variant. applying a unit impulse at time . To illustrate its
Having established the essential time-variant nature of the significance, the ISF’s together with the oscillation waveforms
systems of Fig. 3, we now show that they may be treated as for a typical LC and ring oscillator are shown in Fig. 7. As is
linear for all practical purposes, so that their impulse responses shown in the Appendix, is a function of the waveform
and will characterize them completely. or, equivalently, the shape of the limit cycle which, in turn, is
The linearity assumption can be verified by injecting im- governed by the nonlinearity and the topology of the oscillator.
pulses with different areas (charges) and measuring the resul- Given the ISF, the output excess phase can be calcu-
tant phase change. This is done in the SPICE simulations of lated using the superposition integral
the 62-MHz Colpitts oscillator shown in Fig. 5(a) and the five-
stage 1.01-GHz, 0.8- m CMOS inverter chain ring oscillator
shown in Fig. 5(b). The results are shown in Fig. 6(a) and (b),
respectively. The impulse is applied close to a zero crossing, (11)
HAJIMIRI AND LEE: GENERAL THEORY OF PHASE NOISE IN ELECTRICAL OSCILLATORS 183
(a) (b)
Fig. 7. Waveforms and ISF’s for (a) a typical LC oscillator and (b) a typical
ring oscillator.
(17)
(13)
B. Phase-to-Voltage Transformation
Equation (13) allows computation of for an arbitrary input So far, we have presented a method for determining how
current injected into any circuit node, once the various much phase error results from a given current using (13).
Fourier coefficients of the ISF have been found. Computing the power spectral density (PSD) of the oscillator
As an illustrative special case, suppose that we inject a low output voltage requires knowledge of how the output
frequency sinusoidal perturbation current into the node of voltage relates to the excess phase variations. As shown in
interest at a frequency of Fig. 8, the conversion of device noise current to output voltage
(14) may be treated as the result of a cascade of two processes.
The first corresponds to a linear time variant (LTV) current-
where is the maximum amplitude of . The arguments to-phase converter discussed above, while the second is a
of all the integrals in (13) are at frequencies higher than nonlinear system that represents a phase modulation (PM),
and are significantly attenuated by the averaging nature of which transforms phase to voltage. To obtain the sideband
the integration, except the term arising from the first integral, power around the fundamental frequency, the fundamental
which involves . Therefore, the only significant term in harmonic of the oscillator output can be used
will be as the transfer function for the second system in Fig. 8. Note
this is a nonlinear transfer function with as the input.
(15) Substituting from (17) into (1) results in a single-tone
phase modulation for output voltage, with given by (17).
As a result, there will be two impulses at in the power Therefore, an injected current at results in a pair
spectral density of , denoted as . of equal sidebands at with a sideband power relative
As an important second special case, consider a current at a to the carrier given by
frequency close to the carrier injected into the node of interest,
given by . A process similar to that (18)
of the previous case occurs except that the spectrum of
184 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 2, FEBRUARY 1998
(a) (b)
Fig. 10. Simulated and calculated sideband powers for the first ten coeffi-
Fig. 9. Simulated power spectrum of the output with current injection at (a)
m = 50 + m = 1 06
cients.
f MHz and (b) f0 f : GHz.
This process is shown in Fig. 8. Appearance of the frequency (18). The ISF for this oscillator is obtained by the simulation
deviation in the denominator of the (18) underscores that method of the Appendix. Here, is equal to ,
the impulse response is a step function and therefore where is the average capacitance on each node of the
behaves as a time-varying integrator. We will frequently refer circuit and is the maximum swing across it. For this
to (18) in subsequent sections. oscillator, fF and V, which results in
Applying this method of analysis to an arbitrary oscillator, fC. For a sinusoidal injected current of amplitude
a sinusoidal current injected into one of the oscillator nodes A, and an of 50 MHz, Fig. 10 depicts the
at a frequency results in two equal sidebands at simulated and predicted sideband powers. As can be seen
, as observed in [9]. Note that it is necessary to use from the figure, these agree to within 1 dB for the higher
an LTV because an LTI model cannot explain the presence of power sidebands. The discrepancy in the case of the low
a pair of equal sidebands close to the carrier arising from power sidebands ( – ) arises from numerical noise in
sources at frequencies , because an LTI system the simulations, which represents a greater fractional error at
cannot produce any frequencies except those of the input and lower sideband power. Overall, there is satisfactory agreement
those associated with the system’s poles. Furthermore, the between simulation and the theory of conversion of noise from
amplitude of the resulting sidebands, as well as their equality, various frequencies into phase fluctuations.
cannot be predicted by conventional intermodulation effects.
This failure is to be expected since the intermodulation terms
arise from nonlinearity in the voltage (or current) input/output C. Prediction of Phase Noise Sideband Power
characteristic of active devices of the form Now we consider the case of a random noise current
. This type of nonlinearity does not directly whose power spectral density has both a flat region and a
appear in the phase transfer characteristic and shows itself only region, as shown in Fig. 11. As can be seen from (18) and the
indirectly in the ISF. foregoing discussion, noise components located near integer
It is instructive to compare the predictions of (18) with multiples of the oscillation frequency are transformed to low
simulation results. A sinusoidal current of 10 A amplitude at frequency noise sidebands for , which in turn become
different frequencies was injected into node 1 of the 1.01-GHz close-in phase noise in the spectrum of , as illustrated in
ring oscillator of Fig. 5(b). Fig. 9(a) shows the simulated Fig. 11. It can be seen that the total is given by the sum
power spectrum of the signal on node 4 for a low frequency of phase noise contributions from device noise in the vicinity
input at MHz. This power spectrum is obtained using of the integer multiples of , weighted by the coefficients
the fast Fourier transform (FFT) analysis in HSPICE 96.1. It . This is shown in Fig. 12(a) (logarithmic frequency scale).
is noteworthy that in this version of HSPICE the simulation The resulting single sideband spectral noise density is
artifacts observed in [9] have been properly eliminated by plotted on a logarithmic scale in Fig. 12(b). The sidebands in
calculation of the values used in the analysis at the exact the spectrum of , in turn, result in phase noise sidebands
points of interest. Note that the injected noise is upconverted in the spectrum of through the PM mechanism discuss
into two equal sidebands at and , as predicted in the previous subsection. This process is shown in Figs. 11
by (18). Fig. 9(b) shows the effect of injection of a current at and 12.
GHz. Again, two equal sidebands are observed The theory predicts the existence of , , and flat
at and , also as predicted by (18). regions for the phase noise spectrum. The low-frequency noise
Simulated sideband power for the general case of current sources, such as flicker noise, are weighted by the coefficient
injection at can be compared to the predictions of and show a dependence on the offset frequency, while
HAJIMIRI AND LEE: GENERAL THEORY OF PHASE NOISE IN ELECTRICAL OSCILLATORS 185
(a)
(b)
(23)
(19)
The phase noise corner, , is the frequency where
the sideband power due to the white noise given by (21) is
Now, according to Parseval’s relation we have equal to the sideband power arising from the noise given
by (23), as shown in Fig. 12. Solving for results in the
(20) following expression for the corner in the phase noise
spectrum:
where is the rms value of . As a result
(24)
(21)
This equation together with (21) describe the phase noise
This equation represents the phase noise spectrum of an spectrum and are the major results of this section. As can
arbitrary oscillator in region of the phase noise spectrum. be seen, the phase noise corner due to internal noise
For a voltage noise source in series with an inductor, sources is not equal to the device noise corner, but is
should be replaced with , where smaller by a factor equal to . As will be discussed
represents the maximum magnetic flux swing in the inductor. later, depends on the waveform and can be significantly
We may now investigate quantitatively the relationship reduced if certain symmetry properties exist in the waveform
between the device corner and the corner of the of the oscillation. Thus, poor device noise need not imply
phase noise. It is important to note that it is by no means poor close-in phase noise performance.
186 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 2, FEBRUARY 1998
Fig. 14. 0(x), 0e (x), and (x) for the Colpitts oscillator of Fig. 5(a).
Fig. 13. Collector voltage and collector current of the Colpitts oscillator of
Fig. 5(a).
used in all subsequent calculations, in particular, calculation
of the coefficients .
Note that there is a strong correlation between the cyclosta-
D. Cyclostationary Noise Sources
tionary noise source and the waveform of the oscillator. The
In addition to the periodically time-varying nature of the maximum of the noise power always appears at a certain point
system itself, another complication is that the statistical prop- of the oscillatory waveform, thus the average of the noise may
erties of some of the random noise sources in the oscillator not be a good representation of the noise power.
may change with time in a periodic manner. These sources are Consider as one example the Colpitts oscillator of Fig. 5(a).
referred to as cyclostationary. For instance, the channel noise The collector voltage and the collector current of the transistor
of a MOS device in an oscillator is cyclostationary because the are shown in Fig. 13. Note that the collector current consists
noise power is modulated by the gate source overdrive which of a short period of large current followed by a quiet interval.
varies with time periodically. There are other noise sources The surge of current occurs at the minimum of the voltage
in the circuit whose statistical properties do not depend on across the tank where the ISF is small. Functions , ,
time and the operation point of the circuit, and are therefore and for this oscillator are shown in Fig. 14. Note that,
called stationary. Thermal noise of a resistor is an example of in this case, is quite different from , and hence
a stationary noise source. the effect of cyclostationarity is very significant for the LC
A white cyclostationary noise current can be decom- oscillator and cannot be neglected.
posed as [13]: The situation is different in the case of the ring oscillator
of Fig. 5(b), because the devices have maximum current
(25)
during the transition (when is at a maximum, i.e., the
where is a white cyclostationary process, is a sensitivity is large) at the same time the noise power is large.
white stationary process and is a deterministic periodic Functions , , and for the ring oscillator of
function describing the noise amplitude modulation. We define Fig. 5(b) are shown in Fig. 15. Note that in the case of the
to be a normalized function with a maximum value of ring oscillator and are almost identical. This
1. This way, is equal to the maximum mean square noise indicates that the cyclostationary properties of the noise are
power, , which changes periodically with time. Applying less important in the treatment of the phase noise of ring
the above expression for to (11), is given by oscillators. This unfortunate coincidence is one of the reasons
why ring oscillators in general have inferior phase noise
performance compared to a Colpitts LC oscillator. The other
important reason is that ring oscillators dissipate all the stored
energy during one cycle.
(26)
E. Predicting Output Phase Noise with Multiple Noise Sources
As can be seen, the cyclostationary noise can be treated as
The method of analysis outlined so far has been used to
a stationary noise applied to a system with an effective ISF
predict how much phase noise is contributed by a single noise
given by
source. However, this method may be extended to multiple
(27) noise sources and multiple nodes, as individual contributions
by the various noise sources may be combined by exploiting
where can be derived easily from device noise character- superposition. Superposition holds because the first system of
istics and operating point. Hence, this effective ISF should be Fig. 8 is linear.
HAJIMIRI AND LEE: GENERAL THEORY OF PHASE NOISE IN ELECTRICAL OSCILLATORS 187
(28)
Fig. 15. 0(x), 0e (x), and (x) for the ring oscillator of Fig. 5(b).
where is the parallel resistor, is the tank capacitor, and
is the maximum voltage swing across the tank. Equation
The actual method of combining the individual contributions (19) reduces to
requires attention to any possible correlations that may exist
among the noise sources. The complete method for doing so (29)
may be appreciated by noting that an oscillator has a current
noise source in parallel with each capacitor and a voltage noise Since [8] assumes equal contributions from amplitude and
source in series with each inductor. The phase noise in the phase portions to , the result obtained in [8] is
output of such an oscillator is calculated using the following two times larger than the result of (29).
method. Assuming that the total noise contribution in a parallel tank
oscillator can be modeled using an excess noise factor as
1) Find the equivalent current noise source in parallel with
in [3], (29) together with (24) result in (6). Note that the
each capacitor and an equivalent voltage source in series
generalized approach presented here is capable of calculating
with each inductor, keeping track of correlated and
the fitting parameters used in (3), ( and ) in terms of
noncorrelated portions of the noise sources for use in
later steps. coefficients of ISF and device noise corner, .
2) Find the transfer characteristic from each source to the
output excess phase. This can be done as follows. IV. DESIGN IMPLICATIONS
a) Find the ISF for each source, using any of the Several design implications emerge from (18), (21), and (24)
methods proposed in the Appendix, depending on that offer important insight for reduction of phase noise in the
the required accuracy and simplicity. oscillators. First, they show that increasing the signal charge
b) Find and (rms and dc values) of the ISF. displacement across the capacitor will reduce the phase
3) Use and coefficients and the power spectrum of noise degradation by a given noise source, as has been noted
the input noise sources in (21) and (23) to find the phase in previous works [5], [6].
noise power resulting from each source. In addition, the noise power around integer multiples of the
4) Sum the individual output phase noise powers for uncor- oscillation frequency has a more significant effect on the close-
related sources and square the sum of phase noise rms in phase noise than at other frequencies, because these noise
values for correlated sources to obtain the total noise components appear as phase noise sidebands in the vicinity
power below the carrier. of the oscillation frequency, as described by (18). Since the
contributions of these noise components are scaled by the
Note that the amount of phase noise contributed by each
Fourier series coefficients of the ISF, the designer should
noise source depends only on the value of the noise power
seek to minimize spurious interference in the vicinity of
density , the amount of charge swing across the effec-
for values of such that is large.
tive capacitor it is injecting into , and the steady-state
Criteria for the reduction of phase noise in the region
oscillation waveform across the noise source of interest. This
are suggested by (24), which shows that the corner of
observation is important since it allows us to attribute a definite
the phase noise is proportional to the square of the coefficient
contribution from every noise source to the overall phase noise.
. Recalling that is twice the dc value of the (effective)
Hence, our treatment is both an analysis and design tool,
ISF function, namely
enabling designers to identify the significant contributors to
phase noise.
(30)
F. Existing Models as Simplified Cases it is clear that it is desirable to minimize the dc value of
As asserted earlier, the model proposed here reduces to the ISF. As shown in the Appendix, the value of is
earlier models if the same simplifying assumptions are made. closely related to certain symmetry properties of the oscillation
188 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 2, FEBRUARY 1998
(a)
(b)
(a) (b)
Fig. 17. Simulated power spectrum with current injection at f m = 50 MHz
for (a) asymmetrical node and (b) symmetrical node.
(c)
oscillator. In the second experiment, the same source is applied
to the asymmetric node. As can be seen from the power
spectra of the figure, noise injected into the asymmetric
node results in sidebands that are 12 dB larger than at the
symmetric node.
(d) Note that (30) suggests that upconversion of low frequency
noise can be significantly reduced, perhaps even eliminated,
by minimizing , at least in principle. Since depends
on the waveform, this observation implies that a proper
choice of waveform may yield significant improvements in
close-in phase noise. The following experiment explores this
Fig. 16. (a) Waveform and (b) ISF for the asymmetrical node. (c) Waveform
and (d) ISF for one of the symmetrical nodes.
concept by changing the ratio of to over some range,
while injecting 10 A of sinusoidal current at 100 MHz into
one node. The sideband power below carrier as a function
of the to ratio is shown in Fig. 18. The SPICE-
waveform. One such property concerns the rise and fall simulated sideband power is shown with plus symbols and
times; the ISF will have a large dc value if the rise and the sideband power as predicted by (18) is shown by the
fall times of the waveform are significantly different. A solid line. As can be seen, close-in phase noise due to
limited case of this for odd-symmetric waveforms has been upconversion of low-frequency noise can be suppressed by
observed [14]. Although odd-symmetric waveforms have small an arbitrary factor, at least in principle. It is important to note,
coefficients, the class of waveforms with small is not however, that the minimum does not necessarily correspond to
limited to odd-symmetric waveforms. equal transconductance ratios, since other waveform properties
To illustrate the effect of a rise and fall time asymmetry, influence the value of . In fact, the optimum to ratio
consider a purposeful imbalance of pull-up and pull-down in this particular example is seen to differ considerably from
rates in one of the inverters in the ring oscillator of Fig. 5(b). that used in conventional ring oscillator designs.
This is obtained by halving the channel width of the The importance of symmetry might lead one to conclude
NMOS device and doubling the width of the PMOS that differential signaling would minimize . Unfortunately,
device of one inverter in the ring. The output waveform while differential circuits are certainly symmetrical with re-
and corresponding ISF are shown in Fig. 16(a) and (b). As spect to the desired signals, the differential symmetry dis-
can be seen, the ISF has a large dc value. For compari- appears for the individual noise sources because they are
son, the waveform and ISF at the output of a symmetrical independent of each other. Hence, it is the symmetry of
inverter elsewhere in the ring are shown in Fig. 16(c) and each half-circuit that is important, as is demonstrated in the
(d). From these results, it can be inferred that the close-in differential ring oscillator of Fig. 19. A sinusoidal current of
phase noise due to low-frequency noise sources should be 100 A at 50 MHz injected at the drain node of one of
smaller for the symmetrical output than for the asymmetrical the buffer stages results in two equal sidebands, 46 dB
one. To investigate this assertion, the results of two SPICE below carrier, in the power spectrum of the differential output.
simulations are shown in Fig. 17. In the first simulation, Because of the voltage dependent conductance of the load
a sinusoidal current source of amplitude 10 A at devices, the individual waveform on each output node is not
MHz is applied to one of the symmetric nodes of the fully symmetrical and consequently, there will be a large
HAJIMIRI AND LEE: GENERAL THEORY OF PHASE NOISE IN ELECTRICAL OSCILLATORS 189
m = 100
Fig. 20. Measured sideband power versus injected current at f
Fig. 18. Simulated and predicted sideband power for low frequency injection + m=55
kHz, f0 f : MHz, f02 + m = 10 9
f : MHz, f03 + m = 16 3 MHz.
f :
m
Fig. 21. Measured sideband power versus f , for injections in vicinity of
Fig. 23. Phase noise measurements for a five-stage single-ended CMOS ring
oscillator. f0 = 232 MHz, 2-m process technology.
multiples of f0 .
Fig. 26. Sideband power versus the voltage controlling the symmetry of the
waveform. Seven-stage current-starved single-ended CMOS VCO. f0 = 50
Fig. 24. Phase noise measurements for an 11-stage single-ended CMOS ring MHz, 2-m process technology.
oscillator. f0 = 115 MHz, 2-m process technology.
Fig. 27. Phase noise measurements for a four-stage differential CMOS ring
oscillator. f0 = 200MHz, 0.5-m process technology.
Fig. 25. Effect of symmetry in a seven-stage current-starved single-ended
CMOS VCO. f0 = 60 MHz, 2-m process technology.
is A2 /Hz. Using these numbers
for , the phase noise in the region is predicted to be
constant at 60 MHz. As can be seen, making the waveform , or 103.2 dBc/Hz at an offset
more symmetric has a large effect on the phase noise in the of 1 MHz, while the measurement in Fig. 27 shows a phase
region without significantly affecting the region. noise of 103.9 dBc/Hz, again in agreement with prediction.
Another experiment on the same circuit is shown in Fig. 26, Also note that despite differential symmetry, there is a distinct
which shows the phase noise power spectrum at a 10 kHz region in the phase noise spectrum, because each half
offset versus the symmetry-controlling voltage. For all the circuit is not symmetrical.
data points, the control voltages are adjusted to keep the The eighth experiment investigates cyclostationary effects
oscillation frequency at 50 MHz. As can be seen, the phase in the bipolar Colpitts oscillator of Fig. 5(a), where the con-
noise reaches a minimum by adjusting the symmetry properties duction angle is varied by changing the capacitive divider
of the waveform. This reduction is limited by the phase noise ratio while keeping the effective parallel
in region and the mismatch in transistors in different capacitance constant to maintain
stages, which are controlled by the same control voltages. an of 100 MHz. As can be seen in Fig. 28, increasing
The seventh experiment is performed on a four-stage differ- decreases the conduction angle, and thereby reduces the
ential ring oscillator, with PMOS loads and NMOS differential effective , leading to an initial decrease in phase noise.
stages, implemented in a 0.5- m CMOS process. Each stage is However, the oscillation amplitude is approximately given by
tapped with an equal-sized buffer. The tail current source has , and therefore decreases for large
a quiescent current of 108 A. The total capacitance on each values of . The phase noise ultimately increases for large as
of the differential nodes is calculated to be fF a consequence. There is thus a definite value of (here, about
and the voltage swing is V, which results in 0.2) that minimizes the phase noise. This result provides a
fF. The total channel noise current on each node theoretical basis for the common rule-of-thumb that one should
192 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 2, FEBRUARY 1998
Fig. 28. Sideband power versus capacitive division ratio. Bipolar LC Colpitts
oscillator f0 = 100 MHz.
Fig. 29. State-space trajectory of an nth-order oscillator.
the “speed”
(32)
(33)
where is the norm of the first derivative of the waveform identical stages. The denominator may then be approximated
vector and is the derivative of the th node voltage. Equa- by
tion (34), together with the normalized waveform function
defined in (1), result in the following: (38)
(35) Fig. 30 shows the results obtained from this method compared
with the more accurate results obtained from methods and
where represents the derivative of the normalized waveform . Although this method is approximate, it is the easiest to
on node , hence use and allows a designer to rapidly develop important insights
into the behavior of an oscillator.
(36)
ACKNOWLEDGMENT
The authors would like to thank T. Ahrens, R. Betancourt, R.
It can be seen that this expression for the ISF is maximum Farjad-Rad, M. Heshami, S. Mohan, H. Rategh, H. Samavati,
during transitions (i.e., when the derivative of the waveform D. Shaeffer, A. Shahani, K. Yu, and M. Zargari of Stanford
function is maximum), and this maximum value is inversely University and Prof. B. Razavi of UCLA for helpful discus-
proportional to the maximum derivative. Hence, waveforms sions. The authors would also like to thank M. Zargari, R.
with larger slope show a smaller peak in the ISF function. Betancourt, B. Amruturand, J. Leung, J. Shott, and Stanford
In the special case of a second-order system, one can use Nanofabrication Facility for providing several test chips. They
the normalized waveform and its derivative as the state are also grateful to Rockwell Semiconductor for providing
variables, resulting in the following expression for the ISF: access to their phase noise measurement system.
(37) REFERENCES
[1] E. J. Baghdady, R. N. Lincoln, and B. D. Nelin, “Short-term frequency
where represents the second derivative of the function . In stability: Characterization, theory, and measurement,” Proc. IEEE, vol.
the case of an ideal sinusoidal oscillator , so that 53, pp. 704–722, July 1965.
, which is consistent with the argument [2] L. S. Cutler and C. L. Searle, “Some aspects of the theory and
measurement of frequency fluctuations in frequency standards,” Proc.
of Section III. This method has the attribute that it computes IEEE, vol. 54, pp. 136–154, Feb. 1966.
the ISF from the waveform directly, so that simulation over [3] D. B. Leeson, “A simple model of feedback oscillator noises spectrum,”
only one cycle of is required to obtain all of the necessary Proc. IEEE, vol. 54, pp. 329–330, Feb. 1966.
[4] J. Rutman, “Characterization of phase and frequency instabilities in
information. precision frequency sources; Fifteen years of progress,” Proc. IEEE,
vol. 66, pp. 1048–1174, Sept. 1978.
[5] A. A. Abidi and R. G. Meyer, “Noise in relaxation oscillators,” IEEE
C. Calculation of ISF Based on the First Derivative J. Solid-State Circuits, vol. SC-18, pp. 794–802, Dec. 1983.
[6] T. C. Weigandt, B. Kim, and P. R. Gray, “Analysis of timing jitter in
This method is actually a simplified version of the second CMOS ring oscillators,” in Proc. ISCAS, June 1994, vol. 4, pp. 27–30.
approach. In certain cases, the denominator of (36) shows little [7] J. McNeil, “Jitter in ring oscillators,” in Proc. ISCAS, June 1994, vol.
variation, and can be approximated by a constant. In such a 6, pp. 201–204.
[8] J. Craninckx and M. Steyaert, “Low-noise voltage controlled oscillators
case, the ISF is simply proportional to the derivative of the using enhanced LC-tanks,” IEEE Trans. Circuits Syst.–II, vol. 42, pp.
waveform. A specific example is a ring oscillator with 794–904, Dec. 1995.
194 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 2, FEBRUARY 1998
[9] B. Razavi, “A study of phase noise in CMOS oscillators,” IEEE J. Thomas H. Lee (M’83) received the S.B., S.M.,
Solid-State Circuits, vol. 31, pp. 331–343, Mar. 1996. Sc.D. degrees from the Massachusetts Institute of
[10] B. van der Pol, “The nonlinear theory of electric oscillations,” Proc. Technology (MIT), Cambridge, in 1983, 1985, and
IRE, vol. 22, pp. 1051–1086, Sept. 1934. 1990, respectively.
[11] N. Minorsky, Nonlinear Oscillations. Princeton, NJ: Van Nostrand, He worked for Analog Devices Semiconductor,
1962. Wilmington, MA, until 1992, where he designed
[12] P. A. Cook, Nonlinear Dynamical Systems. New York: Prentice Hall, high-speed clock-recovery PLL’s that exhibit zero
1994. jitter peaking. He then worked for Rambus Inc.,
[13] W. A. Gardner, Cyclostationarity in Communications and Signal Pro- Mountain View, CA, where he designed the phase-
cessing. New York: IEEE Press, 1993. and delay-locked loops for 500 MB/s DRAM’s. In
[14] H. B. Chen, A. van der Ziel, and K. Amberiadis, “Oscillator with odd- 1994, he joined the faculty of Stanford University,
symmetrical characteristics eliminates low-frequency noise sidebands,” Stanford, CA, as an Assistant Professor, where he is primarily engaged in
IEEE Trans. Circuits Syst., vol. CAS-31, Sept. 1984. research into microwave applications for silicon IC technology, with a focus
[15] J. G. Maneatis, “Precise delay generation using coupled oscillators,” on CMOS IC’s for wireless communications.
IEEE J. Solid-State Circuits, vol. 28, pp. 1273–1282, Dec. 1993. Dr. Lee was recently named a recipient of a Packard Foundation Fellowship
[16] C. K. Yang, R. Farjad-Rad, and M. Horowitz, “A 0.6mm CMOS 4Gb/s
award and is the author of The Design of CMOS Radio-Frequence Integrated
transceiver with data recovery using oversampling,” in Symp. VLSI
Circuits (Cambridge University Press). He has twice received the “Best Paper”
Circuits, Dig. Tech. Papers, June 1997.
award at ISSCC.
[17] D. DeMaw, Practical RF Design Manual. Englewood Cliffs, NJ:
Prentice-Hall, 1982, p. 46.
Transactions Briefs
A Study of Oscillator Jitter Due
to Supply and Substrate Noise
Frank Herzel and Behzad Razavi
(a)
I. INTRODUCTION
High-speed digital circuits such as microprocessors and memories
employ phase locking at the board-chip interface to suppress timing
skews between the on-chip clock and the system clock [1]–[3].
Fabricated on the same substrate as the rest of the circuit, the
phase-locked loop (PLL) must typically operate from the global (a)
supply and ground busses, thus experiencing both substrate and
supply noise. The noise manifests itself as jitter at the output of the
PLL, primarily through various mechanisms in the voltage-controlled
oscillator (VCO). As exemplified by measured results reported in the
literature, we show that the contribution of device electronic noise
to jitter is typically much less than that due to supply and substrate
noise.
This paper describes the effect of supply and substrate noise on
the performance of single-ended and differential ring oscillators,
providing insights that prove useful in the design of other types of
oscillators as well. Section II summarizes the oscillators studied in
this work and Section III defines various types of jitter. Sections IV
and V quantify the jitter due to thermal noise in the oscillation (b)
loop and frequency-modulating noise, respectively. Sections VI and Fig. 2. Differential ring oscillator: (a) block diagram and (b) implementation
VII apply the developed results to the analysis of supply and of one stage.
substrate noise, and Section VIII presents the dependence of jitter
upon parameters such as device size, the number of stages, and power The simulations were performed with the SPICE parameters of a
dissipation. 0.6-m CMOS technology. We employed the minimum gate length
throughout the paper. Furthermore, unless indicated otherwise, we
II. RING OSCILLATORS UNDER INVESTIGATION use the following parameters for the differential stage: W m, = 80
RL =1
k , ISS =1mA, CL =0
; VDD =3
V. The rms value of
In this paper, we investigate both single-ended ring oscillators
(SERO’s) and differential ring oscillators (DRO’s). The latter are 1 VDD was chosen to be 71 mV, corresponding to a peak amplitude
much more important in digital circuit applications, since DRO’s are of 100 mV for a sinusoidal perturbation.
less affected by supply and substrate noise. The circuit topologies are
shown in Fig. 1 for the SERO and in Fig. 2 for the DRO. III. DEFINITIONS OF JITTER
()
We consider the output voltage Vout t of an oscillator in the steady
()
Manuscript received October 1, 1997; revised August 2, 1998. This paper
was recommended by Associate Editor B. H. Leung. state. The time point of the nth minus-to-plus zero crossing of Vout t
F. Herzel was with the Electrical Engineering Department, University of =
is referred to as tn . The nth period is then defined as Tn tn+1 0 tn .
California at Los Angeles, Los Angeles, CA 90095, USA, on leave from the For an ideal oscillator, this time difference is independent of n, but in
reality it varies with n as a result of noise in the circuit. This results
Institute for Semiconductor Physics, Frankfurt, Oder, Germany.
B. Razavi is with the Electrical Engineering Department, University of
1 =
in a deviation Tn Tn 0 T from the mean period T . The quantity
1
California, Los Angeles, CA 90095 USA.
Publisher Item Identifier S 1057-7130(99)01471-8. Tn is an indication of jitter.
1057–7130/99$10.00 1999 IEEE
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 46, NO. 1, JANUARY 1999 57
because the latter type hardly changes when the oscillator is placed
in the loop.
A more general quantification of the jitter is possible by means of
the steady-state autocorrelation function (ACF) defined as
( ) = lim 1
N
C1T m
N !1 N
(1
Tn+m Tn : 1 ) (4)
n=1
To obtain an intuitive understanding of this quantity, we insert (4)
with m =0 in (2), obtaining
(a)
1Tc2 = C1T (0): (5)
Equation (5) states that the ACF with zero argument is the squared
cycle jitter. For a nonzero argument, the ACF decreases with in-
creasing m, finally approaching zero for m ! 1. This indicates that
1
the timing error Tn has a finite memory. In order to express the
cycle-to-cycle jitter by the ACF, we rewrite (3) as
(b)
1 = lim 1 (1
N
Fig. 3. Illustration of (a) long-term jitter and (b) cycle-to-cycle jitter. Tcc2 1 ) Tn+1 0 Tn 2
N !1 N
n=1
More specifically, absolute jitter or long-term jitter
= 2 (0) 2 (1)
C1T 0 C1T : (6)
1Tc = Nlim 1 N
1Tn2 : (2)
digital systems.
!1 N n=1
As derived in the Appendix, for white noise sources in the
oscillator, the single-sideband phase noise S (phase noise with
Cycle jitter describes the magnitude of the period fluctuations, but it respect to the carrier) can be expressed in terms of the cycle-to-cycle
contains no information about the dynamics. jitter according to
!03 =4 1Tcc2 3 2
!(0!=40!10 )T2 cc (7)
The third type of jitter considered here is cycle-to-cycle jitter
S (!) =
[Fig. 3(b)] given by
(! 0 !0)2 + !03 =8 2 1Tcc4
1Tcc = Nlim 1 N
(Tn+1 0 Tn )2 (3)
where !0 is the oscillation frequency and ! 0 !0 is the offset
!1 N n=1
frequency. The Appendix also shows that the cycle-to-cycle jitter
can be deduced from the phase noise according to
2 cos !m t:
0 VmfK0
(13)
0
1 (+ )
Multiplying this expression by T t and averaging the result
with respect to t, we obtain the steady-state ACF
2 2
1T (t + )1T (t) = Vm2fK4 0 cos !m : (14)
0
(a) (a)
(b) (b)
Fig. 6. Oscillation frequency of (a) the single-ended ring oscillator and (b) Fig. 7. Cycle jitter and cycle-to-cycle jitter of (a) the SERO and (b) the DRO
the differential ring oscillator as a function of static supply voltage. as a function of supply voltage noise frequency. The solid lines represent the
quasi-static FM expressions.
(a)
Fig. 10. Illustration of the equivalence of supply and substrate noise for the
DRO. (b)
Fig. 12. Illustration of the relationship between power consumption and
In this section, we study jitter as a function of three parameters: noise for (a) device electronic noise and (b) supply noise.
transistor gate width, power dissipation, and the number of stages. To
make meaningful comparisons, the circuit is modified in each case
By contrast, the effect of supply and substrate noise on the jitter of a
such that the frequency of oscillation remains constant. These param-
given oscillator topology is relatively independent of the power drain.
eters also affect the thermal jitter to some extent, but, considering the
This can be understood with the aid of the conceptual illustrations
vastly different designs reported in [5] and [6], we note that this type
in Fig. 12, where the output voltages of N identical oscillators are
of jitter still remains negligible.
added in phase. In Fig. 12(a), only the device electronic noise is
A. Effect of Transistor Gate Width considered [5]. Since thepnoise in each oscillator is uncorrelated, the
output noise voltage is N times that of each oscillator, whereas
the output signal voltage is N 2Vj . In Fig. 12(b), on the other hand,
The differential three-stage ring oscillator of Fig. 2 begins to
oscillate for W 30 m.
all oscillators are disturbed by the same noise source, thus exhibiting
Fig. 11 shows the effect of the gate width on the jitter, where the
completely correlated noise. That is, both the noise voltage and the
oscillation frequency is kept constant by adjusting CL in Fig. 2. The
signal voltage are increased by a factor of N .
jitter reaches a minimum for W 80 m. For large W , the value of
To confirm the above observation, the gate width and tail current
CL must be reduced so as to maintain the same oscillation frequency, were decreased while the load resistance was increased proportion-
yielding a larger voltage-dependent fraction due to drain and source
ally. Table I shows that the jitter is quite constant.
junctions of each device and hence a higher sensitivity to noise.
TABLE I
IMPACT OF POWER CONSUMPTION
TABLE II
THREE-STAGE VERSUS SIX-STAGE OSCILLATOR Fig. 14. Grounded shield used under the capacitor to block substrate noise.
IX. CONCLUSION
We have investigated the timing jitter in oscillators subject to
supply and substrate noise. For digital timing applications, the effect
of supply and substrate noise on the jitter is typically much more
pronounced than that of thermal noise. For supply and substrate noise,
we have derived analytical relationships between the cycle-to-cycle
jitter and the low-frequency sensitivity of the oscillation frequency
to supply or substrate noise. These relationships have been verified
by means of numerical calculations for single-ended and differential
CMOS ring oscillators. For differential ring oscillators, we have
investigated the dependence of the jitter on the transistor gate width,
power consumption, and the number of stages. As a special result,
we have found that in applications where the required oscillation
frequency is lower than the maximum speed of the technology, a
three-stage ring oscillator with additional load capacitances gives the
lowest jitter.
APPENDIX
JITTER AND PHASE NOISE DUE TO THERMAL AND SHOT NOISE
The output voltage of an oscillator can be written as
V t ( ) = V0 cos[!0t + (t)] (18)
where V0 is the amplitude, !0 is the oscillation frequency, and t ()
is the slowly varying excess phase. The excess frequency is
at (0) with the variance On the other hand, the phase noise can be expressed by the cycle-to-
2 = 2D t:
cycle jitter by inserting (32) in (25), yielding
(22)
As evident from (22), the variance diverges with time. The autocor-
!03 =4 1Tcc2
()
relation of V t is known [12] and reads S = :
(! 0 !0)2 + !03=8 1Tcc4
2 (35)
2
hV (t + )V (t)i = V20 exp(0Dj j)cos(!0 ): (23)
Performing the Fourier transformation, we obtain the one-sided power A similar expression has been derived for ring oscillators in [10] and
spectral density reads in our notation
D
SV (!) = V02 f03 1Tc2
(! 0 !0)2 + D2 : (24)
S =
(f 0 f0 )2 (36)
This quantity is often normalized to V02 =2 and referred to as relative
phase noise with respect to the carrier [5] or as single-sideband phase
where f0 = 2
!0 = . Equation (36) turns out to be a special case of
(35) for ! 0 !0 D .
noise [8], given by
S (!) =
2D
(! 0 !0 )2 + D2 : (25) The absolute jitter increases proportionally to the square root of the
1
measurement interval t as evident from (22). Hence, the absolute
For ! 0 !0 D , we obtain from (25) phase jitter is
S (!)
2D p
(! 0 !0 )2 : (26)
1abs = 2D 1t = 1t: (37)
Next, we will relate the cycle-to-cycle jitter and the single-sideband
phase noise to each other. Note that the stationary Wiener process Using (32), the proportionality constant can be related to the
has no memory and the increments in different time intervals are cycle-to-cycle jitter according to
statistically independent [11]. Therefore, the rms mean increment of
()
the excess phase t within one cycle, i.e., the cycle jitter of the p
()
phase, equals the increment of t between t =0 =
and t T . Thus, = 2D = 2f03=2 1Tcc : (38)
from (22), we obtain the phase cycle jitter as
The excess phase change during the nth cycle is referred to as 1n . [1] I. A. Young, J. K. Greason, and K. L. Wong, “A PLL clock generator
The nth oscillation period is defined by the relation with 5 to 110 MHz of lock range for microprocessors,” IEEE J. Solid-
State Circuits, vol. 27, pp. 1599–1607, Nov. 1992.
2f0Tn = 2 + 1n : (28) [2] J. Alvarez, H. Sanchez, G. Gerosa, and R. Countryman, “A wide-
bandwidth low-voltage PLL for powerPCTM microprocessors,” IEEE
For the deviation of the nth period Tn from the mean period J. Solid-State Circuits, vol. 30, pp. 383–391, Apr. 1995.
T =1
=f0 , we then find [3] R. Bhagwan and A. Rogers, “A 1 GHz dual-loop microprocessor PLL
with instant frequency shifting,” in IEEE Proc. ISSCC, San Francisco,
1Tn = 21fn0 = 1n 2T : (29)
CA, Feb. 1997, pp. 336–337.
[4] D. B. Leeson, “A simple model of feedback oscillator noise spectrum,”
in Proc. IEEE, pp. 329–330, Feb. 1966.
Hence, the cycle jitter 1Tc of the period during one cycle is related [5] B. Razavi, “A study of phase noise in CMOS oscillators,” IEEE J.
to 1c according to Solid-State Circuits, vol. 31, pp. 331–343, Mar. 1996.
1Tcc2 = 8!3 D
(ISCAS’94), London, U.K., June 1994, vol. 4, pp. 27–30.
[11] C. W. Gardiner, Handbook of Stochastic Methods. Berlin: Springer-
(32)
0 Verlag, 1983.
[12] R. L. Stratonovich, Topics in the Theory of Random Noise. New York:
with Gordon and Breach, 1967.
!0 =
2 :
T
(33)
Fig. 2. SNR degradation due to the phase noise and spurious level.
Because of the feedthrough and modulation of the reference III. DUAL-LOOP DESIGN
signal, two spurious tones appears at the away from the
desired output frequency, as shown in Fig. 2. The derivation of To reduce the switching time and the chip area of a synthe-
the spurious-tone specification is similar to that of the phase sizer, a high loop bandwidth and a high reference frequency
noise except that the channel bandwidth is not considered in this are desired. Moreover, to suppress the phase-noise contribu-
case. The spurious-tone specification can be expressed as tion of the reference signal and improve frequency-divider com-
follows: plexity, a lower frequency-division ratio is desirable. Therefore,
a dual-loop frequency synthesizer is proposed [6]. As shown in
SNR Fig. 4, the dual-loop design consists of two reference signals and
dBc at 1.6 MHz two phase-locked loops (PLLs) in cascade configuration. In the
(4) feedback path of the high-frequency loop, a mixer is adopted to
dBc for offset MHz.
provide the frequency shift. The output frequency of the synthe-
sizer is expressed as follows:
D. Switching Time
In GSM 900 systems, time-division multiple-access (TDMA) (5)
is adopted within each frequency channel. As shown in Fig. 3,
each frequency channel is divided into eight time slots. Sig- where and are frequencies of the two reference
nals are received and transmitted in time slot #1 and slot #4, signals, and , and are frequency division ratios.
206 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 36, NO. 2, FEBRUARY 2001
Due to the dual-loop architecture, the comparison frequen- IV. CIRCUIT IMPLEMENTATION
cies of the low-frequency and high-frequency loops are scaled This section discusses the design consideration and circuit
up from 200 kHz to 1.6 and 11.3 MHz, respectively. Therefore, implementation of the major building blocks that are unique and
the loop bandwidths of both PLLs can be increased so that the
critical to the proposed dual-loop synthesizer, namely the two
switching time and the chip area can be reduced. Compared VCOs, the frequency dividers, the charge pump, and the loop
to single-loop integer- designs, the frequency-division ratio filters. Detailed analysis and design of other building blocks will
of the programmable divider is reduced from 4236–4449
not be presented, either because they can be found somewhere
to 226–349. Such a reduction in the division ratio significantly
else or they are too obvious.
simplifies the frequency-divider design and reduces phase-noise
contribution of the input reference signal. A. Ring Oscillator VCO1
In the proposed dual-loop synthesizer, the divide-by-32 di-
vider and the high-frequency loop together greatly attenu- The schematic of the proposed two-stage ring oscillator and
ates the phase noise and the spurious tones of the low-frequency its delay cell to meet the required specification as described in
loop. As such, the low-frequency loop can be designed to have Section III are shown in Fig. 5(a) and (b), respectively. The delay
a larger loop bandwidth and a loop filter as small as one-fifth cell consists of nMOS transistors as input transconductors,
of the loop filter in the high-frequency loop. The low-frequency cross-coupled pMOS transistors for maintaining oscilla-
loop requires additional components, including the phase-fre- tion, diode-connected pMOS transistors , and a bias tran-
quency detector (PFD1), the charge pump (CP1), and the fre- sistor for frequency tuning. The source nodes of transistors
quency divider , but they are all quite small and have very are connected to supply to maximize its output amplitude
little impact on the chip area. In additional, VCO1 is imple- , which also helps suppress noise sources by turning them off
mented by a ring oscillator, which occupies a much smaller chip more often [7] and thus further enhances the phase-noise per-
area compared to VCO2. Altogether, the dual-loop design re- formance.
quires no more than 25% overhead in the chip area compared to The half circuit of the delay cell is shown in Fig. 5(c). By
a fraction- design with the same loop bandwidth. equating the delay-cell voltage gain to be unity, the oscillating
Although the input-reference frequency of the frequency of the ring oscillator can be expressed as follows:
low-frequency loop is scaled up by 8 times; the required
frequency range of the oscillator VCO1 in the low-frequency
loop is also scaled up from 25 to 200 MHz. On the other
hand, the phase-noise of the ring oscillator is attenuated by the (6)
frequency divider and is then amplified by the high-fre-
quency loop; the total phase-noise attenuation from VCO1 where is transconductance, is channel conductance, and
output to the synthesizer output is 18 dB. Consequently, this is the total capacitance at output node. Oscillation starts
voltage-controlled oscillator (VCO) requires a high operating when the is large enough to overcome the output load
frequency (600 MHz), a wide frequency range (200 MHz), and .
a low phase noise ( 103 dBc/Hz at 600 kHz). A novel ring When control voltage V, transistors are turned
VCO design that meets all of these tough specifications will be on to cancel , and the oscillator operates at maximum fre-
presented in the next section. quency. When control voltage , transistor are
YAN AND LUONG: MONOLITHIC CMOS DUAL-LOOP FREQUENCY SYNTHESIZER 207
Fig. 5. Circuit implementation of the ring oscillator VCO1. (a) Ring oscillator. (b) Delay cell. (c) Half circuit of delay cell.
(7)
TABLE I
SYSTEM DESIGN OF PROGRAMMABLE-FREQUENCY DIVIDER N
Fig. 10. Circuit implementation of the charge pump and the loop filter.
and pole time constant of the corresponding loop filters, respec- charge-pump noise current to the output phase noise can be
tively. is the phase-detector gain, is the VCO gain, derived to be
and , and are the total capacitance, zero time constant,
and pole time constant of the corresponding loop filters, respec-
tively. Since the transfer function is a low-pass function, the
reference phase noise is highly attenuated at high offset fre-
quency. It also shows that the close-in phase noise of the ref-
erence signals is amplified by the frequency-division ratio. In
this work, the division ratio is reduced from 4449 to 349, and
the phase-noise contribution from the reference signals is sup-
pressed by dB.
Another important source of the close-in phase noise is
the charge-pump noise. The transfer functions between the (9)
210 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 36, NO. 2, FEBRUARY 2001
It shows that small frequency-division ratio and large phase- which are high-pass functions. Therefore, the far-offset phase
detector gain or large charge-pump current are preferred for noise of the synthesizer is dominated by the VCO phase noise.
phase-noise consideration. Another factor not included in (9) Since the loop bandwidths of both PLLs are designed in a range
which also affects the charge-pump noise is its turn-on time. of tens of kilohertz to achieve spurious-level specification, the
In this proposed synthesizer, the charge-pump turn-on time is far-offset phase noise of the PLLs only depends on the VCO
designed to be equal to 1/10 of the input period so that it is phase noise itself.
long enough to eliminate the phase-frequency detector (PFD) To evaluate the overall phase-noise performance, the relation-
dead-zone problem, but at the same time is short enough to min- ship between the phase noise of the low-frequency loop and that
imize the charge-pump phase-noise contribution. of the synthesizer output can be written as
The phase-noise contribution of the loop-filter resistors in
both PLLs can be estimated using their equivalent noise cur-
rents as follows:
(12)
(10)
which shows that there exists dB close-in
which are bandpass functions with peaks appearing between the phase-noise suppression for the low-frequency loop.
zero and the pole of the loop filter. To suppress the phase-noise The estimated phase noise of the whole synthesizer is
peaking, large loop-filter capacitors are desired at the cost of 81.4 dBc/Hz at 20.9 kHz and 123.8 dBc/Hz at 600 kHz.
large chip area. The contribution of each component is shown in Fig. 12, which
For the phase-noise contribution of the VCOs, the transfer shows that the close-in phase noise ( 100 kHz) is dominated
function between the VCO phase noise and output phase noise by the charge pump CP1 and loop filter LF1, while the far-offset
can be found to be phase noise ( 100 kHz) is dominated by the LC oscillator.
V. EXPERIMENTAL RESULTS
The dual-loop frequency synthesizer is implemented in a
standard 0.5- m CMOS technology. Linear capacitors are put
(11) under all the bias pins to serve as on-chip bypass capacitors.
YAN AND LUONG: MONOLITHIC CMOS DUAL-LOOP FREQUENCY SYNTHESIZER 211
B. Measurement of Varactors
The measurement results of the pn-junction varactor at
900 MHz are shown in Fig. 15. As the p diffusion of the
varactors used in the LC oscillator are connected to the output
of the LC oscillator core, they are biased at 1.16 V, which is
the dc bias of the oscillator core during the measurement. The
measured capacitance is close to the estimated results in the
reverse-biased region. The series resistance is around 2
due to the minimum junction spacing and the nonminimum
junction width. The quality factor is around 30 in the operating
region of the oscillator.
Fig. 14. Measurement results and equivalent circuit model of the spiral inductors at 900 MHz.
Fig. 15. Measurement results and bias condition of the pn-junction varactors at 900 MHz.
the measured peak-to-flat close-in phase noise in Fig. 18 is F. Measured Spurious Tones of the Frequency Synthesizer
around 15 dB, which is quite close to that of the estimated Fig. 19 shows the measured spurious level of the dual-loop
value in Fig. 12. Lastly, it is observed experimentally that the frequency synthesizer at 865.2 MHz, which are 79.5 dBc at
close-in phase noise is changed as the charge-pump current 1.6 MHz, 82.0 dBc at 11.3 MHz, and 82.83 dBc at 16 MHz.
of the high-frequency loop is adjusted. Unfortunately, the At 11.3 MHz, the spurious level is only 6 dB above the require-
phase-noise contribution of CP2 and LF2 cannot be measured ment. However, the predicted spurious level at 1.6 MHz should
individually. be below 90 dBc and the one at 16 MHz should not exist [16].
YAN AND LUONG: MONOLITHIC CMOS DUAL-LOOP FREQUENCY SYNTHESIZER 213
H. Performance Evaluation
TABLE II respectively. For the same reason, the total loop-filter capaci-
PERFORMANCE SUMMARY OF THE PROPOSED SYNTHESIZER tance can be smaller than 60 pF, which greatly reduces chip area.
However, the situation would be much different if channel pro-
grammability is included.
Although the proposed synthesizer consists of two loop fil-
ters, but the chip area is just a little bit larger than that of the
design in [3] due to the use of linear capacitors and silicide-
blocked resistors. Compared to the designs in [1] and [3], the
spurious levels are between 75 and 85 dBc, which indicates
that CMOS designs suffer the same problem from the substrate
coupling between the reference signal and VCO. However, for
the close-in phase noise, the proposed dual-loop synthesizer suf-
fers from the 15-dB increase at the peak due to the charge pumps
and loop filters as discussed in Section V-E.
TABLE III
PERFORMANCE COMPARISON OF RECENT WORK ON FULLY INTEGRATED FREQUENCY SYNTHESIZERS
In addition to the close-in phase noise, the far-offset phase- and the spurious level is 82 dBc at 11.3 MHz. Due to the
noise requirement would also be more stringent by the same substrate coupling and testing setup, additional spurious levels
amount. Assuming that VCO1 is adopted in the third PLL, its are measured to be 79.5 dBc at 1.6 MHz and 82.8 dBc
phase noise could be improved to be 116 dBc/Hz at 600 kHz at 16 MHz. The chip area is less than 2.64 mm . Even if a
at a 204-MHz operation. With a 30.6-dB filtering effect of the third PLL is implemented to generate the second reference
high-frequency loop, the phase-noise contribution by the VCO frequency, the increase in the total power consumption and the
of the third PLL would become 146.6 dBc/Hz at 600 kHz, total chip area would be negligibly small.
which would have negligible effect on the overall phase noise.
Basically, the implementation of the third PLL would be sim- REFERENCES
ilar to that of the low-frequency loop, and its chip area would be [1] J. Craninckx and M. Steyaert, “A fully integrated CMOS DCS-1800
less than 10% of the total area of the dual-loop synthesizer. Since frequency synthesizer,” IEEE J. Solid-State Circuits, vol. 33, pp.
the third VCO operates at half the frequency of VCO1, its power 2054–2065, Dec. 1998.
[2] A. Ali and J. L. Tham, “A 900-MHz frequency synthesizer with in-
consumption would be 25% of that of VCO1 ( 2.5 mW). Simi- tegrated LC voltage-controlled oscillator,” Proc. IEEE Int. Solid-Stage
larly, since the divide-by-128 divider also operates at half of the Circuits Conf., vol. 1, pp. 390–391, 1996.
frequency, it would consume only half the power as compared [3] J. F. Parker and D. Ray, “A 1.6-GHz CMOS PLL with on-chip loop
filter,” IEEE J. Solid-State Circuits, vol. 33, pp. 337–343, Mar. 1998.
to the divider ( 0.3 mW). Although the charge-pump cur- +
[4] “Digital cellular telecommunications system (Phase 2 ); Radio trans-
rent should be increased by 100 times to 640 A, the average mission and reception (GSM 5.05),” European Telecommunications
power current would be only 64 A since the turn-on time is Standards Institute, 1996.
[5] D. Craninckx and D. Steyaert, Wireless CMOS Frequency Synthesizer
only around 1/10 of the input period. In conclusion, by intro- Design. Norwell, MA: Kluwer, 1998, pp. 201–202.
ducing the third PLL to generate the second reference signal, [6] T. Aytur and J. Khoury, “Advantages of dual-loop frequency synthe-
the additional power required would be less than 3 mW, and the sizers for GSM applications,” in Proc. IEEE Int. Symp. Circuits and Sys-
tems, 1997.
increase in the total chip area would still be less than 10%. [7] C. H. Park and B. Kim, “A low-noise 900-MHz VCO in 0.6-m CMOS,”
in Proc. Symp. VLSI Circuits, 1998.
[8] A. Hajimiri, S. Limotyrakis, and T. H. Lee, “Phase noise in multi-giga-
VI. CONCLUSION hertz CMOS ring oscillators,” in Proc. IEEE 1998 Custom Integrated
A 900-MHz monolithic CMOS dual-loop frequency synthe- Circuit Conf., 1998, pp. 49–52.
[9] “Oscillator noise analysis in SpectreRF, application note to SpectreRF,”
sizer with good phase-noise performance for GSM receivers is CADENCE, 1998.
presented. Compared to other fully integrated synthesizer de- [10] R. B. Merrill, T. W. Lee, H. You, R. Rasmussen, and L. A. Moberly, “Op-
signs, this proposed synthesizer operates at much lower supply timization of high-Q integrated inductors for multilevel metal CMOS,”
in Proc. Int. Electronic Device Meeting, 1995, pp. 983–986.
voltage and consumes approximately the same power with [11] A. Hajimiri and T. H. Lee, “A general theory of phase noise in electrical
frequency normalization. Implemented in a standard 0.5- m oscillators,” IEEE J. Solid-State Circuits, pp. 179–194, Feb. 1998.
CMOS technology and at 2-V supply voltage, the synthesizer [12] J. Yuan and C. Svenson, “High-speed CMOS circuit technique,” IEEE
J. Solid-State Circuits, vol. 24, pp. 62–70, Feb. 1989.
has a power consumption of 34 mW. At 900 MHz, the measured [13] B. Razavi, RF Microelectronics. Englewood Cliffs, NJ: Prentice Hall,
phase noise is 121.8 dBc/Hz at 600-kHz frequency offset, 1997.
216 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 36, NO. 2, FEBRUARY 2001
[14] P. Larsson, “High-speed architecture for a programmable frequency di- Howard C. Luong (M’91) received the B.S. (high
vider and a dual-modulus prescaler,” IEEE J. Solid-State Circuits, vol. honors), M.S., and Ph.D. degrees in electrical engi-
31, pp. 744–748, May 1996. neering and computer sciences from the University
[15] I. A. Young, J. K. Greason, and K. L. Wong, “PLL clock generator with of California, Berkeley, in 1988, 1990, and 1994,
5 to 110 MHz of lock range for microprocessors,” IEEE J. Solid-State respectively. For his Master’s thesis, he worked on
Circuits, vol. 27, pp. 1599–1607, Nov. 1992. MOS analog multipliers with scaling technologies.
[16] W. S. T. Yan, “A 2-V 900-MHz monolithic CMOS dual-loop For his Ph.D. dissertation, he designed and fabri-
frequency synthesizer for GSM receivers,” M.Phil. thesis, Hong cated a superconductive flash-type analog-to-digital
Kong University of Science and Technology. [Online.] Available: converter that operated at multi-gigahertz clock and
http://www.ee.ust.hk/~eetak, 1999. input frequencies.
[17] T. Fredrich, “Direct phase noise measurements using a modern spectrum Since September 1994, he has been with the elec-
analyzer,” Microwave J., vol. 35, pp. 94–114, Aug. 1992. trical and electronics engineering faculty at the Hong Kong University of Sci-
ence and Technology, where he has been the Faculty-In-Charge of the Analog
Research Lab and the Associate Director of the EEE Undergraduate Program
Committee. His research interests are in high-performance analog and RF inte-
William S. T. Yan received the Bachelor and grated circuits for wireless and portable communications.
Master’s degrees in electrical and electronics Dr. Luong has served as an Associate Editor for IEEE TRANSACTIONS ON
engineering from the Hong Kong University of CIRCUITS AND SYSTEMS II. He received the Faculty Teaching Excellence Ap-
Science and Technology, Hong Kong, in 1996 and preciation Award from the Hong Kong University of Science and Technology
1999, respectively. School of Engineering in 1995, 1996, and 2000.
He is currently with Maxim Integrated Products,
Sunnyvale, CA. His current interests are in the areas
of high-frequency integrated-circuit design.
2048 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 12, DECEMBER 1997
(c)
I. INTRODUCTION
Fig. 1. Methods of frequency modulation upconversion: (a) mixer based, (b)
integration is lost.
Finally, approach (c) can be viewed as indirect modulation and a pipelined, digital – modulator. Finally, experimental
of the VCO through appropriate control of a frequency syn- results are presented and conclusions made.
thesizer that sets the VCO frequency and yields the simplest
transmitter solution of those presented. The synthesizer has a
digital input which allows elimination of the D/A converter II. BACKGROUND
that is required when directly modulating the VCO. Since the The fractional- approach to frequency synthesis enables
synthesizer controls the VCO during modulation, the problem fast dynamics to be achieved within the phase-locked loop
of frequency drift during modulation is eliminated. Also, (PLL) by allowing a high reference frequency [8]; a very
isolation requirements at the VCO input are greatly reduced at useful benefit when attempting to modulate the synthesizer.
frequencies within the PLL bandwidth. The primary obstacle High resolution is achieved with this approach by allowing
faced with this architecture is that a severe constraint is placed noninteger divide values to be realized through dithering; it
on the maximum achievable data rate due to the reliance on has been shown that low spurious noise can be obtained by
feedback dynamics to perform modulation. using a high-order – modulator to perform this operation
This paper presents a compensation method and key circuits [8], [10], [11]. This approach leads to a simple synthesizer
that allow modulation of a frequency synthesizer at rates that structure that is primarily digital in nature, and is referred to
are over an order of magnitude faster than its bandwidth. as a fractional- synthesizer with noise shaping.
Application of the technique allows a high data rate ( 1 Mb/s) Using this fractional- approach, it is straightforward to
transmitter with good spectral efficiency to be realized with realize a transmitter that performs phase/frequency modula-
only two components: a frequency synthesizer and a digital tion in a continuous manner by direct modulation of the
transmit filter. By avoiding additional components such as synthesizer. Fig. 2 illustrates a simple transmitter capable of
mixers and D/A converters in the modulation path, a low Gaussian minimum shift keying (GMSK) modulation [9]. The
power transmitter architecture is achieved. Since off-chip fil- binary data stream is first convolved with a digital finite
ters are not required, high integration is accomplished as well. impulse response (FIR) filter that has a Gaussian shape.
The technique can be used in transmitter applications where (Physical implementation of this filter can be accomplished
frequency modulation is desired, and a moderate tolerance is with a ROM whose address lines are controlled by consecutive
allowed on the modulation index. (When using compensation, samples of the data and time information generated by a
the accuracy of the modulation index, which is defined as the counter.) The digital output of this filter is then summed with
ratio of the peak-to-peak frequency deviation of the transmitter a nominal divide value and fed into the input of a digital
output to its data rate, is limited by variations in the open- – converter, the output of which controls the instantaneous
loop gain of the PLL [7].) To provide proof of concept of the divide value of the PLL. The nominal divide value sets the
technique, we present results from a 1.8-GHz prototype that carrier frequency, and variation of the divide value causes
supports Gaussian frequency shift keying (GFSK) modulation, the output frequency to be modulated according to the input
the same modulation method used in DECT, at data rates in data. Assuming that the PLL dynamics have sufficiently high
excess of 2.5 Mb/s. bandwidth, the characteristics of the modulation waveform
We begin by reviewing a fractional- synthesizer method are determined primarily by the digital FIR filter and thus
presented in [8]–[11] that provides a convenient structure with accurately set.
which to apply the technique. It is shown that high data rates Fig. 3 depicts a linearized model of the synthesizer dynam-
and good noise performance are difficult to achieve with this ics in the frequency domain. The digital transmit filter confines
topology. A method is proposed to overcome these problems, the modulation data to low frequencies, the – modulator
followed by discussion of issues that ensue from its use. A adds quantization noise that is shaped to high frequencies, and
description of key circuits in the prototype is then given, the PLL acts as a low-pass filter that passes the input but
which include an on-chip loop filter, a 64-modulus divider, attenuates the – quantization noise. In the figure, is
2050 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 12, DECEMBER 1997
calculated as
(1)
TABLE I
THEORETICALLY ACHIEVABLE DATA RATES
USING COMPENSATION FOR SECOND-ORDER PLL
B. Matching Issues
(6) In practice, mismatch will occur between the compensation
filter and PLL dynamics. While the compensation filter is
digital and therefore fixed, the PLL dynamics are analog in
A. Achievable Data Rates nature and sensitive to process and temperature variations.
Equation (6) reveals that the signal swing of increases Fig. 6 illustrates that a parasitic pole/zero pair occurs when
in proportion to for large values of the bandwidth of the PLL is too high; a similar situation
Since is the ratio of the modulation data rate to occurs when its bandwidth is too low. As will be seen in the
the bandwidth of the PLL, we see that high data rates lead results sections, the parasitic pole/zero pair causes intersymbol
to large signal swings of the modulation signal when using interference (ISI) and modulation deviation error. To mitigate
compensation. Intuitively, this behavior makes sense since the this problem in the prototype, an on-chip loop filter with
attenuation of must be overcome by the compensated accurate time constants was implemented, and open-loop gain
2052 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 12, DECEMBER 1997
control was used to accurately place the overall pole and zero
positions of the PLL transfer function.
An additional issue related to be mismatch arises from prac-
tical concerns in the PLL implementation. The achievement
of a large dynamic range in the charge pump is aided by
including an integrator in the loop filter (see Section IV-B2),
which yields an overall PLL transfer function as
(7)
IV. IMPLEMENTATION
To show proof of concept of the proposed compensation
method, the system depicted in Fig. 7 was built using a
custom CMOS fractional- synthesizer that contains several
key circuits. Included are an on-chip, continuous-time filter Fig. 8 displays a die photograph of the custom IC, which
that requires no tuning or external components, a digital was fabricated in a 0.6- m, double-poly, double-metal, CMOS
MASH – modulator with six output bits that achieves process with threshold voltages of V and
low power operation through pipelining, and a 64-modulus V. The entire die is 3 mm by 3 mm, and its power
divider that supports any divide value between 32 and 63.5 dissipation is 27 mW. Table II lists the power consumed by
in half cycle increments. An external divide-by-two prescaler individual circuits. The power supply values given in Table
is used so that the CMOS divider input operates at half the II were chosen to be as low as possible to minimize power
VCO frequency, which modifies the range of divide values to dissipation; at the cost of higher power dissipation, all circuits
include all integers between 64 and 127. could be powered by a single 3.3-V supply.
PERROTT et al.: CMOS FRACTIONAL- SYNTHESIZER USING DIGITAL COMPENSATION 2053
(a)
(b)
This assumption is reasonable while the filter , which is implemented with two pipelined adders
modulation signal is applied; we have found that setting the and a delay element, A delay is inserted between these
least significant bit (LSB) of the modulator high also helps two adders in order to pipeline their sum path, which requires
to achieve this condition by forcing the internal states of the a matching delay in the path above for time alignment.
MASH structure to constantly change. Also, a delay must also be included in the output path of
A fact that does not appear to have been appreciated in the first – stage to compensate for the time delay incurred
the literature is that the digital MASH – structure is through the second stage. Since a signal once placed in the
highly amenable to pipelining. This is a useful technique when “pipe shifted domain” can be sent through any number of
seeking a low power implementation since it allows the supply cascaded, pipelined adders and/or integrators, only one pipe
voltage to be reduced by virtue of the fact that the required shift and align shift are needed in the entire structure.
throughput can be achieved with lower circuit speed. Fig. 18 illustrates the implementation of the overall digital
To pipeline the MASH structure, we apply a well-known path using pipelining. To save area, the circuits were pipelined
technique that has been used for adders and accumulators [17], every two bits as opposed to one, and pipe shifting was not
[18]. Fig. 16 illustrates a 3-b example. Since the critical path applied to the carrier frequency signal since it is constant
in these structures is their carry chain, registers are inserted in during modulation. To achieve flexibility, the compensated
this path. To achieve time alignment between the input and the digital transmit filter was implemented in software, as opposed
delayed carry information, registers are also used to skew the to a ROM, and the resulting digital data stream fed into the
input bits. As indicated in the figure, we refer to this operation custom CMOS IC.
as “pipe shifting” the input. The adder output is realigned
in time by performing an “align shift” of its bits as shown. V. MEASURED PERFORMANCE
(Note that shading is applied to the adder block in Fig. 16 as a The primary performance criteria by which a transmitter is
reminder that its bits are skewed in time.) The same pipelining judged are its accuracy in modulation and its noise perfor-
approach can be applied to digital accumulators since there is mance. We now describe the characterization of the prototype
no feedback from higher to lower bits. in relation to these issues.
Since its basic building blocks are adders and accumulators, Fig. 19 shows measured eye diagrams from the prototype
a MASH – modulator of any order can be pipelined using using an HP 89441A modulation analyzer. To illustrate the
this technique. Using the symbols introduced in the previous impact of mismatch between the compensation filter and PLL
two figures, Fig. 17 depicts a pipelined, second-order MASH dynamics, measurements were taken under three different
topology. Each first-order – is realized as a pipelined values of open-loop gain. These results indicate that the
accumulator with feedback removed from the most significant modulation performance of the transmitter is quite good even
bits in its output. The output of the second stage is fed into the when the open-loop gain is in error by 25%; the effects of
PERROTT et al.: CMOS FRACTIONAL- SYNTHESIZER USING DIGITAL COMPENSATION 2057
TABLE III
VALUES OF NOISE SOURCES WITHIN PLL
(11)
Fig. 20. Expanded view of PLL system. where is Boltzmann’s constant, and is temperature in
degrees Kelvin.
this gain error are to produce a moderate amount of ISI and Assuming that each of the noise sources in Fig. 20 are
an error in the modulation deviation. independent of each other, we can express the overall phase
An explanation of the observed ISI and deviation error is noise spectral density at the transmitter output as
given in [7]. In brief, the resulting mismatch creates a parasitic
pole/zero pair that occurs near the cutoff frequency of the PLL (12)
(84 kHz in this case); the resulting transfer function seen by
where and are the noise contributions
the data can be viewed as the sum of a low-pass and an all-pass
from the dominant voltage, current, and quantization noise
filter. ISI is introduced as data excites the impulse response
sources. Based on the values in Table III and the model in
of the low-pass filter, and modulation deviation error occurs
Fig. 20, we obtain
since the magnitude of the all-pass is changed according to
the amount of mismatch present.
Fig. 20 displays the dominant sources of noise in the
(13)
prototype; their values are displayed in Table III. Many of
these values were obtained through ac simulation of the
where
relevant circuits in HSPICE. Note that all noise sources other
than are assumed to be white, so that the values of their
variance suffice for their description. This assumption is only
approximate for the VCO noise in the prototype, as will be In the case of the division of by two is an
seen in the measured data. approximation based on the fact that the dominant charge
Based on measurements, the input referred noise of the VCO pump noise source is switched in and out at each opamp input
was calculated in the table from the expression with a nominal duty cycle of 50%. Note that is given
by (2).
dBc/Hz at MHz A plot of the spectra in (13) is shown in Fig. 21(a).
(10) Computation of these spectra assumed the parameter values
2058 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 12, DECEMBER 1997
(a)
(b)
Fig. 21. Noise spectra of synthesizer: (a) calculated: (1) charge pump induced, S8 ()
f ; (2) VCO and opamp induced, S 8 ()
f ; (3) 6–1 induced, S 8 ()
f ;
()
(4) overall, S8 f and (b) measured synthesizer and open-loop VCO noise.
listed in Fig. 20 and Table III, and described by (7) with system. (The spurious content of the – modulator was
reduced to negligible levels by feeding a binary data stream
kHz kHz into the LSB of the modulation path so that the internal states
kHz of the – were randomized; the binary data stream was
designed to have relatively flat spectral characteristics and
As seen in this diagram, the noise from the charge pump negligible levels of spurious energy at frequencies greater
dominates at low frequencies, and the influence of the – than 10 kHz.) The resulting spectrum compares quite well
quantization noise dominates at high frequencies. with the calculated curve in Fig. 21(a), especially at high
Fig. 21(b) shows measured noise results from the transmitter frequency offsets close to 5 MHz. At lower frequencies in the
prototype taken with an HP 3048A phase noise measurement range of 100 kHz, the measured noise is within about 3 dB
PERROTT et al.: CMOS FRACTIONAL- SYNTHESIZER USING DIGITAL COMPENSATION 2059
of the predicted value; the higher discrepancy in this region [8] T. A. Riley, M. A. Copeland, and T. A. Kwasniewski, “Delta-sigma
might be attributed to the fact that was calculated without modulation in fractional-N frequency synthesis,” IEEE J. Solid-State
Circuits, vol. 28, pp. 553–559, May 1995.
considering the offset or transient response of the charge pump [9] T. A. Riley and M. A. Copeland, “A simplified continuous phase mod-
and/or the possible inaccuracy of the HSPICE device models ulator technique,” IEEE Trans. Circuits Syst. II, vol. 41, pp. 321–328,
May 1994.
at low currents. Note that the spur at 20-MHz offset (the [10] B. Miller and B. Conley, “A multiple modulator fractional divider,” in
reference frequency), which is due to the 50% nominal duty Proc. 44th Annual Symp. on Frequency Control, May 1990, pp. 559–567.
cycle of the PFD, is less than 60 dBc. [11] B. Miller and B. Conley, “A multiple modulator fractional divider,”
IEEE Trans. Instrum. Meas., vol. 40, pp. 578–583, June 1991.
Fig. 21(b) demonstrates that the unmodulated transmitter [12] J. Candy and G. Temes, Oversampling Delta-Sigma Data Converters.
has an output spectrum of 132 dBc/Hz at 5-MHz New York: IEEE Press, 1992.
offset from the carrier. At this frequency offset, simulations [13] Y. Tsividis and J. Voorman, Integrated Continuous-Time Filters. New
York: IEEE Press, 1993.
reveal that the output spectrum of the modulated transmitter [14] T. Kamoto, N. Adachi, and K. Yamashita, “High-speed multi-modulus
is equal to when its data rate is close to the DECT rate prescaler IC,” in 1995 Fourth IEEE Int. Conf. Universal Personal
Communications. Record. Gateway to the 21st Century, 1995, pp. 991,
of 1 Mb/s [7]. This being the case, the transmitter satisfies the 325-8.
DECT noise specification of 131 dBc/Hz at 5-MHz offset; [15] J. Craninckx and M. S. Steyaert, “A 1.75-GHz/3-V dual-modulus divide-
eye diagrams for data rates close to 1 Mb/s are found in [7]. by-128/129 prescaler in 0.7-m CMOS,” IEEE J. Solid-State Circuits,
vol. 31, pp. 890–897, July 1996.
[16] M. Thamsirianunt and T. A. Kwasniewski, “A 1.2 m CMOS implemen-
tation of a low-power 900-MHz mobile radio frequency synthesizer,” in
VI. CONCLUSION IEEE Custom IC Conf., 1994, pp. 16.2/1-4.
[17] S.-J. Jou, C.-Y. Chen, E.-C. Yang, and C.-C. Su, “A pipelined multiplier-
A digital compensation method and key circuits were pre- accumulator using a high-speed, low-power static and dynamic full
sented that allow modulation of a frequency synthesizer at adder design,” IEEE J. Solid-State Circuits, vol. 32, pp. 114–118, Jan.
rates over an order of magnitude faster than its bandwidth. 1997.
[18] F. Lu and H. Samueli, “A 200-MHz CMOS pipelined multiplier-
Using this technique, a transmitter prototype was built that accumulator using a quasidomino dynamic full-adder cell design,” IEEE
achieves 2.5-Mb/s data rate modulation using GFSK modu- J. Solid-State Circuits, vol. 28, pp. 123–132, Feb. 1993.
lation at a carrier frequency of 1.8 GHz. Measured results
indicate that the architecture can achieve the modulation and
noise performance required by the DECT standard with a
structure that is highly integrated and has low power dis-
sipation. In particular, the mostly digital design requires no
off-chip filters, no mixers, and no D/A converters in the Michael H. Perrott (S’97) was born in Austin, TX,
in 1967. He received the B.S.E.E. degree from New
modulation path. Further, the structure contains only the core Mexico State University, Las Cruces, in 1988, and
components required of a narrowband, spectrally efficient the M.S. and Ph.D. degrees in electrical engineering
transmitter: a frequency synthesizer and a digital transmit filter. and computer science from Massachusetts Institute
of Technology, Cambridge, in 1992 and 1997, re-
spectively.
ACKNOWLEDGMENT He currently works at Hewlett-Packard Labora-
tories, Palo Alto, CA. His interests include signal
The authors thank G. Dawe and J. Mourant for guidance processing and circuit design applied to communi-
cation systems.
in RF issues, A. Chandrakasan for discussion on low power
methods, R. Weiner for bonding the die, B. Broughton for aid
in phase noise measurements, and M. Trott, P. Ferguson, P.
Katzin, Z. Zvonar, and D. Fague for advice.
REFERENCES
Theodore L. Tewksbury III (S’86–M’87) received
[1] P. Gray and R. Meyer, “Future directions in silicon IC’s for RF personal the S.B. degree in architecture in 1983 and the
communications,” in IEEE Custom IC Conf., 1995, pp. 83–90. M.S. and Ph.D. degrees in electrical engineering and
[2] T. Stetzler, I. Post, J. Havens, and M. Koyama, “A 2.7–4.5 V single-chip computer science in 1987 and 1992, respectively,
GSM transceiver RF integrated circuit,” in Proc. IEEE Int. Solid-State all from the Massachusetts Institute of Technology,
Circuits Conf., Feb. 1995, pp. 150–151. Cambridge. His doctoral dissertation consisted of
[3] J. Min, A. Rofougaran, H. Samueli, and A. A. Abidi, “An all-CMOS ar- an experimental and theoretical investigation of the
chitecture for a low-power frequency-hopped 900 MHz spread spectrum effects of oxide traps on the large-signal transient
transceiver,” in IEEE Custom IC Conf., 1994, pp. 16.1/1-4. performance of analog MOS circuits.
[4] S. Sheng, L. Lynn, J. Peroulas, K. Stone, I. O’Donnell, and R. Brodersen, He joined Analog Devices, Inc., in 1987 as De-
“A low-power CMOS chipset for spread-spectrum communications,” in sign Engineer for the Converter Group, where he
Proc. IEEE Int. Solid-State Circuits Conf., Feb. 1996, pp. 346–347. worked on high-speed, high-resolution data acquisition circuits for video,
[5] S. Heinen, S. Beyer, and J. Fenk, “A 3.0 V 2 GHz transmitter IC for instrumentation, and medical applications. From 1992 to 1994, as Senior Char-
digital radio communication with integrated VCO’s,” in Proc. IEEE Int. acterization Engineer, he was involved in the development of high-accuracy
Solid-State Circuits Conf., Feb. 1995, pp. 150–151. analog models for advanced bipolar, BiCMOS, and CMOS processes, with
[6] S. Heinen, K. Hadjizada, U. Matter, W. Geppert, V. Thomas, S. Weber, emphasis on the statistical modeling of manufacturing variations. In December
S. Beyer, J. Fenk, and E. Matschke, “A 2.7 V 2.5 GHz bipolar chipset for 1994, he joined the newly formed Communications Division at Analog
digital wireless communication,” in Proc. IEEE Int. Solid-State Circuits Devices as RF Design Engineer. He is presently involved in the design of
Conf., Feb. 1997, pp. 306–307. RF integrated circuits for wireless communications, including GSM, DECT,
[7] M. H. Perrott, “Techniques for high data rate modulation and low power and DBS. He is also actively involved in the development and modeling of
operation of fractional-N frequency synthesizers,” Ph.D. dissertation, advanced semiconductor technologies for RF applications, including ADRF
MIT, 1997. (Analog Devices bipolar RF process) and silicon germanium.
2060 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 12, DECEMBER 1997
TABLE I
POWER CONSUMPTION OF FULLY INTEGRATED
WIRELESS RECEIVERS
D. Charge Pump
Fig. 6 shows the circuit diagram of the charge pump and loop
filter. The charge pump has a differential architecture. However,
only a single output node, , drives the loop filter. To pre-
vent the node from drifting to the rails when neither of the
Fig. 7. Linearized PLL model.
up and down signals (U and D) is active, the unity gain buffer
shown in Fig. 6 is placed between the two output nodes. This
buffer keeps the two output nodes at the same potential and where is the crossover frequency. By differentiating (4) with
thus reduces the charge pump offset. The power of the spurious respect to it can be shown that the maximum phase margin
sidebands in the synthesized output signal is thereby reduced. is achieved at
In this charge pump the current sources are always on and the
PMOS and NMOS switches are used to steer the current from (5)
one branch of the charge pump to the other.
and the maximum phase margin is
E. Loop Filter
Resistor and capacitor in the loop filter (Fig. 6) gen-
erate a pole at the origin and a zero at . Capacitor (6)
and the combination of and are used to add extra poles at
frequencies higher than the PLL bandwidth to reduce reference
feedthrough and decrease the spurious sidebands at harmonics Notice that the maximum phase margin is only a function of
of the reference frequency. The thermal noise of and , (ratio of and ) and for less than 1 the phase margin is
although filtered by the loop, directly modulates the VCO con- less than 20 which makes the loop practically unstable.
trol voltage and can cause substantial phase noise in the VCO To complete our loop analysis we force to
if the resistors are not sized properly. The capacitors and resis- be the crossover frequency of the loop and get
tors of the loop filter should be properly chosen to perform the
required filtering function and maintain the stability of the loop
without introducing too much noise. Fig. 7 shows a linearized (7)
phased-locked loop model. In a third-order loop, the loop filter
contains only , and and its impedance can be written
as Now we can define a loop filter design recipe as follows.
1) Find from the VCO simulation.
(2) 2) Choose a desired phase margin and find from (6).
3) Choose the loop bandwidth and find from (5).
where and . The open loop transfer 4) Select and such that they satisfy (7).
function of the third-order PLL is 5) Calculate the noise contribution of . If the calculated
noise is negligible the design is complete, otherwise go
(3) back to step four and increase .
The same loop analysis can be repeated for a fourth-order
where is the VCO gain constant and is the charge pump loop. In this case the phase margin is
current. The phase margin of the loop is
(4) (8)
784 IEEE JOURNAL ON SOLID-STATE CIRCUITS, VOL. 35, NO. 5, MAY 2000
(9)
(10)
RATEGH et al.: CMOS FREQUENCY SYNTHESIZER 785
Fig. 10. ILFD locking range and power consumption as a function of incident TABLE II
amplitude. ILFD PERFORMANCE SUMMARY
TABLE III frequency divider loaded with the same capacitance as in the
MEASURED SYNTHESIZER PERFORMANCE ILFD consumes almost an order of magnitude more power than
the ILFD with a 600-MHz locking range. The measurement re-
sults of a fast flip-flop based divider in an advanced 0.1- m
CMOS technology show a power consumption of 2.6 mW at
5 GHz [8] which is more than four times the power of the ILFD
with a 600 MHz locking range.
Table III summarizes the performance of the synthesizer. The
spurious sidebands at offset frequencies of twice the reference
signal are more than 54 dB below the carrier. The spurs are
mainly due to charge injection from the and signals to
the loop, and can be reduced significantly by using a cascode
structure for transistors M1–M4 (Fig. 6). Better matching be-
tween the up and down current sources also improves the side-
band spurs. Of the 25-mW total power consumption, less than
3.8 mW is consumed by the VCO and ILFD combined. This low
power consumption is achieved by the optimized design of the
spiral inductors in the VCO and ILFD. The prescaler operates
at 2.5 GHz and consumes 19 mW, of which about 40% is con-
sumed in the first 2/3 dual modulus divider. Therefore the ILFD,
which takes advantage of narrowband resonators, consumes an
order of magnitude less power than the first 2/3 dual modulus
divider, while operating at twice the frequency.
output phase noise. The ILFD phase noise measurements for
offset frequencies higher than 200 kHz are not accurate due to ACKNOWLEDGMENT
the dominance of noise from the external amplifier.
The authors would like to thank Dr. M. Hershenson, Dr. S.
The spurious tones at 11-MHz offset frequency from the
Mohan, and T. Soorapanth for their valuable technical discus-
center frequency are more than 45 dB below the carrier. The
sions and help. They also thank National Semiconductor for fab-
spurs at the 22-MHz offset frequency are at 54 dBc. Since
ricating the chip.
the LO spacing is twice the reference frequency, the spurs at
11-MHz offset frequency fall at the edge of each channel and
are less critical than the 22-MHz spurs which are located at REFERENCES
the center of adjacent channels. With the 54 dBc spurs at
[1] T. S. Aytur and B. Razavi, “A 2-GHz, 6 mW BiCMOS frequency synthe-
22 MHz offset frequency, an undesired adjacent channel may sizer,” IEEE J. Solid-State Circuits, vol. 30, pp. 1457–1462, Dec. 1995.
be 44 dB stronger than the desired channel for a minimum 10 [2] J. Craninckx and M. Steyaert, “A fully integrated CMOS DCS-1800 fre-
dB signal-to-interference ratio. quency synthesizer,” in ISSCC Dig., 1998, pp. 372–373.
[3] M. Hershenson, S. S. Mohan, S. P. Boyd, and T. H. Lee, “Optimization
Phase noise measurements of the complete synthesizer output of inductor circuits via geometric programming,” in Design Automation
signal are shown in Fig. 12. The phase noise at small offset fre- Conf. Dig., June 1999, pp. 994–998.
quencies is mainly determined by the phase noise of the ref- [4] C. G. S. M. H. Perrott and T. L. Tewksbury, “A 27-mW CMOS frac-
tional-N synthesizer using digital compensation for 2.5-Mb/s GFSK
erence signal. The phase noise measured at offset frequencies modulation,” IEEE J. Solid-State Circuits, vol. 32, pp. 2048–2059, Dec.
beyond the PLL bandwidth is the inherent VCO phase noise. 1997.
The phase noise at 1-MHz offset frequency is measured to be [5] A. S. Porret, T. Melly, and C. C. Enz, “Design of high-Q varactors for
low-power wireless applications using a standard CMOS process,” in
101 dBc/Hz. The phase noise at 22 MHz offset frequency is Custom Integrated Circuits Conf. Dig., May 1999, pp. 641–644.
extrapolated to be 127.5 dBc/Hz. Therefore the signal in the [6] H. R. Rategh and T. H. Lee, “Superharmonic injection-locked frequency
adjacent channel can be 43 dB stronger than that of the desired dividers,” IEEE J. Solid-State Circuits, vol. 34, pp. 813–821, June 1999.
[7] H. R. Rategh, H. Samavati, and T. H. Lee, “A 5GHz, 1mW CMOS
channel for a 10 dB signal–to–interference ratio. voltage controlled differential injection-locked frequency divider,” in
Custom Integrated Circuits Conf. Dig., May 1999, pp. 517–520.
[8] B. Razavi, K. F. Lee, and R. H. Yan, “Design of high-speed, low-power
VI. CONCLUSION frequency dividers and phase-locked loops in deep submicron CMOS,”
In this work we demonstrate the design of a fully integrated, IEEE J. Solid-State Circuits, vol. 30, pp. 101–109, Feb. 1995.
[9] H. Samavati, H. R. Rategh, and T. H. Lee, “A 5GHz CMOS wire-
5-GHz CMOS frequency synthesizer designed for a U-NII band less-LAN receiver front-end,” IEEE J. Solid-State Circuits, vol. 35, pp.
WLAN system. The tracking injection-locked frequency divider xxx–xxx, May 2000.
used as the first divider in the PLL feedback loop reduces the [10] D. Shaeffer, A. Shahani, S. Mohan, H. Samavati, H. Rategh, M. Her-
shenson, M. Xu, C. Yue, D. Eddleman, and T. Lee, “A 115-mW, 0.5-m
power consumption considerably without limiting the perfor- CMOS GPS receiver with wide dynamic-range active filters,” IEEE J.
mance of the PLL. Table II summarizes the performance of the Solid-State Circuits, vol. 33, pp. 2219–2231, Dec. 1998.
ILFD. The power consumption of two flip-flop based frequency [11] A. Shahani, D. Shaeffer, S. Mohan, H. Samavati, H. Rategh, M. Her-
shenson, M. Xu, C. Yue, D. Eddleman, and T. Lee, “Low-power di-
dividers at 5 GHz are also listed for comparison purposes. In viderless frequency synthesis using aperture phase detector,” IEEE J.
a 0.24- m CMOS technology a simulated SCL flip-flop based Solid-State Circuits, vol. 33, pp. 2232–2239, Dec. 1998.
RATEGH et al.: CMOS FREQUENCY SYNTHESIZER 787
[12] T. Soorapanth, C. P. Yue, D. K. Shaeffer, T. H. Lee, and S. S. Wong, Hirad Samavati (S’99) received the B.S. degree
“Analysis and optimization of accumulation-mode varactor for RF ICs,” in electrical engineering from Sharif University of
in Symp. VLSI Circuits Dig., 1998, pp. 32–33. Technology, Tehran, Iran, in 1994, and the M.S.
[13] M. Steyaert, M. Borremans, J. Janssens, B. D. Muer, N. Itoh, J. degree in electrical engineering from Stanford
Craninckx, J. Crols, E. Morifuji, H. S. Momose, and W. Sansen, “A University, Stanford, CA, in 1996. He currently is
single-chip CMOS transceiver for DCS-1800 wireless communica- pursuing the Ph.D. degree at Stanford University.
tions,” in ISSCC Dig., 1998, pp. 48–49. During the summer of 1996, he was with Maxim
[14] C. P. Yue, C. Ryu, J. Lau, T. H. Lee, and S. S. Wong, “A physical model Integrated Products, where he designed building
for planar spiral inductors on silicon,” in IEDM Tech. Dig., 1996, pp. blocks for a low-power infrared transceiver IC. His
6.5.1–6.5.4. research interests include RF circuits and analog
[15] C. P. Yue and S. S. Wong, “On-chip spiral inductors with patterned and mixed-signal VLSI, particularly integrated
ground shields for Si-Based RF IC’s,” in Symp. VLSI Circuits Dig., 1997, transceivers for wireless communications.
pp. 85–86. Mr. Samavati received a departmental fellowship from Stanford University in
1995 and a fellowship from the IBM Corporation in 1998. He is the winner of
the ISSCC Jack Kilby outstanding student paper award for the paper “Fractal
Capacitors” in 1998.
I. INTRODUCTION
specification [3]. The lower the phase noise of the LO signal is,
W IRELESS systems, such as PCS-CDMA, cellular
CDMA, and JSTD-018PCS, require the frequency
synthesizer to have precise channel spacing and low phase
the less unwanted signal around the carrier is modulated within
the in-band channel.
Table I shows worldwide mobile frequency standards and RF
noise to meet the overall noise specification and to prevent un-
phase-locked loop (PLL) requirements. CDMA systems require
wanted signal mixing of the interferer. Most existing frequency
fast switching time with precise accuracy of the channel fre-
synthesizers are implemented in silicon germanium (SiGe) or
quency. The channel raster is 30 kHz in cellular and 50 kHz
bipolar technologies, and use several external devices such as
in PCS systems and, to support the dual-band solution of the
temperature-compensated crystal oscillator (TCXO) and loop
CDMA system, the frequency resolution of the synthesizer must
filter. Because of cost and power consumption requirements,
be 10 kHz. This is a major limiting factor in the reduction of
fully integrated CMOS RF building blocks are crucial and have
the locking time and root mean square (rms) phase error. It also
been widely explored [1], [2].
makes it difficult to achieve single-chip integration due to the
Fig. 1 shows an example of a dual-band RF transceiver ar-
loop filter that has a large time constant.
chitecture for PCS- and cellular CDMA. A local oscillator (LO)
In Section II, the special features of the proposed frequency
signal from a dual-band frequency synthesizer is fed to the first
synthesizer are discussed. Section III describes several building
mixer of the receiver for downconversion and it is also used in
blocks of the synthesizer. The measurement results are given in
the transmitter for upconversion. The noise requirement of the
Section IV, and conclusions are presented in Section V.
frequency synthesizer is determined by the blocking profile of
the system, which is calculated from the power of signal and in-
terferer, minimum signal-to-noise ratio (SNR), and bandwidth II. SYSTEM ARCHITECTURE
The proposed PLL is a monolithic integrated circuit that per-
forms dual-band RF synthesis for CDMA wireless communica-
Manuscript received July 28, 2001; revised November 9, 2001.
Y. Koo, H. Huh, Y. Cho, D.-K. Jeong, and W. Kim are with the School of tion applications without any external device. Fig. 2 shows the
Electrical Engineering and Computer Science, Seoul National University, Seoul block diagram of the fractional- -type frequency synthesizer
151-742, Korea (e-mail: ydkoo@iclab.snu.ac.kr). architecture. The external reference frequency is mainly
J. Lee, J. Park, and K. Lee are with GCT Semiconductor, Inc., San Jose, CA
95131 USA. 19.68 MHz, and 19.8 and 19.2 MHz are also supported. The
Publisher Item Identifier S 0018-9200(02)03676-4. voltage-controlled oscillator (VCO) oscillates at 1.7 GHz for
0018-9200/02$17.00 © 2002 IEEE
KOO et al.: FULLY INTEGRATED CMOS FREQUENCY SYNTHESIZER 537
TABLE I
WORLDWIDE MOBILE FREQUENCY STANDARDS AND RF-PLL REQUIREMENTS
Fig. 3. Timing diagram of reference and VCO inputs of PFD in locked state
when the fraction is 1/3.
N
Fig. 5. Behavioral simulation of spur in fractional- architecture. (a) With
conventional scheme. (b) With proposed charge-averaging scheme.
C. Voltage-Controlled Oscillator
Two major issues in the design of the VCO are low phase
noise to meet the overall noise figure criteria and high gain
linearity for robust stability. Phase noise is mainly dependent
on the quality factor of the LC tank [5]. Although an
on-chip spiral inductor has recently been widely explored [6], a Fig. 10. Circuit diagram of (a) VCO and (b) bias circuit.
bonding-wire inductor is superior to a spiral inductor in terms
of resistance, i.e., quality factor. In addition, the bonding-wire variable capacitor. There are two methods that have been pre-
inductor has constant inductance over a wide frequency range. viously reported for implementation of the fixed capacitor. One
Fig. 9 shows bonding-wire inductor modeling. Two pads are is a metal-to-metal capacitor [1] and the other is a MOS tran-
connected to the differential output of the VCO and the ends sistor [7]. The former is used since it is superior in terms of VCO
of two lead frames are connected as a short or by external pushing characteristics. Metal-to-metal capacitors and switches
inductance, according to the operating band, PCS or cellular. are used for coarse tuning, which are controlled by the coarse
The factor of the inductance is expressed as tuning controller. The size of the switch should be
with parasitic components ignored. If the parameters of the sufficiently large in order to avoid degradation of the factor
bonding wire are nH, , pH, of the LC tank. As a variable capacitor, an accumulation-mode
m , which are typical values in the QFN20 package, MOS transistor is used for fine tuning. The MOS capacitor has
the factor of the inductor at 1.7 GHz is 43. an inherent nonlinear capacitance. But, with the coarse tuning
Fig. 10 shows the circuit diagram of the VCO. The oscilla- scheme, the control voltage moves within 0.2 V around half
tion frequency is controlled by the combination of fixed and of the supply voltage, thereby obtaining almost linear gain.
540 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 5, MAY 2002
Fig. 13. Coarse tuning controller. (a) Block diagram. (b) Timing diagram of
operation.
TABLE II
SUMMARY OF SYNTHESIZER PERFORMANCE
at one output of the VCO is 1.05 nH. All control signals are
fed through a serial interface. The VCO at the bottom and the
loop filter on the left have a common analog supply and ground,
and the others are all connected to the digital supply and ground.
The power consumption of the total chip is 60 mW and the VCO
alone dissipates 12.3 mW.
Fig. 15 shows the measured carrier spectrum with the center
frequency of 980 MHz. The output power is 1.2 dBm with an
inductive load, which is sufficient for the output power require-
ments. Fig. 16 shows the measured phase noise in the cellular
and PCS band. Phase noise is 106 dBc/Hz at 100-kHz offset
and 127 dBc/Hz at 1-MHz offset in the cellular band, and
104 dBc/Hz at 100-kHz offset and 121 dBc/Hz at 1-MHz
offset in the PCS band. Fractional spurs are suppressed to the
phase noise level. Table II shows the performance summary.
V. CONCLUSION
In this paper, we demonstrate a fully integrated CMOS
frequency synthesizer designed for PCS- and cellular-CDMA
wireless systems. A charge-averaging scheme for reducing
fractional spurs and a dual-path loop filter architecture are
proposed. The new bias circuit of the VCO compensates for the
variation of output swing of the VCO caused by the variation
of bonding-wire inductance, and the proposed coarse tuning
technique achieves a small VCO gain and a wide operating
frequency range of the VCO simultaneously. The frequency
synthesizer fabricated in a 0.35- m CMOS technology offers
127-dBc/Hz and 121-dBc/Hz phase noise at 1-MHz offset
with 980 MHz and 1.76 GHz of carrier frequency, respectively.
Fig. 16. Measured PLL output phase noise. (a) Cellular band. (b) PCS band.
REFERENCES
IV. EXPERIMENTAL RESULTS AND SUMMARY [1] A. Kral, F. Behbahani, and A. A. Abidi, “RF-CMOS oscillators with
switched tuning,” in Proc. IEEE Custom Integrated Circuits Conf., May
The proposed frequency synthesizer has been fabricated in 1998, pp. 555–558.
a 0.35- m CMOS technology. Fig. 14 shows the die photo- [2] C. Lam and B. Razavi, “A 2.6-GHz/5.2-GHz frequency synthesizer in
0.4-m CMOS technology,” in Symp. VLSI Circuits Dig. Tech. Papers,
graph of the synthesizer with an area of 2.5 mm 2.0 mm in- June 1999, pp. 117–120.
cluding pads. The circuit has been measured with a nominal [3] B. Razavi, RF Microelectronics. Upper Saddle River, NJ: Prentice
3.0-V supply and a 2.7-V worst case. The bonding wire of the Hall, 1998.
[4] J. Craninckx and M. Steyaert, “A fully integrated CMOS DCS-1800
QFN20 package used in the VCO has 1.36 nH of self-induc- frequency synthesizer,” IEEE J. Solid-State Circuits, vol. 33, pp.
tance and 0.31 nH of mutual inductance, so the total inductance 2054–2065, Dec. 1998.
542 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 5, MAY 2002
[5] D. B. Leeson, “A simple model of feedback oscillator noise spectrum,” Joonbae Park received the B.S. and M.S. degrees
Proc. IEEE, vol. 54, pp. 329–330, Feb. 1966. in electronics engineering and the Ph.D. degree in
[6] S. Mohan, M. Hershenson, S. Boyd, and T. H. Lee, “Simple accurate electrical engineering from Seoul National Univer-
expressions for planar spiral inductances,” IEEE J. Solid-State Circuits, sity, Seoul, Korea, in 1993, 1995, and 2000, respec-
vol. 34, pp. 1419–1424, Oct. 1999. tively.
[7] J.-M. Mourant, J. Imbornone, and T. Tewksbury, “A low phase noise In 1998, he joined GCT Semiconductor Inc., San
monolithic VCO in SiGe BiCMOS,” in IEEE Radio Frequency Inte- Jose, CA, as Director of the Analog Division. He is
grated Circuits (RFIC) Symp. Dig., June 2000, pp. 65–68. currently involved in the development of CMOS RF
chip sets for WLL, W-CDMA, and wireless LAN.
His other research interests include data converters
and high-speed communication interfaces.
Dr. Park received the Best Paper Award of VLSI Design’99, Goa, India.
Yido Koo was born in Seoul, Korea, in 1973. He re-
ceived the B.S. and M.S. degrees from the School
of Electrical Engineering, Seoul National University,
Seoul, Korea, in 1996 and 1998, respectively, where
he is currently working toward the Ph.D. degree. Kyeongho Lee was born in Seoul, Korea, in 1969.
His research interests include RF building blocks He received the B.S. and M.S. degrees in electronics
and systems for wireless communication and high- engineering and the Ph.D. degree in electrical
speed interface for data communications. Currently, engineering from Seoul National University, Seoul,
he is developing a low-noise frequency synthesizer Korea, in 1993, 1995, and 2000, respectively.
for CDMA and GSM applications. He was with Silicon Image, Inc., Sunnyvale, CA,
as a Member of Technical Staff, where he worked on
CMOS high-bandwidth low-EMI transceivers. He is
currently with GCT Semiconductor Inc., San Jose,
CA, as a Co-Chief Executive Officer. His research in-
terests include various CMOS high-speed circuits for
Hyungki Huh was born in Seoul, Korea. He wire/wireless communication systems and integrated CMOS RF systems.
received the B.S. and M.S. degree in electrical
engineering from Seoul National University, Seoul,
Korea, in 1998 and 2001, respectively, where he
is currently working toward the Ph.D. degree in
electrical engineering. Deog-Kyoon Jeong received the B.S. and M.S. de-
His research interests are in the area of RF cir- grees in electronics engineering from Seoul National
cuits and systems with emphasis on the fractional fre- University, Seoul, Korea, in 1981 and 1984, respec-
quency synthesizer. tively, and the Ph.D. degree in electrical engineering
and computer sciences from the University of Cali-
fornia, Berkeley, in 1989.
From 1989 to 1991, he was with Texas Instruments
Inc., Dallas, TX, where he was a Member of Tech-
nical Staff and worked on the modeling and design of
BiCMOS gates and the single-chip implementation
Yongsik Cho was born in Daegu, Korea. He received of the SPARC architecture. He joined the faculty of
the B.S. degree in electrical engineering from Seoul the Department of Electronics Engineering and Inter-University Semiconductor
National University, Seoul, Korea, in 2000, where he Research Center, Seoul National University, as an Assistant Professor in 1991.
is currently working toward the M.S. degree in elec- He is currently an Associate Professor of the School of Electrical Engineering,
trical engineering. Seoul National University. His main research interests include high-speed I/O
His research interests are in the area of RF circuits circuits, VLSI systems design, microprocessor architectures, and memory sys-
and systems. tems.
61
The proposed model allows straightforward noise and dynamic
analyses of – fractional- frequency synthesizers and other
PLL applications in which the divide value is varied in time. Based
on the derived model, a general parameterization is presented that
61
further simplifies noise calculations. The framework is used to an-
alyze the noise performance of a custom – synthesizer imple-
mented in a 0.6- m CMOS process, and accurately predicts the
measured phase noise to within 3 dB over the entire frequency
offset range spanning 25 kHz to 10 MHz.
Index Terms—Delta, dithering, divider, fractional- , fre-
quency, modeling, noise, phase-locked loop, PLL, quantization
noise, sigma, synthesizer.
Fig. 1. Block diagram of a 6–1 frequency synthesizer.
I. INTRODUCTION
of the divide value variations is often treated in isolation of other
B. XOR-Based PFD Fig. 4. XOR-based PFD, associated signals, and E (t) decomposition.
An XOR-based PFD is shown in Fig. 4 [13]–[15], along with
associated signals that will be discussed later. Assuming the tical model to that of the tristate topology except that its gain is
PFD is not performing frequency acquisition, the signal is increased by a factor of 2.
simply passed to the output, , so that the detector operates
C. Voltage-Controlled Oscillator
as an XOR phase detector. As such, the detector outputs an
average error of zero when and are in quadrature, For our purposes, only two equations are needed to model
and is nominally a two-level square wave rather than the the VCO. The first relates deviations in the VCO phase, defined
trilevel short-pulse waveform obtained with the tristate design. as , to changes in the VCO input voltage, . Since
The combination of having wide pulses and only two output VCO phase is the integral of VCO frequency, and deviations in
levels allows the XOR-based PFD to achieve high linearity, VCO frequency are calculated as , where is in units
which is desirable for – synthesizer applications to avoid of hertz per volt, we have
folding down – quantization noise [13].
To model the XOR-based PFD, we simply relate its associ- (3)
ated signals to the tristate detector so that the previous results
The second equation relates the absolute VCO phase, defined as
can be readily applied. Fig. 4 displays the signals associated
, to deviations in the VCO phase and the nominal VCO
with this PFD, and reveals that the output can be decom-
frequency :
posed into the sum of a square wave, , and a trilevel
pulse waveform, . The first component is independent of (4)
the input phase difference to the detector and presents a spurious
Our modeling efforts will be primarily focused on deviations
noise signal to the PLL; its influence can be made negligible
in the VCO phase, so that (3) is of the most interest. However,
with proper design. The second component, , captures the
(4) is required in the divider derivation that follows.
impact of the input phase difference, , on the
PFD output, and can be parameterized according to the width of D. Divider
its pulses, where
Modeling of the divider will be accomplished by first re-
lating the PFD pulse widths, , to the VCO phase deviations,
, and the divide value sequence, . Given this rela-
As with the tristate detector, the impulse approximation can be tionship, the divider model is “backed out” using the PFD gain
applied to obtain expression in (1).
We begin by noting that the divider output edges occur
whenever the absolute VCO phase, , completes
radian increments of phase. As stated in (4), is
composed of a ramp in time, , and phase variations,
which, if we ignore , is the tristate expression multi- . These statements are collectively illustrated in Fig. 5.
plied by a factor of 2. Thus, if we ignore the phase offset of Note that changes in occur at the rising edges of the
and the square wave , the XOR-based PFD has an iden- divider.
PERROTT et al.: MODELING APPROACH FOR – FRACTIONAL- FREQUENCY SYNTHESIZERS 1031
(6) (9)
We obtain the desired divider model by replacing with the
We combine the two key equations into one formulation by sub- PFD gain expression in (1) and assuming is zero.
stitution of (6) into (5):
(10)
F. Overall Model
We now combine the results of Section III-A–E to obtain
the overall time-domain PLL model shown in Fig. 6. The PFD
model is obtained from (1) and (2), the divider model from
(10), and the VCO model from (3). As discussed earlier, the
1032 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 8, AUGUST 2002
XOR-based PFD has a factor of two larger gain than the tris-
tate design, which is captured by the factor in the PFD model.
For convenience in analysis to follow, we also define an abstract
signal, , as the output of the divider accumulation action.
Some observations are in order. First, the divider effectively
samples the continuous-time output phase deviation of the
VCO, , and then divides its value by . The output
phase of the divider, , is influenced by the integration
of deviations in the divider value, . The integration of
is a consequence of the fact that the divider output is a
phase signal, whereas causes an incremental change in
the frequency of the divider output. Second, the PFD, charge
pump, and loop filter translate the discrete-time error signal
formed by and to the continuous-time input of
the VCO, . These elements, along with the divider, also
act as a D/A converter for mapping changes in to .
A. Pseudocontinuous Approximation
Consider a signal that is sampled with period and then
converted to an impulse sequence , as described by
Fig. 9. Detailed view of PLL noise sources and examples of their respective
spectral densities.
(14)
(15)
(16)
synthesizer constantly dithers the divide value at a high rate Fig. 13. Block diagram of prototype system.
compared to the bandwidth of such that extracts
out its low-frequency content. The low frequency content of the which is also expressed as
– output is, in turn, set by the – input , which can
have arbitrarily high resolution. Thus, the – modulator al-
lows the PLL output frequency to be controlled to a very high (18)
resolution independent of the reference frequency—a high ref- If the quantization noise spectra of is white, then
erence frequency can be used while simultaneously achieving
high-frequency resolution.
as previously discussed. In many cases, is not white and
C. Frequency-Domain Model
must be computed numerically by simulating the – modu-
To obtain the frequency-domain model of a – synthesizer, lator at a given value of .
we simply extend the PLL model in Fig. 10 to include the – Equation (18) shows that the – quantization noise is
modulator, as shown in Fig. 12. This figure depicts a general reduced in order by one due to the integrating action of the
model of a – modulator which is characterized by its STF divider. Assuming is white, the shaped noise rises at
and NTF. The base quantization noise is assumed ideal (i.e., dB/decade for frequencies . Therefore, if
white) in the illustration. the order of is chosen to be the same as the order of the
Fig. 12 offers several insights to the fundamentals of – – , the quantization noise seen at the PLL output will roll
frequency synthesis. First, we see that the shaped – quanti- off at 20 dB/decade outside the PLL bandwidth. This rolloff
zation noise passes through a digital accumulator and then the characteristic matches that of the VCO noise.
PLL dynamics, , before impacting the output phase of the
PLL. The digital accumulator, a consequence of the integrating VII. RESULTS
nature of the divider, effectively reduces the noise-shaping order
of the – by one. The PLL dynamics, , act to remove the The above methodology is now used to analyze the noise
high-frequency quantization noise produced by the – mod- performance of a prototype system described in [9], [13].
ulator. The – quantization noise adds an additional noise Fig. 13 displays a block diagram of the prototype, which
source to those already present in the PLL, but the relationship consists of a custom CMOS fractional- synthesizer IC that
from each noise source to the output phase remains purely a includes an XOR-based PFD, an on-chip loop filter that uses
function of and the nominal divide value. switched capacitors to set its time constant, a second-order dig-
ital MASH – modulator, and an asynchronous 64-modulus
D. Quantization Noise Impact on PLL divider that supports any divide value between 32 and 63.5
in half-cycle increments. An external divide-by-2 prescaler is
As Fig. 12 reveals, a – synthesizer’s noise performance used so that the CMOS divider input operates at half the VCO
is impacted by the – quantization noise in addition to the frequency, which modifies the range of divide values to include
intrinsic detector and VCO noise sources found in the classical all integers between 64 and 127. A computer interface is used
PLL. Calculation of this impact is straightforward using the pre- to set the digital frequency value that is fed into the input of the
sented modeling approach. For example, given the NTF of an – modulator.
th order MASH structure is , we calculate the im-
pact of its quantization noise on the PLL output using Fig. 12 A. Modeling
and (16) as
A linearized frequency-domain model of the prototype
system is shown in Fig. 14. The open-loop transfer function of
the system consists of two integrators, a pole at and a zero
at . Additional poles and zeros occur in the system due to
the effects of finite opamp bandwidth and other nonidealities,
1036 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 8, AUGUST 2002
(19)
Fig. 15. Expanded view of PLL System.
The parameters of the system were set such that the PLL had a TABLE I
bandwidth of 84 kHz: VALUES OF NOISE SOURCES WITHIN PLL
kHz
kHz
kHz
(20)
(24)
(25)
(26)
(27)
measured results within 3 dB over a frequency offset range [24] M. H. Perrott, “Fast and accurate behavioral simulation of fractional-N
from 25 kHz to 10 MHz. frequency synthesizers and other PLL/DLL circuits,” in Proc. Design
Automation Conf. (DAC), June 2002, pp. 498–503.
ACKNOWLEDGMENT
The authors would like to thank the Hong Kong University of
Michael H. Perrott received the B.S. degree in elec-
Science and Technology, and in particular, J. Lau, P. Chan, and trical engineering from New Mexico State University,
P. Ko, for their support in the writing of this paper. Las Cruces, in 1988, and the M.S. and Ph.D. degrees
in electrical engineering and computer science from
the Massachusetts Institute of Technology (M.I.T.),
REFERENCES Cambridge, in 1992 and 1997, respectively.
From 1997 to 1998, he was with Hewlett-Packard
[1] T. A. Riley, M. A. Copeland, and T. A. Kwasniewski, “Delta–sigma
N
modulation in fractional- frequency synthesis,” IEEE J. Solid State
Laboratories, Palo Alto, CA, working on high-speed
61
circuit techniques for – synthesizers. In 1999,
Circuits, vol. 28, pp. 553–559, May 1993.
he was a visiting Assistant Professor at the Hong
[2] M. A. Copeland, “VLSI for analog/digital communications,” IEEE
Kong University of Science and Technology, where
Commun. Mag., vol. 29, pp. 25–30, May 1991.
he taught a course on the theory and implementation of frequency synthesizers.
[3] B. Miller and B. Conley, “A multiple modulator fractional divider,” in
From 1999 to 2001, he was with Silicon Laboratories, Austin, TX, where he de-
Proc. 44th Annu. Symp. Frequency Control, May 1990, pp. 559–567.
veloped circuit and signal-processing techniques to achieve high-performance
[4] , “A multiple modulator fractional divider,” IEEE Trans. Instrum.
clock and data recovery circuits. He is currently an Assistant Professor in the
Meas., vol. 40, pp. 578–583, June 1991.
[5] W. Rhee, B.-S. Song, and A. Ali, “A 1.1-GHz CMOS fractional- fre- N Department of Electrical Engineering and Computer Science at M.I.T., where
his research focuses on high-speed circuit and signal processing techniques for
quency synthesizer with 3-b third-order sigma–delta modulator,” IEEE
data links and wireless applications.
J. Solid-State Circuits, vol. 35, pp. 1453–1460, Oct. 2000.
[6] B. Miller, “Technique enhances the performance of PLL synthesizers,”
Microw. RF, pp. 59–65, Jan. 1993.
[7] T. Kenny, T. Riley, N. Filiol, and M. Copeland, “Design and realiza-
N
tion of a digital delta–sigma modulator for fractional- frequency syn- Mitchell D. Trott (S’90–M’92) received the B.S.
thesis,” IEEE Trans. Veh. Technol., vol. 48, pp. 510–521, Mar. 1999. and M.S. degrees in systems engineering from Case
[8] T. A. Riley and M. A. Copeland, “A simplified continuous phase mod- Western Reserve University, Cleveland, OH, in
ulator technique,” IEEE Trans. Circuits Syst. II, vol. 41, pp. 321–328, 1987 and 1988, respectively, and the Ph.D. degree
May 1994. in electrical engineering from Stanford University,
[9] M. Perrott, T. Tewksbury, and C. Sodini, “A 27-mW CMOS frac-
N
tional- synthesizer using digital compensation for 2.5-Mb/s GFSM
Stanford, CA, in 1992.
He was an Assistant and Associate Professor in the
modulation,” IEEE J. Solid-State Circuits, vol. 32, pp. 2048–2060, Department of Electrical Engineering and Computer
Dec. 1997. Science at the Massachusetts Institute of Technology,
[10] S. Willingham, M. Perrott, B. Setterberg, A. Grzegorek, and W. McFar- Cambridge, from 1992 until 1998. He was Director
land, “An integrated 2.5-GHz sigma–delta frequency synthesizer with of Research with ArrayComm, Inc., San Jose, CA,
5 microseconds settling and 2-Mb/s closed-loop modulation,” in Proc. from 1998 to 2002. He is currently with Hewlett-Packard Laboratories, Palo
IEEE Int. Solid-State Circuits Conf. (ISSCC), Feb. 2000, pp. 200–201. Alto, CA. His research interests include multiuser communication, information
[11] N. Filiol, T. Riley, C. Plett, and M. Copeland, “An agile ISM band theory, and coding theory.
frequency synthesizer with built-in GMSK data modulation,” IEEE J.
Solid-State Circuits, vol. 33, pp. 998–1008, July 1998.
[12] N. Filiol, C. Plett, T. Riley, and M. Copeland, “An interpolated
frequency-hopping spread-spectrum transceiver,” IEEE Trans. Circuits
Syst. II, vol. 45, pp. 3–12, Jan. 1998. Charles G. Sodini (S’80–M’82–SM’90–F’94) was
[13] M. H. Perrott, “Techniques for high data rate modulation and low power born in Pittsburgh, PA, in 1952. He received the
N
operation of fractional- frequency synthesizers with noise shaping,” B.S.E.E. degree from Purdue University, Lafayette,
Ph.D. dissertation, Massachusetts Inst. Technol., Cambridge, MA, 1997. IN, in 1974, and the M.S.E.E. and Ph.D. degrees
[14] A. Hill and A. Surber, “The PLL dead zone and how to avoid it,” RF from the University of California, Berkeley, in 1981
Design, pp. 131–134, Mar. 1992. and 1982, respectively.
[15] M. Thamsirianunt and T. A. Kwasniewski, “A 1.2-m CMOS imple- He was a Member of the Technical Staff with
mentation of a low-power 900-MHz mobile radio frequency synthe- Hewlett-Packard Laboratories from 1974 to 1982,
sizer,” in Proc. IEEE Custom Integrated Circuits Conf. (CICC), 1994, where he worked on the design of MOS memory
p. 16.2. and, later, on the development of MOS devices with
[16] J. A. Crawford, Frequency Synthesizer Handbook. Norwood, MA: very thin gate dielectrics. He joined the faculty of
Artech, 1994. the Massachusetts Institute of Technology (M.I.T.), Cambridge, MA, in 1983,
[17] E. A. Lee and D. G. Messerschmitt, Digital Communication, 2nd where he is currently a Professor in the Department of Electrical Engineering
ed. Norwell, MA: Kluwer, 1994. and Computer Science. His research interests are focused on integrated circuit
[18] J. Candy and G. Temes, Oversampling Delta–Sigma Data Con- and system design with emphasis on analog, RF, and memory circuits and
verters. New York: IEEE Press, 1992. systems. Along with Prof. R. T. Howe, he is a coauthor of an undergraduate
[19] S. Norsworthy, R. Schreier, and G. Temes, Delta–Sigma Data Con- text on integrated circuits and devices entitled Microelectronics: An Integrated
verters: Theory, Design, and Simulation. New York: IEEE Press, Approach (Englewood Cliffs, NJ: Prentice-Hall, 1996).
1997. Dr. Sodini held the Analog Devices Career Development Professorship at
[20] A. Sripad and D. Snyder, “A necessary and sufficient condition for quan- M.I.T.’s Department of Electrical Engineering and Computer Science and was
tization errors to be uniform and white,” IEEE Trans. Acoust. Speech awarded the IBM Faculty Development Award from 1985 to 1987. He has served
Signal Proc., vol. ASSP-25, pp. 442–448, Oct. 1977. on a variety of IEEE Conference Committees, including the International Elec-
[21] W. Bennett, “Spectra of quantized signals,” Bell Syst. Tech. J., vol. 27, tron Device Meeting, of which he was the 1989 General Chairman. He was the
pp. 446–472, July 1948. Technical Program Co-Chairman in 1992 and the Co-Chairman for 1993–1994
[22] D. Leeson, “A simple model of feedback oscillator noise spectrum,” of the Symposium on VLSI Circuits. He served on the Electron Device Society
Proc. IEEE, vol. 54, pp. 329–330, Feb. 1966. Administrative Committee from 1988 to 1994. He has been a member of the
[23] A. Hajimiri and T. Lee, “A general theory of phase noise in electrical os- Solid-State Circuits Society (SSCS) Administrative Committee since 1993 and
cillators,” IEEE J. Solid-State Circuits, vol. 33, pp. 179–194, Feb. 1998. is currently President of the SSCS.
888 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 38, NO. 6, JUNE 2003
I. INTRODUCTION
It is important to note that the problem of ripple becomes in- each stage is wide enough to support such pulses, then a very
creasingly more serious as the supply voltage is scaled down large number of stages is required to obtain the necessary ,
and/or the operating frequency goes up. The relative magnitude demanding a high power dissipation.
of the primary sidebands at the output of the VCO is given by The second issue relates to the variation of with process
where is the peak amplitude of the and temperature. Since is directly proportionally to , such
first harmonic of the ripple, is the gain of the VCO, and variations can greatly affect the loop stability.
is the synthesizer reference frequency. For a given relative To resolve the above difficulties, the architecture is modified
tuning range (e.g., 10 ), the gain of LC VCOs must increase as shown in Fig. 2(a), where a discrete-time analog delay line
if the supply voltage goes down. If MHz/V and is placed after and . The delay network is realized as
MHz, then the fundamental ripple amplitude must be depicted in Fig. 2(b), consisting of two interleaved master-slave
less than 63 V to guarantee sidebands 60 dB below the carrier. sample-and-hold branches operating at half of the reference fre-
In order to arrive at the stabilization technique, consider the quency. The circuit emulates as follows. When is
PLL architecture shown in Fig. 1(b). Here, the primary charge high, shares a charge packet corresponding to the previous
pump, , drives a single capacitor while a secondary phase comparison with while samples a level propor-
charge pump, , injects charge after some delay . The tional to the present phase difference. In the next period, and
total current flowing through is thus equal to exchange roles. The interleaved sampling network, there-
fore, provides a delay equal to the reference period .
(1) The discrete-time delay technique of Fig. 2 allows a precise
(2) definition of the zero frequency without the use of resistors.
To quantify the behavior of a PLL incorporating this method,
where is assumed to be much smaller than the loop
we assume the loop settling time is much greater than
time constant. Consequently, the transfer function of the
so that the delay network can be represented by the contin-
PFD/CP/LPF combination can be expressed as
uous-time model shown in Fig. 2(b). Here,
(3) approximates the interleaved branches. Equation (4) can then be
rewritten as
Assuming , we have
(8)
(4) where it is assumed and the current through is ne-
glected. This equation exhibits two interesting properties. First,
obtaining a zero at
if , then and the value
of is “amplified” by . For example, if ,
(5)
then is multiplied by a factor of 10, saving substantial area.
Proper choice of can, therefore, stabilize the loop. Second, the zero frequency is equal to
The damping factor and the settling time of the loop can be
written, respectively, as (9)
Fig. 2. Actual implementation of PLL with delay sampling circuit and continuous-time approximation of delay network.
underdamped settling (where the loop time constant is relatively III. SYNTHESIZER DESIGN
short).
A 2.4-GHz CMOS synthesizer targeting Bluetooth appli-
For RF synthesis, the delay network of Fig. 2(b) must be de-
cations has been designed using the stabilization technique
signed carefully so as to minimize ripple on the control voltage.
described above. This section presents the architecture and
Since in the locked condition, the voltages at nodes and are
building blocks of the synthesizer.
nearly equal, charge sharing between or and creates
Shown in Fig. 4, the synthesizer uses an integer- architec-
only a small ripple. Furthermore, the switches in the delay stage
ture with a feedback divider whose modulus is given by
are realized as small, complementary devices to introduce neg-
, where , , and – .
ligible charge injection and clock feedthrough.
With MHz, the output frequency covers the 2.4-GHz
Comparison With Conventional Architecture In order to
ISM band. The output of the swallow counter is pipelined by the
quantify the advantage of the proposed architecture over the
flip-flop to allow a relaxed design for the level converter
conventional PLL topology, we note that capacitor in
and the swallow counter. The buffer following the VCO sup-
Fig. 2 appears in parallel with or . Since the sampling
presses the kickback noise of the prescaler when the modulus
capacitors are typically two to three times larger than , they
changes. It also avoids limiting the tuning range of the VCO by
suppress the charge pump nonidealities by about 9 to 12 dB.
the input capacitance of the prescaler.
The behavioral model shown in Fig. 3(a) is simulated in
MATLAB for the two cases. As explained in Section IV, the
reference frequency and the divide ratio are scaled by a factor A. VCO Design
of 100 to speed up the simulation. The nonideality of the charge The VCO topology is shown in Fig. 5(a). To provide both
pump is modeled by a constant current mismatch that is negative and positive voltages across the MOS varactors, the
injected into the loop filter at each phase comparison instant. sources of and are grounded and the circuit is biased on
Fig. 3(b) depicts the settling behavior and the output spectrum top by . The inductors are realized as shown in Fig. 5(b),
for the two cases. (The plots are deliberately offset for clarity). with the bottom spiral moved down to metal 2 so as to reduce
For approximately equal settling times, the proposed topology the parasitic capacitance [7]. Each inductor is about 14 nH, oc-
(Type A) achieves 10 dB lower sidebands than the conventional cupies an area of 180 m 180 m, and exhibits a of 4 and
loop does. a parasitic capacitance of 100 fF.
LEE AND RAZAVI: STABILIZATION TECHNIQUE FOR PHASE-LOCKED FREQUENCY SYNTHESIZERS 891
Fig. 3. (a) MATLAB behavioral simulations for the ripples on the control lines. (b) Time-domain settling and VCO output spectrum during lock for Type A
(delay-sampling loop filter) and Type B (conventional loop filter).
(13)
TABLE I
FAST SIMULATION SUMMARY of the die, whose active area measures 0.65 mm 0.45 mm.
The circuit has been tested in a chip-on-board assembly while
running from a 2.5-V power supply. The power dissipation is
20 mW.
Fig. 11 shows the output spectrum in the locked condition.
The phase noise is equal to 112 dBc/Hz at 1 MHz offset,
well exceeding the Bluetooth requirement. The primary refer-
ence sidebands are at approximately 58.7 dBc. This level is
lower than that achieved in [8] with differential VCO control
and an 86.4-MHz reference frequency. Similarly, the designs in
[9] and [10] exhibit an inferior tradeoff between the settling time
well-defined charge packet into an integrator in every period and the sideband magnitudes.
and reset the integrator when its output exceeds a certain level Fig. 12 plots the measured settling behavior of the synthesizer
. Using an ideal op amp, comparator, and switches with when its channel number is switched by 64. Here, the channel
proper choice of and , the circuit can achieve arbitrarily select input is periodically switched between the two end chan-
large divide ratios. (The duty cycle of output can be controlled nels and the oscillator control voltage is monitored. The settling
by .) This technique yields another factor of 20 reduction in time is about 60 s, i.e., 60 input cycles. Table II summarizes
the simulation speed, allowing the synthesizer to be simulated the measured performance of the synthesizer.
in less than 3 min on an Ultra 10 Sun Workstation. Table I sum-
marizes the results of the two simulation techniques. VI. CONCLUSION
A PLL stabilization technique is introduced that relaxes the
V. EXPERIMENTAL RESULTS tradeoff between the settling time and the ripple on the con-
The frequency synthesizer has been fabricated in a digital trol voltage, while obviating the need for resistors in the loop
0.25- m CMOS technology. Shown in Fig. 10 is a photograph filter. The proposed approach creates a zero in the open loop
894 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 38, NO. 6, JUNE 2003
TABLE II
SYNTHESIZER PERFORMANCE SUMMARY
Tai-Cheng Lee was born in Taiwan, R.O.C., in
1970. He received the B.S. degree from National
Taiwan University, Taipei, Taiwan, R.O.C., in
1992, the M.S. degree from Stanford University,
Stanford, CA, in 1994, and the Ph.D. degree from
the University of California, Los Angeles, in 2001,
all in electrical engineering.
He was with LSI Logic from 1994 to 1997 as a Cir-
cuit Design Engineer. He served as an Adjunct As-
sistant Professor with the Graduate Institute of Elec-
tronics Engineering (GIEE), National Taiwan Uni-
versity, from 2001 to 2002. Since 2002, he has been with the Department of
Electrical Engineering and GIEE, National Taiwan University, where he is an
Assistant Professor. His main research interests are in high-speed mixed-signal
and analog circuit design, data converters, PLL systems, and RF circuits.
Abstract—An adaptive phase-locked loop (PLL) architecture reception condition is provided when the receiver is displaced
for high-performance tuning systems is described. The architec- within different coverage regions. For the system to be effec-
ture combines contradictory requirements posed by different per- tive, the background scanning has to be performed in a trans-
formance aspects. Adaptation of loop parameters occurs contin-
uously, without switching of loop filter components, and without parent (inaudible) way to the listener. A possible but expensive
interaction from outside of the tuning system. The relationship of way to do that is to use two tuners in the receiver, with one of
performance aspects (settling time, phase noise, and spurious sig- them being used for checking on alternative frequencies only.
nals) to design variables (loop bandwidth, phase margin, and loop Single-tuner solutions—which have a much better price/perfor-
filter attenuation at the reference frequency) are presented, and mance ratio—require a tuning system architecture able to do
the basic tradeoffs of the new concept are discussed. A circuit im-
plementation of the adaptive PLL, optimized for use in a multi- frequency hopping in an inaudible way [2]. In other words, a
band (global) car-radio tuner IC, is described in detail. The real- fast-settling-time architecture is required for these applications.
ized tuning system achieved state-of-the-art settling time and spec- Communication systems often pose severe requirements on
tral purity performance in its class (integer- PLL’s): a signal-to- the spectral purity of the tuning system local oscillator (LO)
noise ratio of 65 dB, a 100-kHz spurious reference breakthrough signal. There are two main reasons for this. First, to avoid prob-
signal under 81 dBc, and a residual settling error of 3 kHz after
1 ms, for a 20-MHz frequency step. It simultaneously fulfills the lems with reciprocal mixing of adjacent channels. Reciprocal
speed requirements for inaudible frequency hopping and the heavy mixing decreases the receiver's selectivity and disturbs the re-
signal-to-noise ratio specification of 64 dB. ception of weak signals. Second, because the mixing process,
Index Terms—Adaptive systems, FM noise, frequency synthe- which is used for down-conversion of the radio-frequency (RF)
sizers, phase-locked loops. signals, superposes the phase noise of the LO on the modula-
tion of the RF signal. Hence, the signal-to-noise ratio (SNR) at
the output of the demodulator is a function of LO's phase noise
I. INTRODUCTION level [3].
This paper describes an adaptive tuning system architecture
F AST settling time–frequency synthesizers are essential
building blocks of modern communication systems.
Typical examples are digital cellular mobile systems, which
that combines fast settling time with excellent spectral purity
performance. The architecture was optimized to be used in
employ a combination of time-division duplex (TDD) and a global car-radio tuner IC with inaudible RDS background
frequency-division duplex (FDD) techniques. In these systems, scanning. The integer- frequency synthesizer has an SNR of
the downlink frequencies (base station to handsets) are placed 65 dB and a 100-kHz spurious reference breakthrough under
in different bands with respect to uplink frequencies. In order 81 dBc at the voltage-controlled oscillator (VCO) ( 87 dBc
to save cost and decrease the size of the handset, it is desirable at the mixer). Residual settling error for a 20-MHz frequency
to use the same frequency synthesizer to generate uplink and step is 3 kHz after 1 ms. These results are similar to those of a
downlink frequencies. Requirements are that the synthesizer fractional- implementation [4]. The complexity of our tuning
has to switch between bands and settle to another frequency system, however, is much smaller. The adaptive phase-locked
within a predetermined time ( 1.7 ms for GSM and DCS-1800 loop (PLL) was integrated in a 5-GHz, 2- m bipolar tech-
systems [1]). nology. The tuning system works with 8.5-V supply voltage for
Car-radio receivers with optimal radio data system (RDS) the charge pumps and with 5 V for the logic functions. Total
performance ask for fast-settling-time tuning systems as well current consumption is 21 mA from the 5-V supply and 12 mA
[2]. The RDS network transmits a list of (nationwide) alterna- from the 8.5-V supply.
tive frequencies carrying the same program. The tuner performs The architecture of the multiband tuner IC is described in
a background scanning of these frequencies, so that optimum Section II. Section III presents relationships of settling time,
phase noise, and spurious signals to the design variables, namely
loop bandwidth, phase margin, and loop filter attenuation at the
reference frequencies. Section IV introduces the adaptive PLL
Manuscript received July 23, 1999; revised November 29, 1999. architecture and discusses the advantages and tradeoffs of the
The author is with Philips Research Laboratories, Eindhoven 5656 AA The
Netherlands (e-mail: Cicero.Vaucher@philips.com). concept. Section V describes the circuit implementation, and
Publisher Item Identifier S 0018-9200(00)02861-4. Section VI presents a summary of measured results.
0018–9200/00$10.00 © 2000 IEEE
VAUCHER: ADAPTIVE PLL TUNING SYSTEM ARCHITECTURE 491
TABLE I
RECEPTION BANDS WITH CORRESPONDING TUNING SYSTEM PARAMETERS
II. MULTIBAND TUNER ARCHITECTURE AM DIV dividers, which are set in between the VCO output
and the RF mixers. Table I presents the VCO frequency and
The block diagram of the global tuner IC with inaudible back- tuning system parameter settings for various reception bands,
ground scanning is shown in Fig. 1. The receiver and tuning including the American Weather Band. By dividing the VCO
system architectures have been defined such that all reception output, the tuning resolution is 1 kHz in AM mode and 50 kHz in
bands can be accessed with a single VCO and a single loop filter, FM mode, despite the fact that reference frequencies are 20 kHz
without changes to the application. Mapping the frequency of and 100 kHz, respectively.
the VCO to the different input bands is achieved by dividing its Combining the different reception bands in one single appli-
output frequency by different ratios, depending on the band to cation—the same VCO and same loop filter—complicates the
be received. The division is accomplished in the FM DIV and design of the tuning system. A reception band with worst case
492 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 4, APRIL 2000
A. Settling Time, Loop Bandwidth, and Loop Phase Margin Fig. 3. Setting transient for different values of , normalized for f t. (a)
1( )
Setting error (represented as f f t =f ) versus f t. (b) Setting error
Bode diagrams are a powerful tool for designing PLL tuning (represented asln( 1 ( )
j f f t j=f ) ) versus f t.
systems [7], [8] because they enable direct assessment of the
loop's phase margin and open-loop bandwidth (0-dB fre-
reaching a minimum for values of around 50 . Increasing
quency ). Accurate and reliable results for and are ob-
the phase margin further leads to a sharp increase in the settling
tained with ease to implement behavioral models [9] and with
time.
fast ac simulation runs. In spite of the advantages of the “ac
The relationship of settling time and phase margin, displayed
method,” design equations relating the settling performance of a
in Fig. 4, can be understood with the help of Fig. 5. It presents
type-2, third-order charge-pump PLL1 [6] to its open-loop band-
the pole and zero locations of the closed-loop transfer function
width and phase margin have, to the best of our knowledge, not
of a third-order loop with different values of phase margin (Bode
yet been published in the open literature.
plots presented in Fig. 2). The real part of the dominant (com-
Fig. 2 presents Bode plots of a type-2, third-order loop for
plex) poles approach for values of of about 50 . When
different values of phase margin . Fig. 3(a) displays the tran-
equals 53 , all three poles lie at . That is the location
sient response of such a loop for three different values of phase
with the fastest damping of the transient error. The fastest re-
margin. The responses are plotted as , normalized
sponse, however, is obtained with 51 . The complex parts of the
for . is the remaining frequency error with respect to
poles “speed up” the settling transient a bit further (25%). For
the final value and is the amplitude of the frequency jump.
higher values of phase margin, the dominant real pole moves to
Fig. 3(b) presents the responses as , so that
the right on the real axis. This pole is responsible for the slowing
the impact of on the “long-term” transient response is easily
down of the PLL response for values of 53 . Fig. 5
observed.
shows that the dominant pole, for 60 phase margin, lies at about
The influence of the phase margin on the settling time, ob-
0.4 . Hence, it may be concluded that the usual practice of
tained with transient simulations similar to those of Fig. 3, is
designing critically damped loops—which have a phase margin
presented in Fig. 4. The figure shows the time necessary for
of about 70 [5]—is not appropriate for fast-settling-time appli-
the value of to reach a numerical value of
cations.
10. The settling time decreases with increasing phase margin,
Let us consider Fig. 3(b) again. One sees that the (envelope
1The most widely used configuration in synthesizer applications. of the) curves can be approximated by straight lines. The ap-
VAUCHER: ADAPTIVE PLL TUNING SYSTEM ARCHITECTURE 493
Fig. 4. Setting time as function of the phase margin for f =f =e . ( ) for a 1(ln( )) of ten.
Fig. 6. Average values of 1
In (3):
locking time(s);
amplitude of the frequency jump (Hz);
maximum frequency error (Hz) at ;
can be read from Fig. 6.
Two points about the present treatment of the transient re-
sponse need further explanation. First, the presented results are
based on a linear continuous-time model for the discrete-time
charge-pump PLL. It is known in the literature [6] that the
continuous-time approach is a good approximation for the
discrete-time PLL if the reference (sampling) frequency
of the loop is at least a factor of ten higher than its open-loop
bandwidth . Therefore, the value of , calculated with (3),
has to be checked against the loop's reference frequency . If
Fig. 5. Position of the closed-loop poles and zeros of a third-order PLL the target ratio is smaller than ten, then actual settling
corresponding to different values of , as displayed in Fig. 2.
behavior will deviate from the calculations.
The second point is that usual implementations of the phase
proach proposed here takes into account with the help of an frequency detector have a limited linear phase error detection
effective damping coefficient . By so doing, we arrive at range, namely, from 2 to 2 [9]. When the instantaneous
the following approximation for the envelope of the curves of phase error becomes larger than 2 , the PFD interprets
Fig. 3(b): the error information as 2 . This effect leads to a
longer settling time than predicted with (3). The maximum value
(1) of , denoted , was found to obey the following rela-
tionship: , where is the main di-
Numerical estimations for can be obtained from tran- vider ratio and is a fitting factor for the influence of the
sient simulations with the help of the following expression: phase margin on . Numerical values for , obtained
from transient simulations, lie in the range [0.7,0.8]. Hence, the
maximum phase error is contained in the interval 2 , when
(2) 2 . If this condition is satisfied, then the
(discrete-time) transient response is accurately predicted by the
The settling time results presented in Fig. 4 leads to the nu- continuous-time linear model.
merical values for displayed in Fig. 6. These values rep- Inaudible RDS background scanning requires settling times
resent an average value for , as they are obtained from a of 1 ms, defined as a residual settling error of 6 kHz for a
of ten. 20-MHz frequency jump. The nominal loop phase margin is set
Manipulation of (1) results in an equation describing the min- to 50 , which corresponds to a of five. On the other
imum loop bandwidth required to achieve given settling speci- hand, it is appropriate to use a lower value for in the
fications and calculations (e.g., 2.5), to provide enough margin for variations
in the nominal values of loop bandwidth and phase margin.
Solving (3) for these settling specifications leads to a nominal
(3)
value of 3.2 kHz for the loop bandwidth .
494 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 4, APRIL 2000
Fig. 7. FM noise density and residual FM for loop bandwidths of 800 Hz and 3 kHz.
The loop bandwidth that satisfies different settling require- the simulated frequency noise (FM noise) power density and the
ments can be calculated with the help of (3). Settling specifica- residual FM, which is plotted as function of , with fixed at
tions, however, often require loop bandwidths that are not op- 20 Hz. The FM noise density and the residual FM are plotted for
timal with respect to spectral purity performance, as will be- values of loop bandwidth of 800 Hz and of 3 kHz. For 3 kHz,
come clear in the next subsection. the residual FM amounts to 40 Hz rms, which is 12 dB higher
than the specification. A loop bandwidth of 800 Hz, on the other
B. Phase Noise Performance and Loop Bandwidth hand, leads to a residual FM of 8 Hz rms, which satisfies the
The dependency of the total phase noise of a PLL tuning SNR requirement.
system on the phase noise of the loop components is well known The contributions of different noise sources to the total fre-
in the literature [3], [5], [10]. The phase noise of the VCO is sup- quency noise density, in the case of an 800-Hz loop bandwidth,
pressed inside the loop bandwidth, whereas the (phase) noise are displayed in Fig. 8. The contribution of the VCO to the
from the other building blocks is transferred to the VCO output, residual FM equals that of the other synthesizer building blocks.
multiplied by the closed-loop transfer function of the PLL: a This is a good compromise, and 800 Hz was chosen as the nom-
low-pass function that suppresses their noise contribution out- inal loop bandwidth for in-lock situations.
side the loop bandwidth. There is a “crossover point” for the The settling specification requires a bandwidth of 3.2 kHz.
loop bandwidth, where the noise contribution from the dividers The SNR constraint, on the other hand, asks for 800 Hz. These
and charge pump becomes dominant with respect to the noise conflicting requirements can be combined when the loop band-
from the VCO. width is made adaptive as a function of the operating mode: fre-
For terrestrial FM reception, the LO signal residual frequency quency jump or in-lock.
noise (residual FM) determines the ultimate receiver's SNR per- Adapting the value of the loop bandwidth during frequency
formance. The SNR specification for the application is 64 dB, jumps is easily accomplished by switching the nominal value of
defined for a reference level of 22.5-kHz peak deviation with the charge-pump current [6], [13]. This method, however, often
50- s deemphasis. Complying to the specification requires the causes disturbances in the VCO tuning voltage—the so-called
residual FM in the LO signal to be less than 10 Hz rms. secondary glitch-effect—at the moment the current is switched
The frequency (FM) noise density of the LO signal from high to low values. These disturbances are highly unde-
is linked to its phase noise power density sirable, as they have to be corrected by the loop in small band-
by [5]. equals width mode. What is more, the “secondary glitches” may cause
2 , the single-sideband noise-to-carrier ratio, so that audible disturbances in analog systems and increase the bit error
. Finally, the residual FM can be calcu- rate in digital systems.
lated
To provide stability for a small bandwidth loop requires a
transfer function zero located at low frequencies (large time con-
(4) stant). A low-frequency zero, however, is undesirable for oper-
ation in high bandwidth mode. It causes the phase margin to be
The integration limits and in (4) depend on the signal “too” high, which increases the settling time. Note that the ef-
bandwidth of the application [3]. For terrestrial FM reception, fective damping coefficient decreases for high values of
the lower limit is 20 Hz and the higher is 20 kHz. Fig. 7 presents phase margin (see Fig. 6).
VAUCHER: ADAPTIVE PLL TUNING SYSTEM ARCHITECTURE 495
Fig. 8. Contributions from different noise sources to the total FM noise density and residual FM (20 Hz–20 kHz) with 800-Hz loop bandwidth.
Therefore, for optimal settling time and phase noise, one has
not only to switch the value of the loop bandwidth but also to
change the location of the zero in the transfer function.
spurious
(5)
where
offset frequency from the carrier (Hz);
amplitude of ac current component with frequency
(A);
impedance of the loop filter at (V/A);
VCO gain (Hz/V).
The value of is twice the value of the loop-filter
dc leakage current [12] in loops operating with well-designed
charge pumps. In cases where the charge pump has charge-
sharing problems and/or charge injection into the loop filter,
may become dominated by these second-order effects.
The imperfections can lead to spurious components with (much)
higher amplitudes than would be expected based on the leakage
current alone. Fig. 10. Loop-filter configuration, charge-pump currents, and component
values used in the global car-radio tuner IC.
Rearranging the above equation leads to a formula that relates
the required filter attenuation at to the specified maximum
level of spurious signals , to the dc leakage The relevant values of equal and its harmonics in a
current , and to the VCO gain standard PLL operating with a reference frequency of Hz.
Therefore, the required loop-filter (trans)impedance for these
frequencies can be readily calculated. The VCO gain, the spu-
(6)
rious specification, and the expected (maximum) leakage cur-
rent are known.
496 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 4, APRIL 2000
Fig. 11. Bode plots of the adaptive loop during frequency jumps and in-lock.
An important conclusion to be taken from the above equations creases the loop phase margin and increases the settling time in
is that the amplitude of the spurious signals is not dependent on high-bandwidth mode.
the absolute value of loop bandwidth. Instead, it is determined Therefore, to provide optimal settling, low-power dissipation,
by the (trans)impedance of the loop filter. This means that, at and good spurious performance, one has not only to switch the
least in principle, “any” spurious specification can be achieved value of the loop bandwidth but also to bypass (some) RC sec-
simply by decreasing the impedance level of the loop filter. In tions of the loop filter. The PLL architecture presented here
practice, this is not a viable option because the PLL loop band- complies with these requirements.
width is proportional to the value of the loop-filter resistor and
to the charge-pump current [6].
For a constant value of the loop bandwidth, a decrease of IV. ADAPTIVE PLL ARCHITECTURE
the loop-filter impedance level requires a proportional increase
A. Basic Architecture
of the nominal charge-pump current. This leads to difficulties
in the charge-pump design and to higher power dissipation. To The basic idea is to have two loops working in parallel, as
avoid these difficulties, more RC sections are added to the basic depicted in Fig. 9. Loop 1, built around PFD1 and CP1, is di-
loop-filter configuration, so that the filter attenuation at higher mensioned for in-lock operation. Loop 2, built around PFD2,
frequencies is increased. Additional RC sections, however, in- DZ, and CP2, is dimensioned for fast settling time. Loop 1 op-
evitably cause phase lag at lower frequencies. The phase lag de- erates all the time, whereas Loop 2 is only active during tuning
VAUCHER: ADAPTIVE PLL TUNING SYSTEM ARCHITECTURE 497
actions. Loop 1 and Loop 2 share the crystal oscillator, the ref-
erence divider, and the main divider.
A smooth takeover from Loop 1, after a frequency jump,
avoids “secondary glitch” effects. The high-current charge
pump CP2 is only active during tuning. CP2 is controlled by the
dead-zone (DZ) block. DZ generates a smooth transition into a
well-defined dead zone for CP2 when lock is achieved, so that (a) (b)
sudden disturbances of the VCO tuning voltage are avoided.
Additional freedom for optimization of the loop parameters
is obtained by using two separate charge-pump outputs and by
applying the charge-pump currents to different nodes of the loop
filter. In this way, the location of the zeros for frequency jumps
and in-lock can be set in a continuous way, without switching
of loop components—which is a source of “secondary glitch”
problems. Furthermore, the path from Icpl to Vtune may con-
tain additional filtering sections for, e.g., attenuation of spurious (c)
signals and/or fractional- quantization noise [14]. These filter
sections may be bypassed by Icph to increase the phase margin Fig. 13. Shift in locking position as function of VCO tuning voltage.
in high-bandwidth mode.
in-lock duty cycle. The processed up and dn signals are then ap-
B. Loop-Filter Implementation plied to low-pass filters and slicers, whose function is to prevent
The ideas described above are demonstrated with the help pulses that have too small a duty cycle from reaching CP2. The
of Figs. 10 and 11. Fig. 10 presents the loop-filter configura- cutoff frequency of the low-pass filters, the discrimination level
tion and component values used in the global tuner IC (Fig. 1). of the slicers, and the turn-on time of CP2 determine the size of
Fig. 11 shows the optimized Bode diagrams of the adaptive PLL the dead zone around the lock position s.
(in FM mode) with the loop filter of Fig. 10. A tradeoff among settling performance, circuit implementa-
During frequency jumps both CP1 and CP2 are active; the tion, and robustness arises, when the magnitude of the dead zone
loop filter zero frequency is 1/2 RbCa and lies at a high fre- has to be determined. Let us start discussing circuit aspects.
quency, matching the 0-dB open-loop frequency. It enables sta- The dead zone of charge pump CP2 should be centered
bility and fast tuning to be achieved. The nominal loop band- around the locking position of the loop for optimum settling and
width in this mode is 3.2 kHz, and the phase margin is 50 . After spectral purity performance. The locking position, however, is
the frequency jump only CP1 is active. The zero of the loop filter a function of the output voltage of charge pump CP1. The effect
moves to a lower frequency (1/2 Ra Rb Ca), without the is depicted in Fig. 13. One sees that, as the tuning voltage Vtune
switching of loop-filter components. The low-frequency zero in- increases, there is a shift of the locking position to positive
creases the phase margin in-lock. values of . The reason lies in the finite output resistance
When the loop is in-lock, an extra pole is introduced of the active element used in CP1. Different current gains in
(1/2 RcCc), which increases the 100-kHz reference sup- CP1's UP and DOWN branches need to be compensated by up
pression by about 20 dB. During frequency jumps, these and dn signals with different duty cycles at the locking point.
elements are bypassed by CP2, increasing the phase margin in Different duty cycles are accomplished by a shift in the loop's
high-bandwidth mode. If the loop bandwidth were increased locking position.
by simply switching the amplitude of CP1, one would end up Fig. 13 shows situations where the gain in the UP branch of
with an unstable loop, because of a phase margin of less than the pump decreases as Vtune increases. The ideal operating situ-
10 in high-bandwidth mode. ation is depicted in Fig. 13(a). Situation (b) is still allowed from
the point of view of spectral purity but has asymmetrical settling
C. Dead-Zone Implementation performance. Finally, (c) depicts a situation that should never
The new element in the adaptive PLL architecture is the com- happen: the locking position shifts so much that the high-cur-
bination of the DZ block with the high-current charge pump rent charge pump CP2 becomes active and degrades the in-lock
CP2. The function of DZ is to provide CP2 with a well-de- spectral purity. Therefore, increasing the size of CP2's dead zone
fined dead zone of s. The dead zone is centered symmet- ( s) eases the design of charge pump CP1 and increases the
rically around the locking position of charge pump CP1 [see robustness of the system.
Fig. 13(a)]. On the other hand, the size of CP2's dead zone influences the
The logic diagram of the DZ/CP2 combination is depicted in settling performance of the adaptive loop. The influence of
Fig. 12. The figure shows how the different logic functions in- on the transient response was simulated with behavioral models.
fluence the duty cycle of the up and dn signals from the phase The results are displayed in Fig. 14, together with the settling
frequency detector (PFD2). At the input of DZ, the up and dn requirements that ensure inaudible background scanning func-
signals have a finite duty cycle, even for an in-lock situation tionality. Table II presents the settling time for different settling
. The finite duty cycle eliminates dead-zone problems accuracies and different values of . A dead-zone value of
in CP1. The XOR and AND gates are used to cancel the finite infinity corresponds to the situation where only CP1 is active
498 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 4, APRIL 2000
TABLE II
SIMULATED IN-LOCK SNR AND SETTLING TIME (ms) FOR A 20-MHz
FREQUENCY JUMP FOR DIFFERENT VALUES OF THE DEAD ZONE AND
DIFFERENT SETTLING ACCURACIES
(nonadaptive loop). Table II shows that by using the adaptive Fig. 15. Micrograph of the tuner IC.
loop architecture, it is possible to combine fast settling time with
good SNR in-lock. Increasing leaves more “residual” phase
(and frequency) error to be corrected by the small bandwidth
loop. The closer one comes to the locking point in high band-
width mode, the shorter the total settling transient will be. A
dead-zone value of 15 ns is a good compromise for the in-
tended application.
V. CIRCUIT IMPLEMENTATION
A die micrograph of the total tuner IC is displayed in Fig. 15.
The adaptive PLL has been integrated with the other functional
blocks of Fig. 1 in a 5-GHz, 2- m bipolar technology [15].
Fig. 16. Architecture of the main programmable divider.
A. Programmable Dividers
The architecture of the main divider is depicted in Fig. 16. current routing logic techniques (CRL) [12], [16]. The low-fre-
The high-frequency part of the programmable divider is based quency part of the main and reference dividers operate with low
on the programmable prescaler concept described in [12] and current levels to limit total power dissipation. To decrease the
consists of a chain of 2/3 divider cells. The modular architecture phase noise of the reference signal going to the phase detectors,
enables easy optimization of power dissipation and robustness this signal is reclocked in a high-current D-flip-flop (D-FF). The
for process variations. The division range of the basic prescaler clean crystal signal is used to clock the D-FF. The total main di-
configuration is extended by the low-frequency programmable vider current consumption is 5 mA. The first 2/3 cell consumes
counter. The logic functions of the PLL were implemented with 2.1 mA.
VAUCHER: ADAPTIVE PLL TUNING SYSTEM ARCHITECTURE 499
B. Oscillators
The LC VCO uses an external tank circuit. It can be tuned
from 150 to 250 MHz, with a voltage tuning range from 0.5
to 8 V. The VCO phase noise is 100 dBc/Hz at 10 kHz, for
a carrier frequency of 237 MHz. The VCO core consumes
1.5 mA. The 20.5-MHz reference crystal oscillator operates
in linear mode, to avoid harmonics interfering in the FM
reception bands. Quadrature generation for the image rejection
FM mixers (see Fig. 1) is accomplished in a divider-by-two
(FM DIV), with the exception of reception in the American
Weather Band (WX). In that case, I/Q signals are generated
with a RC-CR network directly from the VCO. This avoids the
need to have the VCO operating at 346 MHz, and a change in
the LC VCO tuned circuit during WX reception.
C. Charge Pumps
Fig. 17 shows the simplified circuit diagram of the low-cur-
rent charge pump CP1. The up and dn signals from the phase Fig. 19. Settling transient for a 20-MHz tuning step.
500 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 4, APRIL 2000
(a)
(b)
Fig. 20. Spectral purity measurements in FM mode: (a) reference spurious breakthrough and (b) close to the carrier.
detector drive the input differential pairs, which set the currents back arrangement provided by Q3 and Q4. This prevents asym-
in the PNP current switches Q1 and Q2 on and off. The collector metry in the source and sink currents, ensuring good centring of
outputs of Q1 and Q2 are kept at equal dc levels by the dc feed- the charge-pump characteristics for all tuning voltages. Q5 and
VAUCHER: ADAPTIVE PLL TUNING SYSTEM ARCHITECTURE 501
Fig. 21. Evaluation of the FM channel—VCO purity determines SNR for V > 300 V. Fin = 97:1 MHz, AF freq = 1 kHz. SNR meas.: FMdev = 22:5
kHz; 26 dB = 2:0 V. THD meas.: FMdev = 75 kHz.
Q6 provide means for stabilization of currents and for speeding VII. CONCLUSION
up the switching of Q1 and Q2. The reset circuits monitor the
This paper described an adaptive PLL architecture for
currents in Q1 and Q2 and generate the reset signals RST Up
high-performance tuning systems. The relationships of per-
and RST Dn. These signals are fed back to reset the phase de-
formance aspects to design variables were presented. It is
tectors. The high-current charge pump CP2 is a scaled-up ver-
demonstrated that design for spectral purity performance
sion of the CP1 circuit, without the reset circuits.
often leads to suboptimal settling performance, because of
different requirements on the loop bandwidth and on the
VI. MEASUREMENTS
location of the zeros and poles of the closed-loop transfer
The measured charge-pump currents as a function of the function. The adaptive architecture described here resolves
time difference between the phase detector inputs are shown these contradictory requirements, without the necessity of
in Fig. 18. Good centering of the two charge-pump outputs switching circuit elements in the loop filter. The adaptation of
is observed, and there is enough margin for variations in loop bandwidth occurs continuously, as a function of the phase
the in-lock position of CP1. The measured settling transient error in the loop, and without interaction from outside of the
response is displayed in Fig. 19. The settling performance tuning system. During frequency jumps, high bandwidth and
complies to the settling requirements and enables inaudible high phase margin are obtained by bypassing filter sections.
background scanning in single-tuner RDS applications. When the loop is locked, the architecture allows heavy filtering
The frequency spectrum of the VCO in FM mode is presented of spurious signals. The implementation of the dead-zone
in Fig. 20(a) and (b). Fig. 20(a) shows the spurious reference block was presented, and the basic tradeoffs of the concept
breakthrough at 100 kHz to be under 81 dBc. There is yet a were discussed. The adaptive PLL was optimized for use in a
6-dB improvement in noise and spurious breakthrough before multiband (global) car-radio tuner IC, which features inaudible
the VCO signal reaches the FM mixers, due to the division by background scanning. Design and architecture of the PLL
two in the FM DIV divider (see Fig. 1). Fig. 20(b) displays the building blocks were discussed, and measurement results were
phase noise spectrum close to the carrier. Spectrum measure- presented. The integrated adaptive PLL tuning system achieved
ments done in AM mode showed a reference spurious break- state-of-the-art settling and spectral purity performance in its
through of 57 dBc, at an offset of 20 kHz from the carrier. For class (integer- PLL’s). It fulfills simultaneously the speed
AM, the improvement in phase noise and spurious performance requirements for inaudible frequency hopping and the heavy
amounts to 26 dB, due to the division by 20 in between the VCO SNR specification of 64 dB.
and the AM mixers.
Finally, the SNR and THD of the total FM receiver chain are
ACKNOWLEDGMENT
displayed in Fig. 21 as a function of the antenna input signal
level . For low values of , the noise is dominated by RF The author wishes to thank D. Kasperkovitz for technical sup-
input noise and by the quality of the building blocks in the signal port during the project, K. Kianush for his tireless disposition
processing chain: low-noise amplifier, mixers, and demodulator. in bringing the car-radio project to a successful end, H. Verei-
For high values of ( 300 V), the dominant noise source jken for the optimization and layout of the synthesizer building
becomes the LO signal. The excellent measured FM sensitivity, blocks, B. Egelmeers for the implementation and evaluation
2.0 V for 26-dB SNR, and the ultimate SNR of 65 dB verify of the concept in a bread-board functional model, and G. van
the spectrum purity of the tuning system and of the RF channel. Werven for the measurements.
502 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 4, APRIL 2000
REFERENCES [12] C. Vaucher and D. Kasperkovitz, “A wide-band tuning system for fully
integrated satellite receivers,” IEEE J. Solid-State Circuits, vol. 33, no.
[1] B. Razavi, “A 900 MHz/1.8 GHz CMOS transmitter for dual-band appli-
7, pp. 987–998, July 1998.
cations,” IEEE J. Solid-State Circuits, vol. 34, pp. 573–579, May 1999. [13] K. Nagaraj, “Adaptive charge pump for phase-locked loops,” U.S. Patent
[2] K. Kianush and C. S. Vaucher, “A global car radio IC with inaudible
5 208 546, 1993.
signal quality checks,” in IEEE Int. Solid-State Circuits Conf. Dig. Tech. [14] B. Miller and B. Conley, “A multi-modulator fractional divider,” in Proc.
Papers, 1998, pp. 130–131. IEEE 44th Annu. Symp. Frequency Control, 1990, pp. 559–567.
[3] W. P. Robins, Phase Noise in Signal Sources, 2nd ed, ser. 9. London,
[15] Philips Semiconductors, TEA6840H global car-radio tuner datasheet,
U.K.: Inst. Elect. Eng., 1996. 1999.
[4] H. Adachi, H. Kosugi, T. Awano, and K. Nakabe, “High-speed fre-
quency-switching synthesizer using fractional N phase-locked loop,”
[16] W. G. Kasperkovitz, “Digital shift register,” U.S. Patent 5 113 419, 1992.
IEICE Trans. Electron., pt. 2, vol. 77, no. 4, pp. 20–28, 1994.
[5] U. L. Rohde, RF and Microwave Digital Frequency Synthesizers. New
York: Wiley, 1997.
[6] F. M. Gardner, “Charge-pump phase-lock loops,” IEEE Trans. Cicero S. Vaucher (M’98) was born in São Fran-
Commun., vol. 28, no. 11, pp. 1849–1858, Nov. 1980. cisco de Assis, Brazil, in 1968. He graduated in elec-
[7] H. Meyr and G. Ascheid, Synchronization in Digital Communica- trical engineering from the Universidade Federal do
tions. New York: Wiley, 1990. Rio Grande do Sul, Porto Alegre, Brazil, in 1989.
[8] F. M. Gardner, Phase-Lock Techniques. New York: Wiley, 1979. He joined the Integrated Transceivers group
[9] B. Razavi, Ed., Monolithic Phase-Locked Loops and Clock Recovery of Philips Research Laboratories, Eindhoven,
Circuits. New York: IEEE Press, 1996. The Netherlands, in 1990, where he works on
[10] V. F. Kroupa, “Noise properties of PLL systems,” IEEE Trans. implementations of low-power building blocks for
Commun., vol. C-30, pp. 2244–2552, Oct. 1982. frequency synthesizers, on synthesizer architectures
[11] C. S. Vaucher, “Synthesizer architectures,” in Analog Circuit Design, R. for low-noise/high-tuning-speed applications, and
J. van de Plassche, Ed. Norwell, MA: Kluwer, 1997. on CAD modeling of PLL synthesizers.
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 10, OCTOBER 2000 1445
Abstract—A phase-locked loop (PLL) with a fast-locked II. BASIC IDEA AND MODEL
discriminator-aided phase detector (DAPD) is presented. Com-
pared with the conventional phase detector (PD), the proposed A simple charge-pump PLL consists of four major blocks: the
fast-locked PD reduces the PLL pull-in time and enhances the phase detector (PD), the charge-pump circuit, the loop filter, and
switching speed, while maintaining better noise bandwidth. The the voltage-controlled oscillator (VCO) [3]–[6]. Fig. 2 shows
synthesizer has been implemented in a 0.35- m CMOS process, the linear model of a charge-pump PLL-based frequency syn-
and the output phase noise is 99 dBc/Hz at 100-kHz offset.
Under the supply voltage of 3.3 V, its power consumption is 120
thesizer. The closed-loop transfer function can be represented
mW. as
Index Terms—Bandwidth adjusting, fast acquisition, fast
locking, frequency synthesizers, phase detectors, phase-locked (1)
loops.
The conventional PD is implemented in conjunction with a
I. INTRODUCTION charge-pump loop filter in the PLL, as illustrated in Fig. 3.
To determine the transfer function of the PD, assume there is
P HASE-LOCKED loop (PLL) circuits have been found to
be useful wherever there is a need to synchronize a local os-
cillator with an independent incoming signal, such as serial data
a time interval between two input signals and in the
PD, the output current of the charge-pump circuit is a pulse of
duration , and the amplitude of the charge-pump current is
links and RF wireless communications. In order to optimize the
. In the continuous-time approximation, the average value
loop performance, some features should be taken care of [1], [2].
per input signal period can be given as
First, to minimize output phase jitter due to external noise, the
loop bandwidth should be made as narrow as possible. Second,
to minimize output jitter due to internal oscillator noise, or to ob- (2)
tain best tracking and acquisition properties, the loop bandwidth
should be made as wide as possible. These principles obviously The transfer function curve of a linear PD is shown in Fig. 4(a),
oppose each other; and therefore some compromises between where the vertical axis represents the charge injected into the
these two principles are always inevitable. The block diagram loop filter during one period of the input signal. The character-
of a PLL with a discriminator-aided phase detector (DAPD) is istic of a nonlinear PD, as shown in Fig. 4(b), can be divided
shown in Fig. 1. One could leave the discriminator connected into two regions [7]. It has the same characteristic within the
permanently and/or merely weight the relative contributions of locked-in region as that of the linear PD, but the acquisition
the system so as to obtain the desired damping. The discrimi- time will be reduced with the steeper characteristic outside the
nator-aided path adds to lock the PLL quickly. Once the PLL lock-in region. When designing a PLL with the nonlinear PD,
is in lock, a better bandwidth can be maintained while the dis- first the central slope is determined to fulfill the requirement of
criminator is disconnected. noise and modulation for the PLL with a standard PD. Then,
In this paper, a novel DAPD is presented to reduce pull-in the slope near is gradually increased to improve ac-
time and to enhance the switching speed of the PLL, while quisition speed. The proposed nonlinear PD can be built with
maintaining the same noise bandwidth and avoiding modula- delay cells and standard PD circuits, as shown in Fig. 4(c). The
tion damping. Section II describes the basic concept of the pro- standard PD is a digital circuit, triggered by the positive edge
posed structure. Sections III and IV present the realization and of the input reference signal and the output feedback signal
the measurement of the system, respectively, and Section V con- . Considering the delay cells with , the PDs decide the
cludes the paper. position of the phase difference among these regions. Ac-
cording to the value of , the charge pump will output the cor-
Manuscript received November 30, 1999; revised April 28, 2000. This responding current controlled by the up signals or the down
work was sponsored by the National Science Council under Contract signals . The behavior model of the nonlinear PD can be ex-
88-2219-E-002-024. plained by the waveforms of Fig. 4(d). According to the time
C.-Y. Yang was with the Department of Electrical Engineering, National
Taiwan University, Taipei, Taiwan 10617, R. O. C. He is now with the difference between both input signals and , the up signals
Department of Electronic Engineering, HuaFan University, Taipei, Taiwan are used to increase and the down signals are used to de-
223, R.O.C. crease the frequency of signal . The nonlinear PD always gen-
S.-I. Liu is with the Department of Electrical Engineering, National Taiwan
University, Taipei, Taiwan 10617, R. O. C. erates the right signal to equalize the frequency of both input
Publisher Item Identifier S 0018-9200(00)08697-2. signals as the conventional PD. The time interval is positive
0018–9200/00$10.00 © 2000 IEEE
1446 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 10, OCTOBER 2000
(4)
(5)
Fig. 4. (a) Characteristic of the conventional linear phase detector. (b) Characteristic of the nonlinear phase detector. (c) Block diagram of the nonlinear phase
detector. (d) Operation of the nonlinear phase detector.
the sake of stability, the charge-pump current becomes III. CIRCUIT REALIZATION
instead of outside the locked-in region, and the loop-filter
A. Architecture
resistor would become instead of while increases
times, i.e., the loop bandwidth increases times. It may speed The designed frequency synthesizer integrates the proposed
up the switching capability of the PLL. Once it is locked on DAPD, the charge-pump circuit, a prescaler, and a VCO in a
the correct frequency, the PLL will then return to the low-noise single CMOS chip. It is similar to the structure of a conventional
operation. integer-N frequency synthesizer, as shown in Fig. 5. By adding
1448 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 10, OCTOBER 2000
the frequency-doubling block, the output frequency can be up are placed on a factor of four below and above ,
to 900 MHz from a 450-MHz VCO. respectively. In addition, a pump current of 560 A is applied
and the parameter is chosen. The values of the resistors
B. Phase Detector with DAPD and Charge-Pump Filter and and the capacitors and are 470 , 235 , 33 nF,
A schematic diagram of the DAPD is shown in Fig. 6. The phase and 2.2 nF, respectively. The open-loop gain response is depicted
frequency detectors are used to compare the phase difference of in Fig. 7. Curve (a) is the characteristic of the PLL with the DAPD
both input signals. The output signal ( ) of the DAPD de- while the bandwidth is 120 kHz. However, the PLL will return
pends on the phase difference of both input signals whether it is curve (b) with the bandwidth of 40 kHz when it is near in lock.
larger than or not. Considering the delay cells with delay These curves give the same phase margin of approximately 60 .
, which is very small but never negligible, the DAPD decides Thus the PLL would be usually stable.
the operating bandwidth of the loop filter. When leads , the Currently, most frequency synthesizers use phase-frequency
time difference islarger than , and is “low”and is detectors (PFDs) as their PDs. A PFD is a sequential circuit which
“high.” Otherwise, when lags is smaller than , and can not only detect the phase error but also provides a frequency-
is “high” and is “low.” In a word, if the absolute value sensitive signal to aid acquisition when the loop is out of lock.
of the time difference between input signals is larger than , The drawback of some conventional PFDs is a dead zone in the
may appear “high” level. Also, the charge-pump current be- phase characteristic, which generates the phase error in the output
comes andtheresistoroftheloopfilterbecomes ,i.e., signals. To solve this problem, a dynamic CMOS PFD is adopted
. Until the absolute value of is within and as shown in Fig. 8(b), which is similar to the one proposed in
are both “high,” thus is brought to “low” level, then [10]. The PFD consists of two half-transparent registers, shown
the charge-pump current and the resistor return to and , re- in Fig. 8(a), [9] and a NAND gate. It is triggered by the negative
spectively, with a narrower bandwidth for better noise rejection. edge of input signals. The timing diagram of the PFD is shown in
However, the delay cell is adopted according to the VCO’s noise. Fig. 8(c). Even though the input signals are in-phase, the glitches
Assuming that the phase characteristic of the signal is , caused by the reset path always exist. So, extra filters are added
should be larger than to make the DAPD work. in the DAPD to remove the effect of the glitches.
In our design, the loop bandwidth of the PLL equals about So far, the positive gain of the VCO is applied from the above
krad/s, and the loop gain zero and the loop pole discussion. However, since the gain of the VCO is negative as
YANG AND LIU: FAST-SWITCHING FREQUENCY SYNTHESIZER 1449
(a) (b)
(c)
Fig. 8. Implementation of phase-frequency detector. (a) Half-transparent cell. (b) Phase-frequency detector (PFD). (c) Timing diagram of PFD.
described later, and of the PD connected to the charge mode [12]–[15]. It consists of a synchronous divide-by-4/5
pump should be interchanged. The charge pump, which is based counter as the first stage and an asynchronous divide-by-8
on one described in [11], is adopted. It suppresses the charge counter as the second stage. The circuits in the first stage
sharing from the parasitic capacitance by a pair of switched- are fully differential, while the single-ended logic circuits
current sources. are used in the second stage. To reduce the supply noise, an
emitter-coupled logic (ECL)-like differential logic is used in
C. Dual-Modulus Prescaler the high-speed stage [16]. In the divide-by-4/5 circuit, the
DFF is a differential flip-flop. Fig. 10 shows the schematic
The dual-modulus prescaler is the high-frequency building diagram of a NAND-gate logic flip-flop. Merging the logic gates
block in the frequency synthesizer. This circuit shown in Fig. 9 to a flip-flop saves power and increases the operating speed.
divides the frequency of the VCO output signal by a factor of The toggle flip-flops are made by true single-phase clocking
32 or 33 depending on the logic value of the controlled signal (TSPC) DFFs of [12] behind a differential-to-single buffer.
1450 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 10, OCTOBER 2000
D. VCO
Fig. 13. Experimentally measured VCO transfer curve.
The VCO is another high-frequency building block in a fre-
quency synthesizer. Still, an ECL-like current-mode differential
IV. MEASUREMENT RESULTS
pair, as shown in Fig. 11, is used as a delay cell [17], [18] to
achieve high common-mode rejection in a four-stage ring oscil- The synthesizer is implemented in a 0.35- m CMOS process.
lator. The coarse tuning of the ring-oscillator’s center frequency The microphotograph of the fabricated frequency synthesizer
is achieved by the bias Vbpo1 (or through the use of a dig- is shown in Fig. 12. The loop filter is off-chip, and the output
ital-to-analog converter), and a fine tuning technique is needed signal of the VCO is connected to a source follower. The fre-
for the PLL voltage-control path. The gain required for the os- quency synthesizer is measured at a supply voltage of 3.3 V.
cillator is easily determined by the ratio of M1 and M2 as the The frequency of the reference signal is 14 MHz. Fig. 13 shows
current gain. The proposed delay cell has the better noise per- the measured VCO transfer function by varying the controlled
formance because the operation of the circuit is carried out by voltage. The measured VCO has a monotonic frequency range
the differential signal immune to the power-supply-injected and of 435–485 MHz. The gain of the VCO is 32.4 MHz/V at the
substrate-injected noise sources. The replica bias circuit adjusts center frequency of 460 MHz. Fig. 14 shows the output signal
the load over a wide range in response to a swept supply cur- spectrum (using HP8560A Spectrum Analyzer after locked) of
rent. It insures the output swing of delay cells maintain fixed 448 MHz with the phase noise 99 dBc/Hz at 100-kHz offset.
and takes a changeable bias current to cover a suitable range of By adding an external frequency doubler, however, the phase
different output frequencies. Bypass capacitors are also an im- noise is 91 dBc/Hz at 100-kHz offset from 896-MHz carrier
portant consideration for the replica bias and voltage reference as shown in Fig. 15. Also, the measured waveform in the time
circuits. On-chip bypass capacitors can be used to help reduce domain is also shown in Fig. 16, and its rms and peak-to-peak
their noise contribution to the ring-oscillator delay cells. jitter measured by CSA803 (Communication Signal Analyzer)
YANG AND LIU: FAST-SWITCHING FREQUENCY SYNTHESIZER 1451
Fig. 14. Measured output spectrum of the frequency synthesizer. (a) With span
50 MHz. (b) With span 1 MHz.
Fig. 16. Measured waveform. (a) In time domain. (b) Jitter performance.
Fig. 15. Measured output spectrum of the frequency synthesizer with added
frequency doubler.
(a)
(b)
(c)
(3)
(13)
SHAHANI et al.: FREQUENCY SYNTHESIS USING APD 2237
TABLE I
MEASURED APD PLL PERFORMANCE
REFERENCES Min Xu (S’97), for a photograph and biography, see this issue, p. 2231.
Arvin R. Shahani, for a photograph and biography, see this issue, p. 2041.
Maria del Mar Hershenson (S’98), for a photograph and biography, see this Thomas H. Lee (S’87–M’87), for a photograph and biography, see this issue,
issue, p. 2231. p. 2041.
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 36, NO. 5, MAY 2001 777
Fig. 6. Simulated output spectrum of the PLL with the maximum phase offset
(1) of 2.5 .
TABLE I
PARAMETERS FOR SPURIOUS NOISE SIMULATION OF THE PLL
where
PFD gain;
loop filter transfer function;
VCO gain;
dividing ratio;
output phase noise;
phase noise generated from the fractional divider. and the output spectrum of the PLL appears at . The
For the edge-combining frequency synthesizer, is relative power of the spurious tones is given by
mainly caused by the delay mismatches between the delay
cells, and is periodic because the fractional- dividing is per-
formed by periodic combining of the phase-shifted waveforms. dBc (3)
Assuming that is a sinusoidal, the output voltage of the
PLL is presented as
Fig. 6 shows the simulated output spectrums of the frequency
synthesizer. When the maximum phase offset is set to 2.5 ,
30-dBc spurious tones are shown near the carrier. During the
simulation, the design parameters of the PLL are selected as de-
scribed in Table I.
Fig. 7. Periodic phase error at the PFD input when PLL is locked without calibration.
the VCO. By adjusting the rising phases of the corresponding where is the phase error due to the th output signal after
outputs, the phase errors due to the delay mismatches can be cycles of calibration, and is the amount of the calibration
eliminated. for the th VCO output in the th iteration.
Since a PLL makes the average phase error zero in the locking If the above iteration is performed continuously until
mode, the sum of the individual phase error becomes zero when is satisfied for all delay cells, the final value of the phase
the PLL is locked. In other words, when the number of the mul- error due to the 1st VCO output becomes
tiphase clocks is eight
(4)
If the calibration circuit shifts the rising phase of the first output
by , the phase errors are temporarily changed to
(5) (9)
When the PLL is locked again
Similarly
(6)
(10)
based on (4), and by assuming that the phase disturbance, ,
is equally distributed to all delay cells to satisfy (6), the phase Consequently, all the phase errors become zero after finishing
errors become the completion of the calibration. Note that the calibration algo-
rithm performs correctly even if the amount of the individual
phase correction is different and even if the order of the calibra-
(7) tion is changed. Fig. 8 shows the trend of phase error during the
calibration, simulated by MATLAB. Here, maximum 10-mV
If this operation is performed on each output of the VCO one mismatches are assumed.
by one, the phase errors are changed as shown at the bottom of
the page, and this is the completion of one iteration of the cali- V. CIRCUIT IMPLEMENTATION
bration procedure. The resulted phase errors after first iteration
A. Overall Structure
are given by
As shown in Fig. 2, a loop for the calibration is combined
with the main fractional- synthesizer. The calibration loop pe-
riodically measures the phase error due to delay mismatch at
the PFD, and compensates for the mismatches by updating the
(8) offset control voltage of delay cells one after another. This up-
date operation must be performed only when the PLL is locked,
initially:
1st step:
2nd step:
.. .. .. ..
. . . .
8th step:
PARK et al.: SELF-CALIBRATED PHASE-LOCKED LOOP WITH PRECISE I/Q MATCHING 781
Fig. 8. Simulated behavior of phase error during the compensation. Fig. 10. Detail structure of the self-calibration loop.
VI. MEASUREMENTS
Fig. 9. Schematic of the delay cell having offset control capability. A self-calibrated fractional- frequency synthesizer PLL has
been fabricated in 0.35- m CMOS technology. The micropho-
tograph of the fabricated chip is shown in Fig. 11, and its active
because the phase error due to the mismatches can be accurately area is about mm . Both frequency synthesizers with
measured only in the locking mode. If the calibration interval is and without a calibration loop have been integrated in the same
shorter than the lock-in time of the main loop, the locking be- chip to demonstrate the proper operation of the mismatch cali-
havior of the main loop becomes disturbed and even unstable. bration scheme. In both cases, an external 25-MHz crystal os-
Therefore, it is important to make not only the calibration in- cillator is used as a reference clock. The bandwidth of the PLL
terval long enough but also the amount of phase change, , gen- is set to 1 MHz.
erated by the individual calibrating operation, small enough to Fig. 12 shows the measured output spectrum of the frac-
make sure that the main loop quickly responds. In this work, tional- RF synthesizer. Fig. 12(a) is the output spectrum of the
the loop gain of the calibration loop is chosen to be 1/10 of frequency synthesizer without a calibration loop. In this figure,
the main-loop loop gain. The updated offset control signals are the fractional dividing ratio is set to 1/8. Without calibration
maintained until the next update by a capacitor array connected circuit, 30-dBc spurious noise appears at 3.125 ( 25/8) MHz
to each delay cell. The delay cell used in this work is of the same offset from the carrier frequency. In this case, the maximum
type as the low-noise delay cell used in [13]. However, to con- phase offset is estimated as about 2.5 by the equations in
trol the rising phase of each output, four transistors are added, Section II. On the other hand, in the output spectrum of the
as shown in Fig. 9. For example, if is low, the rising of self-calibrated frequency synthesizer, the power of the spurious
is pulled earlier. tones is attenuated to 55 dB, as shown in Fig. 12(b), and
the calculated maximum phase offset is less than 0.2 . Initial
B. Calibration Loop settling of the calibration loop takes about 5.0 ms. Fig. 13 shows
Fig. 10 shows the circuit for the mismatch calibration. The the measured phase noise of the frequency synthesizer. The
calibration circuit consists of a PFD shared with the main loop, closed-loop phase noise at 100-kHz offset from the 1.8-GHz
an additional charge pump, and a capacitor array. When carrier is 105 dBc/Hz. Table II summarizes the measured
is high, the PFD output signal is driven to the charge pump, and characteristics of the PLL.
782 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 36, NO. 5, MAY 2001
Fig. 12. Output spectrum of the synthesizer PLL. (a) Without calibration. (b) With calibration.
VII. CONCLUSION
A self-calibrated 1.8-GHz PLL for fractional- frequency
synthesizing is fabricated in a 0.35- m CMOS process. A
ring-type oscillator is used to generate the multiphase signals,
and a self-calibration loop reduces the output fractional spurs
caused by delay mismatches between the delay cells. The phase
offset of the I/Q signals from the ring oscillator is also relieved.
With this calibration scheme, the fractional spur on the PLL is
attenuated by 25 dB and the maximum phase offset is thereby
reduced to less than 0.2 .
REFERENCES
[1] B. Razavi, RF Microelectronics. Englewood Cliffs, NJ: Prentice Hall,
1998.
Fig. 13. Measured phase noise of the PLL. [2] C. D. Hull, J. L. Tham, and R. R. Chu, “A direct-conversion receiver
for 900-MHz (ISM band) spread-spectrum digital cordless telephone,”
IEEE J. Solid-State Circuits, vol. 31, pp. 1955–1963, Dec. 1996.
TABLE II [3] M. Steyaert, M. Borremans, J. Janssens, B. D. Muer, N. Itoh, J.
PERFORMANCE SUMMARY OF THE SELF-CALIBRATED PLL Craninckx, J. Crols, E. Morijuji, H. S. Momose, W. Sansen, T. Yamaji,
H. Tanimoto, and H. Kokatsu, “A single-chip CMOS transceiver for
DCS-1800 wireless communications,” in ISSCC Dig. Tech. Papers, San
Francisco, CA, Feb. 1998, pp. 48–49.
[4] A. Montalvo, A. Holden, W. Suter, C. Angell, S. White, N. Klemmer,
and D. Homol, “A 22-mW NADC receiver IF Chip with integrated
second IF channel filtering,” in ISSCC Dig. Tech. Papers, San Francisco,
CA, Feb. 1999, pp. 48–49.
[5] J. L. Tham, M. A. Margarit, B. Pregardier, C. D. Hull, R. Magoon, and
F. Carr, “A 2.7-V 900-MHz/1.9-GHz dual-band transceiver IC for dig-
ital wireless communication,” IEEE J. Solid-State Circuits, vol. 34, pp.
286–291, Mar. 1999.
PARK et al.: SELF-CALIBRATED PHASE-LOCKED LOOP WITH PRECISE I/Q MATCHING 783
[6] B. Razavi, “Design considerations for direct-conversion receivers,” Ook Kim (M’86) received the M.S. and Ph.D. de-
IEEE Trans. Circuits Syst. II, vol. 44, pp. 428–435, June 1997. grees in electronics engineering from Seoul National
[7] L. Yu and W. M. Snelgrove, “A novel adaptive mismatch cancellation University, Seoul, Korea, in 1988 and 1994, respec-
system for quadrature IF radio receivers,” IEEE Trans. Circuits Syst. II, tively.
vol. 46, pp. 789–801, June 1999. He was with the Electronics and Telecommunica-
[8] Y. Sugimoto and T. Ueno, “The design of a 1-V 1-GHz CMOS VCO tions Research Institute, Taejon, Korea, from 1994 to
circuit with in-phase and quadrature-phase outputs,” in Proc. Int. Symp. 1998, and with SK Telecom, Seoul, Korea, from 1998
Circuits and Systems, Hong Kong, June 1997, pp. 269–272. to 1999. Since 1999, he has been with Silicon Image
[9] A. A. Abidi, “Direct-conversion radio transceivers for digital commu- Inc., Sunnyvale, CA. He was a Visiting Researcher
nications,” IEEE J. Solid-State Circuits, vol. 30, pp. 1399–1410, Dec. at the Department of Electrical and Electronic En-
1995. gineering, Adelaide University, Adelaide, Australia,
[10] T. A. D. Riley, M. A. Copeland, and T. A. Kwasniewski, “Delta–sigma during 1992, and a Visiting Scholar at the Department of Electrical Engineering,
modulation in fractional-N frequency synthesis,” IEEE J. Solid-State Stanford University, Stanford, CA, during 1999. His research interests are in
Circuits, vol. 28, pp. 553–559, May 1993. CMOS mixed mode circuit design, high-speed data conversion, wireless circuit
[11] M. H. Perrot, “Techniques for high data rate modulation and low power technology, and high-speed data communication.
operations of fractional-N frequency synthesizers,” Ph.D. dissertation,
Mass. Inst. of Technol., Cambridge, MA, 1997.
[12] U. L. Rohde, Digital PLL Frequency Synthesizers. Englewood Cliffs,
NJ: Prentice Hall, 1983. Beomsup Kim (S’87–M’90–SM’95) received the
[13] C.-H. Park and B. Kim, “A low-noise 900-MHz VCO in 0.6-m B.S. and M.S. degrees in electronic engineering
CMOS,” IEEE J. Solid-State Circuits, vol. 34, pp. 586–591, May 1999. from Seoul National University, Seoul, Korea, in
1983 and 1985, respectively, and the Ph.D. degree in
electrical engineering and computer sciences from
the University of California, Berkeley, in 1990.
He worked as a Graduate Researcher and Grad-
uate Instructor at the Department of Electrical Engi-
neering and Computer Sciences, University of Cali-
fornia, Berkeley, from 1986 to 1990. From 1990 to
1991, he was with Chips and Technologies, Inc., San
Jose, CA, where he was involved in designing high-speed signal processing
ICs for disk drive read/write channel. From 1991 to 1993, he was with Philips
Research, Palo Alto, CA, conducting research on digital signal processing for
video, wireless communication, and disk drive applications. During 1994, he
Chan-Hong Park (S’92) received B.S. and M.S. de- was a Consultant, developing the partial response maximum likelihood detec-
grees in electrical engineering from Korea Advanced tion scheme of the disk drive read/write channel. In 1994, he became an Assis-
Institute of Science and Technology (KAIST), tant Professor with the Department of Electrical Engineering, Korea Advanced
Taejon, Korea, in 1994 and 1996, respectively. He Institute of Science and Technology (KAIST), Taejon, Korea, and is currently
is currently working toward the Ph.D. degree in an Associate Professor. During 1999, he took sabbatical leave at Stanford Uni-
electrical engineering at KAIST. versity, Stanford, CA, and at the same time, consulted for Marvell Semicon-
From 1994, he has been with the Department ductor Inc., San Jose, on the Gigabit Ethernet and wireless LAN DSP archi-
of Electrical Engineering, KAIST, as a Grad- tecture. His research interests include mixed-mode signal processing IC design
uate Researcher, where he has been involved in for telecommunication, disk drive, and LAN, high-speed analog IC design, and
designing 100Base-T transceiver ICs, low-noise VLSI system design.
phase-locked loops, and RF front-ends for wireless Dr. Kim is a corecipient of the Best Paper Award for 1990–1991 from the
communications. His research interests include CMOS RF circuits for IEEE JOURNAL OF SOLID-STATE CIRCUITS. He received the Philips Employee
wireless communication, high-frequency analog IC design, and mixed-mode Reward in 1992. Between June 1993 and June 1995, he served as an Associate
signal-processing IC design. Editor for the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II.
788 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 5, MAY 2000
the prescaler modulus to and disabling the swallow counter. 2 circuits. In a conventional 2/3 realization [Fig. 7(a)],
The division continues until the program counter is full and the flip-flop FF is loaded by an OR gate, whereas FF is loaded
RS latch is reset. The overall divide ratio is therefore equal to by FF , an AND gate, and an output buffer. Since FF limits the
. speed, the fanout of three inherent to this topology translates
The pulse-swallow divider used in this work is shown in to substantial power dissipation. Furthermore, if the divider is
Fig. 5(b). Here, the RS latch is followed by a D flip-flop to implemented by current-steering circuits, the AND gate requires
allow pipelining of the prescaler modulus control signal. This stacked logic and hence level-shift source followers. Both of
modification is justified below. The overall divide ratio is now these issues intensify the power–speed tradeoff.
equal to 1. A critical decision in the design of the The 2/3 circuit used in this work is shown in Fig. 7(b).
divider is the choice between low-swing current-steering logic Here, FF is loaded by a NOR gate and FF by a NOR gate and
and rail-to-rail CMOS logic. Simulations of the circuit with a buffer. Simulations indicate that the reduction of the load ca-
various values of , , and indicate that the minimum power pacitance of FF increases the maximum operating speed by ap-
dissipation occurs if the prescaler incorporates current steering, proximately 40%.
its output is converted to rail-to-rail swings, and the remainder The NOR/flip-flop combination is realized as depicted in
of the circuit incorporates standard dynamic and static CMOS Fig. 8. The resistors are made of n-well, and the bias voltage
logic. The use of current steering in the prescaler also obviates is generated to fall midway between the high and low levels of
the need for large oscillator swings, saving power in the VCO inputs and . The output of the prescaler drives a differential
buffer. to single-ended converter, producing rail-to-rail swings for the
The design of the 8/9 prescaler for 2.6-GHz operation remainder of the divider.
presents a great challenge. Shown in Fig. 6, the prescaler The divider of Fig. 5 incorporates pipelining for the prescaler
consists of a synchronous 2/3 circuit and two asynchronous modulus control, thereby relaxing the minimum delay require-
LAM AND RAZAVI: 2.6-GHz/5.2-GHz FREQUENCY SYNTHESIZER 791
Fig. 7. Divide-by-2/3 circuit: (a) conventional topology and (b) circuit used in
this work.
Fig. 6. Prescaler.
ment in this path. Fig. 9 illustrates the issue. When the 9 oper-
ation of the prescaler is finished, the circuit would have at most
seven cycles of to change the modulus to eight. In this par-
ticular prescaler, the timing budget is actually about five input
cycles—approximately 1.9 ns. Thus, with no pipelining, the last
pulse generated by the prescaler in the 9 mode must propagate
through the level converter, the first 2 stage in the swallow
counter, the subsequent logic, the RS latch, and the three-input
NOR gate in less than 1.9 ns. Such a delay constraint necessi- Fig. 9. Pipelining in the prescaler modulus control path.
tates the use of current steering in this path, raising the power
dissipation and complicating the design. With pipelining, on
the other hand, the maximum tolerable delay increases to about
eight input cycles—approximately 3.1 ns.
Fig. 11. (a) Addition of correction circuit to charge pump. (b) Simple folding circuit. (c) Folding circuit with one reference voltage.
75 V.1 This indicates that great attention must be paid to the to . For this reason, and are formed as poly-metal
design of the phase/frequency, the charge pump, and the loop sandwiches (albeit with much less density than MOS capaci-
filter so as to minimize the above errors. tors).
Another source of ripple in the control voltage is the low Another issue in the design of the loop filter of Fig. 10 relates
output impedance of and in Fig. 10, especially as to the thermal noise produced by . Low-pass filtered by
reaches within a few hundred millivolts of the rails. This ef- and , this noise modulates the VCO, raising the output phase
fect creates additional mismatch between the up and down cur- noise. The thermal noise on the control voltage per unit band-
rents as a function of , potentially leading to larger refer- width is given by
ence sidebands near the ends of the tuning range. Transistors
and degenerate and , respectively, alleviating (1)
this issue (another advantage of this topology over the standard
charge-pump configuration).
The addition of in the circuit of Fig. 10 to suppress the
ripple potentially degrades the stability of the loop. Simulations where denotes the noise density of . From the
suggest that for , the settling time increases negli- narrow-band frequency modulation theory [8], we know that
gibly. In this design, pF, pF, and k . if a sinusoid with a peak amplitude and frequency
The two capacitors can be realized by either MOSFET’s or poly- modulates a VCO, the output sidebands fall at rad/s below
metal sandwiches, a choice determined by the control voltage and above the carrier frequency and exhibit a peak amplitude
range. To achieve the maximum tuning range, must ap- of . Approximating the noise per unit band-
proach the supply and ground rails, demanding a reasonable ca- width in (1) by a sinusoid, we obtain the output relative phase
pacitor linearity across this range. MOS capacitors, however, ex- noise per unit bandwidth at an offset frequency as
hibit substantial change as their gate-source voltage falls below
the threshold. Even a parallel combination of an NMOS capac-
itor (connected to ground) and a PMOS capacitor (connected to
) suffers from a two-fold variation as goes from zero
1The ripple is approximated by a sinusoid here. In a more rigorous method,
(2)
the ripple can be expressed as a Fourier series [7].
LAM AND RAZAVI: 2.6-GHz/5.2-GHz FREQUENCY SYNTHESIZER 793
With the values chosen in this design, the output phase noise
reaches 138 dBc/Hz at 10-MHz offset for
GHz/V. While it is desirable to reduce the value of , the re-
quired increase in leads to a severe area penalty because of
the low density of the poly-metal capacitors. Note that since the
stability factor , if is,
say, halved, then must be quadrupled to maintain constant
(for a given charge-pump current).
D. Correction Circuit
The gain of the VCO varies substantially across the tuning
Fig. 13. Measured spectra at 2.6 and 5.2 GHz in locked condition.
range, resulting in considerable change in the settling behavior.
As depicted in Fig. 11(a), it is desirable to vary the charge-pump
current, , such that the product of and and
hence remain relatively constant. Rather than use piecewise
linearization [2], this work incorporates an analog folding tech-
nique. Fig. 11(b) shows a possible solution. Here and
are off if is well below 1.1 V and hence . As
approaches 1.1 V, turns on while is off. Thus, drops,
reaching a low value as carries most of and a neg-
ligible current. As approaches and exceeds 1.3 V, turns
on and eventually returns to . This design actually uti-
lizes the topology shown in Fig. 11(c), where only one reference
voltage is required and each differential pair provides a built-in
offset by virtue of skewed device dimensions. The characteristic
is similar to that shown for Fig. 11(b), with driving the cur- Fig. 14. Measured spectrum at 2.6 GHz.
rent mirrors in the charge pump.
The reference voltage of 1.2 V in Fig. 11(c) assumes that
the gain of the VCO reaches its maximum at V.
This value is somewhat process- and temperature-dependent,
limiting (according to simulations) the suppression of the VCO
nonlinearity to about one order of magnitude.
V. EXPERIMENTAL RESULTS
The frequency synthesizer has been fabricated in a 0.4- m
digital CMOS technology. All of the inductors and capacitors
are included on the chip. Fig. 12 is a photograph of the die,
which measures 1.75 1.15 mm . The circuit has been tested
with a 2.6-V supply.
Figs. 13(a) and (b) depict the output spectra in the locked Fig. 15. Setup for settling time measurement.
condition. The phase noise at 10-MHz offset is equal to 115
dBc/Hz at 2.5 GHz and 100 dBc/Hz at 5.2 GHz. A significant approximately 53 dB below the carrier. For the 5.2-GHz output,
part of the phase noise at 5.2 GHz is attributed to the consider- the sidebands are buried under the noise floor.
able loss of the output 50- buffer. Fig. 14 shows the 2.6-GHz The settling behavior of the synthesizer has also been studied.
output along with the reference sidebands. The sidebands are Fig. 15 illustrates the setup, where the modulus of the feedback
794 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 5, MAY 2000
REFERENCES
[1] “Radio equipment and systems (RES); High performance radio local
area network (HIPERLAN); Functional specification,” ETSI, Sophia
Antipolis, France, ETSI TC-RES, July 1995.
[2] J. Craninckx and M. S. J. Steyaert, “A fully integrated CMOS
DCS-1800 frequency synthesizer,” IEEE J. Solid-State Circuits, vol.
33, pp. 2054–2065, Dec. 1998.
[3] A. Rofougaran et al., “A 900-MHz CMOS LC oscillator with quadrature
outputs,” in ISSCC Dig. Tech. Papers, Feb. 1996, pp. 392–393.
[4] B. Razavi, “A 1.8 GHz CMOS voltage-controlled oscillator,” in ISSCC
Dig. Tech. Papers, Feb. 1997, pp. 388–389.
[5] R. B. Merril et al., “Optimization of high Q
inductors for multi-level
metal CMOS,” in Proc. IEDM, Dec. 1995, pp. 38.7.1–38.7.4.
[6] J. Alvarez, H. Sanchez, and G. Gerosa, “A wide-band low-voltage PLL
Fig. 16. Control voltage during loop settling. for PowerPC microprocessors,” IEEE J. Solid-State Circuits, vol. 30, pp.
383–391, Apr. 1995.
[7] B. Razavi, RF Microelectronics. Upper Saddle River, NJ: Prentice-
TABLE I Hall, 1998.
SYNTHESIZER PERFORMANCE [8] L. W. Couch, Digital and Analog Communication Systems, 4th
ed. New York: Macmillan, 1993.
divider is switched periodically and the control voltage is moni- Behzad Razavi (S’87–M’90) received the B.Sc. de-
gree from Sharif University of Technology, Tehran,
tored. The 0.8-pF capacitor results from the trace on the printed Iran, in 1985 and the M.Sc. and Ph.D. degrees from
circuit board, and the active probe presents an input capacitance Stanford University, Stanford, CA, in 1988 and 1992,
of 2 pF. Since pF and pF, the addition of these respectively, all in electrical engineering.
He was with AT&T Bell Laboratories, Holmdel,
parasitics markedly degrades the stability. Therefore, a 100-k NJ, and subsequently Hewlett-Packard Laboratories,
resistor is placed in series with the active probe to mimic the Palo Alto, CA. Since September 1996, he has been
role of and . The low-pass filter thus formed has a corner an Associate Professor of electrical engineering at the
University of California, Los Angeles. His current re-
frequency comparable to the loop bandwidth, and the 0.8-pF ca- search includes wireless transceivers, frequency syn-
pacitor still produces ringing in the time response. Fig. 16 shows thesizers, phase-locking and clock recovery for high-speed data communica-
the measured control voltage, indicating a settling time on the tions, and data converters. He was an Adjunct Professor at Princeton Univer-
sity, Princeton, NJ, from 1992 to 1994, and at Stanford University in 1995.
order of 40 s. He is a member of the Technical Program Committees of the Symposium on
Table I summarizes the measured performance of the synthe- VLSI Circuits and the International Solid-State Circuits Conference (ISSCC),
sizer. in which he is Chair of the Analog Subcommittee. He is the author of Principles
of Data Conversion System Design (New York: IEEE Press, 1995), RF Micro-
electronics (Englewood Cliffs, NJ: Prentice-Hall, 1998), and Design of Analog
VI. CONCLUSION CMOS Integrated Circuits (New York: McGraw-Hill, 2000), and the editor of
Monolithic Phase-Locked Loops and Clock Recovery Circuits (New York: IEEE
The speed and quality of the devices available in an IC tech- Press, 1996).
Dr. Razavi received the Beatrice Winner Award for Editorial Excellence at the
nology directly affect the choice of transceiver architectures, 1994 ISSCC, the Best Paper Award at the 1994 European Solid-State Circuits
synthesizer topologies, and circuit configurations. In order to Conference, the Best Panel Award at the 1995 and 1997 ISSCC, the TRW Inno-
optimize the overall system performance, the transceiver and vative Teaching Award in 1997, and the Best Paper Award at the IEEE Custom
Integrated Circuits Conference in 1998. He has also served as Guest Editor and
the synthesizer must be designed concurrently, with particular Associate Editor of the IEEE JOURNAL OF SOLID-STATE CIRCUITS and IEEE
attention to the frequency planning. TRANSACTIONS ON CIRCUITS AND SYSTEMS.
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 7, JULY 2002 835
16
tector (PFD) is identified as the main source of spectral pollution in
fractional- synthesizers. The design of the zero-dead zone
PFD and the dual charge pump is optimized toward linearity and
spurious suppression. The frequency synthesizer consumes 35 mA
from a single 2-V power supply. The measured phase noise is as Fig. 1. Principle of 16 fractional-N synthesis.
low as 120 dBc/Hz at 600 kHz and 139 dBc/Hz at 3 MHz.
The measured fractional spur level is less than 100 dBc, even
for fractional frequencies close to integer multiples of the refer- digital noise coupling, the modulator is scheduled for inte-
ence frequency, thereby satisfying the DCS-1800 spectral purity gration on the digital baseband signal processing IC of the full
constraints. transceiver system.
Index Terms—CMOS RF integrated circuits, 16
modulator, The paper describes the design of a monolithic 1.8-GHz
fractional- frequency synthesis, phase-locked loop, phase noise. -controlled fractional- PLL frequency synthesizer. In
Section II, the influence of noise on PLL bandwidth
I. INTRODUCTION requirements is theoretically analyzed for multistage noise
shaping (MASH) and multibit single-loop modulators.
Fig. 2. Third-order multibit single-loop 16 modulator. The internal modulator accuracy is 16 bit. From the five output bits, only four are used for stability
reasons.
B. The Modulators
The influence of both third-order MASH and multibit
single-loop modulators on the spectral purity of the
fractional- synthesizer is investigated. Since the order of
the integrated PLL loop filter is three, the order of the
modulators must also be three or higher to ensure that
noise has at least a 20-dB/dec rolloff at intermediate offset
frequencies, causing no degradation of the output phase noise.
Both modulators have an internal accuracy of 16 bit and 1 LSB
dithering is applied to further randomize any spurious energy. Fig. 3. Maximum PLL bandwidth f versus the reference frequency and
The dithering sequence is third-order noise shaped to avoid an different16 modulator orders, for the type-II fourth-order PLL. The dashed
curve is for the third-order single-loop modulator. The targeted phase-noise
increased noise floor.
The MASH or cascade 1-1-1 modulator is chosen be-
0
specification is 136 dBc/Hz at 3 MHz for DCS-1800.
1
Fig. 5. Simulation results. The phase error for (a) the MASH modulator and (b) the single-loop multibit modulator. The FFT of the current pulses CP [i] for
(c) the MASH modulator and (d) the single-loop multibit modulator.
is 26 MHz and the fractional division number is 67.92. The domain, this effect corresponds to the smaller phase excursions.
output frequency is 1.76592 GHz, i.e., 2.08 MHz offset from The difference in phase error between MASH and single-loop
an integer multiple of . In Fig. 5(a) and (b), the time-domain modulators is reflected in a lower noise floor, i.e., a 10-dB dif-
phase error is plotted for both modulators. Note that the ference. In addition, previously unnoticed spurious tones appear
fractional- PLL frequency synthesizer can hardly be called a in the output spectrum at with .
phase-locked loop, since the loop is never in lock! Due to the Fig. 6 shows the noise of both modulators as it appears at
shaping of the HF noise in the single-loop modulator, the in- the PLL output for an ideal (dotted) and a nonlinear
stantaneous phase error is smaller than for a MASH modulator. conversion (solid). The results of the ideal case closely match
This has two important consequences. First, the on-time of the the theoretical results of Section II-C (solid light gray). Due
charge pumps is smaller for the single-loop modulator, making it to nonlinearity, the simulated output spectrum of the integer-
less sensitive to noise coupling from the substrate and the power PLL (the dash-dotted line) is seriously deteriorated by noise
supply. Second, the sensitivity to the nonlinear con- in the PLL noise bandwidth, increasing the . Especially,
version in terms of noise leakage is reduced. the MASH converter is critical in terms of in-band noise due
To be able to examine the effect of nonlinearities in the fre- to the higher phase error [see Fig. 5(a)], despite the inherently
quency domain, the FFTs of the charge-pump current pulses lower LF noise of the MASH modulator. Note that the sim-
are plotted in Fig. 5(c) and (d). A noise floor appears in ulations are performed without taking into account noise cou-
the output spectrum as well as spurious tones, although the pling through the substrate or power-supply lines. As a conse-
output is perfectly randomized and dithered. Due to the non- quence, the actual spurious performance of the fractional-
linear mixing in the PFD charge pump, noise at folds PLL could be worse than simulated. The presented simulation
back to lower offset frequencies, similar to the effect of a non- results are for a division modulus 67.92, close to an integer mul-
linear DAC in a multibit ADC. Since the noise at is tiple of . When analyzing division moduli in between integer
much lower for the single-loop modulator, its noise leakage multiples of , noise leakage is still observed, but the spurious
due to the nonlinear mixing in the PFD is also lower. In the time tones are well below the phase noise.
DE MUER AND STEYAERT: CMOS MONOLITHIC FREQUENCY SYNTHESIZER FOR DCS-1800 839
Fig. 7. Discrete time autocorrelation estimate of the modulator outputs for (a)
the MASH modulator and (b) the single-loop multibit modulator.
Fig. 6. Simulation results. The 16 noise at the output of the PLL for (a) the
PFD. This effect can be worsened by substrate and power-
supply coupling with signals at .
MASH modulator and (b) the single-loop multibit modulator. The results are
plotted for an ideal PFD (dotted), which closely corresponds to the theoretical
results (solid light gray) and for a nonlinear PFD (solid). They are compared to IV. PLL BUILDING-BLOCK CIRCUIT DESIGN
the simulated integer PLL phase noise (the dash-dotted line).
A. The Fourth-Order Type-II PLL
The explanation for the re-emerging of spurious tones is that A fourth-order type-II PLL is integrated, including a 4-bit
the modulator is unable to sufficiently decorrelate the successive prescaler, a zero-dead-zone PFD, a dual charge pump, and a
output samples. To quantify the correlation in the modulator 3-step equalizer, together with an on-chip LC-tank VCO and a
output, the discrete time autocorrelation estimate is calculated third-order dual-path 35-kHz low-pass loop filter (see Fig. 8).
and plotted for both modulators for inputs close to an integer The equalizer performs a 3-step piecewise equalization of the
value (see Fig. 7). The autocorrelation calculations show corre- loop gain, by keeping the product of the VCO gain and the
lation, although 1–LSB noise-shaped dithering is applied. The charge-pump current constant. To prevent switching between
autocorrelation of the single-loop modulator shows large different equalization states, the state transitions exhibit hys-
correlation peaks, explaining the higher spurious tones in the teresis.
output phase-noise spectrum of the PLL. With the autocorrela-
tion estimate, the necessary internal accuracy of the mod- B. The 4-Bit Prescaler
ulators is found to be at least 13 bits for MASH and 16 bit The first high-speed division of the prescaler is done
for single-loop modulators to sufficiently decorrelate the with two differential single-transistor-clocked (DSTC) logic
modulator output for inputs close to integers. A second possible n-latches [10], forming a differential dynamic D-flip-flop. The
source of tones is the downconversion of tones which are inher- flip-flop operates with rail-to-rail internal signals to minimize
ently present around [5], by the nonlinear mixing in the the residual prescaler phase noise [11] to levels insignificant to
840 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 7, JULY 2002
Fig. 9. (a) Timing control circuit and signals to control the dummy and the output current branch of the charge pump. (b) Charge-pump circuit with (at the left)
the dummy current branch, denoted by the suffix d, and the output branch.
seen in Section II-C, the loop bandwidth needs to be smaller than at a fixed level (see Fig. 8). Additionally, the charge-pump cur-
62 kHz for noise suppression. However, to ensure sufficient rent is designed to be at least a magnitude larger than the fixed
suppression of the low-frequency fractional spurious tones for parasitic charge injection of the switch transistors. The current
inputs close to the integers, the bandwidth is designed to 35 kHz. switches are implemented with pMOS and nMOS transistors to
Despite the rather low loop bandwidth for a fractional- syn- compensate charge injection. Finally, a timing control scheme
thesizer, a settling time of less than 293 s for a 104-MHz step [Fig. 9(a)] is developed to control the charge-pump switches.
is simulated. The up and down control pulses of the PFD are converted to syn-
chronized control signals to drive both the output current branch
E. The Conversion and the dummy current branch of the charge pump [Fig. 9(b)].
Fig. 9(a) shows the dummy and output control signals. The
The nonlinear analysis of Section III identified nonlinearity dummy control is delayed versus the output control by
of the conversion as the main cause of noise leakage modifying the thresholds of the second inverter-string (indicated
and spurious tones. Therefore, the PFD and charge-pump cir- by high and low) such that the current always flows, pre-
cuits are carefully optimized toward spurious suppression as venting hard on/off switching of the current sources. To equalize
such and toward a highly linear phase-error detection for rise and fall times and force a perfect rad relation between
spurious suppression. nMOS and pMOS control signals, latches at the outputs of both
First, the reference spur generation by the PFD charge-pump inverter strings are implemented. Capacitors at the control out-
circuit is carefully minimized. The integration in the first path of puts lower the rise and fall times to prevent large charge injec-
the loop filter is done actively to keep the charge-pump output tions by fast switching.
842 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 7, JULY 2002
TABLE II
SUMMARY OF MEASURED SPECIFICATIONS COMPARED TO THE
DCS-1800 SPECIFICATIONS
noisy control pulses are close to the LC tank and the bonding
wires of the VCO power supply. Without proper shielding, the
VCO phase noise is seriously degraded by this noise coupling.
In Fig. 13, the measured noise and the noise as sim-
ulated in Section III (dashed) is compared. The dash-dotted line
is the simulated phase noise of the PLL without control. The
simulated noise leakage closely matches the measured re-
sults, except at very low offsets due to the limited memory. The
phase noise at high offsets is increased versus the simulated PLL
results due to noise coupling. Second-order tones are larger in
measurements, since the models in the simulator do not include
second-order effects and noise coupling. Tones at 520 kHz are
believed to come from subharmonic tones present in the
Fig. 13. Phase noise measurement with the MASH converter at 1.76592 GHz modulator output [5], which are amplified by mixing through
compared to the simulated 16 noise at the output of the PLL (dashed), and
with the simulated PLL output without 16 control (dash-dotted).
noise coupling. When comparing the results for the MASH and
the single-loop modulator, the measured results are less pro-
nounced than the simulated results (see Fig. 6). The measured
single-loop multibit modulator is presented in Figs. 12 and 13.
phase noise for the single-loop modulator is however a few deci-
Small spurs are present at 2.08 MHz as predicted by the simu-
bels lower than for the MASH modulator. Note that all measure-
lations in Fig. 6. The spur level is well below 100 dBc, due to
ments are performed for frequencies close to integer multiples
careful PFD charge-pump design. The phase noise at 600 kHz
of .
is lower than 120 dBc/Hz. The measured settling time of the PLL is 226 s for a
In Fig. 12, the measured phase noise of the PLL with a
104-MHz frequency step. The power consumption of the PLL
multibit single-loop modulator (dark) is compared to the phase
is 70 mW from a 2-V power supply. The fully integrated
noise at integer division (light). Noise at lower offsets origi-
low-phase-noise VCO is responsible for almost 66% of the
nates from the modulator due to noise folding in the PFD,
total power consumption. The IC area is 2 2 mm , including
as predicted by the simulations. As a result, the rms phase error
bonding pads and bypass capacitors. Table II shows the mea-
is increased from 1.7 to 3 . Note that the phase noise
sured specifications compared to the DCS-1800 specifications
of the PLL at integer divisions is as low as 124 dBc/Hz
[1]. The specifications of the IC prototype comply with the
at 600 kHz, which is only 0.3 dB higher than predicted by
DCS-1800, only the is degraded due to the limited
the PLL simulations (see Table I). The measured results for
resolution of the measurement setup.
fractional division are much noisier than predicted by simu-
lation. The phase noise at offset frequencies close to 10 kHz
is increased due to the limited memory of the data generator. VI. CONCLUSION
The noise at higher offset frequencies is corrupted by noise A monolithic 1.8-GHz -controlled fractional- PLL
coupling from the data generator. As can be seen in Fig. 10, frequency synthesizer is implemented in a standard 0.25- m
the -control bonding wires, which conduct rail-to-rail, very CMOS technology. The monolithic fourth-order type-II PLL
844 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 7, JULY 2002
integrates the digital synthesizer part together with a fully [11] B. De Muer and M. S. J. Steyaert, “A single-ended 1.5-GHz 8/9 dual-
integrated LC VCO, a high-speed prescaler, and a 35-kHz modulus prescaler in 0.7-m CMOS with low phase-noise and high
input sensitivity,” in Proc. Eur. Solid-State Circuits Conf. (ESSCIRC),
dual-path loop filter on a die of only 2 2 mm . To investigate The Hague, Sept. 1998, pp. 256–259.
the influence of the modulator on the synthesizer’s spectral [12] J. Craninckx and M. S. J. Steyaert, “Low-phase-noise fully integrated
purity, a fast nonlinear analysis method is developed, showing CMOS frequency synthesizers,” Ph.D. dissertation, Katholieke Univ.
Leuven, Belgium, 1997.
good correspondence with measurements, in contrast to the [13] B. De Muer, M. Borremans, N. Itoh, and M. S. J. Steyaert, “A 1.8-GHz
results of the theoretical analysis. Nonlinear mixing in the highly tunable low-phase-noise CMOS VCO,” in Proc. IEEE Custom
phase-frequency detector and the VCO is identified as the main Integrated Circuits Conf. (CICC), Orlando, FL, May 2000, pp. 585–588.
[14] B. De Muer and M. S. J. Steyaert, “Fully integrated CMOS frequency
source of spectral pollution in fractional- synthesizers. synthesizers for wireless communications,” in Analog Circuit Design,
MASH and single-loop multibit modulators are compared W. Sansen, J. H. Huijsing, and R. J. van de Plassche, Eds. Norwell,
for use in fractional- synthesis. Although the MASH is stable MA: Kluwer, 2000, pp. 287–323.
[15] F. M. Gardner, Phaselock Techniques. New York: Wiley, 1979.
and easy to integrate, the single-loop modulator presents a
better solution, showing less sensitivity to noise leakage and
noise coupling and providing more flexibility. The measured
phase noise is lower than 120 dBc/Hz at 600 kHz and Bram De Muer (S’00) was born in Sint-Amands-
139 dBc/Hz at 3 MHz. The measured fractional spur level is berg, Belgium, in 1973. He received the M.Sc.
lower than 100 dBc, satisfying the DCS-1800 spectral purity degree in electrical engineering in 1996 from the
Katholieke Universiteit Leuven, Belgium, where
requirements. All measurements are performed for frequencies he is currently working toward the Ph.D. degree
close to integer multiples of the reference frequency, where the on high frequency low-noise integrated frequency
synthesizer is most sensitive to spurious tones. synthesizers at the ESAT-MICAS laboratories.
He has been a Research Assistant with
ESAT-MICAS laboratories since 1996. His research
REFERENCES is focused on integrated low-phase-noise VCOs with
on-chip planar inductors and high-speed prescaler
[1] M. S. J. Steyaert, J. Janssens, B. De Muer, M. Borremans, and N. Itoh, “A design, leading to fully integrated 16 fractional-N synthesizers in CMOS
2-V CMOS cellular transceiver front-end,” IEEE J. Solid-State Circuits, technology.
vol. 35, pp. 1895–1907, Dec. 2000.
[2] T. Cho, E. Dukatz, M. Mack, D. Macnally, M. Marringa, S. Mehta, C.
Nilson, L. Plouvier, and S. Rabii, “A single-chip CMOS direct-conver-
sion transceiver for 900-MHz spread-spectrum digital cordless phones,”
in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, San Michel S. J. Steyaert (S’85–A’89–SM’92) was born
Francisco, CA, Feb. 1999, pp. 228–229. in Aalst, Belgium, in 1959. He received the M.S.
[3] A. Rofougaran, G. Chang, J. J. Rael, J. Y.-C. Chang, M. Rofougaran, P. degree in electrical-mechanical engineering and
J. Chang, M. Djafari, J. Min, E. W. Roth, A. A. Abidi, and H. Samueli, the Ph.D. degree in electronics from the Katholieke
“A single-chip 900-MHz spread-spectrum wireless transceiver in 1-m Universiteit Leuven (K.U. Leuven), Heverlee,
CMOS—Part II: Receiver design,” IEEE J. Solid-State Circuits, vol. 33, Belgium, in 1983 and 1987, respectively.
pp. 547–555, Apr. 1998. From 1983 to 1986, he obtained an IWONL fel-
[4] M. Copeland, T. Riley, and T. Kwasniewski, “Delta–sigma modulation lowship (Belgian National Foundation for Industrial
in fractional-N frequency synthesis,” IEEE J. Solid-State Circuits, vol. Research) which allowed him to work as a Research
28, pp. 553–559, May 1993. Assistant at the Laboratory ESAT at K.U. Leuven.
[5] S. R. Norsworthy, R. Schreier, and G. C. Themes, Delta–Sigma Data In 1987, he was responsible for several industrial
Converters: Theory, Design and Simulation. New York: IEEE Press, projects in the field of analog micropower circuits at the Laboratory ESAT as
1997. an IWONL Project Researcher. In 1988, he was a Visiting Assistant Professor
[6] B. Miller and R. Conley, “A multiple modulator fractional divider,” at the University of California, Los Angeles. In 1989, he was appointed by
IEEE Trans. Instrum. Meas., vol. 40, pp. 578–583, June 1991. the National Fund of Scientific Research (Belgium) as a Research Associate,
[7] “Digital cellular communication system (Phase 2+); Radio transmission in 1992 as a Senior Research Associate, and in 1996 as a Research Director
and reception,” Eur. Telecommun. Standards Inst., ETSI 300 190 (GSM at the Laboratory ESAT, K.U. Leuven. Between 1989 and 1996, he was also
05.05 version 5.4.1), 1997. a part-time Associate Professor and since 1997 an Associate Professor at
[8] W. Rhee, B.-S. Song, and A. Ali, “A 1.1-GHz CMOS fractional-N
16
the K.U. Leuven. His current research interests are in high-performance and
frequency synthesizer with a 3-b third-order modulator,” IEEE J. high-frequency analog integrated circuits for telecommunication systems and
Solid-State Circuits, vol. 35, pp. 1453–1460, Oct. 2000. analog signal processing.
[9] The Mathworks Inc., Matlab User’s Guide, Version 5. Englewood Dr. Steyaert received the 1990 European Solid-State Circuits Conference
Cliffs, NJ: Prentice Hall, 1997. Best Paper Award, the 1995 and 1997 ISSCC Evening Session Award, the
[10] J. Yuan and C. Svensson, “New single-clock CMOS latches and flip- 1999 IEEE Circuit and Systems Society Guillemin–Cauer Award, and the
flops with improved speed and power savings,” IEEE J. Solid-State Cir- 1991 NFWO Alcatel-Bell-Telephone award for innovative work in integrated
cuits, vol. 32, pp. 62–69, Jan. 1997. circuits for telecommunications.
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 5, MAY 1997 691
Abstract—This paper describes an I/O scheme for use in a high- The chip layout and experimental results are presented in
speed bus which eliminates setup and hold time requirements Section IV followed by a conclusion in the final section.
between clock and data by using an oversampling method. The
I/O circuit uses a low jitter phase-locked loop (PLL) which
suppresses the effect of supply noise. Measured results show peak-
to-peak jitter of 150 ps and rms jitter of 15.7 ps on the clock line. II. SYSTEM ARCHITECTURE
Two experimental chips with 4-pin interface have been fabricated Two chips, bus master and bus slave, were designed. Bus
with a 0.6-m CMOS technology, which exhibits the bandwidth masters in a system bus initiate bus transactions, and slaves re-
of 960 Mb/s per pin.
spond to the tenured master. For example, a memory controller
Index Terms— Skew-tolerant, high speed bus, oversampling, works as the master chip and a memory with a high-speed
phase locked loop, jitter, CMOS, phase frequency detector, volt- interface works as the slave chip. A simplified block diagram
age controlled oscillator.
of the two chips is shown in Fig. 1. The bus signals are
composed of 4-b wide data lines, a clock line, and a reference
I. INTRODUCTION line. A charge pump PLL multiplies the external clock by
two and generates two sets of multiphase clocks for both bit
A S the speed of high-speed digital systems tends to be
limited by the bandwidth of pins, new I/O architectures
are gaining momentum over conventional ones. The advent
serialization and data oversampling. The relationship between
internal 12-phase clocks and external clock is shown in Fig 2.
First set of multiphase clocks are 12 multiphase clocks with
of 64 Mb and 256 Mb DRAM’s and faster logic chips also
30 of phase separation. These 12 clocks are shown in Fig 2(a)
propels the need for high-speed I/O interface while reducing
as PCK[0] to PCK[11]. These multiphase clocks were laid out
the number of pins and hence the system cost. Synchronous
to minimize the interference. Fig 2(b) shows the multiphase
DRAM’s increased chip bandwidth up to 220 Mb/s/pin [1]. A
clock distribution. Ground lines were inserted between each
revolutionary architecture using delay-locked loops (DLL’s) or
multiphase clock to minimize the interference. When one
phase-locked loops (PLL’s) was also successful in providing
clock is switching, the adjacent clocks are guaranteed to
over 500 Mb/s/pin bandwidth [2], [3]. Such a narrow, high-
be in stable state. This configuration minimizes coupling
speed bus provides large bandwidth in a small, low pin-count
between clocks. The second set of multiphase clocks are
package, but such high-speed bus architectures inevitably
four multiphase clocks with 90 of phase separation. This
require strict phase relationships between clock and data.
second set of multiphase clocks, TCK[0] to TCK[3], are in
A phase-tolerant I/O scheme was also developed previously
phase with PCK[0], PCK[3], PCK[6], PCK[9], respectively.
for a point-to-point link [4]. This paper describes an I/O
We generate these two separate sets of clocks to equalize
scheme for use in a high-speed bus which eliminates setup
loading conditions.
and hold time margins by using blind 3 oversampling and
An 8-b parallel data stream is first converted to a 4-b
data recovery. In the new scheme, the clock line delivers
data stream by an internal clock and then serialized with a
only frequency information. The data receiving circuits extract
serialization circuit. The serializer circuit used is the same
phase information from the data itself. An 8-b data bus
type of circuit reported in [4]. The only difference is that
employing this skew insensitive scheme can deliver over
four phase clocks instead of ten phase clocks of the previous
960 MB/s. Two experimental chips with 4-pin interface were
design are used in this design, thereby reducing area and
fabricated.
parasitic capacitance at high-speed nodes. The serial stream
In Section II, the chip architecture and the skew-tolerant I/O
is driven by a current controlled open-drain output driver.
scheme will be presented. The circuit design techniques for
The second set of multiphase clocks, TCK[0] to TCK[3],
low jitter PLL and other circuits are discussed in Section III.
are used by the transmitter to serialize 4 b of data. Each
pin connected to a high-speed bus has 12 oversamplers and
a output driver. In [6], 32 clock phases are generated to
Manuscript received August 20, 1996; revised December 3, 1996.
S. Kim, K. Lee, Y. Moon, and D.-K. Jeong are with the Inter-University oversample the incoming data. The decision on the degree
Semiconductor Research Center, Seoul National University, Seoul 151-742, of oversampling is a tradeoff between input data phase jitter
Korea. tolerance, power, and area. If too many clock phases are used
Y. Choi and H. Lim are with Samsung Electronics Co., Yongin-City,
Kyungki-Do, Korea. per bit period, power consumption and chip area will increase.
Publisher Item Identifier S 0018-9200(97)02850-3. But low oversampling ratio may affect the tolerance of phase
0018–9200/97$10.00 1997 IEEE
692 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 5, MAY 1997
(a)
(b)
Fig. 2. (a) External clock and 12 multiphase clocks relationship. (b) Multiphase clock layout.
jitter on the incoming data. If the phase jitter on the incoming Fig. 3. The serial input data is sampled at the rising edges
data is low and the PLL has low jitter characteristics, the of each multiphase clock. The receiver samples the serial
oversampling ratio can be as low as three [7]. The oversampler data blindly without any constraint on setup and hold time
oversamples the bus data three times per bit using 12 phase margins. The sampled data is amplified again regeneratively to
clocks provided by a PLL. To extract correct phase information reduce possible metastability. Fig. 3 shows two high-speed bus
from the data stream, the high-to-low transition is inserted in signals, bus signal 0 and bus signal 1, with skew between them.
each head of a packet on each pin for correct data sampling. When the signal receiver detects the first 1-to-0 transition, it
The slaves of the bus keep oversampling the bus signals to selects the next bit as the first valid data. The third bit after
catch the start of a bus transfer. This process is illustrated in the first valid bit is also selected as valid. It is assumed
KIM et al.: 960-Mb/s/pin INTERFACE FOR SKEW-TOLERANT BUS 693
that the next oversampled bit after the first 1-to-0 transition
was sampled near the center of data eye pattern. Each pin
of the data bus tracks the start phase of a data transfer
separately. After each pin catches the start of a data transfer,
the demultiplexed data of each pin is retimed into a single
internal clock domain. Since this process can be done in one
clock cycle, the masters can respond quickly as distance from
the signal source changes.
Since this scheme allows skew not only in clock line but
also among data lines, there is a possibility that some of
the demultiplexed parallel data are one internal clock cycle
Fig. 6. Conventional phase frequency detector. earlier or later than the other demultiplexed data after retiming.
694 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 5, MAY 1997
does not change for small changes in the input signals at the
PFD. Any width of the dead-zone directly translates to jitter
in the PLL and must be avoided.
To overcome the speed limitation and to reduce the dead
zone, a new dynamic logic style PFD was designed. A
similar dynamic comparator was reported before [9]. But our
implementation requires fewer number of transistors. Fig. 7
shows the circuit diagram of the PFD. Conventional static
logic circuitry was replaced by dynamic logic gates. As a
result, the number of transistors in the PFD core is reduced
from 44 to 16. The critical path of this PFD is shown also
in Fig. 7. The critical path of this PFD is composed of three-
gate feedback path. The shortened feedback path delay and
dynamic operation allow high precision in the high-frequency Fig. 12. VCO operation for step supply noise.
operation.
Fig. 8. shows the relation between dead zone of PFD and
on the performance of the VCO. So the noise insensitivity of
the phase error of PLL. If the phase difference of EXT clock
the VCO is very important. The VCO implemented in this
and VCO clock is smaller than the dead zone, the PFD cannot
detect the phase difference. So the phase error signal of PFD design has a simple bias circuit to reject supply step noise.
will remain zero, resulting in unavoidable phase error between The processor or bus can have intervals when there is heavy
EXT clock and VCO clock. The minimum peak-to-peak phase circuit activity in switching large amounts of capacitance and
error caused by this dead zone is intervals when there is very little circuit activity. This will
show up as steps or impulses on the power supply of PLL [8].
Minimum Peak-to-Peak Phase Error (1) The actual peak-to-peak jitter in this case becomes dominated
by the peaks in the impulse transient noise response. The VCO
used in the design is a six-stage differential-type ring oscillator
In order to avoid dead zone, the PFD asserts both UP and
with limited voltage swing and is shown in Fig. 11. Each stage
DOWN outputs as shown in Fig 9. For in-phase inputs of
is made up of a differential NMOS pair with variable resistance
EXT_CLK and VCO_CK, the charge pump will see both
loads made of PMOS devices operating in the triode region.
UP and DOWN pulse for the same short period of time. If
The bias voltage for the PMOS is generated by a replica bias
there is a phase difference between EXT_CLK and VCO_CK,
the width of UP and DOWN pulse will be proportional to circuit. The operation of this bias circuit is shown in Fig. 12.
the phase differences of the inputs. Fig. 10 shows the SPICE The voltage dynamically tracks the supply variations. The
simulation result of the UP/DOWN pulse width differences replica bias circuit which consists of replica delay cell and an
as a function of the input phase differences. The deadzone of op-amp sets the minimum voltage level of the internal VCO
the PFD is significantly smaller than the measured maximum swing to The signal is generated by two resistors and
PLL jitter. one capacitor. When the supply rail is quiet, the voltage swing
Several critical parameters of the PLL, such as speed, timing of the internal VCO is - Let us assume that there is a
jitter, spectral purity, and power dissipation, strongly depend supply voltage step variation of at some point. After the
696 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 5, MAY 1997
step change at the supply, the level settles to to the increased supply voltage for a short period of time.
And the voltage swing at the VCO increases with a time
(2) constant determined by and OPAMP bandwidth
and approaches to
with a time constant of
(4)
(3)
which result in the increase of one stage delay. This gives
At the instant of supply step change, the voltage difference an averaging effect on the VCO delay after the supply step
between and remains the same due to the capacitor change, making the delay change minimized with supply step
at the generator. If - is fixed, the delay cells run change. If we select and values for a minimum
a little bit faster due to the supply voltage increase instead average delay change, the effect of supply step change can be
of keeping exact constant delay. Since - remains the nullified. The values we chose for this particular process are
same temporarily, the delay cells run a little bit faster due k k and pF.
KIM et al.: 960-Mb/s/pin INTERFACE FOR SKEW-TOLERANT BUS 697
C. Oversampler
The oversampler used in the data receiver is shown in
Fig. 14. Each oversampler is a cascaded sense amplifier and
uses four clocks for correct, timely sampling. It is very
important to reduce the probability of metastability by careful
design and layout. The same size is used for both PMOS and
NMOS in the core synchronizing amplifier to maximize the
loop bandwidth.
output. The speed limit came from several reasons. CMOS TABLE I
driving capability limitation and the signal degradation through MAIN FEATURES OF THE CHIP
chip packaging and printed circuit board (PCB) were among Core Area 3.6 mm 2 0.7 mm
the main factors. Technology 0.6-m double-metal CMOS
The skew-insensitive receiving operation was also observed. Supply Voltage 3.3 V
Data Rate 960 Mb/s
There are four high-speed pins in the prototype chip. We
PLL jitter 15.8 ps rms @ 960 Mb/s
made a PCB with four high-speed impedance controlled bus
Power 0.7 W fully active
lines. The length of normal lines is 12 cm. One of the high-
speed signal paths was made intentionally longer than the other pin with a longer trace. The lower waveform is from the
signals by 10 cm. The 960 Mb/s high-speed serial data was normal length pin. Although the two pins have different trace
sent into the receiver. The receiver recovers the serial data lengths, the chips could receive data without errors. The power
into 8-b 120 - MHz parallel data. Fig. 19 shows 120 MHz dissipation at 960 Mb/s was 0.7 W for the master chip. The
recovered parallel data. The upper waveform is from the chip characteristics is summarized in Table I.
KIM et al.: 960-Mb/s/pin INTERFACE FOR SKEW-TOLERANT BUS 699
V. CONCLUSION [9] H. Notani et al., “A 622-MHz CMOS phase-locked loop with precharge-
type phase frequency detector,” in Proc. Symp. VLSI Circuits, June 1994,
A new high-speed skew-insensitive I/O scheme has been pp. 129–130.
described in this paper. Two chips that incorporated the new
I/O scheme using the low jitter PLL technique have been
fabricated in a 0.6- m double-metal CMOS process. Three
Sungjoon Kim (S’91) was born in Pusan, Korea, on
times oversampling technique relaxed the strict requirement of June 2, 1970. He received the B.S. and M.S. degrees
setup and hold margins of high-speed chip-to-chip interfaces. in electronics engineering from Seoul National Uni-
Newly designed fast phase frequency detector and a high versity in 1992 and 1994, respectively. Since 1994
he has been working toward the Ph.D. degree in the
noise immunity VCO circuit improved jitter performance of same university.
PLL. The measured PLL rms jitter was 15.7 ps. Accurate He spent the summer of 1995 working on the
multiphase clock generation for oversampling the bus signal limiting factors of CMOS Gb/s transmission at SUN
Microsystems, CA. His research interests include
was made possible by utilizing the low jitter PLL. By using clock and data recovery for high-speed communi-
such techniques, skew-insensitive data transfer was tested. cation and high-speed I/O interface circuits.
This skew-insensitive I/O scheme is useful for high-speed
ASIC-to-memory and ASIC-to-ASIC interfaces. This scheme
will become more important as the chip-to-chip data transfer
speed goes up.
Kyeongho Lee (S’92) was born in Seoul, Korea,
on August 5, 1969. He received the B.S. and M.S.
degrees in electronics engineering from Seoul Na-
REFERENCES tional University in 1993 and 1995, respectively.
He is currently working toward the Ph.D. degree in
[1] M. Horiguchi et al., “An experimental 220 MHz 1 Gb DRAM,” in
electronics engineering of the same university.
ISSCC 1995 Dig. Tech. Papers, pp. 252–253.
[2] M. Horowitz et al., “PLL design for a 500 MB/s interface,” in ISSCC He is working on various CMOS high-speed cir-
1993 Dig. Tech. Papers, pp. 160–161. cuits for data communication. His research interests
[3] T. H. Lee et al., “A 2.5 V CMOS delay-locked loop for an 18 Mbit, include high-speed CMOS interface circuits, high-
500 Megabytes/s DRAM,” IEEE J. Solid-State Circuits, vol. 29, pp. speed video display system, and PLL systems for
1491–1496, Dec. 1994. Gigabit communication.
[4] E. Reese et al., “A phase tolerant 3.8 GB/s data-communication router
for a multiprocessor supercomputer backplane,” in ISSCC 1994 Dig.
Tech. Papers, Feb. 1994, pp. 296–297.
[5] S. Kim et al., “A pseudo-synchronous skew-insensitive I/O scheme for
high bandwidth memories,” in Proc. Symp. VLSI Circuits, June 1994, Yongsam Moon (S’97) was born in Incheon, Korea,
pp. 41–42. on March 1, 1971. He received the B.S. and M.S. de-
[6] M. Bazes and R. Ashuri, “A novel CMOS digital clock and data grees in electronics engineering from Seoul National
decoder,” IEEE J. Solid-State Circuits, vol. 27, pp. 1934–1940, Dec. University in 1994 and 1996, respectively, where he
1992. is currently working toward the Ph.D. degree.
[7] S. Kim et al., “An 800 Mbps multi-channel CMOS serial link with 3 2 He has been working on architectures and CMOS
circuits for microprocessors. His current research
oversampling,” in Proc. IEEE Custom Integrated Circuit Conf., 1995,
pp. 451–454. interests are in clock and data recovery circuits for
[8] I. Young et al., “A PLL clock generator with 5 to 110 MHz lock range for high-speed data communication.
microprocessors,” IEEE J. Solid-State Circuits, vol. 27, pp. 1599–1607,
Nov. 1992.
700 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 5, MAY 1997
Deog-Kyoon Jeong (S’87–M’89) received the B.S. Hyung Kyu Lim (S’82–M’84) was born February
and M.S. degrees in electronics engineering from 4, 1953, in Kyung-Nam, Korea. He received the B.S.
Seoul National University, Seoul, Korea, in 1981 degree from the Seoul National University, Seoul,
and 1984, respectively, and the Ph.D. degree in Korea, the M.S. degree from the Korea Advanced
electrical engineering and computer sciences from Institute Science and Technology, and the Ph.D.
the University of California, Berkeley, in 1989. degree from the University of Florida, Gainesville,
From 1989 to 1991, he was with Texas Instru- all in electrical engineering, in 1976, 1978, and
ments, Dallas, TX, where he was a member of the 1984, respectively.
technical staff working on the single chip imple- Since 1976, he has been with the Semiconductor
mentation of the SPARC architecture. Since 1991, Research and Development Center, Samsung Elec-
he has been on the faculty of the School of Electrical tronics Co., Kiheung, Korea. From 1978 to 1981,
Engineering and the Inter-University Semiconductor Research Center, Seoul he was engaged in the development of bipolar linear integrated circuits
National University. His main research interests include high-speed circuits, and CMOS watch chips. After finishing his Ph.D. study, he worked mainly
VLSI systems design, microprocessor architectures, and memory systems. in the area of high-density MOS memory development. Starting from a
64 Kb EEPROM design in 1984, he led various memory device research
and development projects that include 256 Kb EEPROM, 16 Mb mask
ROM, 1 Mb high-speed static Ram, and 1/3 inch CCD image sensor. He
is currently responsible for design engineering of all MOS memory research
Yunho Choi was born in Incheon, Korea, on March and development projects in which dynamic RAM and specialty memories are
29, 1960. He received the B.S. degree in electrical added. He has authored or coauthored over 20 technical journal and conference
engineering from Seoul National University, Seoul, papers and holds 23 patents.
Korea, in 1983. Dr. Lim is a member of the IEEE Electron Device Society.
He joined Samsung Semiconductor Inc., Santa
Clara, CA, in 1983, where he was engaged in
the design of the 256K DRAM. Since 1986, he
has been working on the design of high-density
dynamic memory including synchronous DRAM at
the Semiconductor Research Center, Samsung Elec-
tronic Company, Ltd., Kiheung, Korea. Currently he
is in charge of specialty memory design such as graphics memory and merged
DRAM and logic product development.
784 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 36, NO. 5, MAY 2001
trol voltages generated from two main loop charge pumps. The
multiplexer selects one of four clocks as and this clock
feeds the clock buffer whose function is to convert low swing to
full CMOS-level as well as provide the chip-wide output clock,
. The drives the phase detector which compares
it to the reference clock. The output of the phase detector is used
by two charge pumps and four loop filters to control the delay
time of each main loop VCDL. Four-to-one clock switching
is implemented by the window finder and the state decoder
block. The window finder monitors the boundary where the se-
lected is switched and forces the state decoder to update
the two-bit selection code at the switching event. The selection
code not only controls the clock selection at the multiplexer but
changes the configuration of two charge pumps and four loop
filters to accommodate the clock switching. Duty cycle correc-
tion (DCC) is employed to remove the duty cycle imperfections
Fig. 2. Block diagram of the proposed dual-loop DLL. of the input clock and the output clock . Fi-
nally, although two input clocks, and , can
be merged into one clock input, lower jitter clock source is pre-
deviates from the value at the simulation stage according to ferred as the , if possible, since it determines the jitter
temperature and voltage variations [2]. When the variation of characteristics of the whole DLL.
is excessive, the DLL loses the lock and falls into In this architecture, the clock selection scheme enables the
the lock-failure cases in Fig. 1(b). output clock to cover the entire phase range (modulo ). Fur-
A DLL relying on quadrature phase mixing [3] has been thermore, seamless clock switching is possible by optimizing
proposed to overcome the limited range problem of the con- the main loop VCDL delay control scheme. Moreover, the phase
ventional DLL. The phase mixing technique using quadrature locking is achieved by fully analog control in all loops, so that
clocks provides unlimited phase shift capability. However, we can apply low-skew and low-jitter techniques, established in
phase mixing uses two small slew-rate clocks to obtain linear conventional DLLs.
results. Therefore, this approach has the disadvantage of the
increased dynamic noise sensitivity and jitter. In the semidig-
C. Reference Loop Design
ital DLL [4], a digitally controlled phase interpolator uses
internally generated 30 -spaced clocks through the dual DLL The objectiveness of the reference loop is to provide quadra-
architecture. Although noise sensitivity issues on the phase ture clocks to the main loop. Since the main loop uses these
interpolation could be alleviated by smaller interpolation multiphase clocks as references, the phase distribution in the
intervals, inherent digital nature causes dithering around zero output clocks should be preserved against a possible harmonic
phase error due to continuous control-bit updates. A digital lock. The reference loop phase detector depicted in Fig. 3(a) has
DLL architecture with infinite phase capture ranges [5] is also the capability to detect and escape up to the second harmonic
not free from the same dithering problem and requires a large lock. This design is made of two level-sensitive AND/NAND logic
chip area for fine delay control. which requires 45 and 90 clocks as well as 0 and 180 clocks.
At one period lock, clocks and UP/DN output waveforms are
B. Proposed Dual-Loop DLL shown in Fig. 3(b). The phase detector asserts their UP and DN
outputs for equal duration due to 45 clock in order to avoid a
Fig. 2 shows a block diagram of the proposed dual-loop DLL dead-zone problem, although the phase offset of the reference
architecture [6]. This architecture is based on two loops: the ref-
loop gives negligible effects on the offset of the main loop output
erence loop and the main loop. The reference loop is locked clock. At the second harmonic lock as shown in Fig. 3(c), the
at 180 phase shift through the conventional DLL architecture. phase detector detects that the loop is in the harmonic lock due
Since the reference loop VCDL is composed of four main delay
to 90 clock and asserts only UP output to escape the harmonic
cells, each delay cell generates a 45 phase shift at locked con-
lock. By limiting the delay range of a delay line, there is no pos-
dition. All delay cells including delay buffers are differential el- sibility of harmonic lock over third since the reference loop is
ements commonly controlled by the output of the charge pump. composed only of delay cells with no additional delay elements
The delay cell named “3” means three parallel-connected delay such as the clock buffer.
cells, so that the load balance between 0 and 180 clock is
preserved. The reference loop provides two differential clocks
spaced by 90 to the main loop. To cover the entire 360 phase D. Main Loop Design
range, clocks from the reference loop are partially inverted and The main loop design is focused on the selection control and
inputted to four sets of VCDL in the main loop. Each main loop delay control of the main loop VCDL to achieve the infinite
VCDL is composed of three delay cells and generates low swing delay range by using four finite-length VCDLs. Fig. 4(a) shows
internal clocks- , , , and . These clocks expe- the conceptual timing diagram of the main loop VCDL selection
rience the analog delay time control by two kinds of four con- control. Assuming clock is selected as , the
786 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 36, NO. 5, MAY 2001
Fig. 4. Selection control of the main loop VCDL. (a) Conceptual timing
diagram. (b) Block diagram of the control logic.
Fig. 5. Delay control of the main loop VCDL. (a) Single control with other clocks fixed. (b) Differential control with same speed. (c) Differential control with
2
3 speed difference.
(3)
JUNG et al.: DUAL-LOOP DELAY-LOCKED LOOP 789
Fig. 12. Selection code waveforms with the refCLK input grounded.
Fig. 10. Simulated mismatch sensitivity characteristics of the DCC with the
proposed duty detection stage.
REFERENCES
[1] M. Johnson and E. Hudson, “A variable delay line PLL for CPU-co-
processor synchronization,” IEEE J. Solid-State Circuits, vol. 23, pp.
1218–1223, Oct. 1988.
[2] T. Yoshimura, Y. Nakase, N. Watanabe, Y. Morooka, Y. Matsuda, M.
Kumanoya, and H. Hamano, “A delay-locked loop and 90 phase shifter
for 800-Mb/s double data rate memories,” in Symp. VLSI Circuits Dig.
Tech. Papers, June 1998, pp. 66–67.
[3] T. H. Lee, K. S. Donnelly, J. T. C. Ho, J. Zerbe, M. G. Johnson, and
T. Ishikawa, “A 2.5-V CMOS delay-locked loop for an 18-Mb 500-
Mbyte/s DRAM,” IEEE J. Solid-State Circuits, vol. 29, pp. 1491–1496,
Dec. 1994.
[4] S. Sidiropoulos and M. A. Horowitz, “A semidigital dual delay-locked
loop,” IEEE J. Solid-State Circuits, vol. 32, pp. 1683–1692, Nov. 1997.
[5] K. Minami et al., “A 1-GHz portable digital delay-locked loop with in-
finite phase capture ranges,” in ISSCC Dig. Tech. Papers, Feb. 2000, pp.
350–351.
[6] Y.-J. Jung, S.-W. Lee, D. Shim, W. Kim, C.-H. Kim, and S.-I. Cho, “A
0.57 V (nMOS) and 0.55 V (pMOS). The gate-oxide thick- low-jitter dual-loop DLL using multiple VCDLs with a duty cycle cor-
ness is 5.8 nm. Fig. 11 shows the layout of the prototype chip. rector,” in Symp. VLSI Circuits Dig. Tech. Papers, June 2000, pp. 50–51.
The active area of the DLL occupies 0.13 mm . [7] M. Bazes, “Two novel fully complementary self-biased CMOS differen-
tial amplifiers,” IEEE J. Solid-State Circuits, vol. 26, pp. 165–168, Feb.
Waveforms depicted in Fig. 12 shows two-bit selection code 1991.
with the reference clock input grounded, while running the input [8] M. J. M. Pelgrom, H. P. Tuinhout, and M. Vertregt, “Transistor matching
clock at its nominal frequency of 400 MHz. In this configura- in analog CMOS applications,” in IEDM Dig. Tech. Papers, Dec. 1998,
pp. 915–918.
tion, the main loop phase detector always asserts DN signals.
Therefore, the selection code is continuously updated in accor-
dance with sequences of “00,” “01,” “10,” and “11.” This means
the infinite times rotation of the output clock throughout the full
Yeon-Jae Jung was born in Korea in 1974. He re-
0 –360 range. ceived the B.S. and M.S. degrees from the School
Fig. 13(a) and (b) shows the jitter histograms of the DLL of Electrical Engineering, Seoul National University,
clock output at 400 MHz. Fig. 13(a) shows 6.7 ps RMS and Seoul, Korea, in 1997 and 1999, respectively, where
he is currently working toward the Ph.D. degree.
54 ps peak-to-peak jitter characteristics with a quiet power He has worked on architectures and CMOS circuits
supply. With a 300-mV 2.5-MHz square-wave supply noise, the for high-speed I/O interfaces. His current research in-
peak-to-peak jitter increases to 150 ps, as shown in Fig. 13(b). terests include high-speed CMOS circuits and com-
munication ICs.
The ratio of the peak-to-peak jitter to the RMS jitter is well
maintained in spite of supply-noise injection. Supply-noise
sensitivity is measured to be 0.32 ps/mV.
Table I summarizes the DLL performance characteristics.
The DLL operates from 150- to 600- MHz frequency range
Seung-Wook Lee was born in Seoul, Korea, in 1971.
with a 2.5-V supply. Static phase error between the reference He received the B.S. and M.S. degrees in electronics
clock and the output clock of the DLL is less than 20 ps. engineering from Seoul National University, Seoul,
Operating at 400 MHz, the DLL dissipates 60 mW. Korea, in 1995 and 1997, respectively, where he is
currently working toward the Ph.D. degree in the
School of Electrical Engineering.
His research interests include CMOS RF circuit
V. CONCLUSION design and high-speed communication interfaces.
Mr. Lee is the winner of the Bronze Prize of the
We have described a dual-loop DLL architecture that allows IC design contest held by the Federation of Korean
the unlimited delay range by using multiple VCDLs. The Industries in 1995.
reference loop generates four evenly spaced clocks without
a possible harmonic lock. Clock selection in the main loop
enables the DLL to cover the entire phase range and seamless
clock switching is achieved by optimizing the main loop Daeyun Shim was born in Seoul, Korea, in 1962. He
received the B.S., M.S., and Ph.D. degrees in elec-
VCDL delay range control. Thus, this architecture can emulate tronics engineering from Seoul National University,
the infinite-length VCDL with multiple finite-length VCDLs. Seoul, Korea, in 1985, 1987, and 2000, respectively.
To obtain low supply-noise sensitivity, the low-jitter scheme His Ph.D. dissertation was related to the design of
high-speed locking clock generators.
generates a reduced swing voltage compared to supply noise Since 1987, he has been working on digital video
for the delay compensation of a delay line. Finally, a duty cycle signal processing and ASIC design at Samsung Elec-
corrector presents a high immunity to process mismatches with tronics Corporation. His research interests are video
signal processing and compression, high-speed dig-
the help of two stacked source-coupled pairs configuration. ital circuit design, and high-speed locking systems.
A prototype fabricated using 0.25- m CMOS technology He is currently working on DVD-PRML system design.
JUNG et al.: DUAL-LOOP DELAY-LOCKED LOOP 791
Wonchan Kim was born in Seoul, Korea, in Soo-In Cho was born in Seoul, Korea, in 1957. He received the B.S. degree in
1945. He received the B.S. degree in electronics electronics engineering from Seoul National University, Seoul, Korea, in 1979.
engineering from Seoul National University, Seoul, He joined the Semiconductor Research and Development Center, Samsung
Korea, in 1972. He received the Dip.-Ing. and Electronics Company, Ltd., Kyungki-Do, Korea, in 1979, where he was engaged
Dr.-Ing. degrees in electrical engineering from the in the design of CMOS logic LSI. Since 1983, he has been working on MOS
Technische Hochschule Aachen, Aachen, Germany, dynamic memory design.
in 1976 and 1981, respectively.
In 1972, he was with Fairchild Semiconductor
Korea as a Process Engineer. From 1976 to 1982, he
was with the Institut für Theoretische Electrotecnik
RWTH, Aachen. Since 1982, he has been with
the School of Electrical Engineering, Seoul National University, where
he is currently a Professor. His research interests include development of
semiconductor devices and design of analog/digital circuits.
Abstract— This paper describes a phase-locked loop (PLL)- In Section II of this paper, the PLL architecture is described
based frequency synthesizer. The voltage-controlled oscillator with an analysis of the loop stability and loop optimization.
(VCO) utilizing a ring of single-ended current-steering amplifiers The circuit design techniques for the PLL are considered in
(CSA) provides low noise, wide operating frequencies, and opera-
tion over a wide range of power supply voltage. A programmable Section III. The measured results are discussed in Section IV.
charge pump circuit automatically configures the loop gain and Finally, conclusions are made in Section V regarding this
optimizes it over the whole frequency range. The measured PLL work.
frequency ranges are 0.3–165 MHz and 0.3–100 MHz at 5 V and 3
V supplies, respectively (the VCO frequency is twice PLL output).
The peak-to-peak jitter is 81 ps (13 ps rms) at 100 MHz. The chip
II. PLL ARCHITECTURE
is fabricated with a standard 0.8-m n-well CMOS process. It is often difficult to design a PLL that can operate over a
Index Terms—CMOS phase-locked loop, current-steering am- wide frequency range due to the practical limit of the capacitor
plifier, current-steering logic, frequency synthesizer, low noise, size that can be integrated for the loop filter. One method to
low voltage VCO. widen the frequency range is to vary the PLL bandwidth as
a function of the desired output frequency. This principle is
I. INTRODUCTION applied in our design by utilizing a current D/A converter
which controls the charge pump current. With this technique,
loop on the VCO can also be used for reducing the PLL
jitter; however, it is likely to cause glitches or overshoots
whenever the frequency transition mode is activated due to
complicated feedback loops [8]. Hence, it is not a suitable
approach for microprocessor applications, wherein a smooth
transition between frequencies is usually required. While the
second type of VCO rejects power supply noise well, the
frequency range of operation may not be sufficient for some
applications. To widen the frequency range of differential-pair-
based VCO’s, complex MOS resistors [7] can be used at the
cost of higher supply and more complex design.
The VCO design in this work utilizes a simple CSA circuit
[1]. Fig. 3(a) shows a CSA cell which consists of a current
source, , and a pair of NMOS devices. is the input device
Fig. 2. Programmable charge pump with D/A converter control. and is the load. When is high, turns on, sinking
the bias current , while shuts off. Under this condition,
the on resistance of defines the output low voltage, .
PLL architecture uses a current D/A converter to control the When is low, turns off and is steered to .
charge pump current, . The product is optimized Under this condition, the resistance of the diode-connected
for given values of and such that the stability margin defines the output high voltage, . By varying the bias
constraints are satisfied by (2) and (3). Using a decoding current, , a current-controlled CSA-based ring oscillator is
table, the optimization is performed by the logic block in formed with an output voltage swing of
Fig. 1 which sets the charge pump current upon examination
of and . Typically, a loop bandwidth which is ten times
less than for the entire frequency range can be easily (4)
achieved using this architecture.
Equation (4) indicates that (typically between 1 and 2 V)
varies with . Thus, the voltage swing of the CSA cell
III. CIRCUIT DESIGN TECHNIQUES
increases correspondingly with frequency. This is a desirable
feature because the signal level improves at high frequency
A. VCO Circuit when the power supply switching noise becomes worse. Since
Two types of VCO based on the ring oscillator topology the voltage swing is limited by the diode-connected ,
are commonly used in CMOS PLL design: the current-starved the current source always operates in the saturation region;
inverter based VCO [3]–[5] and the differential-pair based consequently, very small switching noise is generated. For an
VCO [6]–[8]. In spite of a wide frequency range, the first n-well process, the PMOS current source can be guarded by its
type of VCO is sensitive to power supply noise. Although own well and isolated from the noisy p-substrate. The current
an on-chip voltage regulator can be used to reduce the effect source also buffers the output from , thereby reducing
of power supply noise [5], it is not effective for operation at the noise injected from to the output. Any ground noise
high frequency since a voltage regulator inherently has poor (coupled from other circuitry within the chip) is rejected by
ac rejection. Another drawback of using an on-chip regulator the CSA as a common mode noise because both its output
is that it reduces the useful power supply range, making and input are referred to the same ground. By referring the
it undesirable for low-supply applications. A local feedback charge pump, loop filter, V/I converter, VCO, and other analog
584 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 4, APRIL 1997
(a)
B. Phase/Frequency Detector
A modified dead-zone free, phase/frequency detector (PFD)
is used in this design [2]. In the frequency transition mode,
a pulse width limiting circuit in the PFD limits the pulse
width of the UP and DN signals. Such limited UP/DN pulses
provide a finite amount of charge to the integration capacitor
Fig. 4. Measured VCO performance.
of the loop filter in order to slow down the frequency ramp
of the VCO. The UP/DN pulses, however, cannot be made
circuits in the PLL to the same ground, i.e., p-substrate, the arbitrarily narrow. The noise level in the chip may dominate
ground noise can be substantially rejected [9]. over the minute UP/DN correction pulses causing the PLL
Fig. 3(b) shows the VCO circuit using a three-stage CSA not to properly acquire in frequency. The maximum frequency
ring oscillator. The current sources in the CSA ring oscillator transition rate of less than 0.1%, i.e., the difference in period
are cascoded with high-swing bias. In the V/I converter, between two consecutive clock cycles, is achieved by using
a single degenerated stage provides a first-order linear this technique.
relationship between the oscillation frequency and the control
voltage. is forced in the linear region to provide the high- C. Charge Pump and Loop Filter
swing cascoded bias for the VCO. This VCO is suitable for The charge pump shown in Fig. 2 is designed using cas-
low supply voltage operation since it only needs about 2.5 V coded current sources with CMOS switches. The amount
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 4, APRIL 1997 585
Fig. 6. Measured PLL output at 155 MHz after 5 ms delay, i.e., after 775 000 clock cycles. The rms jitter (standard deviation) is 28 ps as shown.
of the current, typically 10–150 A, is controlled by the frequency divided by two, a PLL output with 50% duty cycle
current digital-to-analog converter (DAC). A bandgap current is guaranteed. Fig. 5 shows the histogram of PLL period
reference circuit is used to compensate the variation over jitter (also referred to as short-term or clock-to-clock jitter)
temperature. The opamp in the charge pump circuit reduces the at 100 MHz after 42 321 hits using a standard 14.318 MHz
transients caused by the charge transfer [4] as is switched. crystal as the input reference frequency. The peak-to-peak jitter
The capacitors in the loop filter are formed by NMOS devices is 81 ps with an rms jitter of 13 ps. The long-term jitter of the
with the sources and drains connected to ground and the gate
PLL was also measured. An rms jitter of 28 ps is observed
connected to the filter output node. The capacitor C2 is about
at 155 MHz with 5 ms delay, i.e., measured after a delay of
400 pF.
775 000 clock cycles from the triggered clock cycle, as shown
in Fig. 6. In order to appreciate the noise rejection capability
IV. MEASURED RESULTS of this PLL, the chip was used to generate the clock for an HP
The measured results show that the PLL operates from 0.3 laser printer. During the printing process, which is very noisy
to 165 MHz and 0.3 to 100 MHz at 5 V and 3 V power electrically and thermally, both the period jitter and the long-
supplies, respectively. Since the output frequency is the VCO term jitter of the clock must be low for good printing quality. A
586 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 4, APRIL 1997
TABLE I element in the design that provides all these features is the
SUMMARY OF MEASURED PLL PERFORMANCE CSA-based VCO circuit. A programmable current DAC is also
Frequency Range 0.3–165 MHz used to optimize the loop gain of the PLL. Smooth frequency
Period (Short-Term) Jitter at 100 MHz 13 ps rms, 81 ps peak-to-peak transition is realized by using a modified PFD with a pulse
Long-Term Jitter at 155 MHz 28 ps rms after 5 ms delay
width limiting circuit. The chip is implemented in a standard
Supply Voltage Range 2.5 V to 7 V
Output Duty Cycle 50%, VCO frequency is twice 0.8- m CMOS process.
PLL output
VCO Linearity 2% from 10–200 MHz REFERENCES
VCO Current Consumption 500 A at 200 MHz
Crosstalk between 2 PLL’s 50 dB down [1] D. J. Allstot, G. Liang, and H. C. Yang, “Current-mode logic techniques
for CMOS mixed-mode ASIC’s,” in Proc. IEEE Custom Integrated
Circuits Conf., 1991, pp. 25.2.1–25.2.4.
[2] F. M. Gardner, “Charge-pump phase-lock loops,” IEEE Trans. Commun.,
comparison of test results between the use of this PLL and the vol. 28, pp. 1849–1858, Nov. 1980.
original crystal clock showed no perceptible difference even [3] R. Shariatdoust, K. Nagaraj, M. Saniski, and J. Plany, “A low jitter 5
MHz to 180 MHz clock synthesizer for video graphics,” in Proc. IEEE
under high magnification. Fig. 7 shows a smooth frequency Custom Integrated Circuits Conf., 1992, pp. 24.2.1–25.2.5.
transition of the PLL from 33 to 100 MHz. No frequency [4] M. G. Johnson and E. L. Hudson, “A variable delay line PLL for CPU-
glitches and overshoots were observed during the transition coprocessor synchronization,” IEEE J. Solid-State Circuits, vol. 23, pp.
1218–1223, Oct. 1988.
time. Since there are two independent PLL’s on the same [5] K. M. Ware, H.-S. Lee, and C. G. Sodini, “A 200-MHz CMOS phase-
chip, crosstalk between the two PLL’s was also measured. locked loop with dual phase detectors,” IEEE J. Solid-State Circuits,
The signal coupling between the two PLL’s is at least 50 dB vol. 24, pp. 1560–1568, Dec. 1989.
[6] B. Kim, D. N. Helman, and P. R. Gray, “A 30-MHz hybrid analog/digital
down. The measured performance of the PLL is summarized clock recovery circuit in 2-m CMOS,” IEEE J. Solid-State Circuits,
in Table I. vol. 25, pp. 1385–1394, Oct. 1990.
[7] I. A. Young, J. K. Greason, and K. L. Wong, “A PLL clock generator
with 5 to 110 MHz of lock range for microprocessors,” IEEE J. Solid-
V. CONCLUSION State Circuits, vol. 27, pp. 1599–1606, Nov. 1992.
[8] D. Mijuskovic et al., “Cell-based fully integrated CMOS frequency
In this paper, we demonstrated the design of a fully in- synthesizers,” IEEE J. Solid-State Circuits, vol. 29, pp. 271–279, Mar.
tegrated CMOS PLL circuit that achieves wide operating 1994.
[9] D. J. Allstot and W. C. Black Jr., “A substrate-referenced data-
frequency range and low jitters (both short-term and long- conversion architecture,” IEEE Trans. Circuits Syst., vol. 38, pp.
term) over a wide range of power supply voltage. The key 1212–1217, Oct. 1991.
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 4, APRIL 1999 513
sizes. Table I lists some of the relevant attributes of this B. Power-Supply Isolation
process technology.
A separate analog power connection (AVDD) is used for
The PLL clock generator is shown in the microprocessor
the analog circuits [current reference, charge pump, common-
die photograph of Fig. 2(a). The dimensions of the entire PLL
mode rejection (CMR), filter initialization, and VCO circuits]
are 1040 640 m . It is shown with the major features
to increase the isolation of the sensitive circuits from the logic-
identified in Fig. 2(b).
induced switching noise present on the main power supply.
To allow the detection of potential defects using conventional
IV. PLL CLOCK GENERATOR COMPONENTS testing, the AVDD pin is held low, disabling the analog devices
that normally draw dc current. Both on-chip and on-module
A. Phase/Frequency Detector decoupling is used on AVDD.
The digital PFD generates a signal that conveys relative
phase and frequency error information about its inputs to the C. Reference Circuit
charge pump and filter. The PFD design is based on a three- A thermal voltage-referenced current source is used to
state machine structure [8], as depicted in Fig. 3(a). From the provide temperature- and supply-independent biasing for the
initial reset state, a rising edge on the input will assert analog circuits in the PLL. The circuit contains an array of P
the UP output until the rising edge of appears, which diffusions in the N-well connected to form two forward-biased
deasserts UP and forces a reset of both flip-flops [Fig. 3(b)]. diodes with areas that differ by a factor of ten. When connected
BOERSTLER: LOW-JITTER PLL CLOCK GENERATOR 515
(a)
(b)
(a) Fig. 3. (a) PFD state diagram. (b) Phase detector implementation.
(a)
invariant with temperature. The current in one leg of the
differential amplifier varies proportionally with temperature
and is mirrored and added to the summing junction of the
resistor A constant bias current is also added to the
summing junction to establish the correct weighting of the
various compensating currents and to correct for the TC of
the summing junction resistor.
Using a statistical process model, the process compensation
was designed to favor the stabilization of the “best case” side
of the distribution over the “worst case” side in anticipation
of future process trends. Given the limited range over which
a circuit may be practically compensated, the performance for
the “best case” devices was not sacrificed at the expense of
extensive compensation of the poorest performing devices. For
the unsorted population, this approach allowed a reduction in
the sensitivity of the VCO to process variability by a factor of
3.6 (55.4–15.2%) over the uncompensated VCO; temperature
sensitivity was reduced by a factor of 4.7 (38.6–8.2%).
(b)
Fig. 5. (a) Process compensation circuit. (b) Temperature compensation
circuit. E. Charge Pump
The reference circuit is used to generate the currents
The two voltages are compared using a differential amplifier, and for use within the charge pump. The peak charge-
which generates a current proportional to the NMOS pump current may be adjusted in 30- A increments from 30
offset from nominal. This current is mirrored to produce a to 240 A by scaling the mirror currents as shown in Fig. 6.
current that is injected into a precision resistor The error signals and generated by the PFD are used
used for combining various process monitors to generate a to switch the peak current selected. Adjusting the charge pump
compensating reference voltage. The compensating reference allows for optimization of the loop characteristics for different
voltage is connected to the active load elements of the VCO, divider and VCO settings. Differential outputs P and P are
which control the VCO’s voltage swing. A current included for high CMR in the subsequent analog circuits.
generated from a similar PMOS circuit also is injected into
the resistor. F. Loop Filter
Weighted combinations of standard bias circuits with dif- The differential loop filter and initialization circuits are
fering voltage and temperature coefficients have been used shown in Fig. 7. Currents to and from the charge-pump circuit
previously to compensate reference circuits for VCO’s [9]. enter the filter at nodes P, P. The input to the filter
In this case, however, temperature was monitored directly contains NMOS transmission-gate clamping devices to limit
by comparing the voltage of two series-connected devices the maximum filter voltage to where is
biased by current below their 0-TC operating point to the NMOS threshold voltage for a large source-bulk voltage.
the voltage of two parallel devices biased by current For the CMOS6S process, the clamps prevent the filter voltage
significantly above their effective 0-TC point [Fig. 5(b)]. The from exceeding approximately 1.8 V, eliminating concern for
devices and bias currents are sized so that both branches of the VCO input stage’s shutting off. The filter capacitors are
the differential amplifier are balanced at for nominal accumulation-mode gate-oxide devices, and are interleaved
temperature conditions. The inset shows the I–V character- to improve the matching. Both loop-filter capacitors together
istics as a function of temperature for the series (subscript occupy an area of approximately 865 280 m and are
2) and parallel (subscript 1) connected devices; the 0-TC approximately 450 pF each. Precision resistors (1.2 K each)
points correspond to the crossing point where the current is are used to produce a zero in the filter transfer function.
BOERSTLER: LOW-JITTER PLL CLOCK GENERATOR 517
(a)
(a)
(b)
Fig. 11. Cycle–cycle processor clock jitter: (a) quiet processor and (b) active processor.
limit of period composed of six delay units and one mixer the common-mode control circuit. Nominal VCO gain for the
unit. These frequency limits also affect the VCO gain (for settings that produce the maximum VCO range is 185 MHz/V.
a given mixer design) as well as the center frequency. The The worst case VCO power dissipation is 30 mW.
frequency limits may be independently controlled using the
multiplexers shown in Fig. 9, allowing flexible control of the I. Dividers and Receivers
VCO operating range and greater than ten-to-one adjustment
Dividers and (Fig. 1) may be individually
range for VCO gain.
programmed and support division by 2, 3, 4, 5, 6, 8, or 10.
The delay elements and mixer designs are based upon The dividers are placed in pairs within the layout to improve
PMOS source-coupled pair differential amplifiers with NMOS device matching between and and between and
load networks [Fig. 10(a) and (b)] which allow voltage- The receivers shown in Fig. 1 are also placed together
controlled swing adjustment through effective load-line and are located near the I/O pad for BUSCLK.
translation by adjusting the voltage The high impedance
provided by the current source improves the supply noise
rejection for the source-coupled pair, and the N-well improves V. PLL MEASUREMENTS
the isolation to the p bulk substrate noise. The variation of The damping factor, loop gain, and natural frequency of
the threshold voltage due to bulk effect is eliminated using the PLL may be adjusted over a wide range to match the
bulk-to-source biasing throughout the structure. Sensitivity of application by changing the charge-pump and VCO gain as de-
the VCO to low repetition rate, 100-mV steps on VDD and scribed above. System testing was conducted with 90-A peak
AVDD is 0.418 ps/mV. Center-frequency common-mode volt- charge-pump current using the maximum frequency and range
age sensitivity is 3.5% over the full input range dictated by on the VCO with a variety of divider settings and BUSCLK
BOERSTLER: LOW-JITTER PLL CLOCK GENERATOR 519
frequencies. The processor clock was accessed from the clock divider implementation, R. Kodali for circuit simulation and
tree through a series of inverters. A time-interval measurement specification, D. Woeste and J. Strom for the divider and lock
(TIM) system was used to measure cycle–cycle period jitter detector circuits, and S. Dhong and M. Papermaster for their
statistics for a number of packaged die representing various continuous support of this work.
process skews. The processor was operated using an array
initialization program loop with the fixed-point and floating- REFERENCES
point processors active for the “active” processor tests, and [1] I. Young, M. Mar, and B. Bhushan, “A 0.35 m CMOS 3-880 MHz
was also operated in a “quiet” mode reset state. All tests PLL N /2 clock multiplier and distribution network with low jitter for
were performed at room temperature with ambient forced- microprocessors,” in ISSCC Dig. Tech. Papers, Feb. 1997, pp. 330–331.
[2] J. Alvarez, H. Sanchez, G. Gerosa, and R. Countryman, “A wide-
air cooling. Conventional first-cycle oscilloscope-based jitter bandwidth low-voltage PLL for powerPC microprocessors,” IEEE J.
measurements were performed periodically and provided P- Solid-State Circuits, vol. 30, pp. 383–391, Apr. 1995.
P jitter results that were consistent with those measured on [3] J. Cho, “Digitally-controlled PLL with pulse width detection mechanism
for error correction,” in ISSCC Dig. Tech. Papers, Feb. 1997, pp.
the TIM system. The external clock was provided by a high- 334–335.
frequency pulse generator, with 7.3 ps rms, 36 ps P-P jitter. [4] I. Young, J. Greason, and K. Wong, “A PLL clock generator with 5–110
Fig. 11(a) shows a histogram of cycle–cycle period mea- MHz of lock range for microprocessors,” IEEE J. Solid-State Circuits,
vol. 27, pp. 1599–1607, Nov. 1992.
surements taken with the processor in an inactive reset state [5] V. von Kaenel, D. Aebischer, C. Piguet, and E. Dijkstra, “A 320 MHz,
but with the clock tree active. The frequencies of the reference 1.5 mW at 1.35 V CMOS PLL for microprocessor clock generation,”
clock, processor clock, and VCO are 85, 170, and 340 MHz, in ISSCC Dig. Tech. Papers, Feb. 1996, pp. 132–133.
[6] P. E. Gronowski, P. Bannon, M. Bertone, R. Blake-Campos, G.
respectively, which corresponds to a 3-dB loop bandwidth of Bouchard, W. Bowhill, D. Carlson, R. Castelino, D. Donchin, R.
2 MHz. The distribution of samples in the histogram follows Fromm, M. Gowan, A. Jain, B. Loughlin, S. Mehta, J. Meyer, R.
Mueller, A. Olesin, T. Pham, R. Preston, and P. Rubinfeld, “A 433
a Gaussian distribution with period jitter of 8.4 ps rms, 62 ps MHz 64b quad-issue RISC microprocessor,” in ISSCC Dig. Tech.
P-P. The minimum period measured for this sample size Papers and Slide Supplement, Feb. 1996, pp. 222–223.
was 26.2 ps less than the mean (3.1 sigma away). [7] Z. Zhang, H. Du, and M. Lee, “A 360 MHz 3V CMOS PLL with 1
V peak-to-peak power supply noise tolerance,” in ISSCC Dig. Tech.
Assuming that cycle-time failures only occur on the minimum Papers, Feb. 1996, pp. 134–135.
period side, the worst case clock jitter penalty for this system [8] D. H. Wolaver, Phase-Locked Loop Circuit Design. Englewood Cliffs,
(i.e., a “quiet” processor) is 26.2 ps at 3.1 sigma confidence NJ: Prentice-Hall, 1991, pp. 59–61.
[9] J. F. Ewen, A. Widmer, M. Soyuer, K. Wrenner, B. Parker, and H.
(or 25.2 ps penalty at 3.0 sigma). Since a peak-to-peak jitter Ainspan, “Single-chip 1062 Mbaud CMOS transceiver for serial data
approximately equal to the PFD dead zone can exist for the communication,” in ISSCC Dig. Tech. Papers, Feb. 1995, pp. 32–33.
PLL, the 25 ps simulated value for the dead zone may be a [10] B. Lai and R. Walker, “A monolithic 622 Mb/s clock extraction and data
retiming circuit,” in ISSCC Dig. Tech. Papers, Feb. 1991, pp. 144–145.
significant component of the measured jitter. [11] S. K. Enam and A. Abidi, “NMOS IC’s for clock and data regenera-
Fig. 11(b) shows a clock-jitter histogram for the processor tion in gigabit-per-second optical fiber receivers,” IEEE J. Solid-State
Circuits, vol. 27, pp. 1763–1774, Dec. 1992.
executing the array initialization routine for a large population [12] D. W. Boerstler and K. Jenkins, “A phase-locked loop clock generator
A Gaussian curve has been superimposed on for a 1 GHz microprocessor,” in Symp. VLSI Circuits Dig. Tech. Papers,
the histogram for comparison purposes. The frequencies of June 1998, pp. 212–213.
[13] J. Silberman, N. Aoki, D. Boerstler, J. Burns, S. Dhong, A. Essbaum, U.
the reference clock, processor clock, and VCO are 90, 180, Ghoshal, D. Heidel, P. Hofstee, K. Lee, D. Meltzer, H. Ngo, K. Nowka,
and 360 MHz, respectively. For this system (i.e., an “active” S. Posluszny, O. Takahashi, I. Vo, and B. Zoric, “A 1.0 GHz single-
processor), the period jitter has increased to 10.0 ps rms, 80 issue 64b PowerPC integer processor,” in ISSCC Dig. Tech. Papers,
Feb. 1998, pp. 230–231.
ps P-P, and the worst case clock-jitter penalty is 37.1 ps at 3.7 [14] V. von Kaenel, D. Aebischer, R. van Dongen, and C. Piguet, “A 600
sigma confidence (or 30.1 ps at 3.0 sigma). The effective noise MHz CMOS PLL microprocessor clock generator with a 1.2 GHz
penalty for running the array initialization routine is 4.9 ps at VCO,” in ISSCC Dig. Tech. Papers, Feb. 1998, pp. 396–397.
3.0 sigma.
Abstract—This paper describes a delay-locked loop (DLL) cir- of DLLs have been developed [3]–[6]. However, such DLLs
cuit having two advancements, a dual-loop operation for a wide resulted in complex architectures that faced such problems as
lock range and programmable replica delays using antifuse cir- increased area, added power consumption, and degradation of
cuitry and internal voltage generator for a post-package skew cali-
bration. The dual-loop operation uses information from the initial jitter performance.
time difference between reference clock and internal clock to select For these issues, a novel dual-loop architecture, which in-
one of the differential internal loops. This increases the lock range creases the lock range having no degradation of jitter perfor-
of the DLL to the lower frequency. In addition, incorporation of mance with a relatively small overhead in area and power, is
the programmable replica delay using antifuse circuitry and the proposed in this paper. Another enhancement in the proposed
internal voltage generator allows for the elimination of skews be-
tween external clock and internal clock that occur from on-chip DLL is the post-package skew calibration. Process variations in
and off-chip variations after the package process. The proposed on-chip and trivial mismatches in off-chip parameters can result
DLL, fabricated on 0.16- m DRAM process, operates over the in a large static skew in addition to the phase offset of the phase
wide range of 42–400 MHz with 2.3-V power supply. The measured detector. In the proposed DLL, an improved scheme using anti-
results show 43-ps peak-to-peak jitter and 4.71-ps rms jitter con- fuse circuitry is applied for reducing the skew. It enables a prac-
suming 52 mW at 400 MHz.
tical calibration of inevitable skews after the package process.
Index Terms—Delay-locked loop, dual-loop operation, This paper is arranged as follows. The limited range
high-speed DRAM, programmable replica delay, skew calibration.
problem of the conventional DLL is described in Section II.
In Section III, the concept of the proposed dual loop for wide
I. INTRODUCTION locking range is briefly explained, followed by presentation
of the architecture and physical implementation based on the
T HE DELAY-LOCKED loop (DLL) has become an indis-
pensable component in high-speed synchronous DRAMs
such as DDR SDRAM. Since the DLL determines the opera-
concept. The skew calibration method using antifuse circuitry
is described in Section IV. Section V discusses the fabricated
tion range of the DRAM and has a large effect on the data valid chip and shows the experimental results. Finally, the paper is
window, a high-performance DLL that has a wider range and concluded in Section VI.
lower jitter is essential for increasing the speed of DRAM. A
DLL can be categorized into either of two types, the digital II. LIMITED RANGE PROBLEM OF CONVENTIONAL DLL
and the analog type. Although the digital DLL has robustness,
Fig. 1(a) shows the architecture of the conventional analog
process portability, and design simplicity, it is difficult to use on
DLL and the delay characteristic of the voltage-controlled
a very high-bandwidth DRAM (over 600 Mb/s) due to poor jitter
delay line (VCDL). When (minimum delay
performance [1], [2]. Therefore, in spite of sensitivity on process
of VCDL) (maximum delay of VCDL), the
variation, the analog DLL, which ensures lower jitter by the con-
range of (operation frequency of DLL) is determined by
tinuous characteristics of analog operation, is more suitable in
(control voltage of loop filter) at the initial state. When
the higher speed DRAM. In addition to the jitter performance,
(minimum control voltage of loop filter) at the
another important issue of the DLL is the lock range. Process
initial state and (the cycle time
variation makes the lock range of the analog DLL more limited
of reference clock), the lock failure occurs because the phase
and results in a narrower operation range of the DRAM. The
detector produces a DN pulse which discharges the capacitor in
limited range of the DLL limits the flexibility of implementation
the loop filter, as shown in Fig. 1(b). Therefore, in this case, it
on memory applications and increases test costs in mass produc-
must be at the initial state for sat-
tion. For solving the limited lock-range problems, various types
isfying the condition without lock failure. Therefore, the range
of is . In the
Manuscript received October 2, 2001; revised January 29, 2002. other case, when (maximum control voltage of
S. J. Kim, S. H. Hong, J. H. Cho, P. S. Lee, J. H. Ahn, and J. Y. Chung
are with the Advanced Design Team, Memory Research and Development, loop filter) at the initial state and ,
Hynix Semiconductor Inc., Ichon-si, Kyoungki-Do 467-701, Korea (e-mail: the lock failure occurs because of the UP pulse of the phase
sejun.kim@hynix.com). detector shown in Fig. 1(c). In this case, the range of
J.-K. Wee is with the Department of Electronics Engineering, Hallym Uni-
versity, Chunchun-si, Kangwon-Do 200-702, Korea. is when
Publisher Item Identifier S 0018-9200(02)04934-X. the initial is . For utilizing the full range of
0018-9200/02$17.00 © 2002 IEEE
KIM et al.: SKEW-CALIBRATED DUAL-LOOP DLL 727
(a)
(b)
(c)
Fig. 1. (a) Block diagram and delay characteristic of conventional DLL. Cases of lock failure at initial control voltage: (b) initial V =V and (c) initial
V =V .
without the lock failures as in Fig. 1(b) and (c), the since it is stuck/harmonic lock free and the delay cell
initial must be set at a level such that the initial has a fast slew-rate that produces less phase noise [7]. But in
is approximately . reality, is very sensitive to process, voltage, and
In this condition, the range of is determined as temperature (PVT) variation. As a result, designing
. But this method to be in the target range becomes more careful and difficult
can cause stuck/harmonic lock and makes the jitter perfor- work as the operation frequency becomes higher. Therefore,
mance worse. Therefore, if the range of is desired at considering the PVT variation, the range of becomes
the higher frequency range, the initial should be set to more limited with the higher operating range.
728 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 6, JUNE 2002
(a)
(a)
(a)
(b)
Fig. 5. (a) Delay cell of the replica bias circuit. (b) Cross-sectional view of
bias line in the proposed DLL.
(b)
(a)
(a)
(b)
(c)
Fig. 6. (a) Schematic and timing diagrams at (b) (1=2) 2T <T <
T <T
, (c) 0 < = 2T
(1 2) of the loop selector. (b)
Fig. 7. (a) Schematic and (b) timing diagram of the phase detector.
Fig. 10. Antifuse circuit for skew calibration and SEM photograph of the
Fig. 8. Linear capacitor in the loop filter.
antifuse.
able delay circuit. The tunable delay circuit is connected to the Fig. 11. Flow of skew calibration after package process.
antifuse circuitry and the antifuse is made of ONO (oxide–ni-
tride–oxide) dielectrics, as shown in Fig. 10. The sequence of
skew calibration is explained as follows. When DLL is enabled
by RESET for the test, nodes fd [1]–[8] and bd [1]–[8] in Fig. 9
are all fixed at the high state, because the initial program voltage
is at ground level, and RESET initializes node A and B as
level. In this state, no address code can have an effect on the
fixed levels of fd [1]–[8] and bd [1]–[8]. First, the skew between
the external clock and the data strobe signal is measured. The
measured skew is estimated by selection of optimal number of
delay loads. After the program mode (PGM) is activated, the
program code signifying the estimated number of delay loads
is applied to the address pins and the skew is remeasured. This
process is iterated to increase or decrease replica delay times by
left-shift–right-shift (LSRS) for minimizing the skew. When the
skew is almost eliminated, the inserted program address code is
fixed and the on-chip negative voltage generator is enabled to
produce a program voltage ( V) for rupturing the
Fig. 12. Microphotograph of the proposed DLL.
antifuses. The replica delay is tuned through the flow shown in
Fig. 11. According to the simulation results, the programmable
fabricated chip. The active area of DLL occupies 0.27 mm .
tuning range using the eight antifuses is from 350 to 350 ps
The loop filter consumes 50 of total area. For high-fre-
and the minimum tuning resolution is approximately 10 ps.
quency measurements of the proposed DLL, a chip-on-board
(COB) has been fabricated both to reduce parasitics and to
V. EXPERIMENTAL RESULTS match 50- impedance of the measurement instrument. The
The proposed DLL has been fabricated using 0.16- m proposed DLL operates from 42 to 400 MHz with a 2.3-V
DRAM process. Fig. 12 shows a microphotograph of the power supply. Fig. 13 shows the synchronized waveforms at
732 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 6, JUNE 2002
(a) (b)
Fig. 13. Synchronized waveforms at (a) 42 MHz and (b) 400 MHz.
(a) (b)
Fig. 14. Measured jitter characteristics at 400 MHz in (a) a quiet supply and (b) with injected 1-MHz 6300-mV square wave noise.
42 and 400 MHz. At 400 MHz, the peak-to-peak jitter is 43 ps reduced to 9 ps with measured peak-to-peak jitter of 46 ps. In
and the rms jitter is 4.71 ps, as shown in Fig. 14(a). When theory, error reduction resulting in negative phase shift will
a 300-mV 1-MHz square wave is injected externally on have increased jitter due to increased load in replica delay,
the power supply, the peak-to-peak jitter and the rms jitter is but error reduction resulting in positive phase shift will have
measured to be 80 and 7.46 ps, respectively, at 400 MHz, as decreased jitter by decreased load in replica delay. However,
shown in Fig. 14(b). Fig. 15(a) shows a skew that is composed from analyzing the measured results, the amount of increased
of the phase offset of the phase detector and replica mismatch jitter by negative phase reduction is insignificant compared
by process variation before skew calibration and the tuned skew to the reduced phase error. Fig. 15(b) shows the resolution
after skew calibration. Before the calibration, the measured and partial range of the skew calibration through antifuse
skew is 55 ps. After the calibration, the remeasured skew is programming. Minimum resolution is about 10 ps and total
KIM et al.: SKEW-CALIBRATED DUAL-LOOP DLL 733
(a) (b)
Fig. 15. (a) Measured skew at 400 MHz before skew calibration and after skew calibration. (b) Range (full range not displayed for limitation of tester) and
resolution of skew calibration.
TABLE I clock and the internal clock of the DLL. Also, an improved skew
PERFORMANCE CHARACTERISTICS OF THE PROPOSED DLL calibration method demonstrated a practical post-package skew
calibration using the antifuse circuitry and the internal negative
voltage generator. The proposed DLL, fabricated on 0.16- m
DRAM process, achieves a wide range from 42 to 400 MHz,
and 43 ps peak-to peak jitter and 4.71 ps rms jitter at 400 MHz
that is applicable to high-speed DRAMs.
ACKNOWLEDGMENT
The authors are grateful to H. Ryu and Dr. Y. Kim for helpful
discussion about COB-type PCB design.
REFERENCES
[1] A. Hatakeyama et al., “A 256-Mb SDRAM using a register-controlled
digital DLL,” IEEE J. Solid-State Circuits, vol. 32, pp. 1728–1734, Nov.
1997.
[2] Y. Okajima et al., “Digital delay-locked loop and design technique for
high-speed synchronous interface,” IEICE Trans. Electron., vol. E79-C,
calibration range is from 350 to 350 ps, as expected from pp. 798–807, June 1996.
simulation. These results show that the skew by variation in [3] T. H. Lee et al., “A 2.5-V CMOS delay-locked loop for an 18-Mbit 500-
on-chip or off-chip can be eliminated through programmable Mbyte/s DRAM,” IEEE J. Solid-State Circuits, vol. 29, pp. 1491–1496,
Dec. 1994.
replica delays using the antifuse circuitry, and also verifies [4] S. Tanoi et al., “A 250–622-MHz deskew and jitter-suppressed clock
that the improved skew calibration technique can effectively buffer using two-loop architecture,” IEEE J. Solid-State Circuits, vol.
eliminate the skews after packaging without degradation of 31, pp. 487–493, Apr. 1996.
[5] S. Sidiropoulos et al., “A semi-digital dual delay-locked loop,” IEEE J.
the jitter characteristic. The power dissipation of the proposed Solid-State Circuits, vol. 32, pp. 1683–1692, Nov. 1997.
DLL is 52 mW at 400 MHz. Table I summarizes the measured [6] Y. Okuda et al., “A 66–400-MHz adaptive-lock-mode DLL circuit with
duty-cycle error correction,” in Symp. VLSI Circuits Dig. Tech. Papers,
characteristics of the proposed DLL. June 2001, pp. 37–38.
[7] C. H. Park et al., “A low-noise 900-MHz VCO in 0.6-m CMOS,” IEEE
VI. CONCLUSION J. Solid-State Circuits, vol. 34, pp. 586–591, May 1999.
[8] T. Yoshimura et al., “A delay-locked loop and 90-degree phase shifter
In this paper, the dual-loop architecture with the improved for 800-Mb/s double data rate memories,” in Symp. VLSI Circuits Dig.
skew calibration method was presented. The dual-loop architec- Tech. Papers, June 1998, pp. 66–67.
[9] J. G. Maneatis, “Low-jitter and process-independent DLL and PLL
ture enabled the wide range of the DLL by using the loop selec- based on self-biased techniques,” IEEE J. Solid-State Circuits, vol. 31,
tion decided by an initial time difference between the reference pp. 1728–1732, Nov. 1998.
734 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 6, JUNE 2002
[10] I. A. Young et al., “A PLL clock generator with 5 to 110 MHz of lock Joo Hwan Cho was born in Seoul, Korea, in 1968.
range for microprocessors,” IEEE J. Solid-State Circuits, vol. 27, pp. He received the B.S. degree in electronic materials
1599–1607, Nov. 1992. engineering from Kwang-Woon University, Seoul, in
[11] F. Herzel et al., “A study of oscillator jitter due to supply and substrate 1992.
noise,” IEEE Trans. Circuits Syst. II, vol. 46, pp. 56–62, Jan. 1999. He joined the Semiconductor Research and Devel-
[12] T. Hamamoto et al., “A skew and jitter suppress DLL architecture for opment Center, Hynix Semiconductor Inc, Ichon-si,
high-frequency DDR SDRAMs,” in Symp. VLSI Circuits Dig. Tech. Pa- Kyungki-Do, Korea, in 1992. Since then, he has been
pers, June 2000, pp. 76–77. working on DRAM design and failure analysis.
[13] S. Kuge et al., “A 0.18-m 256-Mb DDR-SDRAM with low-cost post-
mold-tuning method for DLL replica,” in IEEE Int. Solid-State Circuits
Conf. (ISSCC) Dig. Tech. Papers, Feb. 2000, pp. 402–403.
[14] K. S. Min et al., “A post-package bit-repair scheme using static latches
with bipolar-voltage programmable antifuse circuit for high-density
DRAMs,” in Symp. VLSI Circuits Dig. Tech. Papers, June 2001, pp.
67–68. Pil Soo Lee was born in Seoul, Korea, in 1963. He
received the B.S. and M.S. degrees from Inchon Uni-
versity, Korea, in 1990 and 1992, respectively.
In 1993, he joined KEC, Kumi, Korea, where
Se Jun Kim was born in Seoul, Korea, in 1974. He he worked on power device design and analysis. In
received the B.S. and M.S. degrees in electronics en- 1997, he joined Hynix Semiconductor Inc., Ichon-si,
gineering from Hanyang University, Seoul, in 1998 Kyungki-Do, Korea, where he has been working on
and 2000, respectively. signal integrity analysis of high-frequency devices,
In 2000, he joined the Memory Research and circuits, and boards.
Development Division, Hynix Semiconductor
Inc., Kyungki-Do, Korea, as a Research Engineer,
where he has been working on CMOS circuit and
architecture for high-speed digital/analog interface.
His current interests include clock recovery circuits,
data converters, clock distribution, and I/O circuits Jin Hong Ahn was born in Busan, Korea, in 1958.
for high-speed digital/analog interface. He received the B.S. and M.S. degrees in electronic
engineering from Seoul National University, Seoul,
Korea, in 1982 and 1984, respectively.
He joined Gold-Star Semiconductor Company,
Sang Hoon Hong received the B.S. degree in Gumi, Korea, in 1984. From 1986 to 1990, he was
electronic engineering from Yonsei University, involved in designing SRAMs and mask ROMs.
Seoul, Korea, in 1993. He received the M.S. and In 1991, he moved to the DRAM design group,
Ph.D. degrees in engineering sciences from Harvard Gold-Star Electron Company, Seoul. From 1991 to
University, Cambridge, MA, in 1998 and 2001, 1998, he managed several generations of advanced
respectively. DRAM design projects, including 64-Mb, 256-Mb,
He is currently with the Memory Research and MML, and intelligent RAM. His interests in DRAM design include new
Development Division of Hynix Semiconductor DRAM architectures, next-generation DRAM circuit technologies, and
Inc., Ichon-si, Kyongki-Do, Korea, working on low-cost DRAM design techniques. In 1999, he joined the Memory Research
high-speed dynamic memories with a partic- and Development Group, Hynix Semiconductor Inc., Ichon-si, Korea, where
ular interest in low-voltage/power circuits and he was engaged in the development of 0.15-m 256-M DRAM. He is currently
architectures. a Technical Director in DRAM Design technology.
Jae-Kyung Wee was born in Seoul, Korea, in 1966. Jin Yong Chung received the B.S.E.E. degree from
He received the B.S. degree in physics from Yonsei Seoul National University, Seoul, Korea, in 1974 and
University, Seoul, in 1988 and the M.S. degree the M.S.E.E. degree from Korea Advanced Institute
from Seoul National University in 1990. In August of Science and Technology, Taejon, Korea, in 1976.
1998, he received the Ph.D. degree in electronics From 1976 to 1978, he worked for Korea Semicon-
engineering on modeling and characterization ductor Inc., which later became Semiconductor Busi-
of interconnects for high-speed and high-density ness Unit of Samsung Electronics, where he was in-
circuits from Seoul National University. volved in the design of timepieces and custom CMOS
In 1990, he joined Hyundai Electronic Company chip designs. Since 1979, he was involved in memory
working on the process integration of 16 MDRAM design area and worked for various companies in-
and LOGIC devices. In 1996, he was engaged in the cluding National Semiconductor, Synertek, Vitelic,
development of the manufacturable 0.35-m CMOS logic technology for high- developing CMOS SRAMs, 4 K to 64 K and mask ROMs and CMOS DRAMs.
performance logic products at Hyundai Electronics. In August 1998, he became In 1987, he joined LG Semiconductor, Korea, where he developed 256 K to 16 M
a Project Leader of the Antifuse Repair Circuit Development Team. From Au- DRAMs and other standard logic products. In 1992, he joined Mosel-Vitelic,
gust 1999 to June 2000, he was a Project Leader of 1-G DDR SDRAM using 2
where he developed high-speed DRAMs and the 256 K 8 high-speed DRAM
0.13-m technology. Beginning in July 2000, he also worked on next-generation became the first semi-standard DRAM, which helped the company to go public.
DRAM and its related systems. He is currently with the faculty of Hallym Uni- Since 1996, he has worked for Hynix Semiconductor Inc., Ichon-si,
versity, Chunchun-si, Kangwon-Do, Korea. His research interest is in the area of Kyoungki-Do, Korea, as a Senior Vice President and Chief Architect in the
future DRAM architecture including high-speed DRAM with 200 400 MHz Memory Research and Development Division. His current research interest is
clock, interconnect modeling, charge pump, DLL, I/O, and module designs for in development of ultrahigh-speed, super low-voltage and low-power memory
high-speed chips. He holds several patents and is an author or co-author of sev- products, novel device research in ferroelectric and magnetic memories, and
eral papers. new-generation 3-D devices.
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 8, AUGUST 2000 1137
Abstract—This paper presents a salient analog phase-locked respond properly to unpredictable phase fluctuation, instant
loop (PLL) that adaptively controls the loop bandwidth according frequency shift, and time-varying jitter because the sequence
to the locking status and the phase error amount. When the phase was calculated with preknown fixed noise statistics.
error is large, such as in the locking mode, the PLL increases the
loop bandwidth and achieves fast locking. On the other hand, Discrete-time PLL’s, which are programmed on DSP proces-
when the phase error is small, this PLL decreases the loop band- sors, based on a recursive least squared (RLS) algorithm [5]
width and minimizes output jitters. Based on an analog recursive or the Kalman filter algorithm [6] can respond to such unpre-
bandwidth control algorithm, the PLL achieves the phase and dictable jitter variations, but require enormous amount of hard-
frequency lock in less than 30 clock cycles without pre-training, ware. The outputs generated from the discrete-time PLL’s are
and maintains the cycle-to-cycle jitter within 20 ps (peak-to-peak)
in the tracking mode. A feed forward-type duty-cycle corrector in a digital domain, and therefore the discrete-time PLL’s re-
is designed to keep the 50% duty cycle ratio over all operating quire digital-to-analog converters (DAC) and an analog-to-dig-
frequency range. ital converter (ADC) to sample input signals for detection. Slow
Index Terms—Adaptive bandwidth PLL, analog imple- signal-processing speed of the digital-to-analog conversion in
mentation, clock recovery, fast locking time, frequency hop- the discrete-time PLL’s limits the operating frequency and con-
ping, gear-shifting algorithm, low jitter, phase-locked loops, fines the use of the PLL’s to the applications dealing with low-
time-varying channel. frequency signals like digital wireless base stations.
This paper presents a new analog adaptive PLL (AAPLL) ar-
I. INTRODUCTION chitecture capable of varying the loop bandwidth according to
an adaptively updated control sequence under a time-varying
(2)
(3)
(a)
A. Stability
Since the AAPLL automatically changes the loop bandwidth,
(a) a careful loop stability analysis is required. As mentioned in
the previous section, an analog adaptive controller adjusts
Fig. 9. Feed forward-type duty-cycle corrector. (a) Duty-cycle corrector the phase-detector gain of the CP-PLL. Therefore, stability
schematic. (b) Conceptual diagram of the correcting operation.
checking for the PLL for each different phase gain should
be accomplished first. A complete stability analysis for the
, and charges the output node of the duty-cycle corrector CP-PLL is cumbersome because a PLL operates in both a
almost instantaneously, because the discharge path of the node linear and a nonlinear region. A simplified stability analysis for
is already off due to the signal . The signal , which is also a second-order CP-PLL [8] is used in this section. When the
selected from the multiphase signals, is the one whose rising criterion is extended to include the logic delay effect, it can be
edge is shifted by 180 in phase from that of . Similarly, the expressed as
signal rapidly discharges the node and delivers the desired
50% duty-cycle signal. Since this duty-cycle correction circuit (4)
consists of only two transmission gates and two inverters, the sil-
icon area is minimal and the power consumption is negligible. Here, , , and are the clock period, the logic delay, and the
In HSPICE simulation, the proposed duty-cycle corrector keeps RC time constant of a loop filter respectively. The stability limit
the output duty cycle almost perfectly at 50% with the input duty for the loop gain of the AAPLL is derived and simulated
cycles varying from 10 to 90%. using this criterion as shown in Fig. 10. The adaptively gener-
ated loop gain sequence by the recursive equation is also shown
IV. ANALYSIS AND SIMULATION in the same figure to verify the AAPLL stability. The sequence
converges to the minimum bandwidth and the amplitude of this
In this section, the stability of the AAPLL is analyzed for the bandwidth is almost similar to that derived from the MMSE cri-
adaptively generated loop sequence, and behavioral simulation terion [4]. Equation (4) can be written to obtain the stability cri-
results for fast lock and large jitter reduction are described. terion for the bandwidth voltage by solving a MOS I-V
1142 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 8, AUGUST 2000
(5)
B. Output Jitter
Recently, it was reported that a CP-PLL has an optimum loop
bandwidth that generates minimum jitter in the steady state [11].
A clean tone, that is assumed to have only noise floor and no
random walking phase noise, is used as a reference signal for the
jitter derivation. Because the AAPLL eventually achieves the
steady state locking with a clean reference signal like other con-
ventional PLL’s, the output cycle-to-cycle jitter of the AAPLL
can be calculated by
(b)
Fig. 13. Simulation results for the locking behavior of the AAPLL and
conventional ones. (a) Fixed narrow-bandwidth PLL. (b) Fixed wide-bandwidth
PLL.
(6)
extensively tested by the circuit simulator. Fig. 12 shows the sim-
Here, , , , and are the internal jitter ulation setup for the AAPLL. Fig. 13 compares the simulated
from the VCO, the jitter of the input signal, the rms value of locking behavior of the AAPLL with that of a conventional PLL.
the charge-pump current variation, and the rms value of VCO The bandwidths of the conventional PLL are selected to have two
control voltage noise in the steady state, respectively. typical values. One is optimized for the initial locking, and the
other for the steady-state tracking. The gray line in Fig. 13(a) in-
C. Behavioral Simulation of a Locking Feature dicates an incoming signal in the phase domain. The solid line
Closed-form analysis of locking behaviors for the AAPLL is in the figures shows the phase of the AAPLL output signal from
difficult because of its nonlinear operation. In this paper, a sim- initial locking to steady-state tracking. The phase variation of
ulation-based approach like the Monte Carlo Method is used in- the conventional PLL optimized for steady-state tracking with a
stead. The AAPLL is modeled in a SPICE circuit simulator and narrow bandwidth is shown in the same figure as a dashed line.
LEE & KIM: LOW-NOISE FAST-LOCK PHASE-LOCKED LOOP 1143
Fig. 17. Experimental results of the locking for a 150–200-MHz input signal.
Fig. 18. Experimental results of the locking for a 180–220-MHz input signal
by four steps.
Fig. 15. Control voltage change for a 0–250 MHz frequency input.
Fig. 16. Loop bandwidth voltage change for a 0–250-MHz frequency input.
Fig. 20. 50% duty-cycle correction operation over the entire frequency range.
Fig. 22. Comparison between recently reported PLL’s and DLL’s and this
work.
TABLE I
AAPLL CHARACTERISTICS SUMMARY
using analog technique and hence the required die size and the [8] S. Haykin, Adaptive Filter Theory. Englewood Cliffs, NJ: Prentice
power consumption are minimal. Hall, 1995.
[9] T. C. Weigandt, B. Kim, and P. R. Gray, “Analysis of timing jitter in
CMOS ring oscillators,” in Proc. Int. Symp. Circuit and Systems, vol. 4,
APPENDIX London, U.K., June 1994, pp. 27–30.
[10] C. H. Park and B. Kim, “A low-noise 900-MHz VCO in 0.6-m CMOS,”
As shown in Fig. 2, the OR gate gives the control signal for IEEE J. Solid-State Circuits, vol. 34, pp. 586–591, May 1999.
the switch according to the phase error signal in the band- [11] K. Lim, C. H. Park, and B. Kim, “Low noise clock synthesizer design
using optimal bandwidth,” in Proc. Int. Symp. Circuit and Systems, Mon-
width controller. When the phase error of the signal is high, the terey, CA, June 1998, pp. 163–166.
controller signal from the OR gate feeds current to the band- [12] J. Lee and B. Kim, “A 250 MHz low jitter adaptive bandwidth PLL,”
width capacitor and the voltage across the capacitor in- ISSCC Dig. Tech. Papers, pp. 346–347, Feb. 1999.
[13] J. McNeil, “Jitter in ring oscillators,” IEEE J. Solid-State Circuits, vol.
creases at a constant rate . As a result, the bandwidth 32, pp. 870–879, June 1997.
voltage increases proportional to the normalized phase
error . After the charging process, the controller signal
from the OR gate disconnects the path from the current source
and connects to the resistor. So the capacitor discharges Joonsuk Lee (S’99) received the B.S. and M.S. de-
through the resistor . The switching action occurs every grees in electrical engineering and computer sciences
from Korea Advanced Institute of Science and Tech-
clock cycle period. nology (KAIST), Taejon, Korea, in 1995 and 1997,
The voltage of the bandwidth capacitor at time respectively. Since 1997 he has been working toward
can be written as the Ph.D. degree at the same university.
From 1999 to 2000, he was with IBM Microelec-
tronics, Boston, MA, as an Analog and Mixed Signal
Designer involved in a high performance sigma–delta
ADC/DAC project with Motorola, Lowell, MA. His
research interests include PLL/DLL, timing recovery
(7) algorithms, high-speed SDRAM interface, and LAN and mixed-mode signal
processing technique for telecommunication IC’s.
where is the voltage of the previous capacitor voltage Mr. Lee is the Gold Medal winner of the Human-Tech Thesis Prize from Sam-
sung Electronics Co. Ltd. in 1997, the Gold Medal winner of the Chip Design
at time . The voltage equation can be simplified to (8). Contest from LG Semicon Co. Ltd. in 1998, and the Gold Medal winner of the
Integrated Design Center (IDEC) Award in 1998.
(8)
Here ,
. In the initial locking mode, Beomsup Kim (S’87–M’90–SM’95) received the
the AAPLL does the locking operation based on (8). Once B.S. and M.S. degrees in electronic engineering
from Seoul National University, Seoul, Korea, in
the AAPLL finished the phase and frequency locking, the 1983 and 1985, respectively, and the Ph.D. degree in
phase error is far less than . In this case, the forgetting electrical engineering and computer sciences from
factor and the proportional coefficient can be be replaced by the University of California, Berkeley, in 1990.
From 1986 to 1990, he worked as a Graduate Re-
and . searcher and Graduate Instructor at Department of
Electrical Engineering and Computer Sciences, Uni-
REFERENCES versity of California, Berkeley. From 1990 to 1991,
he was with Chips and Technologies, Inc., San Jose,
[1] J. Dunning et al., “An all-digital phase-locked loop with 50-cycle lock CA, where he was involved in designing high speed-signal processing IC’s for
time suitable for high-performance microprocessors,” IEEE J. Solid- disk drive read/write channels. From 1991 to 1993, he was with Philips Re-
State Circuits, vol. 30, pp. 412–422, Apr. 1995. search, Palo Alto, CA, where he was conducting research on digital signal pro-
[2] B. Kim, D. N. Helman, and P. R. Gray, “A 30-MHz hybrid analog/digital cessing for video, wireless communication, and disk drive applications. During
clock recovery circuit in 2-m CMOS,” IEEE J. Sold-State Circuits, vol. 1994, he was a Consultant, developing the partial-response maximum likeli-
25, pp. 1385–1394, Dec. 1990. hood detection scheme of the disk drive read/write channel. In 1994, he became
[3] M. Mizuno et al., “A 0.18 m CMOS hot-standby phase-locked loop an Assistant Professor with the Department of Electrical Engineering, Korea
using a noise immune adaptive-gain voltage-controlled oscillator,” Advanced Institute of Science and Technology (KAIST), Taejon, Korea, and
ISSCC Dig. Tech. Papers, pp. 268–269, Feb. 1995. is currently an Associate Professor. During 1999, he took a sabbatical leave
[4] G. Roh, Y. Lee, and B. Kim, “An optimum phase-acquisition technique and stayed at Stanford University, Stanford, CA, and also consulted for Marvell
for charge-pump phase-locked loops,” IEEE Trans. Circuit Syst. II, vol. Semiconductor Inc., San Jose, CA, on the Gigabit Ethernet and wireless LAN
44, pp. 729–740, Sept. 1997. DSP architecture. His research interests include mixed-mode signal processing
[5] B. Chun, Y. Lee, and B. Kim, “Design of variable loop gain of dual-loop IC design for telecommunications, disk drive, local area network, high-speed
DPLL,” IEEE Trans. Commun., vol. 45, pp. 1520–1522, Dec. 1997. analog IC design, and VLSI system design.
[6] P. F. Driessen, “DPLL bit synchronizer with rapid acquisition using Dr. Kim is a corecipient of the Best Paper Award (1990–1991) for the IEEE
adaptive Kalman filtering techniques,” IEEE Trans. Commun., vol. 452, JOURNAL OF SOLID-STATE CIRCUITS, and received the Philips Employee Reward
pp. 2673–2675, Sept. 1994. in 1992. Between June 1993 and June 1995, he served as an Associate Editor for
[7] B. Kim, “Dual-loop DPLL gear-shifting algorithm for fast synchroniza- the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: ANALOG AND DIGITAL
tion,” IEEE Trans. Circuits Syst. II, vol. 44, pp. 577–586, July 1997. SIGNAL PROCESSING.
632 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 5, MAY 1999
Abstract— A digital delay-locked loop (DLL) that achieves the interface cells to provide internal on-chip clocks that
infinite phase range and 40-ps worst case phase resolution at are aligned in phase with an external system clock. The
400 MHz was developed in a 3.3-V, 0.4-m standard CMOS clock alignment circuits must provide a phase resolution
process. The DLL uses dual delay lines with an end-of-cycle detec-
tor, phase blenders, and duty-cycle correcting multiplexers. This better than 50 ps and produce a worst case long-term jitter
more easily process-portable DLL achieves jitter performance of less than 250 ps peak-to-peak (p–p). To facilitate the
comparable to a more complex analog DLL when placed into use of many different application-specific integrated-circuit
identical high-speed interface circuits fabricated on the same controllers with the memory system, the clock alignment
test-chip die. At 400 MHz, the digital DLL provides <250 ps
circuit should be easily portable across multiple processes
peak-to-peak long-term jitter at 3.3 V and operates down to 1.7 V,
where it dissipates 60 mW. The DLL occupies 0.96 mm2 : without compromising performance.
The clock alignment function can be provided using either
Index Terms—Delay circuits, delay-locked loops (DLL’s), dig-
ital control, digital DLL, phase blending, phase control, phase
phase-locked loops (PLL’s) or DLL’s. Because frequency syn-
synchronization. thesis is not needed in this application, DLL’s are preferred for
their unconditional stability, lower phase-error accumulation,
and faster locking time. In previous designs of the interface
I. INTRODUCTION cells for this memory system, we have used an analog DLL
with a two-step coarse/fine architecture. A high-level drawing
I N RECENT years, there has been a great deal of interest
in delay-locked loops (DLL’s) for clock alignment. Both
analog and digital DLL’s have been developed [1]–[6], with
of this approach is shown in Fig. 1. This analog DLL includes
a quadrature generator, which produces four reference signals
analog loops generally providing better jitter performance spaced 90 apart in phase to evenly cover the full 360
at the expense of greater complexity. This paper describes of phase space. A phase interpolator circuit in the analog
a digital DLL that achieves jitter performance comparable DLL receives these reference signals and selects a phase
to an analog DLL. Although the digital DLL uses more adjacent pair that define a phase quadrant for interpolation to
area and power than the analog DLL, its greater simplicity, produce an output signal phase-aligned to a reference signal,
easier portability, and lower minimum required supply voltage RefClk.
makes it very attractive in many clock alignment applications. Analog DLL’s constructed with this approach provide sev-
Additionally, the digital DLL not only operates at lower supply eral significant benefits. Because most of the elements in the
voltages than the analog DLL but it also demonstrates that signal path can be made from differential analog blocks with
digital DLL’s have the potential for good power-consumption good power-supply rejection ratio (PSRR), the analog DLL
scaling as supply voltage is decreased. architecture of Fig. 1 can provide very good jitter performance.
The motivation for the development of this digital DLL Additionally, it can be carefully designed to occupy relatively
was the need for a clock alignment circuit for use in the little area and consume relatively little current. Furthermore,
CMOS interface cells [6] of a high-speed memory system the analog DLL can provide very small phase steps when
as in [7].1 The memory system operates at 400 MHz, with locked ( 50 ps). Finally, the architecture of Fig. 1 provides
data transferred on both edges of the clock, producing an infinite phase range, and one set of quadrature reference
effective 800-Mb/s/pin transfer rate. This corresponds to a signals can be fed to multiple phase interpolators, allowing
1.25-ns bit time. With such tight timing requirements, it phase alignment to multiple reference signals simultaneously.
becomes imperative to include clock alignment circuits in However, because of the relatively high analog complexity of
this DLL and its individual elements, the analog DLL of Fig. 1
Manuscript received September 15, 1998; revised December 23, 1998. requires a detailed, process-specific implementation, making it
B. W. Garlepp, K. S. Donnelly, J. Kim, P. S. Chau, J. L. Zerbe, C. Huang,
C. V. Tran, C. L. Portmann, D. Stark, and Y.-F. Chan are with Rambus, Inc., relatively labor intensive to port across multiple processes.
Mountain View, CA 94040 USA. Although we have traditionally used analog DLL’s to pro-
T. H. Lee and M. A. Horowitz are with the Center for Integrated Systems, vide the clock alignment function in the CMOS interface
Stanford University, Stanford, CA 94305 USA.
Publisher Item Identifier S 0018-9200(99)03668-9. cells of the memory system described above, we decided to
1 Documentation is available at http://www.rambus.com/html/direct_docu- consider using a digital DLL. Digital DLL’s are characterized
mentation.html. by their use of a digital delay line and are typically made from
0018–9200/99$10.00 1999 IEEE
GARLEPP et al.: PORTABLE DIGITAL DLL 633
simple, digital circuit elements. This facilitates their design and CMOS interface cells on the same test-chip die. Section VI
portability across multiple processes. Additionally, because concludes this paper.
phase information in a digital DLL is stored as a digital The terms phase and delay are used throughout this paper
state, digital DLL’s can provide very fast timing recovery after to describe the DLL’s operation. It is helpful to recall that at a
being placed into a low power mode. However, conventional given system frequency, the two quantities are related by the
digital DLL’s provide only moderate phase resolution and jitter simple equation
performance [8], [9].
(1)
Another benefit of digital DLL’s is their ability to readily
operate at lower voltages than analog DLL’s. Because analog where is phase in degrees, is delay in seconds, and
DLL’s require the use of saturated current sources, they is frequency in hertz.
experience voltage headroom problems as supply voltages
decrease. Digital DLL’s, on the other hand, need only enough II. DIGITAL DELAY CIRCUIT TECHNIQUES
voltage to ensure the proper operation of their digital gate
elements. For the same reason, digital DLL’s better utilize A. Conventional Digital Delay Lines
the power-saving benefits of digital CMOS voltage scaling
than analog DLL’s. The power of an analog DLL is typically As mentioned above, the purpose of a DLL in a clock
alignment application is to provide an output clock signal that
distributed between IV power (where I is power and V is
is aligned in phase with a reference clock signal of the same
voltage) from the constant current (differential) stages and
frequency. To do this, the DLL must include a mechanism for
CV f power (where C is capacitance and f is frequency) from
providing a variable delay to an input signal. The DLL then
the CMOS (single-ended) stages (if any). The power of digital
adjusts this variable delay such that the input signal passes
DLL’s, on the other hand, is determined primarily by CV f
through the delay mechanism and emerges at the output of the
power, which decreases quadratically with supply voltage.
DLL aligned in phase with the reference signal.
This paper describes a digital DLL [10] used as the clock
Digital DLL’s generally incorporate a tapped digital delay
alignment circuit in the CMOS interface cells of a high-speed line as the variable-delay mechanism. The delay line receives
memory system. This work improves upon the performance of an input clock signal (e.g., a buffered version of the reference
previous digital DLL’s by paralleling the two-step coarse/fine signal) and passes it through a series of delay elements. The
analog DLL architectures presented in [4], [5], [7], and [11], outputs of the delay elements are tapped and buffered to
allowing the digital DLL to achieve jitter performance com- provide a series of phase-adjacent signals. The DLL then
parable to the analog DLL’s. selects the delay-line tap that provides the signal that produces
This paper is arranged as follows. Section II describes an output with a phase that most closely matches the desired
delay-generation techniques used in conventional digital phase.
DLL’s and describes the improved techniques implemented A conventional delay line suitable for a CMOS digital DLL
in the new DLL. This section also describes infinite phase is shown in Fig. 2. The delay elements could be implemented
generation with the new delay-line scheme. Section III with almost any circuit block, but because the phase resolution
describes several new circuit techniques used for enhancing of the delay line is determined by the delay through the delay
the phase resolution and signal quality in the new digital DLL. elements, delay elements that provide minimal delay are gen-
Section IV describes the overall DLL architecture. Section V erally preferred. Thus, the delay line of Fig. 2 uses inverters,
discusses our test chip and measured results, with special since they provide the shortest delay of any CMOS digital gate.
attention given to making a direct, side-by-side comparison of Because of the inverting characteristic of all standard CMOS
the new digital DLL with an analog DLL placed into identical gates, the delay line is tapped only at every other inverter
634 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 5, MAY 1999
Fig. 3. Complementary delay line with inverter delay elements for improved phase resolution.
output to ensure that each successive tap provides a signal ClkInb. Because of the use of complementary inputs, the two
that is adjacent in phase to the signals at its adjacent taps. delay lines are tapped after every inverter to provide phase-
Although conventional delay lines are attractive for their adjacent signals separated by only one inverter delay, thereby
simplicity, DLL’s designed around such conventional delay improving the phase resolution by a factor of two. An example
lines suffer from several significant limitations. First, the delay of how this delay-line scheme provides single inverter delay
line provides fairly coarse phase resolution. For example, the resolution is shown by the shaded paths in Fig. 3. The signal
delay line in Fig. 2 provides a minimum phase step corre- that emerges from Tap 2 has passed through three inverter
sponding to two inverter delays. Such coarse phase resolution delays, while the signal that emerges from Tap 3 has passed
is not fine enough for our clock alignment application. Second, through four inverter delays. However, ClkInb is exactly 180
conventional delay lines deliver only a finite phase range. out of phase with ClkIn, providing the additional inversion
Typically, in order to cover at least one full cycle of phase, the required to ensure that the signals emerging from Taps 2 and
delay-line length and element delays are adjusted to provide 3 are indeed separated in phase by exactly one inverter delay.
at least 360 of phase under the fastest process, voltage, This complementary delay-line architecture also allows the
and temperature (PVT) conditions and minimum operating delay lines to be made shorter. The true taps from the delay
frequency More often, however, the delay line can provide the first 180 of phase, while the complement
line is designed with as much as 720 (i.e., two cycles) taps can provide the second 180 of phase. Thus, each of
of phase under these conditions. This requires the use of a the two delay lines can be tuned for only 180 of phase
long delay line, occupying a large silicon area and dissipating under the fastest PVT conditions and Shorter delay
additional power as the input signal propagates through the lines provide the additional benefits of reduced maximum
many delay elements. Additionally, because inverters offer jitter accumulation, smaller silicon area, and lower power
poor PSRR, voltage supply noise-induced jitter can accumulate consumption. The problem that this design creates is a need to
as the signal propagates down the delay line. This causes determine when to switch from the true taps to the complement
the signals available from the later taps in the delay line taps and vice versa to ensure full and even coverage of the
to be more jitter prone than the signals from the earlier entire 360 phase plane. This is particularly important because
taps. Last, even with an extended delay line, the DLL can the number of delay elements (and output taps) needed to cover
nonetheless run out of phase range and lose lock in a system 180 changes with PVT conditions and operating frequency.
with slowing drifting phase (e.g., spread-spectrum clocking).
These limitations prohibited the use of a conventional delay
line in our DLL design. C. Infinite Phase Generation
To solve the problem of determining when to switch be-
tween the true and complement taps of the complementary
B. Delay-Line Improvements delay line, we developed an end-of-cycle (EOC) detector, as
To overcome some of these limitations, we developed a shown in Fig. 4, for use with the complementary delay line. An
complementary delay line as shown in Fig. 3 for our DLL. EOC detector is essentially a bank of data flip-flops arranged
In this architecture, two parallel delay lines with weak cross as a time-to-digital converter for measuring the delay through
coupling are driven by complementary input signals ClkIn and the delay line. The EOC detector produces a thermometer code
GARLEPP et al.: PORTABLE DIGITAL DLL 635
A. Phase Blending
Although the delay-line improvements discussed above re-
duced the required power and area of the delay line, improved
its jitter accumulation performance, enabled infinite phase
range, and improved the available phase resolution by a factor
Fig. 5. Phasor diagram with phasors of signals from the taps of a comple- of two, this phase resolution was still not good enough to
mentary delay line with one inverter delay= 50 : meet the requirements of our memory system. In the 0.4- m
process we used, the propagation delay of one inverter over all
indicating the first 180 of delay in the delay lines. The first anticipated PVT conditions varied from 100 to 300 ps. This
state transition in the EOC code indicates the first true tap is much larger than the worst case phase step specification of
from the delay line that provides a signal with phase that 50 ps. Therefore, to ensure compliance with this specification,
lags the phase of the signal from Tap 1 by more than 180 the DLL’s phase resolution needed to be improved by at least
With this information, the DLL logic knows when to switch six times over what the delay line provided.
between the true and complement taps of the delay line to To solve this problem, we used inverter phase blend-
ensure full coverage of all 360 of phase space, with phase ing. A simple, single-stage phase-blender circuit is shown
steps of at most one inverter delay. Use of the EOC code also in Fig. 6(a). This circuit receives two phase-adjacent input
prevents negative phase steps in the phase-transfer function as signals, and , which are separated in phase by one
taps are successively selected from the delay line. This allows inverter delay. The phase blender directly passes these two
the complementary delay lines to provide infinite, monotonic signals with a simple delay to produce output signals and
phase range for the DLL. The clocking signal for the EOC However, it also uses a pair of phase-blending inverters to
detector, SampClk, is synchronized to the signal from Tap 1 interpolate between these two input signals to produce a third
by a replica timing network (not shown). output signal, , having a phase between that of and
To illustrate the principle of infinite phase generation using This effectively doubles the available phase resolution.
the EOC code with this delay-line scheme, refer to Fig. 5, However, it is not sufficient to use equal-sized inverters
which shows a phasor diagram of the signals from the first for the phase blending. Fig. 6(b) illustrates a simple model
five true and complement taps of a complementary delay line [12] used for determining the ideal relative sizes of the two
like the one shown in Fig. 3. The figure assumes that the phase-blending inverters to ensure that the phase of lies
PVT conditions and operating frequency are such that the directly between that of and The model approximates
propagation delay of each inverter stage is equal to 50 of the two inverters with two simple switched current sources
phase. In the figure, the solid lines correspond to signals from sharing a common resistance–capacitance (RC) load. For two
the true taps, while dashed lines correspond to signals from rising edge input signals separated in time by the model
the complement taps. Because Tap 5 delivers a signal that is yields the equation
delayed by 200 from the signal at Tap 1, the EOC detector’s
thermometer code would indicate that Tap 5 is the first true
tap to provide a signal with phase beyond 180 relative to the
signal from Tap 1. With this information, the DLL knows to
(2)
switch between the true and complement taps after four stages.
636 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 5, MAY 1999
(a) (c)
where is the total resistive load, is the output capacitance, Another design constraint of the phase-blender circuit is that
is the total pulldown current of the two phase-blending all paths through the circuit must provide precisely the same
inverters, is the unit step function, and is the phase- loading and delay to ensure that the phase relationship between
blending inverter relative size ratio [refer to Fig. 6(a), where and is maintained by and
is the ratio of the device widths in The phase-blender idea can be extended to multiple cas-
inverter to the total device widths in both inverters and caded stages for further phase-resolution improvement, with
]. Equation (2) is the sum of two decaying exponential terms, each additional stage improving the resolution by a factor of
and Fig. 6(c) shows a plot of the resulting waveform according two. Fig. 7 shows a two-stage cascaded phase-blender circuit
to this equation for the case where Because the that provides a 4x improvement in phase resolution from input
second exponential term is delayed in time by relative to to output. Although it is theoretically possible to increase phase
the first, it only begins to affect the slope of the decay after resolution indefinitely by adding more and more phase-blender
this delay has elapsed. Therefore, without explicitly solving stages, there is a practical limit. The number of inverters in
the equation for each case of and it is not each signal path increases by two with each additional phase-
obvious when will cross blending stage, making the circuit increasingly susceptible
For input signals separated in phase by one inverter delay to voltage supply noise-induced jitter due to the additional
(i.e., ), the model specifies that in order to ensure delay in the signal path. Therefore, it is prudent to increase
that the phase of lies directly in between that of the number of blending stages to improve phase resolution
and the phase-blending inverters must be sized in a only until the output phase step size from the phase blender
ratio, such that the leading phase is is approximately equivalent to the anticipated voltage supply
coupled to an inverter that is bigger than the one that receives noise-induced jitter.
the lagging phase. This ratio was also confirmed empirically There are several design limitations that must be considered
with simulations. The effect of the relative sizing of the phase- when designing a cascaded phase blender. First, the impor-
blending inverters is illustrated in Fig. 6(d) and (e), which tance of proper (asymmetrical) sizing of the phase-blending
shows the resulting output signal edges for and inverters grows with the number of cascaded blending stages
, respectively. Clearly, the phase of output signal because edge misplacement has a compounding effect as the
is closer to that of than to that of when the signals travel through the multiple stages. Additionally, close
phase-blending inverter size ratio is Although attention must be paid to ensuring equal loading for equal
asymmetrical inverter sizing ensures good, evenly delay through all paths, requiring the use of dummy devices
spaced edge placement of the three output signals, it requires on otherwise unbalanced paths. Finally, like a single-stage
that lead Reversing the phase of these two input phase blender, a cascaded phase blender also requires the
signals would result in a severely misplaced since the phase of to lead that of to ensure even output phase
effective sizing ratio would then be spacing.
GARLEPP et al.: PORTABLE DIGITAL DLL 637
To overcome these design limitations of the cascaded phase phase. Beginning with output outputs
blender, we developed a symmetrical phase blender. A block can be successively selected to evenly span
diagram of a three-stage symmetrical phase blender is shown the phase range between and Once is selected,
in Fig. 8. This circuit is essentially two parallel cascaded can be changed to another signal that lags This
phase-blender circuits, sharing some common paths. When switching is possible without affecting the signal be-
leads the outputs provide cause has no dependence on or coupling from Then
equal output phase spacing. When leads the out- outputs can be successively se-
puts provide equal output phase lected to evenly span the phase range between and
spacing. Therefore, the circuit provides phase blending with an Once is selected, can be changed to yet another
8x improvement in phase resolution and equally spaced output signal that lags Again, this is possible without any change
signals regardless of which input signal leads in phase. in the signal because has no dependence on or
Additionally, the symmetrical blender allows for seamless coupling from This process can continue indefinitely.
input switching for continuous phase blending over multiple Also, because all paths through the symmetrical phase blender
input delays. For example, assume that leads in are inherently balanced, no dummy devices are needed.
638 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 5, MAY 1999
(a)
(b)
Fig. 9. (a) A 16 : 1 duty-cycle correcting multiplexer circuit. (b) Duty-cycle correction control circuit.
B. Signal Selection and Duty-Cycle Correction cycle correcting functionality to the multiplexing circuitry, we
Since the digital DLL was to be placed into a memory implemented duty-cycle correction while requiring minimal
system that exchanges data on both edges of the clock, good additional power, area, and delay.
duty cycle (i.e., close to 50%) is required to ensure that the A 16 : 1 duty-cycle correcting multiplexer is shown in
data exchanged on either edge of the clock have equal bit Fig. 9(a) with a corresponding control circuit in Fig. 9(b). To
times. Duty-cycle distortion is usually addressed in PLL’s by facilitate understanding of this circuit’s operation, consider an
simply running the PLL’s voltage-controlled oscillator (VCO) example. Assume that signal is selected and has duty-
at twice the system frequency and using a postdivider triggered cycle distortion such that output signal has a high
on one edge of the VCO output to produce the output clock duty cycle. Assume also that is sensed by a duty-cycle
from the PLL [13]–[15]. This ensures good, 50% duty cycle. In error detector, which produces a differential output error signal
a DLL, however, no frequency multiplication is possible. The proportional to the difference in duty cycle be-
duty cycle of the output signal must be directly corrected to tween and the ideal 50%. Thus, in our example,
50%, for example, by using a duty-cycle correcting amplifier will be greater than causing more current to be steered
in the signal path as in Fig. 1 and in [4]. through the right branch of the control signal in Fig. 9(b) than
Although duty-cycle correction can be addressed by placing through the left side. This in turn increases the strength of
a duty-cycle corrector at the output of the DLL, this approach and compared to and in the duty-cycle
has several limitations. First, since duty cycle is corrected only correcting multiplexer of Fig. 9(a). These transistors alter the
at the output of the DLL, internal DLL signals may have duty cycle of the signal as it passes from to driving
poor duty cycle. It is good practice, however, to maintain to the ideal 50% duty cycle. The use of both PMOS and
50% duty cycle throughout the signal path to maximize signal NMOS devices to perform the duty-cycle correction ensures
propagation as frequency is increased. Second, performing all a symmetrical duty-cycle correction range. Furthermore, be-
the duty-cycle correction in one stage at the output of the cause duty-cycle correction has been distributed through two
DLL places a great deal of strain on the duty-cycle correcting stages, the requirements on each individual duty-cycle correct-
circuit; it must have a large duty-cycle correction range to ing stage are reduced. By combining both necessary functions
compensate for all the duty-cycle distortion that can accumu- of signal selection and duty-cycle correction, this circuit
late in the signal path. Finally, adding a duty-cycle corrector minimizes signal path delay, jitter accumulation, circuit area,
directly into the signal path increases signal path delay, and and power compared to performing both functions separately.
thus susceptibility to voltage supply noise-induced jitter.
To address the issue of duty cycle, we developed the IV. DLL ARCHITECTURE
idea of duty-cycle correcting multiplexers. Since multiplexers Fig. 10 is a block diagram of the entire digital DLL, with
would be needed in our DLL regardless, by adding duty- shading indicating the circuit blocks that were described in
GARLEPP et al.: PORTABLE DIGITAL DLL 639
(a) (b)
Fig. 12. Measured transmit eye diagrams at 3.3 V and 400 MHz of the high-speed interface cells with (a) the analog DLL of [6] and (b) the new digital DLL.
V. MEASURED PERFORMANCE the signal path of the digital DLL. (Note: I/O circuit duty-
cycle distortion produced the unequal eyes in both diagrams.
A. Test Chip This is unrelated to the DLL’s.)
Both the digital DLL presented here and an implementation Fig. 13(a) and (b) shows receive shmoo diagrams for the
of the analog DLL of Donnelly et al. [6] were integrated into two interfaces with the analog and digital DLL’s, respectively.
identical high-speed CMOS interface cells on opposite sides The diagrams indicate the CMOS interfaces’ valid timing win-
of a single test chip. A micrograph of this test chip is shown in dows for receiving data. On the diagrams, the -axis is supply
Fig. 11. The test chip I/O was laid out symmetrically so that voltage (2.5 V 4.0 V) while the -axis indicates input
either interface cell could be tested on the same hardware by data positioning along a bit period ( Mb/s ns).
simply removing the test chip from the test socket, rotating The normal data position is in the center of the bit period. A
it 180 and reinserting it into the socket. This allowed a black dot in the diagram indicates incorrectly received data for
true side-by-side comparison of the two DLL’s operating in a that combination of bit position and Ideally, the window
system. The test-chip circuits were fabricated using a standard should be entirely white, but realistically, it is limited by jitter
0.4- m, 3.3-V CMOS process with 0.65-V threshold voltages. from the DLL and other sources. Therefore, this test measures
the amount of tolerable skew on the input timing over a range
B. Test Results of supply voltages. Although the interface with the analog DLL
delivers better timing performance than the interface with the
Unless indicated otherwise, all test results described in this
section were measured with the analog and digital DLL’s digital DLL (1.02 versus 0.92 ns), both meet the component
operating in their respective high-speed interface cells at 3.3 V specification of 0.85 ns.
and 400 MHz (800 Mb/s/pin) using the same test vectors. Fig. 14 is a circle plot of the measured phase of the DLL’s
Additionally, the test chip included noise-generator circuits, output signal ClkOut, illustrating the DLL’s ability to provide
which produced digital switching noise during the testing of infinite phase range. The -axis indicates delay [or phase, as in
both interfaces. (1)] of the ClkOut signal relative to a fixed 400-MHz signal.
Fig. 12(a) and (b) shows eye diagrams of the two interfaces The -axis indicates cycle count. These data were measured by
with the analog and digital DLL’s, respectively. The diagrams probing the on-chip DLL output signal (ClkOut) and forcing
indicate the output timing performance of the interface cells the DLL’s phase-detector output low. This caused the DLL’s
in the test system. Although the interface with the analog output phase to continually advance over time. The term circle
DLL provided slightly better timing performance, 320 ps p–p plot is used because this diagram is equivalent to sweeping a
versus 380 ps p–p for the interface with the digital DLL, the phasor that represents the phase of ClkOut around the phase
performances of both interfaces (and therefore, both DLL’s) plane, thereby drawing a circle in the phase plane. Because
were comparable. This is surprisingly good considering the the phase of ClkOut is measured relative to a fixed 400-MHz
extensive use of poor PSRR elements, such as inverters, in signal, the plotted delay appears modulo 2.5 ns, where ns
GARLEPP et al.: PORTABLE DIGITAL DLL 641
(a) (b)
Fig. 13. Measured shmoo diagrams showing the 400-MHz receive timing windows of the high-speed interface cells with (a) the analog DLL of [6]
and (b) the new digital DLL.
Fig. 14. Measured circle plot illustrating the infinite phase transfer characteristic of the digital DLL.
at 400 MHz. The absolute value of delay (i.e., from 3.4 the delay line. The slope of the transfer function depends on
to 5.9 ns) is irrelevant since it includes some test-system setup PVT conditions and system frequency, since these conditions
time. The data were measured and plotted using a time-interval determine how many delay-line taps are required to provide
analyzer. 180 of phase. In this case, nine taps were required, resulting
The circle plot illustrates the DLL’s phase transfer function, in an average phase step size of 20 ps or 2.9
showing its reasonably good linearity, monotonicity, and lack Table I presents a summary of many of the measured and
of discontinuities. The small bumps in the transfer function simulated results of the analog and digital DLL’s operating in
indicate a change in coarse reference phase selected from their respective CMOS interfaces. Although the analog DLL
642 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 5, MAY 1999
(a) (b)
Fig. 15. Measured DLL power consumption (a) as a function frequency for VDD = 3:3 V and (b) as a function supply voltage for f = 400 MHz.
[2] J.-M. Han, J. Lee, S. Yoon, S. Jeong, C. Park, I. Cho, S. Lee, and D. Seo, Kevin S. Donnelly (A’93) was born in Los Angeles,
“Skew minimization techniques for 256 Mb synchronous DRAM and CA, in 1961. He received the B.S. degree in elec-
beyond,” in VLSI Circuits Dig. Tech. Papers, June 1996, pp. 192–193. trical engineering and computer science from the
[3] A. Hatakeyama, H. Mochizuki, T. Aikawa, M. Takita, Y. Ishii, H. University of California, Berkeley, in 1985 and the
Tsuboi, S. Fujioka, S. Yamaguchi, M. Koga, Y. Serizawa, K. Nishimura, M.S. degree in electrical engineering from San Jose
K. Kawabata, Y. Okajima, M. Kawano, H. Kojima, K. Mizutani, T. State University, San Jose, CA, in 1992.
Anezaki, M. Hasegawa, and M. Taguchi, “A 256 Mb SDRAM using He was with Memorex, Sipex, and National Semi-
register-controlled digital DLL,” in ISSCC 1997 Dig. Tech. Papers, Feb. conductor, specializing in bipolar and BiCMOS
1997, pp. 72–73. analog circuits for disk-drive read/write and servo
[4] T. Lee, K. Donnelly, J. Ho, J. Zerbe, M. Johnson, and T. Ishikawa, “A channels. In 1992, he joined Rambus, Inc., Moun-
2.5 V CMOS delay-locked loop for 18 Mbit, 500 megabyte/s DRAM,” tain View, CA, where he has designed high-speed
IEEE J. Solid-State Circuits, vol. 29, pp. 1491–1496, Dec. 1994. CMOS PLL circuits for clock recovery and data synchronization, and high-
[5] S. Sidiropoulos and M. Horowitz, “A semidigital dual delay-locked speed I/O circuits. He currently manages a group developing I/O circuits
loop,” IEEE J. Solid-State Circuits, vol. 32, pp. 1683–1692, Nov. 1997. and PLL’s. His interests include PLL’s and DLL’s, I/O circuits, and data
[6] K. Donnelly, Y. Chan, J. Ho, C. Tran, S. Patel, B. Lau, J. Kim, P. converters. He is a Member of the ISSCC Digital Subcommittee. He has
Chau, C. Huang, J. Wei, L. Yu, R. Tarver, R. Kulkarni, D. Stark, and M. received several circuit design patents.
Johnson, “A 660MB/s interface megacell portable circuit in 0.3 m–0.7 Mr. Donnelly is a coauthor of the paper that won the Best Paper Award
m CMOS ASIC,” IEEE J. Solid-State Circuits, vol. 31, pp. 1995–2003, at the 1994 ISSCC.
Dec. 1996.
[7] N. Kushiyama, S. Ohshima, D. Stark, H. Noji, K. Sakurai, S. Takase,
T. Furuyama, R. Barth, A. Chan, J. Dillon, J. Gasbarro, M. Griffin,
M. Horowitz, T. Lee, and V. Lee, “A 500-Megabyte/s data-rate 4.5M
DRAM,” IEEE J. Solid-State Circuits, vol. 28, pp. 490–508, Apr. 1993. Jun Kim was born in Tokyo, Japan, on November
[8] M. Hasegawa, M. Nakamura, S. Narui, S. Ohkuma, Y. Kawase, H. 14, 1966. He received the B.S.E.E. degree from the
Endoh, S. Miyatake, T. Akiba, K. Kawakita, M. Yoshida, S. Yamada, T. University of California, Berkeley, in 1989.
Sekigguchi, I. Asano, Y. Tadaki, R. Nagai, S. Miyaoka, K. Kajigaya, M. From 1989 to 1991, he was with Vitelic, Inc.,
Horiguchi, and Y. Nakagome, “A 256 Mb SDRAM with subthreshold where he worked on SRAM and DRAM develop-
leakage current suppression,” in ISSCC 1998 Dig. Tech. Papers, Feb. ment. Between 1991 and 1994, he was with Sun
1998, pp. 80–81. Microsystems, where he was involved in micropro-
[9] T. Saeki, Y. Nakaoka, M. Fujita, A. Tanaka, K. Nagata, K. Sakakibara, cessor and digital circuit design. Since 1994, he
T. Matano, Y. Hoshino, K. Miyano, S. Isa, E. Kakehashi, J. Drynan, has been with Rambus, Inc., Mountain View, CA,
M. Komuro, T. Fukase, H. Iwasaki, J. Sekine, M. Igeta, N. Nakanishi, as a Designer of high-speed CMOS I/O and DLL
T. Itani, K. Yoshida, H. Yoshino, S. Hashimoto, T. Yoshii, M. Ichinose, circuits.
T. Imura, M. Uziie, K. Koyama, Y. Fukuzo, and T. Okuda, “A 2.5
ns clock access 250 MHz 256 Mb SDRAM with synchronous mirror
delay,” ISSCC 1996 Dig. Tech. Papers, Feb. 1996, pp. 374–375.
[10] B. Garlepp, K. Donnelly, J. Kim, P. Chau, J. Zerbe, C. Huang, C. Tran,
C. Portmann, D. Stark, Y. Chan, T. Lee, and M. Horowitz, “A portable Pak S. Chau was born in Hong Kong in 1966.
digital DLL architecture for CMOS interface circuits,” in VLSI Circuits He received the B.S. degree in computer system
Dig. Tech. Papers, June 1998, pp. 214–215. engineering from the University of Massachusetts,
[11] M. Griffin, J. Zerbe, A. Chan, Y. Jun, Y. Tanaka, W. Richardson, G. Amherst, in 1989 and the M.S. degree in electri-
Tsang, M. Ching, C. Portmann, Y. Li, B. Stonecypher, L. Lai, K. Lee, cal engineering from the University of California,
V. Lee, D. Stark, H. Modarres, P. Batra, J. Louis-Chandran, J. Privitera, Davis, in 1991.
T. Thrush, B. Nickell, J. Yang, V. Hennon, and R. Sauve, “A process He was with National Semiconductor and Chron-
independent 800 MB/s DRAM bytewide interface featuring command tel, Inc., where he worked as an Analog Circuit
interleaving and concurrent memory operation,” in ISSCC 1998 Dig. Designer. In 1994, he joined Rambus, Inc., Moun-
Tech. Papers, Feb. 1998, pp. 156–157. tain View, CA, where he has engaged in designing
[12] S. Sidiropoulos, “High-performance interchip signalling,” Ph.D. dis- high-speed I/O and DLL circuits.
sertation, Computer Systems Laboratory, Stanford University, Stan-
ford, CA, Apr. 1998. Available as Tech. Rep. CSL-TR-98-760 from
http://elib.stanford.edu/.
[13] I. Young, M. Mar, and B. Bhushan, “A 0.35 m CMOS 3-880 MHz
PLL N/2 multiplier and distribution network with low jitter for micro- Jared L. Zerbe was born in New York, NY, in
processors,” in ISSCC 1997 Dig. Tech. Papers, Feb. 1997, pp. 330–331. 1965. He received the B.S. degree in electrical en-
[14] V. von Kaenel, D. Aebischer, C. Piguet, and E. Dijkstra, “A 320 MHz, gineering from Stanford University, Stanford, CA,
1.5 mW at 1.35 V CMOS PLL for microprocessor clock generation,” in 1987.
in ISSCC 1996 Dig. Tech. Papers, Feb. 1996, pp. 132–133. He joined VLSI Technology, Inc., in 1987, where
[15] V. von Kaenel, D. Aebischer, R. van Dongen, and C. Piguet, “A 600 he worked on semicustom ASIC design. In 1989, he
MHz CMOS PLL microprocessor clock generator with a 1.2 GHz joined MIPS Computer Systems, where he designed
VCO,” in ISSCC 1998 Dig. Tech. Papers, Feb. 1998, pp. 396–397. high-performance floating-point blocks. Since 1992,
he has been with Rambus Inc., Mountain View, CA,
where he has specialized in the design of high-
speed I/O and PLL/DLL clock recovery and data
synchronization circuits.
Chanh V. Tran was born in Vietnam in 1964. He Yiu-Fai Chan (S’76–M’78) received the B.S. and
received the B.S. degree in electrical engineering M.S. degrees in electrical engineering and computer
and computer science form the University of Cali- science (with highest honors) from the University
fornia, Berkeley, in 1989. of California (UC), Berkeley, in 1972 and 1973,
From 1989 to 1992, he was with National Semi- respectively.
conductor Corp., Santa Clara, CA, where he worked He joined Rambus, Inc., Mountain View, CA, in
on CMOS mixed-signal IC design in the Data 1992, where he is Director of Engineering, respon-
Acquisition Group. In 1992, he joined Rambus Inc., sible for the development, application engineering,
Mountain View, CA, where he has been involved in and customer support of high-speed mixed-signal
DLL and high-speed I/O design. circuits, device packaging, signal integrity, and sys-
tem engineering. Prior to that, he was with Tera
Microsystems in charge of developing chips for workstations based on the
Sparc architecture. He was with Altera Corp. from 1983 to 1990, where he
led a team of engineers to develop the industry’s first CMOS programmable
Clemenz L. Portmann (S’92–M’95) received the logic devices. From 1976 to 1983, he held various technical and management
B.S.E.E. degree from the University of Washington, positions at Intersil, Inc. (later a division of General Electric), where he was
Seattle, in 1986, the M.S.E.E. degree from the engaged in the development of various CMOS memories, microprocessors,
University of Hawaii at Manoa, Honolulu, in 1988, and peripheral devices. It was there that he developed the first EPROM devices
and the Ph.D. degree in electrical engineering from in CMOS technology. From 1974 to 1976, he designed calculator and TV
Stanford University, Stanford, CA, in 1995. game integrated circuits at National Semiconductor. He has received several
From 1988 to 1989, he was a Visiting Researcher patents in circuits and systems technologies.
at Nagoya University, Nagoya, Japan, and the Toy- Mr. Chan is a member of Tau Beta Pi, Phi Beta Kappa, and Eta Kappa
ohashi University of Technology, Toyohashi, Japan, Nu. He received the University Science Fellowship from UC Berkeley and
under the Monbusho (Ministry of Education) schol- conducted research on solid-state devices and microwave acoustics. He has
arship program. From 1989 to 1990, he was a published in various IEEE technical publications and presented papers at IEEE
Design Engineer for VLSI Technology, Inc., San Jose, CA, where he designed technical conferences.
standard cell libraries and SRAM’s for ASIC designs. In 1995, he joined
Rambus, Inc., Mountain View, CA, where he is engaged in the design of
high-speed I/O circuits and DLL’s for DRAM interfaces.
Thomas H. Lee (S’87–M’87), for a photograph and biography, see this issue,
p. 585.
shift-left and shift-right signals. The power consumption will IV. EXPERIMENTAL RESULTS
decrease when there are no shift-left or -right signals and the The RSDLL was fabricated in a 0.21- m, four-poly, double-
loop is locked. Another concern with the phase-detector design metal CMOS technology (a DRAM process). We used a 48-
is the design of the flip-flops (FF’s). To minimize the static stage delay line with an operation frequency of 125–250 MHz.
phase error, very fast FF’s should be used, ideally with zero The maximum operating frequency was limited by delays
setup time. Also, the metastability of the flip-flops becomes external to the DLL such as the input buffer and interconnect.
a concern as the loop becomes locked. This together with There was no noticeable static phase error on either rising
possible noise contributions and the need to wait, as discussed or falling edges. Fig. 6 shows the resulting rms jitter versus
above, before implementing a shift-right or -left may increase input frequency. One sigma of jitter over the 125–250-MHz
the desirability of adding additional filtering in the phase frequency range was below 50 ps. The peak-to-peak jitter over
detector. Some possibilities include increasing the divider ratio this frequency range was below 100 ps. The measured delay
used in the phase detector or using a shift register in the phase per stage versus VCC and temperature is shown in Fig. 7. Note
detector to determine when a number—say, four—shift-rights that the 150-ps typical delay of a unit-delay element was very
or -lefts have occurred. For the present design, we were forced close to the rise and fall times on-chip of the clock signals and
to use a divide by two in the phase detector because of lock represents a practical minimum resolution of a DLL for use in
time requirements. a DDR DRAM fabricated in a 0.21- m process. The power
568 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 4, APRIL 1999
consumption (current draw of the DLL when VCC V) of two-loop architectures where coarse loops (resolutions on the
the prototype RSDLL is illustrated in Fig. 8. We found that the order of 100 ps) are used with fine loops (resolutions on the
power consumption was mainly determined by the dynamic order of 10 ps [2]) for wide tuning range and small static
power dissipation of the symmetrical delay line. Our NAND phase errors.
delays in this test chip were implemented with 10/0.21- m
NMOS and 20/0.21- m PMOS. By reducing the widths of REFERENCES
both the NMOS and PMOS transistors, the power dissipation [1] A. Hatakeyama, H. Mochizuki, T. Aikawa, M. Takita, Y. Ishii, H.
can be greatly reduced without a speed or resolution penalty Tsuboi, S.-Y. Fujioka, S. Yamaguchi, M. Koga, Y. Serizawa, K.
(with the added benefit of reduced layout size). Nishimura, K. Kawabata, Y. Okajima, M. Kawano, H. Kojima, K.
Mizutani, T. Anezaki, M. Hasegawa, and M. Taguchi, “A 256-Mb
SDRAM using a register-controlled digital DLL,” IEEE J. Solid-State
V. CONCLUSIONS Circuits, vol. 32, pp. 1728–1732, Nov. 1997.
[2] S. Eto, M. Matsumiya, M. Takita, Y. Ishii, T. Nakamura, K. Kawabata,
The concept of a register-controlled symmetrical delay- H. Kano, A. Kitamoto, T. Ikeda, T. Koga, M. Higashiro, Y. Serizawa,
K. Itabashi, O. Tsuboi, Y. Yokoyama, and M. Taguchi, “A 1Gb SDRAM
locked loop has been presented. The modified symmetrical with ground level precharged bitline and non-boosted 2.1V word line,”
delay element makes the RSDLL useful in DDR DRAM’s. in ISSCC Dig. Tech. Papers, Feb. 1998, pp. 82–83.
Experimental results verify that this RSDLL is stable against [3] T. Saeki, Y. Nakaoka, M. Fujita, A. Tanaka, K. Nagata, K. Sakakibara,
T. Matano, Y. Hoshino, K. Miyano, S. Isa, S. Nakazawa, E. Kakehashi,
temperature, process, and power-supply variations. J. M. Drynan, M. Komuro, T. Fukase, H. Iwasaki, M. Takenaka, J.
Further development of the RSDLL will include investiga- Sekine, M. Igeta, N. Nakanishi, T. Itani, K. Yoshida, H. Yoshino, S.
tions into reducing power consumption, implementing phase- Hashimoto, T. Yoshii, M. Ichinose, T. Imura, M. Uziie, S. Kikuchi, K.
Koyama, Y. Fukuzo, and T. Okuda, “A 2.5-ns clock access 250-MHz,
locked loops where the symmetrical delay is used as part of a 256-Mb SDRAM with synchronous mirror delay,” IEEE J. Solid-State
purely digital registered-controlled oscillator, and developing Circuits, vol. 31, pp. 1656–1665, Nov. 1996.
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 11, NOVEMBER 1997 1683
I. INTRODUCTION
conventional approaches, Section II presents the dual inter-
the main reason for the increased adoption of DLL’s in the phase mixer is a clock with a slew rate inherently limited
applications that do not require clock synthesis. by , where is the output swing of the phase
The conventional DLL architecture of Fig. 1 suffers from mixer and the period of the clock. This slow clock exhibits
two important disadvantages: clock jitter propagation and increased dynamic noise sensitivity, thus degrading the jitter
limited phase capture range. Since the VCDL simply delays performance of quadrature mixing DLL’s.
the reference clock by a single clock cycle, the reference The approach presented here overcomes this limitation of
clock jitter directly propagates to the output clock. This all- quadrature mixing DLL’s since it generates the output clock
pass filter behavior with respect to the frequency of the by interpolating between smaller 30 phase intervals [5].
jitter of the reference clock results in reduced I/O timing Simultaneously, by avoiding the use of a VCO it eliminates the
margins, especially in “source-synchronous” interfaces where phase error accumulation problem of similar approaches [4].
the reference clock emanates from another noisy digital chip.
To overcome this problem, a separate low-jitter differential B. Dual Interpolating DLL
clock can be used as the input to the delay line. This way the
Fig, 2 shows a high-level block diagram of the proposed
on-chip common-mode noise and the reference clock jitter do
architecture. This architecture is based on cascading two loops.
not affect the I/O timing margins.
A conventional first-order core DLL is locked at 180 phase
A more important problem is that a VCDL does not have
shift. Assuming that the delay line of the core DLL comprises
the cycle slipping capability of a VCO. Therefore, at a given
six buffers, their outputs are six clocks which are evenly
operating clock frequency, the DLL can delay its input clock spaced by 30 . The peripheral digital loop selects a pair of
by an amount bounded by a minimum and a maximum clocks, and , to interpolate between. Clocks and
delay. As a consequence, extra care must be taken by the can be potentially inverted in order to cover the full 0–360
designer so that the loop will not enter in a state in which phase range. The resulting clocks, and , drive a digitally
it tries to lock toward a delay which is outside these two controlled interpolator which generates the main clock . The
limits. A compromising solution is to extend the VCDL range phase of this clock can be any of the quantized phase steps
and use an FSM that controls the loop start-up. However, between the phases of clocks and , where is the
DLL’s relying on quadrature phase mixing [2], [3] completely interpolation controlling word range.
eliminate this problem. This approach is based on the fact that The output clock of the interpolator drives the phase
quadrature clocks can be easily generated, given a clock of detector which compares it to the reference clock. The output
the correct frequency. The quadrature clocks are then fed to a of the phase detector is used by the FSM to control the phase
phase mixer which can produce a clock whose phase can span selection, the selective phase inversion, and the interpolator
the complete 0–360 phase interval. This approach eliminates phase mixing weight. The FSM moves the phase of the clock
the limited phase range problem of conventional DLL’s since according to the phase detector output. In the more common
it can essentially rotate the output clock phase infinite times case this means just changing the interpolation mixing weight
providing seamless switching at the quadrant boundaries. The by one. If, however, the interpolator controlling word has
main disadvantage of quadrature mixing is that the output of reached its minimum or maximum limit, the FSM must change
SIDIROPOULOS AND HOROWITZ: SEMIDIGITAL DUAL DELAY-LOCKED LOOP 1685
(a)
(b)
(a) (b)
Fig. 12. (a) Simplified FSM algorithm and (b) resulting loop behavior.
Fig. 11. Simulated phase interpolator transfer function. Fig. 13. Prototype chip microphotograph.
current sources. The type-I design exhibits a nominal step of On every cycle of its operation, the FSM might undertake two
approximately 2 . However, due to the gate-to-drain capacitive actions.
coupling effect, the maximum step of 3.8 occurs at the
interpolation boundary when the input clock is switched to • In the more frequent case of in-range interpolation (i.e.,
the next selection. In the lower power implementation where weight 0), the FSM simply increments or decrements
no buffering is used at the core delay line outputs (type- the interpolation weight by shifting up or down the
I-unbuf), the data-dependent loading on the previous stage interpolator controlling shift register. The direction of the
results on a double phase step at the interpolation interval shift is decided based on the phase detector output and
boundaries. Although the alternative design (type-II) does not the current value of the state Early.
exhibit a boundary phase step, it was not used since it occupies • If the peripheral loop has run out of range in the current
more layout area and exhibits more nonlinear characteristics interpolation interval, the FSM seamlessly slides the
due to data-dependent loading of the previous stage. So in current interpolation interval by switching phase or to
the present implementation, worst-case dithering occurs at the next selection. The fact that the interpolation has run
the interpolation interval boundaries and has an approximate out of range in the current interval is simply indicated by
magnitude of 3.8 . a combination of the current value of the state Early, the
most or least significant bit of the thermometer register,
and the output of the phase detector. In case the current
D. Finite State Machine selection of phase or is adjacent to the 0 or 180
A simplified version of the peripheral loop FSM algorithm interpolation interval boundary, switching to the next
is outlined in Fig. 12(a). The single state Early of the FSM selection involves toggling the select of the second-stage
indicates the relationship of the two interpolator input clocks. phase inversion multiplexer.
1690 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 11, NOVEMBER 1997
The loop phase capture behavior resulting from this simple more complicated algorithms can be implemented requiring
algorithm is illustrated in Fig. 12(b). The phase error decreases minimal effort from the designer. Faster phase acquisition
at a linear rate until the system achieves lock. Subsequently, can be obtained by disabling the front end counter/filter and
the loop dithers around the zero phase error point with a changing the interpolation step by a larger amount while the
dither magnitude of one phase interpolation interval. This loop is not in lock. The loop can also implement a periodic
occurs because in this type of “bang–bang” system, the output phase calibration algorithm. In this case, the FSM is activated
of the phase detector is just a binary phase error without initially to drive the loop to zero phase error. Then it is
any indication of the magnitude of the phase error. The shut down to save power and it is periodically turned on to
complementary interpolation weights slew linearly, changing compensate for slow phase drifts. Since the FSM can run
direction at the interpolation interval boundaries. Once the at a frequency slower than that of the system clock, the
system finds lock, they either dither by one or they stay implementation of different algorithms is not in the system
constant if the dither point happens to lie on an interval critical path.
boundary.
The magnitude of the peripheral loop phase dither is IV. EXPERIMENTAL RESULTS
determined by the minimum interpolation step and the To verify the dual DLL architecture, a chip has been
delay through the feedback loop. In conventional analog fabricated through MOSIS in the HP CMOS26B process. This
“bang–bang” DLL’s, the loop delay is largely determined by is a 1.0- m drawn process with the channel lengths scaled to
the delay through the delay line and the clock distribution 0.8 m. Although the gate oxide in this process is 170 Å
network. However, this digital implementation has a larger allowing 5-V operation, the loop design and testing was done
minimum loop delay. The underlying reason is that driving with a 3.3-V power supply voltage.
the FSM directly from the phase detector output might lead Fig. 13 is a micrograph of the chip. The chip integrates the
into metastability problems, especially since the whole loop dual DLL, along with noise injection and monitoring circuits
operation is driving the phase detector to its metastable point and current-mode differential output buffers. The dual DLL
of operation. For this reason, in this implementation the occupies 0.8 mm of silicon area, the majority ( 60%) of
output of the phase detector is delayed by three metastability which is devoted to the peripheral loop logic. This is mainly
hardened flip-flops. This increases the mean time between due to the relatively large standard cell size of the library used
failures (MTBF) of the system to a calculated worst case of in this implementation.
approximately 100 years, but at the same time increases the The block labeled NOISE-GEN in Fig. 13 is used to inject
peripheral loop delay by three cycles. To compensate for that and measure on-chip supply noise. Fig. 14 shows a schematic
delay and decrease the loop dither, the FSM logic implements diagram of these circuits. The 1000- m wide transistor
a front-end filter which counts eight continuous phase detector shorts the on-chip supply rails creating a voltage drop across
“up” or “down” results before propagating this signal to the the off-chip 4- resistor . In order to monitor the droop
core FSM. This causes the FSM to delay its next decision on the on-chip supply, device and the external 5- load
until the results of its previous action have been propagated to resistor form a broadband attenuating buffer which drives
the phase detector output and reduces the inherent peripheral the 50- scope. The gain of the buffer is computed during an
loop dither to one phase interpolation interval. initial calibration step. The use of these circuits enables the
The digital nature of the peripheral loop control enabled the injection and monitoring of fast ( 1-ns rise time) steps on the
implementation of the FSM to be done through synthesis of a on-chip supply.
behavioral verilog model followed by a simple standard cell The dither jitter of the loop with quiescent on-chip supply
place and route. The FSM behavioral model was verified by varies with the input phase. This occurs because the offset of
simulation in conjunction with a behavioral core loop model. the interpolator and the phase selection multiplexers change
The significance of this automated methodology is that other according to the point of lock. Fig. 15 shows the worst-
SIDIROPOULOS AND HOROWITZ: SEMIDIGITAL DUAL DELAY-LOCKED LOOP 1691
Fig. 16. Jitter histogram with 1-MHz 750-mV square wave supply noise.
case jitter (68 ps) with quiescent supply. The jitter histogram the reference clock to a constant voltage while the input clock
consists of the superposition of two Gaussian distributions ran at its nominal frequency of 250 MHz. The histogram
resulting from the switching of the peripheral loop between valleys correspond to the interpolation interval boundaries.
two adjacent interpolation intervals. The distance between the The spacing of the valleys is within 10% of their nominal
peaks of the two superimposed distributions is about 40 ps, 333-ps distance, indicating good matching of the delays of
which is in fair agreement with the simulation results. With the core loop buffers. The absence of one valley at the 180
the noise generation circuits injecting a 750-mV 1-MHz square interpolation boundary indicates a slight offset in the core
wave on the chip supply, the peak-to-peak jitter increases to loop. The fact that the magnitude of the highest peak of the
400 ps (Fig. 16). It should be noted that simulation results histogram is smaller than the magnitude of the deepest valley
indicate that approximately 50% of this jitter is not inherent to indicates that the interpolator achieves the 4-b target linearity
the loop, but is due to the supply sensitivity of the succeeding (the 4-b linearity of the interpolator was also confirmed by
static CMOS clock buffer and off-chip driver. a similar histogram of a single interpolation interval). Thus
Fig. 17 illustrates the linearity of the interpolation process the overall linearity of the DLL is limited by the steps at the
in the peripheral loop. The figure shows the histogram of interpolation interval boundaries.
the output clock with the peripheral loop FSM continuously Table I summarizes the loop performance characteristics.
rotating that clock. The histogram was generated by keeping With a 3.3-V supply, the loop operates from 80 kHz to
1692 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 11, NOVEMBER 1997
ACKNOWLEDGMENT
The authors are grateful to M. Johnson, T. Lee, J. Maneatis,
and K. Yang for helpful discussions.
REFERENCES
[1] M. Johnson and E. Hudson, “A variable delay line PLL for CPU-
coprocessor synchronization,” IEEE J. Solid-State Circuits, vol. 23, Oct.
400 MHz. The phase offset between the reference clock and 1988.
[2] T. Lee et al., “A 2.5 V CMOS delay-locked loop for an 18 Mbit, 500
the output clock of the loop is less than 40 ps. Operating at MB/s DRAM,” IEEE J. Solid-State Circuits, vol. 29, pp. 1491–1496,
250 MHz, the dual DLL draws 31 mA dc from a 3.3-V power Dec. 1994.
supply. [3] M. Izzard et al., “Analog versus digital control of a clock synchronizer
for a 3 Gb/s data with 3.0 V differential ECL,” in Dig. Tech. Papers
1994 Symp. VLSI Circuits, June 1994, pp. 39–40.
[4] M. Horowitz et al., “PLL design for a 500 MB/s interface,” in Dig.
V. SUMMARY Tech. Papers Int. Solid State Circuits Conf., Feb. 1993, pp. 160–161.
[5] S. Sidiropoulos and M. Horowitz, “A semi-digital delay locked loop with
Although DLL’s are easier to design than PLL’s and offer unlimited phase shift capability and 0.08–400 MHz operating range,” in
better jitter performance, their main disadvantage is their Dig. Tech. Papers Int. Solid State Circuits Conf., Feb. 1997, pp. 332–333.
[6] J. Maneatis and M. Horowitz, “Precise delay generation using coupled
limited phase capture range. This disadvantage limits their oscillators,” IEEE J. Solid-State Circuits, vol. 28, pp. 1273–1282, Dec.
application to completely synchronous environments and com- 1993.
plicates start-up circuitry. This paper presented a dual DLL [7] J. Maneatis, “Low-jitter process-independent DLL and PLL based
on self-biased techniques,” IEEE J. Solid-State Circuits, vol. 31, pp.
architecture which removes this limitation by using a core DLL 1723–1732, Nov. 1996.
to generate coarsely spaced clocks which are then used by a
peripheral DLL to generate the output clock by using phase
interpolation. This architecture has unlimited (modulo 2 )
phase shift capability, therefore removing boundary conditions Stefanos Sidiropoulos (S’93), for a photograph and biography, see p. 690 of
the May 1997 issue of this JOURNAL.
and phase relationship constraints between the system clocks.
The only requirement is that the DLL input and reference
clocks are plesiochronous, making the dual DLL suitable for
clock recovery applications. In addition, the digital nature Mark A. Horowitz (S’77–M’78–SM’95), for a photograph and biography,
of the peripheral loop control enables implementation of see p. 690 of the May 1997 issue of this JOURNAL.
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 8, AUGUST 2002 1021
Max
Min (1)
Equation (1) shows that the DLL is prone to the false locking Fig. 4. Small-signal model of the conventional analog DLL.
problem when process variations are taken into account [7].
Therefore, some solutions [6]–[10] are proposed to overcome
this problem. They are described as follows.
First, the basic idea is to use a phase-frequency detector
(PFD) [5], because it has a capture range of 2 , 2 wider
than other phase detectors. So, the PFD is a better choice for
wide range operation. However, the PFD cannot be used in the
DLL alone without any control circuit because the DLL will
try to lock a zero delay. A PFD combined with a control circuit
is presented in [6]. Nevertheless, in some cases, especially for
high-frequency operations, the initial delay between ref_clk
and vcdl_clk, as shown in Fig. 1, may be larger than two clock Fig. 5. Block diagram of the phase selection circuit.
cycles and harmonic locking will occur.
Second, a solution called an all-analog DLL using a replica
presented for the DLL to solve false locking problems and keep
delay line [7] has been developed to solve the narrow frequency
the latency of one clock cycle. The exact 50% duty cycle is not
range problem of a conventional DLL. If the delay range of the
necessary.
VCDL satisfies the relation ,
the DLL will have a maximum operation range of 7:1.
Third, a digital-controlled DLL called the self-correcting III. ARCHITECTURE OF THE PROPOSED DLL
DLL is proposed in [8]. The problem of false locking is The architecture of the proposed DLL is shown in Fig. 3. It
solved by the addition of a lock-detect circuit and the modified is composed of a conventional analog DLL, a phase selection
phase detector. Although this self-correcting DLL avoids false circuit, and a start-controlled circuit. Before the DLL begins to
locking, the outputs of the VCDL are required to have an exact lock, the phase selection circuit will choose an appropriate delay
50% duty cycle. cell to be a feedback signal (vcdl_clk) according to different fre-
The DLL developed in [9] uses a stage selector for fast-locked quencies of input signal. In other words, the number of the delay
and wide-range operations, but the DLL requires an additional cells may change at different input frequencies. The minimum
VCDL, which increases the area. A similar DLL can automat- delay of the delay line is determined by one unit-delay
ically change its lock mode to extend the operation range, but cell. The maximum delay can be decided as where
the latency of the DLL will be larger than one clock cycle [10]. is the number of unit-delay cells. Thus, the operating fre-
The approach presented in this work uses a phase selection quency range of the DLL can be from to
circuit to automatically decide what number of delay cells .
should be used. This can enable the DLL to operate in the The linear model of the DLL is shown in Fig. 4, where the
wide-frequency range. A new start-controlled circuit is also summer stands for a phase detector, is the charge-pump cur-
CHANG et al.: WIDE-RANGE DELAY-LOCKED LOOP WITH FIXED LATENCY OF ONE CLOCK CYCLE 1023
Fig. 6. Schematic of edge detection circuit. (a) Edge detection circuits. (b) Clock edge generation. (c) Latch N.
(a) (b)
Fig. 7. Timing diagram of edge detection circuit.
rent, is the period of the input reference clock, is the the jitter performance will be degraded. Hence, the following
capacitor value in the loop filter, and is the gain of the tradeoff design guideline was suggested in [12]:
VCDL which is proportional to the number of delay cells. In
the steady-state locked condition, the -domain transfer func- (4)
tion can be expressed as [11]
where .
When the input frequency is higher, the phase selection circuit
(2) will select the smaller number of delay cells and will
become smaller. In order to have an adequate loop bandwidth for
the DLL, the capacitances used in the loop filter must become
smaller. In this work, the 3-bit control signals generated from
where is the input delay time and is the output delay the phase selection circuit will switch the number of capacitors
time. The loop bandwidth can be expressed as [11] in the loop filter depending on the selected phase.
After the vcdl_clk is decided, the DLL will start the locking
(3) process, which is controlled by the start-controlled circuit. First,
the delay between input and output of the VCDL is initially set
to the minimum value and then allows the down signal of the
Since the transfer function is inherently stable, a wider loop PFD output activate, supposing that the VCDL’s delay increases
bandwidth can be used to achieve fast acquisition time, but with control voltage decreasing. Therefore, the delay between
1024 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 8, AUGUST 2002
input and output of the VCDL will increase until it reaches one
clock period of the input signal. Thus, the DLL will not fall into
false locking and the latency is fixed to one clock cycle no matter
how long a delay the VCDL provides.
Fig. 12. Schematic of the delay cell with replica bias [12].
PFD are in low level. When startb goes to high, setupb will
also go to high. After two consecutive falling edges of vcdl_clk
trigger the DFFs, the down signal of the PFD will be activated
and let the delay of the VCDL increase. The delay of the VCDL
will increase until it is equal to one clock period of the input
Fig. 13. Simulated transfer curve of the VCDL. signal due to the nature of negative feedback architecture. Since
the start-controlled circuit forces the delay of the VCDL to its
two inverters. The timing diagram of this start-controlled circuit minimum value and controls the delay of the VCDL to increase
is shown in Fig. 9. Initially, startb is set to low in order to clear until its delay is equal to one clock period, the DLL will not fall
the two DFF’s outputs. Therefore, setupb is low and pulls the into false locking even when . In order to get
control voltage to , as shown in Fig. 3 (i.e., set the VCDL equal delays for path1 and path2, dummy loads should be added
delay to its minimum value). In this way, the two inputs of the in point A. In comparison with [6], this start-controlled circuit
1026 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 8, AUGUST 2002
Fig. 16. DLL at initial state when operating frequency is 130 MHz. Fig. 18. Measurement results of rms jitter over different frequencies.
has two advantages: the proposed circuit is simple, and the duty TABLE I
PERFORMANCE SUMMARY
cycle of ref_clk and vcdl_clk is not required to be exactly 50%.
C. Other Circuits
In this work, the dynamic logic style PFD [13] is adopted to
avoid the dead-zone problem and improve the operating speed.
To mitigate charge injection errors induced by the parasitic
capacitors of the switches and current source transistors, the
charge-pump circuit developed in [11] is used here. The delay
cell circuit is similar to [11]. The schematics of these circuits
are shown in Figs. 10–12. The control voltage of the loop filter
is directly connected to nMOS rather than pMOS. Therefore,
the transfer curve of delay versus control voltage is monotonic
decreasing, as shown in Fig. 13.
V. EXPERIMENTAL RESULTS
The prototype chip is fabricated in a 0.35- m single-poly show the first four cycles of the DLL in the locking process
triple-metal standard CMOS process. The microphotograph of when the operating frequency is 6 and 130 MHz, respectively.
the chip is shown in Fig. 14. The capacitors used in the loop After the signal startb is high, the phase selection circuit will se-
filter are integrated in the chip and formed by metal-to-metal lect one of the outputs of the VCDL as close as possible to the
capacitors. The experimental results show that the DLL can op- next rising edge of the input clock, ref_clk. Figs. 15 and 16 also
erate in the frequency range of 6–130 MHz. Figs. 15 and 16 show that after the signal startb is high, the first rising edge of
CHANG et al.: WIDE-RANGE DELAY-LOCKED LOOP WITH FIXED LATENCY OF ONE CLOCK CYCLE 1027
the output clock of the VCDL, vcdl_clk, leads that of the input [10] Y. Okuda, M. Horiguchi, and Y. Nakagome, “A 66–400 MHz adap-
clock, ref_clk. Since the signal startb will set the control voltage tive-lock-mode DLL circuit duty-cycle error correction,” in Symp. VLSI
Circuits Dig. Tech. Papers, June 2001, pp. 37–38.
in Fig. 3 to , the proposed phase detector and the cur- [11] J. G. Maneatis, “Low-jitter process-independent DLL and PLL based
rent-pump circuit will discharge the loop filter to increase the on self-biased techniques,” IEEE J. Solid-State Circuits, vol. 31, pp.
delay of the VCDL. It will align the phases between the input 1723–1732, Nov. 1996.
[12] A. Chandrakasan, W. J. Bowhill, and F. Fox, Design of High-Perfor-
clock and output clock of the VCDL. Fig. 17 shows the jitter mance Microprocessor Circuit. New York: IEEE Press, 2001, p. 240.
histogram when the DLL operates at 130 MHz. Fig. 18 shows [13] S. Kim et al., “A 960-Mb/s/pin interface for skew-tolerant bus using
the measurement results of rms jitter over different frequencies. low jitter PLL,” IEEE J. Solid-State Circuits, vol. 32, pp. 691–700, May
1997.
Table I gives the performance summary. The proposed DLL can
be seen to have a wide-operational range and a fixed latency of
one clock cycle.
Hsiang-Hui Chang (S’01) was born in Taipei,
Taiwan, R.O.C., on February 4, 1975. He received
VI. CONCLUSION the B.S. and M.S. degrees in electrical engineering
from National Taiwan University, Taipei, in 1999
A DLL with wide-range operation and fixed latency of one and 2001, respectively. He is currently working
clock cycle is proposed. First, the multiphase outputs of the toward the Ph.D. degree in electrical engineering at
VCDL are all sent to the phase selection circuit. Then the National Taiwan University.
His research interests are PLL, DLL, and
phase selection circuit will automatically select one of the high-speed interfaces for gigabit transceivers.
delayed outputs to feedback. As a result, this DLL can operate
over a wide range without suffering from harmonic locking
problems. Ideally, this DLL can operate from
to . The experimental results also demonstrate
the functionality of the proposed DLL. Moreover, at different Jyh-Woei Lin was born in Kaoshiung, Taiwan,
R.O.C., in 1974. He received the B.S. degree in
operating frequencies, the jitter performances are all in an electrical engineering from National Taipei Univer-
acceptable range and the latency is just one clock cycle. Since sity of Technology in 1996, and the M.S. degree
the speed of the proposed circuits can be increased if the in electrical engineering from National Taiwan
University in 2001.
more advanced process is used, the performance of the DLL He joined Sunplus Corporation, Hsinchu, Taiwan,
such as the operating frequency range can be improved with a in 2001 as an Analog Circuit Designer. His research
little hardware and design effort. The power consumption of interests include PLL, DLL, and interface circuits for
high-speed data links.
the digital part in the DLL and the total die area will also be
reduced.
Abstract—A novel clock network composed of multiple syn- is that skew is only relevant between communicating latches,
chronized phase-locked loops is analyzed, implemented, and but the clock path is always the length of the chip. Clock speeds
tested. Undesirable large-signal stable (mode-locked) states increase with gate delay, and processor architectures can exploit
dictate the transfer characteristic of the phase detectors; a matrix
formulation of the linearized system allows direct calculation of both locality of blocks and pipelining to avoid penalty due to
system poles for any desired oscillator configuration. A 16-oscil- long signal paths, but the error in a global clock scales with the
lator 1.3-GHz distributed clock network in 0.35- m CMOS is total path delay, and is thus a growing fraction of a clock cycle.
presented here. In this paper, we consider the effects of static and dynamic
Index Terms—Clock network, multiple oscillator system, phase- mismatch on a few representative clock networks in Section II
locked loop. and propose a distributed generation scheme that needs only
local synchronization to generate a global clock. Large and
I. INTRODUCTION small-signal stability of the proposed network is analyzed in
Section III. This clock was implemented on a test chip; circuit
D. Active Feedback
As is evident from the given examples, most of the skew
comes from the initial long-distance distribution of a clock to
relatively small loads. A delay-locked loop (DLL) could be
adapted to measure and cancel out wire variations, as shown
in Fig. 3. If the round-trip delay is tuned to an even number of
clock cycles, the wire has nominally 0 delay.
Unfortunately, despite the apparent symmetry, the forward
and reverse paths do not match well for two reasons. First,
“matched” buffers are physically separated. In Fig. 3, should
match , although it would be physically near . is not as
far away from its matched pair as it might be in a tree, but it will
still typically be millimeters away. Second, there is no temporal
correlation. The clock signal passes at a different time than
it passes , so any time-dependent variations, including those
due to power supply and signal coupling, do not match.
Fig. 1. Simulated edge in a grid with skew to the drivers.
Another approach, proposed by Intel, is shown in Fig. 4 [7].
Here, a DLL matches delays to two half-trees; an obvious gen-
eralization, with four DLLs matching quarter-trees is shown in
Fig. 5. Static delay variations of some nearest neighbors are can-
celed out by the DLL to within the precision of the matching of
the comparators. The drawback is that some neighboring nodes,
as and in Fig. 5, are only related through multiple DLLs.
A much better result can be obtained by using DLLs that take
multiple reference inputs, and adjust output phase to be aligned
exactly between the two inputs. The network can then be re-
drawn somewhat more symmetrically, as Fig. 6. (For clarity, the
local tree was not drawn, and the connections to the compara-
tors are abstracted.)
Optimization of the number of tiles is straightforward. In-
Fig. 2. Short circuit power in a grid vs. input tree skew.
ternal skew scales with tile area, so as the number of tiles in-
creases, internal skew falls. However, every boundary between
tiles introduces some skew because of mismatch in the phase de-
clock source to the load is comparable to the size of the entire
tector (PD). Hence, as the number of tiles increases, the number
die. Because the worst-case skew occurs between two adjacent
of boundaries increases. Fig. 7 shows the optimization curves
leaves for which the clock path was completely different, worst
calculated for this clock metric. As in other clock networks,
case mismatch depends on the entire source-to-leaf delay. And
faster clocks require a more finely grained architecture. Jitter in
worse, the problem becomes worse with process scaling. Be-
a DLL network will rise in exactly the same way as it increases
cause RC delay does not scale, delay along an optimally buffered
in clock trees, and for the same reasons. Skew scales linearly
line scales only as ; hence the skew as a fraction of the clock
with because it is comprised of comparator mismatches and
period grows as with falling . delays across each leaf-patch. Note, however, that in a phase-
locked loop (PLL) the noise can be expected to scale with ; a
C. Grid
PLL network like the one in Fig. 6 would have total clock un-
Modern grids are H-tree-grid hybrids: a short H-tree dis- certainty that is a constant fraction of the clock period.
tributes clock to a few (4 or 16, for example) buffers around a
chip, and those buffers drive a clock grid in parallel. Shorting
III. STABILITY
the buffers together helps drive down some of the uncertainty
at the cost of increased short-circuit power during switching We propose a distributed clock network comprised of an
and somewhat slower edge rates. However, rise time scales array of synchronized PLLs. Independent oscillators generate
linearly with , so by the same reasoning as applied to the tree the clock signal at multiple points (“nodes”) across a chip;
scaling arguments, skew as a fraction of rise time will increase each oscillator distributes the clock to only to a small section
with as gate delay falls. When the tree skew exceeds rise of the chip (“tile”) (Fig. 8). PDs at the boundaries between tiles
time, short circuit power dissipation increases rapidly, and the produce error signals that are summed by an amplifier in each
clock edges begin to show an unacceptable kink. Fig. 1 shows tile and used to adjust the frequency of the node oscillator. In
simulated edge shapes with increasing input skew for a grid general, the network need not be square or regular.
driven from a 4-level tree with skews from 0 to 200 ps, and With locally generated clocks, there are no chip-length clock
Fig. 2 shows the corresponding short-circuit power dissipation, lines to couple in jitter; skew is introduced only by asymmetries
plotted as a fraction of -power for the clock grid. in PDs instead of mismatches in physically separated buffers,
GUTNIK AND CHANDRAKASAN: ACTIVE GHz CLOCK NETWORK USING DISTRIBUTED PLLs 1555
A. Small Signal
In a multiple-oscillator PLL large-signal and small-signal be-
havior are interrelated. In normal operation, the oscillators are
phase-locked, and jitter depends on the network response to
noise. Because startup is expected to take a negligibly small
fraction of time, the connection of the oscillators is optimized
for small-signal behavior rather than to make initial acquisition
more efficient. The linearized small-signal behavior, valid when
the oscillators are nearly in phase, is analyzed first.
B. General Derivation
Fig. 6. Multi-input delay cell DLL architecture.
The block diagram (Fig. 9) of a multiple-oscillator PLL is
essentially identical to the one for a conventional PLL, except
and the clock is regenerated at each node, so high-frequency that the connections between blocks are vectors instead of indi-
jitter does not accumulate with distance from the clock source. vidual signals, and the gains and transfer functions are matrices
Unlike earlier work on multiple clock domains which suggested instead of scalars. This means that the PD becomes matrix ,
1556 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 11, NOVEMBER 2000
(3)
Fig. 10. One-dimensional PLL array; symmetrical with the dotted-line This system has multiple poles at the same place where a single-
connections. oscillator PLL has single poles.
On the other hand, in a perfectly symmetrical array (call it
), the input to each oscillator is the phase of oscillators
of size , and the loop filter becomes ,a
and (Fig. 10, with the dotted-line connections). The
corresponding matrix. is an in-
matrix is the same because the physical arrangement of nodes
tuitively meaningful matrix. The network of oscillators
is identical, but changes:
is similar to a lumped circuit with a node for each oscillator
and a branch for each connection between pairs of oscillators.
Node voltages in represent oscillator phase, and branch cur-
rents represent the error signals on the output of the PD. is the (4)
conductance matrix for with unity conductance branches.
for a four-oscillator network is shown in (1). Each off-diagonal
entry is 1 if there is a PD between node and node ; To achieve the same phase margin in as in , it is necessary
is the number of detectors attached to node . to lower the gain . This can be shown with a geometrical ar-
gument: in , when the phase of oscillator changes by ,
the change is measured at two PDs, so oscillator feels twice
(1) the feedback that it would have felt in , and at the same time,
oscillators and both adjust in the opposite direction,
giving four times the effective gain. Hence, the gain must be de-
creased by a factor of approximately four. Mathematically, the
DC gain in the loop can be lumped into . largest eigenvalues of is 1, but the largest eigenvalue of
Writing the transfer function in matrix form gives is 3.5. Poles of the symmetrical system, solved via (2),
are plotted in Fig. 12(a). The key difference between and
(2) is the systems’ response to noise. In both cases, noise at frequen-
cies higher than the unity gain frequency are attenuated. For
where is the phase error input to each phase comparator. frequencies much lower than , the response can be calculated
is the reference phase, and are the noise contribu- via (2). Fig. 11 shows a Bode plot of noise at node in response
tions from interconnect and PD mismatch. to a noise source at node . Noise performance of is much
worse for intermediate frequencies because there is no feedback
C. Examples so errors propagate forever. In , the feedback limits the influ-
Matrix is determined by the geometry of the tiles, and ence of preceding stages, and this in turn attenuates noise. For
hence will constrained by the placement of clock loads, which this reason, networks with feedback are preferred, despite the
for this problem is fixed. Assuming the simplest possible PLL, more complicated stability calculation.
. This leaves , , and as design variables. 2) Two-Dimensional Array: A two-dimensional array is an-
There are still far too many choices to find the general op- alyzed exactly the same as is a one-dimensional array, except
timum, but a few examples may help guide the search. that the gain has to decrease by another factor of two because the
1) One-Dimensional Array: A one-dimensional array of os- center oscillators see four neighbors rather than two. A 16-ele-
cillators with PDs between neighbors is the simplest generaliza- ment array in a grid is implemented in this thesis. Its poles
tion of a single PLL. In a perfectly asymmetrical array (call this are shown in Fig. 12(b).
GUTNIK AND CHANDRAKASAN: ACTIVE GHz CLOCK NETWORK USING DISTRIBUTED PLLs 1557
Fig. 11. Comparison of noise responses for symmetrical and asymmetrical Because phase is periodic with period , the phase measured
networks. at the PDs . For small ,
, so the nonlinearity is irrelevant. However, with
(6)
so is a stationary point. This is intuitively easy to see, in
reference to Fig. 13: each oscillator leads one neighbor, and
lags behind another neighbor by exactly the same amount. The
net phase error is zero, so clearly there is no restoring force to
drive the phases to 0. Because the nonlinearity does not change
for small deviations from , dynamics about are the same
as those about 0 and hence this state is stable. The locking
(a)
of a distributed oscillator to nonzero relative phases has been
called mode-locking [9]. At startup, each oscillator in a dis-
tributed PLL starts at a random phase, so there is a nonzero
chance of converging to a mode-locked state. Simulations show
that for a network like the one shown here, the system ends
mode-locked from of random initial states. The proba-
bility goes up rapidly with the size of the system; a array
ends up mode-locked well over 99% of the time.
Pratt and Nguyen proved several useful properties about sys-
tems in mode-lock [9]. The key result, generalized for non-
Cartesian networks, is that for a system in mode-lock, there
must be a phase difference between two oscillators such that
where is the number of nodes in the largest minimal
(b) loop in the network and a minimal loop is a loop in the graph
that cannot be decomposed into multiple loops
Fig. 12. Root locus for 1-D and 2-D PLL arrays. (a) 1-D array. (b) 2-D array.
This result suggests a way to distinguish between
mode-locked states and the desired 0-phase state: in mode-lock,
D. Large Signal: Mode Locking there must be at least one branch with a large phase error. If the
The analysis of the previous section indicates that fully con- gain of the PD is designed to be negative for a phase difference
nected networks should have a better noise response than asym- larger than , then all mode-locked states are made unstable
metrical networks. However, the feedback allows the possibility without affecting the in-phase equilibrium. Pratt and Nguyen
of undesirable large-signal modes. Consider the matrices for a suggest that XOR PDs preclude mode-lock in a rectangular
PLL network: network of oscillators because the response decreases for phase
errors larger than , [9]. This result follows directly from the
result derived above: in a rectangular array, the largest minimal
loop has four nodes, so . A PD described in the
next section, with , would be useful in nonrectangular
networks, and where more gain near 0 phase is desirable.
IV. IMPLEMENTATION
(5) The distributed clock network generates the clock signal with
PLLs at multiple points (“nodes”) across a chip, and distributes
1558 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 11, NOVEMBER 2000
each only to a small section of the chip (“tile”) (Fig. 8). PDs
at the boundaries between tiles produce error signals that are
summed by an amplifier in each tile and used to adjust the fre-
quency of the node oscillator
Because the proposed network has many nodes, the power
and size constraints on each node are even more stringent than
the constraints on a single global PLL. The oscillator, PD, and
loop filter of a working demonstration chip, fabricated in a stan- Fig. 17. Locking behavior of the PLL array
dard 0.35- m single-poly triple metal process, are considered in
turn below. loaded arbiter which acts as a nonlinear PD. For no input phase
difference, the output is balanced. As the phase difference in-
A. Oscillator
creases from zero, one output will be asserted for the full du-
The demonstration chip used an nMOS-loaded differen- ration of an input pulse, while the other output will be asserted
tial ring oscillator as a voltage-controlled oscillator (VCO) for only the remainder of the input pulse duration after the first
(Fig. 14). Transistors comprise the differential input pulse ends, which is equal to the input phase difference.
inverter. The differential pair is , the tail current is driven Thus the detector has very high gain near zero phase error that
by , and act as the nMOS load. The nMOS loads allow drops off to zero as the input phase difference approaches the
fast oscillation and shield the output signal from noise. input pulse width (Fig. 16).
is a low-pass version of generated by subthreshold The pulse generators and enable this arbiter to give
leakage through PFET ; supply noise coupling in through frequency-error feedback. If one input is at a higher frequency
of is bypassed by . The oscillation frequency than the other, its output will be asserted for more input pulses
is only dependent on the supply voltage through capacitor than the other. Because the width of the pulses is independent
nonlinearity and the output conductance of , and feedback of input frequency, the average output voltage corresponds to
of the PLL compensates drift of and . frequency. Unlike a typical phase-frequency detector, however,
the strength of the error signal falls to zero as frequency differ-
B. Phase Detector (PD) ence goes to 0, so there can be no mode-lock problems, yet large
The PD, shown in Fig. 15, has a sufficient nonlinearity, higher signal frequency (and hence, phase) locking is enhanced. Fig. 17
gain at small input phase difference and less high-frequency shows the large signal correction and small signal behavior of
content than an XOR PD. The core ( ) is an nMOS- the entire array of PLLs as the already internally locked array
GUTNIK AND CHANDRAKASAN: ACTIVE GHz CLOCK NETWORK USING DISTRIBUTED PLLs 1559
REFERENCES Vadim Gutnik (M’00) received the B.S. degree in electrical engineering and
materials science from the University of California, Berkeley, in 1994, and the
[1] D. W. Bailey and B. J. Benschneider, “Clocking design and analysis for
S.M. and Ph.D. degrees in electrical engineering from the Massachusetts Insti-
a 600-MHz Alpha microprocessor,” J. Solid State Circuits, vol. 33, no. tute of Technology, Cambridge, in 1996 and 2000, respectively.
11, pp. 1627–1633, Nov. 1998.
Previous research interests have included micromechanical resonators, and
[2] C. F. Webb, “A 400-MHz S/390 microprocessor,” in ISSCC Dig. Tech. variable-voltage power supplies. He is currently working as a Design Engineer
Papers, Feb. 1997, pp. 168–169. at Silicon Laboratories, Austin, TX.
[3] T. Yoshida, “A 2-V 250-MHz multimedia processor,” in ISSCC Dig.
Dr. Gutnik received an NDSEG fellowship in 1994, and the Intel Foundation
Tech. Papers, Feb. 1997, pp. 266–267. Fellowship in 1997.
[4] I. A. Young, M. F. Mar, and B. Bhushan, “A 0.35-m CMOS
3-880-MHz PLL N/2 clock multiplier and distribution network with
low jitter for microprocessors,” in ISSCC Dig. Tech. Papers, Feb. 1997,
pp. 330–331. Anantha P. Chandrakasan (M’95) received the
[5] H. B. Bakoglu, J. T. Walker, and J. D. Meindl, “A symmetric clock- B.S., M.S., and Ph.D. degrees in electrical engi-
distribution tree and optimized high-speed interconnections for reduced neering and computer sciences from the University
clock skew in ULSI and WSI circuits,” in IEEE Int. Conf. Computer of California, Berkeley, in 1989, 1990, and 1994,
Design, NY, Oct. 1986, pp. 118–122. respectively.
[6] P. Zarkesh-Ha, T. Mule, and J. D. Meindl, “Characterization and mod- Since September, 1994, he has been the Analog
eling of clock skew with process variations,” in Proc. IEEE 1999 Custom Devices Career Development Assistant Professor of
Integrated Circuits Conf., pp. 441–444. electrical engineering at the Massachusetts Institute
[7] G. Geannopoulos and X. Dai, “An adaptive digital deskewing circuit for of Technology, Cambridge. His research interests in-
clock distribution networks,” in ISSCC Dig. Tech. Papers, Feb. 1998, pp. clude the ultra-low-power implementation of custom
400–401. and programmable digital signal processors, wireless
[8] F. Ançeau, “A synchronous approach for clocking VLSI systems,” J. sensors and multimedia devices, emerging technologies, and CAD tools for
Solid State Circuits, vol. SC-17, no. 1, pp. 51–56, Feb. 1982. VLSI. He is a co-author of the book titled Low Power Digital CMOS Design
[9] G. A. Pratt and J. Nguyen, “Distributed synchronous clocking,” IEEE (Norwood, MA: Kluwer, 1995). He has served on the technical program com-
Trans. Parallel and Distributed Systems, Mar. 1995. mittee of various conferences including ISSCC, VLSI Circuits Symposium,
DAC, ISLPED, and ICCD. He is the Technical Program Co-Chair for the 1997
International Symposium on Low-Power Electronics and Design and for VLSI
Design’98.
He received the National Science Foundation Career Development Award in
1995, the IBM Faculty Development Award in 1995, and the National Semicon-
ductor Faculty Development Award in 1996. He received the IEEE Communi-
cations Society 1993 Best Tutorial Paper Award for the IEEE Communications
Magazine paper titled, “A Portable Multimedia Terminal.”
ISSCC 2000 I SESSION 10 I CLOCK GENERATION AND DlSTRl6UTlOH I PAPER TA 10.5
TA 10.5 Active GHz Clack Network using Distributed PLLs difference iiicrcasos rrorti eeru, one output is asserted for the fiill
durntiun of nn input pulse, while thc nthcr output is asscrtcd for
\ladim Gutnik, Ananfha Chandrakasan only tlie remniiidcr of the input pu1.s~rlurnt.ion after tlic first input
pulse ends, which is cqual to the input phase differcnco. Thus the
MIT Microsyslcms Technology I.ah, Cambridge, MA dotectnr has high gain near zero phnsc error that drops nff to zero
A S the input phnsc rliffcrence apprnnclics the iiiput pulw width
............................................
Vt>iaa h14 M7
I
Vout
X t Loop Filter
6t vco
1
Figtire 10.6.1: Dicltribiited docking network.
I I
~6~
4 g.............................................. --I ~-
I -
501 .- - --
~
fi 5. :; P
M4
AI
v
Y2
M2
:...............i ................
:
1 . I .
-0.2 -0.1 0 0.1 0.2
TIme dlfference (nanoseconds)
-
-
Figure 10.6.3 Phase detector. Figure 10.6.4 Simulutcd I’D tmnsCcr curve.
6
0 ’ I 1 I ‘ I
01
50 100 a
Spacing b c l w o n Signal Linu and G r o w l Line (11 111)
1
Figure 105.8:nistrilmtcd dock chip.
fixcl vcrtcx dclcctnr 1 100M
Nuinber nf cbnnncls
I'nwer I cllamcl
Arcalclinnncl
< 1UOJI\V
1 5 0 hu 400 luii'
] CMOS and
Figure 11.1.7: Ruiigc finder ASIC in 0 , 8 ) ~ m
Track pnsilion resolulioii I 15 [I111 p:rckagcd transmitter-receiver.
Tntd m a ,8 Id Core aren of chip is 1xl.67min2.
tarliation
. . dose (10 yrs) li)-30Mndr~10"1icutnms/rii~
Tracker
Numlicr of ctiaiinels 12M
Power I cliilnncl < 3n1w
Trnck posilioii rewliitioii Sfl-10011nr
Bndinliun dose (10 yrs) I O Mnd $. I O " ILICI~I' -
Caloriiiiclor
- --
Niimbcr o f clinnnds LOOK
Snmpljnfi rate I 2 liil al 40 MI lz
Rrtlialion d u e (10 ym) 500 KmdtlO"n/ctn' (lmrrtil)
Zfl Mmrlt Ill"n/cin? h l c a l x )
Munti dctccior
Nuinbcr nf chinncla BW K
T i m i y rcaoliition .7 11s Figutw 11.3.4: 130 chtinnel protolype micrograph.
Kadialion ilme ( I O yrsi IflKrnd t IO" n l c d
Chip is 2x8mm2.
Dala rnte i i h level I I Tbit/scc
lrigger
-4
I25 by 50 iiiicrotis
hiialog 2.2 V Di si131 I .Ir V
13" 6.6 and 5.2 pA
11.9
Abstract—This paper describes an all-analog multiphase work over process, voltage, and temperature (PVT) variations.
delay-locked loop (DLL) architecture that achieves both Since DLL's adjust only phase, not frequency, the operating fre-
wide-range operation and low-jitter performance. A replica quency range is severely limited. We propose a new DLL archi-
delay line is attached to a conventional DLL to fully utilize the
frequency range of the voltage-controlled delay line. The proposed tecture that operates in a wide frequency range while keeping
DLL keeps the same benefits of conventional DLL's such as good the low-jitter performance.
jitter performance and multiphase clock generation. The DLL Various wide-range DLL architectures [2]–[7], with similar
incorporates dynamic phase detectors and triply controlled delay motivations, have been developed, which can be classified
cells with cell-level duty-cycle correction capability to generate into three categories: analog type [2], digital type [3], [4], and
equally spaced eight-phase clocks. The chip has been fabricated
using a 0.35-µm CMOS process. The peak-to-peak jitter is less dual-loop type [5]–[7]. While a conventional analog DLL [1]
than 30 ps over the operating frequency range of 62.5–250 MHz. uses a voltage-controlled delay line (VCDL), the wide-range
At 250 MHz, its jitter supply sensitivity is 0.11 ps/mV. It occupies analog DLL [2] uses phase mixers for wide-range operation.
smaller area (0.2 mm2) and dissipates less power (42 mW) than However, because of its relatively high analog complexity,
other wide-range DLL's [2]–[7]. the analog DLL requires a process-specific implementation,
Index Terms—Delay-locked loop, duty-cycle correction, dy- making it relatively difficult to port across multiple processes
namic phase detector, multiphase clock generation, replica delay [4]. Thus, digital DLL's [3], [4] have been proposed for
line, triply controlled delay cell. better process portability. However, skew error and jitter are
increased due to continuous change of phase selections among
I. INTRODUCTION quantized delay times with supply and temperature variations.
To overcome these problems, dual-loop architectures have been
Fig. 2. Block diagrams of (a) a digital DLL and (b) a dual-loop DLL.
Fig. 1. Block diagram of (a) a conventional DLL and (b) a DLL locking
operation and operating frequency range limitation.
or equivalently in terms of
Max
(4)
If the delay range of the controlled delay cell satisfies the re-
lation , the DLL will have a fre-
quency range determined by the entire delay range of the delay
cell. However, even if we make the delay range wider and sat-
isfy in an effort to increase the
frequency range, the lock range is limited to only 7:1.
In some applications where the frequency range must be
larger than 7:1, changing the pump-current ratio of the CSPD
can make the frequency range wider. For example, with
, the frequency range of 9:1 can be obtained.
With , the frequency range of 11:1 can be
obtained.
In high-frequency operations, , especially , may be
too short to drive the XNOR gate. So, a divide-by-two circuit and Fig. 8. (a) Core DLL with cell-level duty-cycle correction and (b) rising and
a pair of delay cells are used to slow down the frequency of falling edge alignment.
Ref-CLK [11]. The new configuration shown in Fig. 6 is effec-
tively the same as the one in Fig. 4 but offers a more robust D. Cell-Level Duty-Cycle Correction
operation in the high-frequency operations.
Fig. 8 shows the core DLL with a cell-level duty-cycle cor-
rection mechanism. In high-frequency operations, clock out-
C. Core DLL puts with a short cycle time can be severely distorted as the
Fig. 7 shows a simplified block diagram of the core DLL. It clock passes through many delay cells. Even if the duty cycle
consists of a VCDL, a dynamic phase detector, a charge pump, of Ref-CLK is 50% at the entrance, that of CLK7 may deviate
and a loop filter. The core DLL generates eight-phase clock out- significantly from 50%. It causes multiphase clock outputs to
puts through eight delay cells (DC's) in the VCDL. The core have phase error, which could be fatal, especially in high-speed
DLL is the same as a conventional analog DLL except that it communication applications. A conventional solution is to at-
has another control voltage Vcr. Vcr from the replica delay line tach duty-cycle correction circuits to all clock output drivers
coarsely determines the delay time of the VCDL so with the price of added area, increased jitter, and further phase
that is equal to in the locked state. In the locked mismatch due to elongated path. So a cell-level duty cycle cor-
state, the eighth clock output, CLK7 in Fig. 7, is aligned with rection is proposed.
Ref-CLK. The second phase detector shown in Fig. 8 takes inverted
In high-frequency operations, there may be some static phase Ref-CLK and inverted CLK7 as its inputs, generating a control
mismatch between CLK7 and Ref-CLK due to the long rise/fall signal Vduty as the output. It fine-tunes the cell current ratio,
times of signal transition edges compared with the period of and thus aligns the falling edges of Ref-CLK and CLK7. In the
the clock. So the fine-tuning is required. The dynamic phase steady state, therefore, both rising and falling edges of CLK7
detector (PD) in the core DLL generates control signal Vcp, fine- and Ref-CLK are synchronized in phase, and both clocks have
tunes , and removes residual phase mismatch so that the the same duty cycle. It should be noted that the duty-cycle cor-
rising edge of Ref-CLK is exactly aligned with that of CLK7. rection circuit (DCC) used right at the input of Ref-CLK corrects
MOON et al.: ALL-ANALOG MULTIPHASE DLL 381
Fig. 10. (a) Dynamic phase detector and (b) its operations.
Fig. 9. Triply controlled DC. (a) Circuit diagram of a DCE and (b)
configuration of a triply controlled DC.
the duty cycle of Ref-CLK only. With cell-level duty cycle cor-
rection, not only CLK7 but also the other intermediate clock out-
puts maintain a 50% duty cycle without any additional circuits.
Although two control voltages Vcp and Vduty are simultane-
ously adjusted in the coupled negative feedback loops, the sta- Fig. 11. Prototype chip microphotograph.
bility is guaranteed by making one of its loops have a sufficiently
low bandwidth.
Since the high and low levels alternate in an inverter chain,
duty-cycle control signals must alternate between Vduty and
IV. CIRCUIT DESIGN
Vduty_b as well. Therefore, Vduty_b controls DCE0 and DCE3
A. Triply Controlled Delay Cell and Vduty controls DCE1 and DCE2, as shown in Fig. 9(b).
According to the noise analyses of [12] and [13], a In the delay circuit, either Vduty or Vduty_b changes the duty
fast-slewing (short rise/fall time) delay cell with a fully cycle of the clock outputs by adjusting the current ratio of
switching capability offers less phase noise. Although offering to . With this mechanism, the multiphase clock
a full swing output, a shunt-capacitor delay cell [14], with its outputs, CLK0 CLK7, will be duty-cycle corrected and
capacitor, would increase the chip area and power. Therefore, equally spaced. There is no need to attach a DCC circuit in
we decided to use the current-starved inverter [15] as a basic each clock output.
controlled delay cell. Since the current-starved inverter does
not require a level conversion circuit, which is required for a B. Dynamic Phase Detector
differential delay cell, it has less chip area and power, although Since the tuning precision of the core DLL depends on the
substrate and supply noise might cause detrimental influence. characteristics of the phase detector, we propose a new high-pre-
A triply controlled delay cell is used as the basic delay cell cision dynamic phase detector. Fig. 10(a) shows the circuit di-
element (DCE). The circuit diagram of the DCE and the config- agram of the proposed dynamic phase detector, which is im-
uration of one unit of DC are shown in Fig. 9. Four DCE's and proved from the published phase-frequency detector [8] by re-
two inverters compose a DC and make its rising/falling delay moving a feedback path and replacing the feedback input with
times symmetric. The delay time of the triply controlled an REF and DCLK signal. The phase detector can operate with
delay cell is determined by six control signals: Vcr, Vcr_b, Vcp, less phase offset at high frequencies due to symmetry of circuit,
Vcp_b, Vduty, and Vduty_b. Of those signals, Vcr and Vcr_b shallow logic depth of only two gates, and fast operation with
come from the replica delay line. In the DCE, the sizes of a dynamic logic circuit. While the widths of UP and DOWN
MP1 and MN1 are made larger than the others' so that Vcr and pulses are proportional to the phase difference of the inputs as
Vcr_b can control and primarily. The other control shown in Fig. 10(b), there remains a chain of short pulses in the
signals, which are generated by the core DLL, make only small locked state. These pulses in the locked state serve to reduce
adjustments to and for the fine-tuning of . the dead zone of the phase detector [8]. However, the accuracy
Vcp and Vcp_b are used to align the rising edges of Ref-CLK of the phase detector is improved when the pulse duration is
and CLK7. Vduty and Vduty_b are responsible for maintaining shorter. Furthermore, smaller capacitor in the loop filter can be
the correct duty cycle and, thus, aligning the falling edges. used since the amount of pumped charge is smaller compared
382 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 3, MARCH 2000
Fig. 12. Clock waveforms at 62.5 MHz. (a) CLK0, CLK2 and (b) CLK0, CLK4.
with a conventional “bang-bang” type of phase detector or a pro- show that the clock outputs are aligned with precise phase re-
portional phase detector with wider pulse width. lationships of less than 1% error over an operating frequency
range from 62.5 to 250 MHz. The delay range of the VCDL is
estimated to be between 4 and 16 ns. With minor change of de-
V. EXPERIMENTAL RESULTS
vice sizes of the VCDL, the operating frequency range could be
The test chip has been fabricated using a 0.35-µm, N-well, extended toward a higher frequency range.
triple-metal CMOS process. The threshold voltages in this Fig. 14(a) and (b) shows the jitter histograms in the clock
process are 0.42 V (NMOS) and −0.22 V (PMOS). The output CLK7. The frequency of Ref-CLK is 250 MHz. Fig. 14(a)
gate-oxide thickness is 75 nm. Fig. 11 shows a microphoto- shows 4-ps rms and 29-ps peak-to-peak jitter characteristics
graph of the fabricated chip. The chip integrates the DLL with in a quiet power supply, where only the DLL is activated in
an on-chip decoupling capacitance of 270 pF. The active area the chip. When other digital circuits are turned on, rms and
of the DLL occupies 0.08 mm2 and the decoupling capacitor peak-to-peak jitter are increased to 6.4 and 44 ps, respectively,
0.12 mm2. Since the pulse currents of the multiphase clock and internal supply noise of about 200 mV is measured. If a
outputs are interspersed, the ac component of the supply 500-mV, 1.1-MHz square wave is injected externally on the
current is present at the eighth harmonic frequencies of the power supply, the peak-to-peak jitter increases to 83 ps, as
clock. Therefore, the 270-pF on-chip capacitor is adequate to shown in Fig. 14(b). At 250 MHz, jitter supply sensitivity is
reduce the on-chip supply noise induced by switching of digital measured to be only 0.11 ps/mV. Furthermore, from 62.5 to 250
circuits. MHz, the clock outputs show almost flat jitter performance.
The prototype chip operates from 62.5 to 250 MHz with a Since the delay range of the VCDL in the core DLL is primarily
3.3-V power supply. Fig. 12(a) shows the waveforms of CLK0 set by Vcr and Vcr_b, the gain of the VCDL is nearly flat over
and CLK2 at 62.5 MHz. These clock outputs are the first and a wide range of operating frequency. The jitter performance of
the third clocks, respectively, and have a 90 phase difference. the proposed DLL is better than or at least comparable to other
Fig. 12(b) shows the waveforms of CLK0 and CLK4, which wide-range DLL's [2]–[7].
are an inversion of each other with a 180 phase difference. Table I summarizes the DLL performance characteristics.
Fig. 13(a) and (b) shows the same waveforms at 250 MHz. In The power dissipation is proportional to the operating fre-
spite of some ringing due to capacitance and inductance of the quency. Operating at 250 MHz, the DLL draws 12.6-mA dc
board and measurement instrument, the measurement results from a 3.3-V power supply.
MOON et al.: ALL-ANALOG MULTIPHASE DLL 383
Fig. 13. Clock waveforms at 250 MHz. (a) CLK0, CLK2 and (b) CLK0, CLK4.
Fig. 14. Jitter histograms at 250 MHz in (a) a quiet supply and (b) with added 1.1-MHz, 500-mV square wave noise.
384 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 3, MARCH 2000
TABLE I [10] Y. Moon, D.-K. Jeong, and G. Kim, “Clock dithering for electromagnetic
PERFORMANCE CHARACTERISTICS OF compliance using spread spectrum phase modulation,” in IEEE ISSCC
PROTOTYPE CHIP Dig. Tech. Papers, Feb. 1999, pp. 186–187.
[11] Y. Moon, J. Choi, K. Lee, D.-K. Jeong, and M.-K. Kim, “A 62.5–250
MHz multi-phase delay-locked loop using a replica delay line with triply
controlled delay cells,” in Proc. IEEE Custom Integrated Circuits Conf.,
May 1999, pp. 299–302.
[12] B. Kim, “High speed clock recovery in VLSI using hybrid analog/digital
techniques,” Ph.D. dissertation, Univ. of California, Berkeley, Memo.
UCB/ERL M90/50, June 1990.
[13] A. Hajimiri, S. Limotyrakis, and T. H. Lee, “Jitter and phase noise in
ring oscillators,” IEEE J. Solid-State Circuits, vol. 34, pp. 790–804, June
1999.
[14] M. Bazes, “A novel precision MOS synchronous delay line,” IEEE J.
Solid-State Circuits, vol. SC-20, pp. 1265–1271, Dec. 1985.
[15] D.-K. Jeong, G. Borriello, D. A. Hodges, and R. H. Katz, “Design of
PLL-based clock generation circuits,” IEEE J. Solid-State Circuits, vol.
SC-22, pp. 255–261, Apr. 1987.
Yongsam Moon (S'96) was born in Incheon, Korea, on March 1, 1971. He re-
VI. CONCLUSION ceived the B.S. and M.S. degrees in electronics engineering from Seoul National
University, Seoul, Korea, in 1994 and 1996, respectively, where he is currently
By including a replica delay line with a CSPD, the core DLL pursuing the Ph.D. degree.
He has been working on architectures and CMOS circuits for microproces-
operates in a wide frequency range from 62.5 to 250 MHz. Since sors. His current research interests include clock and data recovery for high-
the replica delay line occupies a quarter of the area of the core speed communication and high-speed I/O interface circuits.
DLL, the area cost and power consumption of the prototype
chip are much smaller than those of other wide-range DLL's
[2]–[7]. Both the analog-control scheme and the flat gain of
Jongsang Choi was born in Korea on September 11, 1974. He received the B.S.
the VCDL offer a low-jitter performance of 4-ps rms and 29-ps and M.S. degrees in electronics engineering from Seoul National University,
peak-to-peak, and a low supply sensitivity of 0.11 ps/mV. The Seoul, Korea, in 1997 and 1999, respectively, where he is currently pursuing
DLL incorporates dynamic phase detectors and triply controlled the Ph.D. degree.
He has been working on architectures and CMOS circuits for high-speed com-
delay cells with cell-level duty-cycle correction capability in munication. His current research interests include high-speed CMOS circuits
order to generate equally spaced eight-phase clocks. and gigabit network systems.
The DLL can be used not only as an internal clock buffer of
microprocessors and memory IC's but also as a multiphase clock
generator for gigabit serial interfaces. With a faster VCDL with
Kyeongho Lee (S'92–M’00) was born in Seoul, Korea, on August 5, 1969. He
minor change of device sizes, the DLL will operate at a higher received the B.S., M.S., and Ph.D. degrees in electronics engineering from Seoul
and wider frequency range. National University, Seoul, Korea, in 1993, 1995, and 2000, respectively.
Since 2000 he has been with Global Communication Technology, Inc., Los
Altos, CA. He is working on various CMOS high-speed circuits for RF com-
REFERENCES munication. His research interests include high-speed CMOS circuits and PLL
[1] M. Johnson and E. Hudson, “A variable delay line PLL for CPU-co- systems.
processor synchronization,” IEEE J. Solid-State Circuits, vol. 23, pp.
1218–1223, Oct. 1988.
[2] T. H. Lee, K. S. Donnelly, J. T. C. Ho, J. Zerbe, M. G. Johnson,
and T. Ishikawa, “A 2.5 V CMOS delay-locked loop for an 18 Mbit, Deog-Kyoon Jeong (S'87–M'89) received the B.S. and M.S. degrees in elec-
500 Megabyte/s DRAM,,” IEEE J. Solid-State Circuits, vol. 29, pp. tronics engineering from Seoul National University, Seoul Korea, in 1981 and
1491–1496, Dec. 1994. 1984, respectively., and the Ph.D. degree in electrical engineering and computer
[3] A. Efendovich, Y. Afek, C. Sella, and Z. Bikowsky, “Multifrequency sciences from the University of California at Berkeley, Berkeley, CA, in 1989.
zero-jitter delay-locked loop,” IEEE J. Solid-State Circuits, vol. 29, pp. From 1989 to 1991, he was with Texas Instruments Incorporated, Dallas, TX,,
67–70, Jan. 1994. where he was a Member oif the Technical Staff. He worked on modeling and
[4] B. W. Garlepp, K. S. Donnelly, J. Kim, P. S. Chau, J. L. Zerbe, C. Huang, design of BiCMOS circuits and single-chip implementation of the SPARC ar-
C. V. Tran, C. L. Portmann, D. Stark, Y.-F. Chan, T. H. Leen, and M. A. chitecture. Since 1991, he has been on the Faculty of the School of Electrical En-
Horowitz, “A Portable Digital DLL for High-Speed CMOS Interface gineering, Seoul National University, Seoul, Korea, as an Associate Professor.
Circuits,” IEEE J. Solid-State Circuits, vol. 34, pp. 632–644, May 1999. His research interests include high-speed circuits, microrocessor architectures,
[5] S. Tanoi, T. Tanabe, K. Takahashi, S. Miyamoto, and M. Uesugi, “A and memory systems.
250–622 MHz deskew and jitter-suppressed clock buffer using two-loop
architecture,” IEEE J. Solid-State Circuits, vol. 31, pp. 487–493, Apr.
1996.
[6] K. Lee, Y. Moon, and D.-K. Jeong, “Dual loop delay-locked loop,”, U.S.
patent pending. Min-Kyu Kim was born in Seoul, Korea, in 1965. He received the B.S., M.S.,
[7] S. Sidiropoulos and M. A. Horowitz, “A semi-digital dual delay-locked and Ph.D. degrees in electronics engineering from Seoul National University,
loop,” IEEE J. Solid-State Circuits, vol. 32, pp. 1683–1692, Nov. 1997. Seoul, Korea, in 1988, 1990, and 1998, respectively.
[8] S. Kim, K. Lee, Y. Moon, D.-K. Jeong, Y. Choi, and H. K. Lim, “A 960- From 1995 to 1996, he was with the Electronics and Telecommunications
Mb/s/pin interface for skew-tolerant bus using low jitter PLL,” IEEE J. Research Institute, Taejon, Korea, working on the development of high-speed
Solid-State Circuits, vol. 32, pp. 691–700, May 1997. communication IC's for ATM switches. Since 1998, he has been working on
[9] D.-L. Chen and M. O. Baker, “A 1.25 Gb/s, 460 mW CMOS transceiver high-speed serial link technologies at Silicon Image, Inc., Cupertino, CA. His
for serial data communication,” in IEEE ISSCC Dig. Tech. Papers, Feb. current interests include circuit design for high-speed communication systems
1997, pp. 242–243. and digital-interface display systems.
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 36, NO. 3, MARCH 2001 417
Abstract—This paper describes a low-voltage low-jitter clock digital logic gates are required to convert a conventional DLL
synthesizer and a temperature-compensated tunable oscillator. into a wider range self-correcting DLL. For comparison, in [2]
Both of these circuits employ a self-correcting delay-locked loop a second DLL is added to achieve wider range operation.
(DLL) which solves the problem of false locking associated with
conventional DLLs. This DLL does not require the delay control The synthesizer outlined in this paper operates over a wide
voltage to be set on power-up; it can recover from missing refer- range of input reference clock frequencies and generates a low-
ence clock pulses and, because the delay range is not restricted, jitter output clock running at nine times the reference frequency.
it can accommodate a variable reference clock frequency. The Jitter measurements of 3.2 ps rms and 20 ps peak-to-peak, for
DLL provides multiple clock phases that are combined to produce a 2-V supply and 1-GHz output frequency, show that the core
the desired output frequency for the synthesizer, and provides
temperature-compensated biasing for the tunable oscillator. With DLL compares well with recently reported DLLs [2], [3]. Mul-
a 2-V supply the measured rms jitter for the 1-GHz synthesizer tiple clock phases from the DLL are combined using digital
output was 3.2 ps. With a 3.3-V supply, rms jitter of 3.1 ps was logic to produce the synthesizer output [4]. An alternative ap-
measured for a 1.6-GHz output. The tunable oscillator has a 1.8% proach requiring a pair of on-chip tuned LC-tanks is described
frequency variation over an ambient temperature range from in [5].
0 C to 85 C. The circuits were fabricated on a generic 0.5- m
digital CMOS process. The tunable voltage-controlled oscillator (VCO) is intended
for use in a transceiver where the receive and transmit clocks
Index Terms—CMOS analog integrated circuits, delay-locked
loops, frequency synthesizers, tunable oscillators, voltage con- are plesiochronous. It is possible to tune the VCO around a
trolled oscillators. center frequency while still maintaining good temperature inde-
pendence. In some applications it may also act as a replacement
for a fractional-N-type synthesizer. This circuit is similar to the
I. INTRODUCTION oscillator described in [6] but it uses a lower jitter DLL in place
of the PLL and can operate over a wider frequency range.
T RADITIONALLY, phase-locked loops (PLLs) have been
used for clock synthesis. The synthesizer and tunable
oscillator outlined in this paper employ a delay-locked loop
In Section II the DLL architecture is discussed, starting with
a review of a conventional DLL and progressing to the new
(DLL). A DLL is more stable than higher order PLLs and self-correcting architecture. Section III outlines the clock syn-
requires only one capacitor in its first-order loop filter. On thesizer architecture. This is followed in Section IV by an out-
the other hand, a PLL generally requires a more complex line of the temperature-compensated tunable oscillator archi-
second-order filter. This filter usually employs larger com- tecture. Section V discusses the circuit layout and Section VI
ponents which may need to be off chip. Additionally, a DLL introduces measured performance results for the two circuits.
offers better jitter performance than a PLL because phase errors This paper then concludes in Section VII with a summary of the
induced by supply or substrate noise do not accumulate over achievements of this work.
many clock cycles [1].
The self-correcting DLL overcomes problems of false II. DLL ARCHITECTURE
locking associated with conventional DLLs. A self-correcting
circuit detects when the DLL is locked, or is attempting to lock, A. Conventional DLL
to an incorrect delay and then brings the DLL into a correct A simplified block diagram of a conventional DLL is illus-
locked state. This DLL does not require the delay control trated in Fig. 1. This circuit contains a voltage-controlled delay
voltage to be set on power-up; it can recover from missing line (VCDL), a phase detector, a charge pump, and a first-order
reference clock pulses and, because the delay range is not loop filter. The delay line, consisting of cascaded variable delay
restricted, it can accommodate a variable reference clock fre- stages, is driven by the input reference clock, ckref. The output
quency. This paper describes how a small number of additional of the delay line’s final stage and the ckref falling edges are
compared by the phase detector to determine the phase align-
Manuscript received July 19, 2000; revised October 24, 2000. This work was ment error. The phase detector output is integrated by the charge
supported by Parthus Technologies. pump and loop filter capacitor to generate the control voltage,
D. J. Foley is with the Department of Microelectronics, National University vcntl, of the delay stages.
of Ireland, Cork, Ireland.
M. P. Flynn is with Parthus Technologies, Cork, Ireland. When correctly locked, the total delay of the delay line
Publisher Item Identifier S 0018-9200(01)01483-4. should equal one period of the reference clock. A conventional
0018–9200/01$10.00 © 2001 IEEE
418 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 36, NO. 3, MARCH 2001
composed of the same delay stages as the VCDL and its temper- Fig. 17. VCO frequency variation with temperature.
ature (and process) variations will therefore be the same (apart
from some minor random mismatch effects and thermal gradi-
ents across the die). vcntl thus compensates for the VCDL and
VCO temperature fluctuations. The last VCO stage has an addi-
tional tuning voltage, tune, which fine tunes the VCO frequency.
By varying the tune voltage it is possible to tune the VCO center
frequency to within 3%. A wider tuning range can be achieved
by varying the frequency of the DLL reference clock, ckref.
The schematic of the last VCO stage is shown in Fig. 12. This
stage is identical to the other VCO and VCDL stages except that
the VCR contains a transistor which is connected to the external
tune voltage. In all other stages this transistor is connected to Fig. 18. VCO frequency variation with tune voltage.
ground. The extra charging current required in this VCO stage
is provided by the controlled current source bias . ature-compensated tunable oscillator has an active area of
0.7 mm .
V. CIRCUIT LAYOUT
The synthesizer and temperature-compensated tunable VI. TEST RESULTS
oscillator were fabricated on a standard 0.5- m triple-metal Fig. 14 shows a histogram of the edge jitter on the 1.62-GHz
single-poly digital CMOS process. The die photomicrograph synthesizer output clock for a supply of 3.3 V. Edge jitter of
of the device, containing both the synthesizer and tempera- 3.1 ps rms and 20 ps peak-to-peak were measured. The jitter
ture-compensated tunable oscillator, is shown in Fig. 13. The measurements of 3.2 ps rms and 20 ps peak-to-peak, for a 2-V
synthesizer has an active area of 0.6 mm and the temper- supply and 1-GHz output frequency, show that the DLL core ex-
422 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 36, NO. 3, MARCH 2001
TABLE I the tune voltage. As can be seen from the plot, the relationship
MEASURED SYNTHESIZER CHARACTERISTICS is close to linear. It is possible to tune the frequency around a
center frequency in the range from 200 to 500 MHz by selecting
an appropriate input reference frequency. This ensures that this
scheme can be used for a wide variety of applications. The mea-
sured jitter on the 400-MHz output was 29 ps rms and 180 ps
peak-to-peak. Table I shows the measured synthesizer charac-
teristics. Table II summarizes the measured characteristics of
the temperature-compensated tunable oscillator.
VII. CONCLUSION
In this paper, a robust self-correcting low-jitter DLL was used
as the basis for a low-voltage high-frequency synthesizer and a
temperature-compensated tunable oscillator. The DLL does not
require the VCDL control voltage to be set on power-up. The
DLL can recover from missing reference clock pulses and it
can track step changes in a variable reference clock frequency.
The synthesizer has significantly lower edge jitter than the tradi-
tional PLL-type synthesizer [9] and other reported DLL circuits
[10], [11]. The temperature-compensated tunable oscillator pro-
vides a temperature-stable tunable frequency that varies by just
1.8% over the 0 C to 85 C temperature range.
TABLE II ACKNOWLEDGMENT
MEASURED TUNABLE OSCILLATOR CHARACTERISTICS
The authors wish to acknowledge contributions from the
following Parthus Technologies employees: J. Ryan, J. Horan,
C. Cahill, F. Fuster, J. Collins, B. Kinsella, M. Erett, and
S. Murphy. The authors also wish to thank R. Fitzgerald from
the NMRC for the die photo micrographs. The device was fab-
ricated on the ESM (Newport) Wafer Fab through Europractice.
REFERENCES
[1] B. Kim, T. C. Weingandt, and P. R. Gray, “PLL/DLL system noise anal-
ysis for low-jitter clock synthesizer design,” in Proc. ISCAS, June 1994,
pp. 151–154.
[2] Y. Moon, J. Choi, K. Lee, D. Jeong, and M. Kim, “An all-analog multi-
phase delay-locked loop using a replica delay line for wide-range oper-
ation and low jitter,” IEEE J. Solid-State Circuits, vol. 35, pp. 377–384,
Mar. 2000.
[3] M. Mota and J. Christiansen, “A high-resolution time interpolator based
on a delay-locked loop and an RC delay line,” IEEE J. Solid-State Cir-
cuits, vol. 34, pp. 1360–1366, Oct. 1999.
[4] D. Foley and M. Flynn, “CMOS DLL-based 2-V 3.2-ps jitter 1-GHz
clock synthesizer and temperature compensated tunable oscillator,” in
Proc. IEEE Custom Integrated Circuits Conf., May 2000, pp. 371–374.
[5] G. Chien and P. R. Gray, “A 900-MHz local oscillator using a DLL-based
hibits better jitter performance than that reported for the higher frequency multiplier technique for PCS applications,” in ISSCC Dig.
voltage DLLs (3.3-V supply, 0.35- m CMOS, 4-ps rms jitter) in Tech. Papers, Feb. 2000, pp. 202–203.
[2] and (5-V supply, 0.7- m CMOS, 10-ps rms jitter) in [3]. The [6] H. Chen, E. Lee, and R. Geiger, “A 2-GHz VCO with process and tem-
perature compensation,” in Proc. ISCAS, June 1999, pp. 11 569–11 572.
measured jitter (rms) variation versus synthesizer output fre- [7] A. Young, J. K. Greason, and K. L. Wong, “A PLL clock generator with
quency for a 3.3-V supply is shown in Fig. 15. With the supply 5 to 110 MHz of lock range for microprocessors,” IEEE J. Solid-State
reduced to 1.8 V, the rms jitter was measured at 4.9 ps for an Circuits, vol. SC-27, pp. 1599–1607, Nov. 1992.
[8] M. Horowitz, C.-K. K. Yang, and S. Sidiropoulos, “High-speed electrical
output frequency of 720 MHz. Fig. 16 shows this 720-MHz syn- signaling: Overview and limitations,” IEEE Micro., vol. 18, pp. 12–24,
thesizer output. Mismatched propagation delays and interblock Jan./Feb. 1998.
routing in the frequency multiplication block (Fig. 9) resulted [9] H. C. Yang, L. K. Lee, and R. S. Co, “A low-jitter 0.3-165 MHz CMOS
PLL synthesizer for 3-V/5-V operation,” IEEE J. Solid-State Circuits,
in 100-ps interperiod jitter. vol. 32, pp. 582–586, Apr. 1997.
Fig. 17 shows the temperature-compensated tunable oscil- [10] J. G. Maneatis, “Low-jitter process-independent DLL and PLL based
lator frequency variation with temperature. Varying the ambient on self-biased techniques,” IEEE J. Solid-State Circuits, vol. 31, pp.
1723–1732, Nov. 1996.
temperature from 0 C to 85 C resulted in a total frequency [11] S. Sidiropoulos and M. A. Horowitz, “A semidigital dual delay-locked
variation of 1.8%. Fig. 18 shows the variation of frequency with loop,” IEEE J. Solid-State Circuits, vol. 32, pp. 1683–1692, Nov. 1997.
FOLEY AND FLYNN: CMOS DLL-BASED CLOCK SYNTHESIZER AND TEMPERATURE-COMPENSATED TUNABLE OSCILLATOR 423
David J. Foley (S’00) received the B.Eng. degree Michael P. Flynn (S’92–M’95–SM’98) was born in
from the National University of Ireland, Limerick, in Cork, Ireland. He received the B.E. and M.Eng.Sc.
June 1988. In 1994 he received the M.Eng.Sc. degree degrees from the National University of Ireland,
from the National University of Ireland, Cork, where Cork, in 1988 and 1990, respectively. He received
he is currently working toward the Ph.D. degree. the Ph.D. degree in electrical engineering from
He has worked in IC design with NEC Corpora- Carnegie Mellon University, Pittsburg, PA, in 1995.
tion, Tamagawa, Japan, from 1988 to 1990, AT&T From 1998 to 1991, he was with the National
Bell Labs, Tokyo, Japan, from 1990 to 1992, and Microelectronics Research Center, Cork. He was
Parthus Technologies, Dublin, Ireland, from 1994 to a Co-op Engineer with National Semiconductor in
1998. Santa Clara, CA, from 1993 to 1995. From 1995
to 1997, he was a Member of Technical Staff with
Texas Instruments DSPS R&D Lab, Dallas, TX. He is now a Technical
Director with Parthus Technologies, Cork. He is also a part-time Lecturer in the
Department of Microelectronics at the National University of Ireland, Cork.
Dr. Flynn received the 1992–1993 IEEE Solid-State Circuit Predoctoral Fel-
lowship. He is a member of Sigma Xi.
ISSCC 2002 / SESSION 4 / BACKPLANE INTERCONNECTED ICs / 4.1
4.1 A 1.5V 86mW/ch 8-Channel 622–3125Mb/s/ch the nMOS differential pair converts Va to differential currents Ip
and In, which are mirrored into the pMOS current sources to be
CMOS SerDes Macrocell with Selectable
steered by the high-speed differential clock (I-IB). A self-biased
Mux/Demux Ratio nMOS load is used with MP1 and MP2 to control the output com-
mon-mode voltage.
Fuji Yang, Jay O’Neill, Patrik Larsson, Dave Inglis, Joe Othmer
Agere Systems, Holmdel, NJ The phase interpolator exhibits an infinite phase shift range
allowing the DLL to easily track the frequency offset between the
local clock and the incoming data and enables shared-PLL archi-
2.5-3.125Gb/s serial links are commonly used for chip-to-chip tecture for multi-channel serial links with plesiochronous clock-
interconnects in high-speed network systems. In SONET OC-768 ing.
application, at least 16 on-chip SerDes transceivers are required
to guarantee total full duplex I/O throughput of 40Gb/s. Figure 4.1.4 illustrates the non-monotonic relation between the
Published 2.5Gb/s SerDes transceivers consume between 150 phase shift introduced by the interpolator and the two weights α
and 200mW, not suitable for applications requiring hundreds of and β. To have a 2π interpolation range, the bang-bang phase
on-chip SerDes transceivers [1]. Developing a low-power SerDes detector polarity must be updated to provide the correct up/down
transceiver is important for high throughput network ICs [2]. signals for different quadrants. This is by a PD-polarity-control
Another challenge is reduction of inter-channel noise coupling circuit in association with a Q-detect circuit. The Q-detect circuit
when integrating many transceivers on the same chip. This low- detects the output vector quadrant by determining the sign of α
power 8-channel SerDes macrocell employs a shared-PLL archi- and β. The Q-detect circuit uses the replica of the V-I converter
tecture. As shown in Figure 4.1.1, on the transmitter side, the in the phase mixer.
on-chip TxPLL provides a half-rate clock to all transmitters. On
the receiver side, the RxPLL distributes I- and Q-phase clocks to Although the phase mixer has control weights α and β, the phase
8 receivers. Each receiver has a phase interpolator to generate interpolation is only a function of α/β, and is independent of the
an output phase-aligned with the in-coming data for clock and amplitude of α and β. The loop, sensitive only to the phase vari-
data recovery. Sharing a single PLL between a group of trans- ation, thus controls α/β. As a consequence, α and β can grow or
mitters or receivers reduces the power and avoids the potential shrink arbitrarily. To prevent α and β from being too small, an
multi-VCO coupling problem found in a conventional one-PLL- offset current is intentionally introduced in the charge-pumps. It
per-channel configuration. The macrocell realized in a 0.16µm is controlled as follows: If α>0, Iup = I0 + Ioffset and if α<0, Idown = I0
CMOS process consumes an average power of 86mW per channel + Ioffset (the same algorithm is applied for Q-charge-pump). As the
at 1.5V power supply. result, α and β are always pulled away from zero to eliminate
any shrinking possibility. To prevent overflow on α and β, the
The transmitter 16:1 or 20:1 serialization starts with 4 shift-reg- amplitude control circuit clips α and β by blocking UP or DOWN
ister based selectable 4:1 or 5:1 multiplexers. Their 4 outputs are signal. As shown in Figure 4.1.4, Va or Vb will be kept within
sent to a tree-based 4:1 multiplexer (Figure 4.1.1). A pMOS CML [Vmin, Vmax].
output driver with on-chip 50Ω terminations is employed. The
output signal referenced to the ground makes the interface inde- The test chip in a 0.16µm 5-level metal CMOS technology uses a
pendent of the power supply. The output amplitude is set to 217-pin PBGA package. The chip micrograph is shown in Figure
1Vpp, diff. 4.1.5. Active area is about 2mm2. Figure 4.1.6a shows the mea-
sured jitter tolerance of the receiver. The CDR works with VDD
The receiver employs an interleaved integrate-and-dump front- as low as 1V for 1Gb/s maximal input data rate. With 1.5V power
end (Figure 4.1.1) [3, 4]. The integrate-and-dump operation supply, the receiver covers an input data rate range of 622 to
improves the SNR and eliminates the quadrature clock required 3125Mb/s. Measured recovered clock jitter is 87.1ps pp at
in a conventional half-rate front-end [5]. The integrator outputs 2.5Gb/s. Figure 4.1.6b shows the Tx output eye diagram mea-
are de-multiplexed by the decision-latches controlled respective- sured at 3.2Gb/s with a 231-1 PRWS. The measured jitter is
ly by ck2i and ck2q, which are divide-by-2 clocks of the recovered 57.8ps pp and static VDD sensitivity is 0.06ps/mV. Measured
clock. The decision-latch outputs d1-d4 are fed into 4 shift-regis- results are summarized in Figure 4.1.7.
ters to realize the 4:16 or 4:20 de-serialization. The integrator is
References:
implemented in a way similar way to that proposed in Reference
[1] R. Gu et al “A 0.5-3.5Gb/s Low-Power Low-Jitter Serial Data CMOS
[4], but with a pMOS input stage. It has a gain of 2 allowing Transceiver,” ISSCC Digest of Technical Papers, pp. 352-353, Feb. 1999.
relaxed offset and noise requirements of the latches. The receiv- [2] M-J. Lee et al., “An 84mW 4Gb/s clock and data recovery circuit for ser-
er achieves 30mVpp,diff sensitivity with BER <10-12 at 2.5Gb/s. ial link applications” VLSI Symposium 2001, pp. 149-152, 2001.
[3] S. Sidiropoulos et al “A 700Mb/s/pin CMOS signaling interface using a
The clock recovery is by a DLL based on an analog phase inter- current integrating receivers” IEEE JSSC, vol. 32, no. 5, pp. 681-690, May
polator [6]. In contrast to the implementation in Reference [6], a 1997.
[4] J. Savoj et al., “A CMOS Interface Circuit for Detection of 1.2Gb/s RZ
four-quadrant phase mixer is used here. Referring to Figure
Data” ISSCC Digest of Technical Papers, pp. 278-279, Feb. 1999.
4.1.2, the DLL consists of a bang-bang phase detector (PD), a PD [5] P. Larsson, “An Offset-Cancelled CMOS Clock Recovery/Demux with
polarity control circuit, an amplitude control circuit, I- and Half-Rate Linear Phase Detector for 2.5Gb/s Optical Communication”
Q-charge-pumps and the four-quadrant mixer-based phase ISSCC Digest of Technical Papers, pp. 74-75, Feb. 2001.
interpolator. The analog phase interpolation is by mixing the I- [6] T. Lee et al., “A 2.5V CMOS delay-locked loop for an 18Mb, 500Mb/s
and Q-phase clocks from the RxPLL with respective weights α DRAM” IEEE JSSC, vol. 9, no. 2, Dec. 1994
(=Va-Vref) and β (=Vb-Vref): CLK=α*(I-IB)+β*(Q-QB). Va and Vb are
independently generated by I- and Q-charge-pumps. The weights
α and β, ranging from negative to positive, directly control the
quadrant changes. This eliminates the potential phase disconti-
nuity at quadrant crossings found in the circuit of Reference [6].
Figure 4.1.3 shows the schematic of one 4-quadrant mixer, where
9D 9E
3'SRODULW\FRQWURO
$PSOLWXGHFRQWURO
FK FK
,QSXW XS
3KDVHGHWHFWRU
FK FK
FK GG FK
GDWD GQL
' φ LQWHUSFNJHQ
FORFNJHQ 3'
9UHI
FN
FNT
' FNL
&/. GQ
XST
LQW
FNL
9E
G
&3T
GULYHU ' G
G
G
'
GQT
LQW FNT
' 9PD[
$GHWHFW 4% 4
9PLQ
9D 9E
β
4
2XWSXWYHFWRU 9D9E
,S ,Q ,%
, , ,, ,,, ,9
9PD[
9UHI Φ α
,
9D 9RS 9RQ 9UHI Φ(GHJU.)
03 03 9PLQ
FOLSLQJ
&RPPRQORDGFLUFXLW
Figure 4.1.3: Four-quadrant mixer schematic. Figure 4.1.4: Relation between the phase shift and the weights Va and Vb.
Technology: 0.16 CMOS with 5 metal levels
BER < 10 –12 (all measurements were done with BER < 10 –12 )
D E
Figure 4.1.6: Measured Rx jitter tolerance and Tx eye diagram at 3.2Gb/s.
I. INTRODUCTION
vertically coupled inductors, a quality factor of only four was Fig. 3. Equivalent circuit of the tuning diode.
achievable without changing technology parameters.
The simplest way to reduce phase noise is increasing the
resonator energy by applying higher voltages to the resonator. case of direct coupling. In the presented design the resonator
In this design an emitter-coupled pair with cross feedback voltage reaches a value of 3 .
is used as a negative resistance, which is responsible for The limiting elements for the maximum voltage of the
undamping the resonator. The limit of the maximum oscillation resonator are now the two serial-connected tuning diodes. To
amplitude depends on the feedback. There are three ways of decrease their voltage without reducing the resonator-energy,
feeding the output-signal back to the input (see Fig. 2). The a capacitor is added in series at the cost of tuning range.
easiest way is direct coupling, where no biasing network is This capacitor is also responsible for getting a linear tuning
needed and very low power consumption can be achieved. characteristic. To provide the DC-path for the tuning diodes,
Using direct coupling the voltage across the resonator is resistors are connected in parallel to the coupling capacitances.
limited by the base-collector diode of the transistors. When These resistors are negligible in sight of reducing the quality
forward biased, this diode inserts additional damping and factor because they have a large value of 1 k , which is much
current noise to the resonator causing increased phase noise. larger than the capacitances impedance of 40 (see Fig. 6).
With capacitive coupling [6] this can be avoided. Here no The quality factor of the capacitance is about 24, which is
resistive element is inserted into the feedback. With capacitive at the same range as the varactor. These quality factors are
feedback a phase noise of 100 dB/Hz at 100 kHz could negligible high relative to that of the inductor.
be achieved by [6], having quality factors of eight. The The inductors are produced as symmetrical quadratic spirals.
disadvantage is the need of a high-impedance biasing network At our standard bipolar process only two metal layers could
at the transistors base. This biasing network can be realized be used to create vertically coupled inductors. The crosses
by noisy resistors or by large inductors that cost a lot of chip- are made in the gap between two metal lines (see Fig. 7). The
space. If resistors are used, uncorrelated noise is introduced cost of this technique is the wide gap between the lines, which
to both halfwaves of the signal, when the oscillator acts in causes an increment of the size and parasitic effects like series
its linear region. This noise is nearly negligible, when a low resistance and substrate capacitance. The quality factor is as
resonator is used. In our case the impedance at the input low as four. This is caused by the technology, where the metal
of the feedback amplifier is about 500 . This impedance layers have a poor conductivity and high capacitances to the
consists of the feedback capacitor and the tank impedance at medium-doped substrate. In this design an inductor of 2.7 nH
resonance. The Bias resistor in parallel is about 4 k , and so was used. Its series resistance is 4.2 . The coupling factor
the effect of adding noise is not very dominant. With inductive was estimated to 0.85. The values of the equivalent circuit (see
coupling the bias current can be fed through the inductor. This Fig. 8) where first calculated by algorithms from [7] and then
allows connecting a low-impedance biasing network which can fitted to measurement. The coupling capacitor was estimated
be made of a voltage source. The advantage of connecting a from the plate capacitance of the two metal layers.
voltage source directly to the circuit is the absence of resistive For tuning, the base-emitter diode of a transistor is used (see
elements that cause white noise, which would be converted to Fig. 3). This has the disadvantage of a high series resistance
phase noise by the nonlinear elements. Every DC path can be (base resistance) of 2.6 and a relatively low capacitance
blocked carefully against emissions from the supplies without variation by a factor of 1.75 applying a voltage difference of
any resistive element. The maximum voltage at the resonator 2.7 V (see Fig. 4). However, this represents the only way to
can be adjusted by the biasing voltage so that the base-collector create a tuning diode without changing the standard bipolar
diode is not the limiting element. Now the amplitude of the process, where no hyperabrupt pn-junctions are available. The
swing is limited by the base emitter diode of the transistors of this varactor was simulated to be about 25 (see Fig. 5)
and the limitation of the current source. The energy in the when it is calculated from 1/(jwRC). The base-collector diode
resonator can be increased and so the phase noise is reduced. could not be used for tuning, because it does not have such a
Now the maximum voltage is not one diode-voltage, as in the large capacitance variation.
ZANNOTH et al.: FULLY INTEGRATED VCO 1989
TABLE I
SUMMARY OF THEVCO CHARACTERISTICS
[5]
This occurs because of the reduction of the quality factor Markus Zannoth was born in Munich, Germany, in
and the introduction of additional current noise due to the 1971. He received the Dipl. Ing. degree in electrical
engineering in 1996 from the Technical University
forward-biased diodes. of Munich, Munich, Germany. Since 1996, he has
been working towards the Dr.Ing. degree at Siemens
AG and the Technical University of Munich.
His doctoral research is on integrated oscillators.
IV. CONCLUSION
A fully integrated bipolar VCO is realized (see Fig. 12) that
achieves a measured phase noise of 136 dB/Hz at 4.7 MHz.
The oscillator has a linear tuning characteristic with a tuning
range of 150 MHz at a center frequency of 1.96 GHz. Further
characteristics are given in Table I.
Bernd Kolb was born in 1972. He studied electrical engineering with
In this design two metal layers are used to build vertically an emphasis on telecommunication techniques at the Georg-Simon-Ohm-
coupled integrated inductors. These have quality factors of Polytechnic Nuremberg. There, he received the Dipl.Ing. (FH) degree in 1995.
about four. Integrated varactor diodes are implemented by He joined the Siemens High Frequency IC Department in 1995. Since
then, he has worked in the field of oscillators, frequency dividers, and vector
using base-emitter diodes of transistors. With this design the modulators. He has focused on designing highly integrated transmitter IC’s
noise requirements of the DECT-specification of 132 dB/Hz for mobile communication. He is now with Lucent Network Systems GmbH
at 4.7 MHz frequency offset are achieved with a margin of Nuremberg, Germany, where he designs high-frequency parts of base station
for mobile communication.
4 dB. The output power is 8 dBm at 50 , with a center
frequency of 1.95 GHz. For the use of this oscillator in a DECT
product, the varactor-capacitance will be increased until the
required center frequency of 1.88 GHz is reached. The design Joseph Fenk received the diploma in electronics
from the Technical University of Munich, Munich,
has been realized in standard high-volume bipolar process with Germany, in 1968.
an of 25 GHz. He is responsible for product definition and
project management of communications RF-
integrated circuits at Siemens Components, Inc.,
REFERENCES Integrated Circuit Division. After joining Siemens
in 1968, he worked as a Development Engineer
[1] ETSI, Digital European Cordless Telecommunications (DECT) Common on high-frequency components in the Discrete
Interface, Part 2: Physical Layer, Oct. 1992. Components Group, developing transmitters, aerial
[2] L. L. Larson, RF and Microwave Circuit Design for Wireless Communi- and tuner transistors, FET’s, and Varactor and PIN
cations. Boston: Artech House, 1996. diodes. In 1976, he joined the Integrated Circuits Group as a Design Engineer
[3] B. D. Leeson, “A simple model of feedback oscillator noise spectrum,” for consumer products. He has been engaged in the development of integrated
Proc. Lett. IEEE, pp. 329–330, Feb. 1966. circuits for infrared preamplifiers, prescalers, IF-amplifiers/demodulators for
[4] G. Sauvage, “Phase noise in oscillators: A mathematical analysis of FM-radio and satellite-TV, mixer/oscillators FM radio, TV-and SAT-TV, and
Leeson’s model,” IEEE Trans. Instrum. Meas., vol. IM-26, pp. 408–410, TV UHF/VHF modulator IC’s, as well as circuits for narrowband FM mobile
Dec. 1977. radio. He holds more than 50 patents relating to IC and system design and
[5] J. Craninckx and M. S. J. Steyaert, “A 1.8-GHz low-phase-noise CMOS has presented technical papers at numerous industry conferences and forums.
VCO using optimized hollow spiral inductors,” IEEE J. Solid-State
Circuits, vol. 32, pp. 736–744, May 1997.
[6] G. Palmisano, M. Paparo, F. Torrisi, and P. Vita, “Noise in fully
integrated PLL’s,” in Proc. 6th Workshop Advances in Analog Circuit
Design AACD’97, Como, Italy, pp. 1–19. Robert Weigel (S’88–M’89–SM’95) was born in Ebermannstadt, Germany,
[7] J. Crols, P. Kinget, J. Craninckx, and M. Steyaert, “An analytical model in 1956. In 1989, he received the Dr.Ing. degree, and in 1992 the Dr.Ing.habil
of planar inductors on lowly doped silicon substrates for high frequency degree, both in electrical engineering from the Technical University of
analog design up to 3 GHz,” in IEEE Symp. VLSI Circuit Dig. Tech. Munich, Munich, Germany.
Papers, 1996, pp. 28–29. From 1982 to 1988, he was a Research Assistant, from 1988 to 1994,
[8] J. N. Burghartz, M. Soyuer, and K. A. Jenkins, “Microwave inductors he was a Senior Research Engineer, and from 1988 to 1996, he was a
and capacitors in standard multilevel interconnect silicon technology,” Professor at the Technical University of Munich. In the winter of 1994–1995,
IEEE Trans. Microwave Theory Tech., vol. 44, pp. 100–104, Jan. 1996. he was a Guest Professor at the Technical University of Vienna, Vienna,
[9] L. Dauphinee, M. Copeland, and P. Schvan, “A balanced 1.5 GHz Austria. Since 1996, he has been Head of the Institute for Communication
voltage controlled oscillator with an integrated LC resonator,” in Proc. and Information Engineering at the University of Linz, Austria. He has been
ISSCC’97, Session 23, Analog Techniques, pp. 390–391. engaged in research and development on microwave theory and techniques,
[10] I. B. Jansen, K. Negus, and D. Lee, “Silicon bipolar VCO family for integrated optics, high-temperature superconductivity, surface acoustic wave
1.1 to 2.2 GHz with fully-integrated tank and tuning circuits,” in Proc. (SAW) technology, and digital and microwave communication systems. In
ISSCC’97, Session 23, Analog Techniques, p. 392. these fields, he has published more than 120 papers and has given more than
[11] B. Razavi, “A 1.8 GHz CMOS voltage—Controlled oscillator,” in Proc. 90 international presentations. His work includes European research projects
ISSCC’97, Session 23, Analog Techniques, pp. 388–389. and international journals.
[12] K. A. Hajimiriand and T. H. Lee, “A general theory of phase noise in Dr. Weigel is a senior member of the IEEE Microwave Theory and Tech-
electrical oscillators,” IEEE J. Solid-State Circuits, vol. 33, pp. 179–194, niques and the Ultrasonics, Ferroelectrics, and Frequency Control Societies.
Feb. 1998. He is also a member of the Institute for Systems and Components of the
[13] CADENCE, Oscillator Noise Analysis in SpectreRF, application note to Electromagnetics Academy, the Informationstechnishe Gesellschaft (ITG) in
SpectreRF, 1998. the Verband Deutscher Elekrotechniker (VDE), and the Society of Photo-
[14] F. X Kärtner, “Untersuchung des Rauschverhaltens von Oszillatoren,” Opticals Instrumentation Engineers (SPIE). In 1993 he was a co-recipient of
Ph. D. dissertation, Tech. Univ. Munich, Munich, Germany, 1988. the MIOP-award.
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 2, FEBRUARY 1998 295
I. INTRODUCTION
II. CIRCUIT
The transistor schematic of the ncPFD is shown in Fig. 3(a).
The detector has a 0-rad phase offset. The main part of the
(a) (b)
circuit is the nc stage [4]. Delays (two inverters) are inserted
at the reference and slave inputs in order to remove the dead Fig. 3. (a) The ncPFD in zero degree phase offset version. (b) Modified
version with rad phase offset.
zone in the phase characteristics around rad phase error. In
Fig. 4, waveforms for the circuit in Fig. 3(a) are shown when
The detector can easily be modified to one with -rad phase
the slave input lags the reference input.
offset, as shown in Fig. 3(b), where one, or in general an odd
Manuscript received March 11, 1997; revised August 21, 1997. number, of inverter(s) are used for the delays.
The author is with Electronic Devices, Department of Physics and Mea-
surement Technology, Linköping University, S-58183 Linköping, Sweden. If the phase detector is used only as a phase detector, i.e., not
Publisher Item Identifier S 0018-9200(98)00732-X. as a frequency detector, the circuit in Fig. 3(a) can be used as
0018–9200/98$10.00 1998 IEEE
296 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 2, FEBRUARY 1998
Fig. 4. Waveforms for the case when slave lags after the reference signal.
The pulse width of the up signal is larger than for the down signal.
Fig. 5. Phase characteristics of the ncPFD (solid line), conPFD (dashed line),
and the ptPFD (dash-dot line) from SPICE level-2 simulations of extracted
layout, VDD = 3:0 V and f = 50 MHz.
Fig. 10. Waveforms for the case when the slave has a higher frequency than
the reference signal. The down signal has higher duty cycle than the up signal.
Fig. 8. The width of the dead zones of the ncPFD (solid), ptPFD (dashed),
and conventional PFD (dash-dot) as function of frequency. The frequency
resolution is 100 MHz and the supply voltage is 5.0 V. The plot is based on
SPICE simulations of extracted layout.
Fig. 11. Frequency sensitivity for the ncPFD (solid), ptPFD (dash-dot),
and conPFD (dashed). The plot is based on behavioral simulations with
20 different initial phases for each frequency and the mean-value for each
frequency is plotted. The reference frequency is 50 MHz.
Fig. 12. Frequency sensitivity for the ncPFD for a number of frequencies. Fig. 14. Lock-in process of a third-order PLL with the ncPFD as phase
The plot is based on behavioral simulations with 20 different initial phases
+
for each frequency. The solid line is the mean value and the “ ” symbols are
frequency detector. The loop filter and PLL data are shown in the upper right
corner.
the minimum and maximum values. The reference frequency is 50 MHz.
Fig. 13. Frequency sensitivity for the ncPFD when the slave frequency is VI. EXPERIMENTS
4/5 of the reference frequency. For the initial phases of 0.0, 2.5, and 5.0 ns
the sensitivity is zero. The phase detection properties of the ncPFD have been
verified experimentally with a test chip. The test chip is a line
receiver for serial data that utilizes several parallel samplers
this false locking will not be stable, since a small phase change
to receive bit rates of 2.0 Gb/s [7]. The phase detector was
results in a nonzero sensitivity and drives the loop back to lock.
used in a delay-locked loop (DLL) which generates control
One way to add small phase changes to the simulation is to
signals for the sampling switches used in the line receiver.
include phase noise which is always present in an oscillator.
The ncPFD, Fig. 3(a), was used as a -rad phase detector and
When we add phase noise of approximately 300 ps peak-to-
the delay line was half a wavelength long.
peak to the simulations, the normalized minimum sensitivity
The skew between the reference and slave signals is not
which was zero will increase to approximately 0.01. The
possible to measure directly. This quantity has been measured
improvement is not significant but the sensitivity will be
indirectly through measurement error compensation circuits to
nonzero and positive for all phases. Hence, false locking is
be about 125 ps at MHz. Unfortunately, there is no
avoided. To further enhance the phase noise during the lock in
control of how large the measurement error is.
process, one could use dithering techniques, i.e., add the signal
The circuit blocks used to measure the offset are shown in
from a noise/signal source to the control voltage of the VCO.
Fig. 15. The two clocks that we want to compare come from
the beginning and the end of the delay line. They are fed into
V. BEHAVIORAL MIXED-MODE SIMULATIONS two matched inverter chains where the propagation delay for
In order to understand the sensitivity to frequency errors rising and falling edges are matched against process variations
and lock-in properties of the proposed detector, a complete [8]. The delay from the multiplexer inputs to the oscilloscope
third-order charge pump PLL system was simulated using a screen for the two signal paths are not matched. Two mea-
multilevel mixed-mode simulator, Lsim [5]. The PFD was surements are done to compensate this. One where the delay
represented by a schematic simulated in switch mode. The line input signal goes uninverted through Output buffer 1 and
VCO, phase-noise generator, and charge pump are represented one where the same signal goes inverted through the Output
by behavioral models written in the hardware description buffer 2. The measured skew including the measurement error
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 2, FEBRUARY 1998 299
Fig. 15. DLL, phase offset measurement circuitry, and NMOS transistor to
access the control voltage.
Fig. 16. Oscilloscope screen dump of the drain voltage of an NMOS
transistor with external pull-up resistor where the gate is connected to the
for the measurements will be as follows: control voltage. Four different lock-in procedures are shown. The initial
control voltages are 0.0, 1.0, 2.0, and 3.0 V for the curves from top to
skew inv mux Buf bottom, respectively.
Abstract—Rotary traveling-wave oscillators (RTWOs) repre- Researchers have therefore looked to alternative oscillator
sent a new transmission-line approach to gigahertz-rate clock mechanisms for better phase stability and lower power con-
generation. Using the inherently stable LC characteristics of sumption. Previous transmission-line systems such as salphasic
on-chip VLSI interconnect, the clock distribution network be-
comes a low-impedance distributed oscillator. The RTWO operates distribution [6], distributed amplifiers [7], and adiabatic LC res-
by creating a rotating traveling wave within a closed-loop differen- onant clocks [8] provide only a sinusoidal or semisinusoidal
tial transmission line. Distributed CMOS inverters serve as both clock, making fast edge rates difficult to achieve.
transmission-line amplifiers and latches to power the oscillation This paper introduces the rotary traveling-wave oscillator
and ensure rotational lock. Load capacitance is absorbed into the (RTWO); a differential LC transmission-line oscillator which
transmission-line constants whereby energy is recirculated giving
an adiabatic quality. Unusually for an LC oscillator, multiphase produces gigahertz-rate multiphase (360 ) square waves with
(360 ) square waves are produced directly. RTWO structures low jitter. Extension of the RTWO to rotary oscillator arrays
are compact and can be wired together to form rotary oscillator (ROAs) offers a scalable architecture with the potential for
arrays (ROAs) to distribute a phase-locked clock over a large chip. low-power low-skew clock generation over an arbitrary chip
The principle is scalable to very high clock frequencies. Issues area without resorting to clock domains. Simulations predict
related to interconnect and field coupling dominate the design
process for RTWOs. Taking precautions to avoid unwanted signal rise and fall times of 20 ps on a 0.25- m process and a
couplings, the rise and fall times of 20 ps, suggested by simulation, maximum frequency limited only by the of the integrated
may be realized at low power consumption. Experimental results circuit technology used.
of the 0.25- m CMOS test chip with 950-MHz and 3.4-GHz rings Experiments show that although the RTWO operates differ-
are presented, indicating 5.5-ps jitter and 34-dB power supply entially, careful attention is required to guard against magnetic
rejection ratio (PSRR). Design errors in the test chip precluded
meaningful rise and fall time measurements. field couplings between the clock conductors and other struc-
tures if the potential performance of these oscillators is to be
Index Terms—Clocks, MOSFET oscillators, phase-locked oscil-
lators, phased arrays, synchronization, timing circuits, transmis- realized.
sion line resonators, traveling-wave amplifiers.
II. CONCEPT OF THE ROTARY CLOCK OSCILLATOR
I. INTRODUCTION A. Fundamentals and Structures
The basic ROA architecture is shown in Fig. 1. A represen-
C LOCKING at gigahertz rates requires generators with low
skew and low jitter to avoid synchronous timing failures.
The notion of a “clocking surface” becomes untenable at giga-
tative multigigahertz rotary clock layout has 25 interconnected
RTWO rings placed onto a 7 7 array grid. Each ring consists
hertz rates [1], frequently mandating that large VLSI chips are of a differential line driven by shunt-connected antiparallel in-
subdivided into multiple clock domains and/or utilize skew-tol- verters distributed around the ring. This arrangement produces
erant multiphase circuit design techniques [2]. a single clock edge in each ring which sweeps around the ring
Techniques such as distributed phase-locked loops (PLLs) at a frequency dependent on the electrical length of the ring.
[3] and delay-locked loops (DLLs) [4] can control systematic Pulses are synchronized between rings by hard wiring which
skew to within 20 ps, but are complex, introduce random skew forces phase lock.
(i.e., jitter), and have area penalties. H-tree distribution systems, Fig. 2 illustrates the theory behind the individual RTWO.
while simple, are difficult to balance and can use upwards of Fig. 2(a) depicts an open loop of differential transmission line
30% of a chip’s total power budget [5]. All these systems are (exhibiting LC characteristics) connected to a battery through
inherently single-phase, induce large amounts of simultaneous an ideal switch. When the switch is closed, a voltage wave be-
switching noise, and can be highly susceptible to this noise. gins to travel counterclockwise around the loop. Fig. 2(b) shows
a similar loop, with the voltage source replaced by a cross-con-
Manuscript received March 20, 2001; revised June 28, 2001. This work was nection of the inner and outer conductors to cause a signal in-
supported by Multigig Ltd., and also supported in part by the National Science version. If there were no losses, a wave could travel on this ring
Foundation under Award EIA-31332. indefinitely, providing a full clock cycle every other rotation of
J. Wood is with MultiGig, Ltd., Northampton NN8 1RF, U.K. (e-mail:
john.wood@multigig.com). the ring (the Möbius effect).
T. C. Edwards is with Engalco, Huntington, YO32 9NY, U.K. (e-mail: en- In real applications, multiple antiparallel inverter pairs are
quiries@engalco.com). added to the line to overcome losses and give rotation lock.
S. Lipa is with the Microelectronics Systems Laboratory, North Carolina State
University, Raleigh, NC 27695 USA. Rings are simple closed loops and oscillation occurs sponta-
Publisher Item Identifier S 0018-9200(01)08220-8. neously upon any noise event. Unbiased, startup can occur in
0018–9200/01$10.00 © 2001 IEEE
WOOD et al.: ROTARY TRAVELING-WAVE OSCILLATOR ARRAYS 1655
Fig. 3. Waveforms of line voltage and line current for the 3.4-GHz clock
simulation example.
B. Waveforms
Fig. 3 shows simulated waveforms of a 3.4-GHz RTWO taken
at an arbitrary position on the ring. The design has the following
characteristics for reference:
• Conductors: Width m
Fig. 1. Basic rotary clock architecture. The = signs denote points with same • Pitch m
phase.
• Ring Length m
• Metallization: 1.75 m copper
• Loop inductance total nH
• Process: 0.25- m CMOS
• Nch total width: 2000 m
• Pch total width: 5000 m
• Number of inverters: 24 pairs.
Very large distributed transistor widths give substantial ca-
pacitive loading to the lines, thus lowering velocity to give a
reasonably low clock rate from a compact oscillator structure.
In application, up to 75% of this capacitance can come from
load capacitance, reducing the size of the drive transistors ac-
cordingly.
The upper traces of Fig. 3 show the simulated voltage wave-
forms on the differential line at points labeled A0, B0. The lower
traces show the current in the conductors to be 200 mA, while
the supply current is simulated at 84 mA with 4.5 mA of
Fig. 2. Idealized theory underlying the RTWO. (a) Open loop of differential ripple. This clearly illustrates that energy is recycled by the basic
conductors to a battery via a switch. (b) Similar loop but with the voltage source
replaced by the inner and outer conductors cross-connected.
operation of the RTWO. Just driving the 34 pF of capacitance
present would require 275 mA at this frequency (from ).
D. Network Rules
Although the square-ring shape is convenient to show dia-
grammatically, it is only one example of a more general net-
work solution which requires ROAs to conform closely to the
following rules. Fig. 6. Expanded view of short sections of the transmission line, including
1) Signal inversion must occur on all (or most) closed paths. three sets of back-to-back inverters as a wavefront passes.
2) Impedance should match at all junctions.
3) Signals should arrive simultaneously at junctions. F. Coherent Amplification, Rotation Locking
From 1) above, any odd number of crossovers are allowed on
Fig. 6 is an expanded view of a short section of transmission
the differential path and regular crossovers forming a braided
line with three sets of back-to-back inverters shown. It is as-
or “twisted pair” effect can dramatically reduce the unwanted
sumed that startup is complete and the rotating wave is sweeping
coupling to wires running alongside the differential line.
left to right. For this analysis, we view the inverter pairs as dis-
The differential lines would typically be fabricated on the top
crete latch elements.
metal layer of a CMOS chip where the reverse-scaling trend of
Each latch switches in turn as the incident signal, traveling on
VLSI interconnect offers increasingly high performance [10].
the low impedance transmission line, overrides the ON resistance
of the latch and its previous state. This “clash” of states occurs
E. Fields and Currents only at the rotating wavefront and therefore only one region is in
Fig. 5 illustrates a three-dimensional section of the ring struc- this cross-conduction condition at any one time. The transmis-
ture connected to a pair of CMOS inverters expanded to show sion-line impedance is of the order of 10 and the differential
the four individual transistors. The main current flow in the dif- on-resistance of the inverters is in the 100- –1-k range, de-
ferential conductors is shown by solid arrows, the magnetic field pending on how finely they are distributed throughout the struc-
surrounding these conductors by dashed loops, and the capac- ture.
itance charge/signal-boost current flowing through the transis- Once switched, each latch contributes for the remainder of the
tors by dashed lines. half cycle, adding to the forward-going signal. Coherent buildup
An important feature of differential lines is the existence of a of switching events occurs in this forward direction only. An
well-defined “go” and “return” path which gives predictable in- equal amount of energy is launched in the reverse direction, but
ductance characteristics in contrast to the uncertain return-cur- the latches in that direction cannot be switched further into the
rent path for single-ended clock distribution [11]. state to which they have already switched. The reverse-traveling
Capacitance arises mainly from the transistor gate and deple- components simply reduce the amount of drive required from
tion capacitance and interconnect capacitance does not domi- those latches.
nate. Importantly, it is the nonlinear latching action which is re-
indicates intrinsic gate resistance, i.e., the ohmic path sponsible for the self-locking of direction (a highly linear am-
through which the gate charge flows. The term implies a plifier has no such directionality).
parasitic gate term, but in reality, most of this resistance is in To clarify the above statements, Fig. 7 demonstrates how a
the series circuit of the channel under the gate electrode. This is large CMOS latch responds to an imposed differential signal.
shared by the D-S channel, as illustrated by the triangular region The curve trace shows a central differential-amplification re-
(shown with transistors operating in the pinchoff region). gion bounded by two absorptive ohmic regions (shaded) corre-
WOOD et al.: ROTARY TRAVELING-WAVE OSCILLATOR ARRAYS 1657
(4)
RingLen
(1) (The 2 factor arises from the pulse requiring two complete
laps for a single cycle.)
where Differential characteristic impedance is given by
interconnect capacitance for the line AB;
gate overlap and Miller-effect feedback capaci-
tance; (5)
total channel capacitance;
drain depletion capacitance to bulk (substrate); Transmission line characteristics dominate over RC charac-
load capacitance added to a line. teristics when [14]
(Note that the is used to convert the in-parallel “to ground”
values into in-series differential values of capacitance.) (6)
is usually a small part of total capacitance and accu-
rate formulas are available [12] if needed. H. Bandwidth and Power Consumption
To calculate the per-unit-length differential inductance, i.e.,
accounting for mutual coupling, we use [13], expressed below. Seen from an RF perspective, Fig. 8(a) shows the RTWO to
be two push–pull distributed amplifiers folded on top of each
other. Distributed amplifiers exhibit very wide bandwidth be-
(2)
cause parasitic capacitances are “neutralized” by becoming part
1658 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 36, NO. 11, NOVEMBER 2001
TABLE I
CHANGES OF CHARACTERISTICS WITH N
of the transmission-line impedance [15]. Performance is limited Most of the remaining losses in Table I are attributed to cross-
by the carrier transit time of the MOSFETs [16], not by the tra- conduction and parasitic losses. is a real loss mechanism
ditional digital inverter propagation time , which is not ap- for gigahertz signals, and RTWO rise/fall times can be doubled
plicable where gates and drains are driven cooperatively by an by this phenomenon. In newer CMOS processes, improves
imposed low-impedance signal, and where the load capacitance with shorter channel length.
is hidden in the transmission line.
Operation of the RTWO is largely adiabatic when the voltage
drop required to charge the capacitances is developed mainly III. MORE DETAILED CONSIDERATIONS
across the inductance:
A. Skew Control
(7)
Interconnected RTWO loops offer the potential to control
and when the intrinsic gate resistance is low relative to the re- skew in spite of relatively large open-loop time-of-flight
actance of the gate capacitance. mismatches. Functionally, phase averaging occurs by pulse
combination at the junction of multiple transmission lines.
(8) For a four-port junction, the normal operating mode will see
RTWO rise and fall times are controllable by setting the cutoff two pulses arriving at the junction simultaneously. These
frequency of the transmission lines. two sources will feed two output ports and signal flow will
be unimpeded by reflections if impedance is matched. This
(9) amounts to a situation similar to that described in [17], [18],
although for ROAs, the mechanism is LC transmission-line
energy combination, not ohmic combination of CMOS inverter
Edges become faster and cross-conduction losses are reduced
outputs.
when the structure is more distributed.
Where there exists a time-of-flight mismatch, one pulse ar-
Table I lists characteristic changes with , where
rives at the junction before the other. Fig. 9(a) depicts the oper-
with , and held
ation of a four-port junction between of two interwired but ve-
constant.
locity-mismatched RTWO loops. Each of these rings has been
The most significant power loss mechanism for the RTWO is
divided into segments numbered (each as Fig. 8). Four
power dissipated in the interconnect, given by
rings are wired together (similar to Fig. 16, shown later). Only
the junction of the rings and are considered here;
(10)
the latter having a higher open-loop operating frequency.
WOOD et al.: ROTARY TRAVELING-WAVE OSCILLATOR ARRAYS 1659
Fig. 11. Segment of chip layout showing 90 routing beneath clock lines and
a tap to clock (CLK: CLK) loads.
(11)
D. Frequency/Impedance Adjustment
When the above condition is met, the capacitance can be taken
Rewriting (4) in the form below shows that frequency is set
as being effectively lumped on the main RTWO ring at the tap
only by the total inductance and capacitance of the RTWO loop.
point for the purposes of predicting oscillator frequency and ring
impedance.
Although not immediately apparent, this condition is achiev- (12)
able in practice due to three factors. The first factor is that the
tap line velocity is relatively fast for SiO dielectric. It is ap- Total loop inductance is proportional to RingLen and
proximately , while the main RTWO oscillator ring might varies strongly as a function of the width and pitch of the top
be operating at perhaps . The second factor is that the metal differential conductors. This allows a coarse frequency
tap length only has to be long enough to reach within a single selection through the top-metal mask definition. Unit-to-unit in-
RTWO ring. The third factor is that it requires two signal rota- ductance variation is expected to be small because of the good
tions on the RTWO to complete a clock cycle. These three fac- lithographic reproduction of the relatively large clock conduc-
tors work together to make the RTWO rings physically small tors and the weak sensitivity of inductance to metal thickness
compared to the expected speed-of-light dimensions. The dis- variations.
tances to be spanned by the fast tap wires are therefore short Total capacitance for the RTWO is the sum of all
enough that transmission-line effects on these lines are unim- lumped capacitances connected to the loop (1). tends to
portant—certainly at the clock fundamental frequency and even be dominated by gate-oxide capacitance from the drive
at higher harmonics. FETs and the clock load FETs. is inversely proportional
This can be illustrated by reference to a specific 3.4-GHz to gate-oxide thickness , which on a modern CMOS SiO
RTWO, 3200 m long with 20-ps rise/fall times. Within one of is controlled to approximately 5% variation over extended
these rise or fall periods, a stub transmission line with velocity wafer lots [24]. Drain depletion capacitances exist on bulk
is able to communicate a signal over a distance of 3 mm. CMOS where the active transistors connect to the ring.
For a stub length of 400 m (to reach the center of the ring), this During the VLSI layout phase, a CAD tool (expected re-
equates to 3.75 round-trip times along the stub. lease: Q1 2002) can target a fixed operating frequency. The
Fig. 12 shows simulated waveforms with 2 pF of total tool will be able to correct impedance discontinuities caused by
to-ground capacitance at the end of one such stub. Reflected lumped load capacitance by the addition of dummy “padding”
energy gives rise to the ringing which is evident with this level capacitance elsewhere around the loop, and postcompensate an
of capacitance. The line resistance of the stubs must be low to overly capacitive-loaded clock network by reducing the differ-
maintain reflective energy conservation. ential inductances through pitch reduction—hence restoring ve-
The ratiometric factors outlined above between ring length, locity and thus frequency. Alternatively, at the expense of using
frequency, rise/fall time, and stub lengths are expected to hold as more metallization, a new layout with more numerous, shorter
ROAs are scaled to higher frequencies and smaller ring lengths length rings could be used. The tool will need to simultaneously
without requiring special stub tuning measures. solve impedance matching issues [refer to Section II-A, (5)]. By
Capacitive Loading Limits: Substantial total-chip capacitive manipulation of both and simultaneously, it is possible to
loading can be tolerated by the RTWO relative to conventionally control and independently, as shown diagrammatically in
resonant systems [8], [22], [23]. However, the loading effects of Fig. 13. For example, velocity can be reduced by increasing
interconnect, active, and stub capacitances cannot be increased both and by the same factor to cancel the effect on .
without limit. The consequential lowering of line impedance in- These adjustments can support arbitrary branch-and-combine
creases circulating currents until losses become a concern. networks (at least in theory).
Eventually, the impedance becomes so low relative to the loop Post fabrication, adding together the sources of variation and
resistance that the relation (6) cannot be maintained, whereupon given that frequency is related to and , a 5% ini-
oscillation ceases altogether. tial tolerance of operating frequency between parts is expected.
WOOD et al.: ROTARY TRAVELING-WAVE OSCILLATOR ARRAYS 1661
TABLE II
VARIATIONS WITH TEMPERATURE
TABLE III
VARIATIONS WITH DC SUPPLY VOLTAGE V
TABLE IV
INDUCED NOISE AS A FUNCTION OF VICTIM DISTANCE AND LENGTH
Fig. 14. Crossover traces, a visualization output from the Rotary Explorer tool.
trace and one end connected to ground. Note the more sensitive
Fig. 15. Example of notably strong coupled signal waveform.
noise scale.
The absolute maximum coupling occurs if victim distance is
allowed to go to zero. In this case, mutual coupling between ag- by coupling to any highly conductive structure in which eddy
gressor and victim is 100% with no cancellation effects from the currents can flow to decrease and distort the inducing field. Cou-
other differential trace. As a numerical example, it follows that plings to less conductive circuits such as the substrate give a loss
a 2.5-V signal with a rise time of 20 ps on a transmission line mechanism which can be modeled as a shunt term in the trans-
with a velocity of has the 2.5-V gradient over 430 m of mission-line equations. LC resonance in the small-scale coupled
length (Fig. 4 illustrates the concept). Over the 60- m length structures is unlikely because of the high resonant frequencies.
discussed above, this equates to 348 mV. Slower edge rates, All of the coupling mechanisms mentioned are edge-rate depen-
faster transmission lines, and lower supply voltages reduce this dent, and this can limit the achievable rise and fall times of the
figure proportionally. RTWO by attenuating the high-frequency signal components.
Long-range inductive noise coupling from the differential Full RLC layout extraction is essential in the neighborhood of
transmission line is expected to be small, since (from a distance) the clock lines if routing is allowed in these areas. An alternative
the ‘go’ and ‘return’ currents are equal and opposite. proposal under investigation is to predefine a VLSI structure
Potential problems exist in short-range magnetic coupling to combining clock and power distribution into the same grid to
wiring in the vicinity of the clock lines. Inductance is lowered give consistent characteristics and shielding.
WOOD et al.: ROTARY TRAVELING-WAVE OSCILLATOR ARRAYS 1663
Fig. 18. Clock frequency versus V for the large ring and I versus V
for the entire chip with all five rings.
analysis was performed including these power traces was it [4] S. Tam, S. Rusu, U. N. Desai, R. Kim, J. Zhang, and I. Young, “Clock
apparent that induced current loops (circulating through the generation and distribution for the first IA-64 microprocessor,” IEEE J.
Solid-State Circuits, vol. 35, pp. 1545–1552, Nov. 2000.
decoupling capacitors) were strongly attenuating the rotary [5] C. J. Anderson et al., “Physical design of a forth-generation power
signal. In this condition, the latching action (Fig. 7) does not GHz microprocessor,” in ISSCC 2001 Dig. Tech. Papers, Feb. 2001,
pp. 232–233.
fully develop and the rings support linear amplification of noise [6] V. L. Chi, “Salphasic distribution of clock signals for synchronous sys-
signals—hence the problematic multimode action. (This effect tems,” IEEE Trans. Comput., vol. 43, pp. 597–602, May 1994.
was much less severe on the large 965-MHz ring because the [7] B. Kleveland et al., “Monolithic CMOS distributed amplifier and oscil-
lator,” in ISSCC Dig. Tech. Papers, Feb. 1999, pp. 70–71.
lines were much closer to the magnetically neutral [8] W. Athas, N. Tzartzanis, L. J. Svensson, L. Peterson, H. Li, X. Jiang,
center line of the transmission line). The problem can be P. Wang, and W.-C. Liu, “AC-1: A clock-powered microprocessor,”
mitigated by use of braided transmission lines. (as detailed in in Proc. Int. Symp. Low-Power Electronics and Design, Aug. 1997,
[Online] Available: http://www.isi.edu/acmos/people/nestoras/pa-
Section IV-C). pers/97-08.MontereyAC1.ps.
Analysis of the test chip showed that 90 coupling between [9] J. Wood. PCT/GB00/00175. MultiGig Ltd.. [Online]. Available:
M5 and the orthogonal thin M4 lines is not a significant http://www.delphion.com/cgi-bin/viewpat.cmd/WO00044093A1
[10] B. Kleveland, T. H. Lee, and S. S. Wong, “50-GHz interconnect design
problem, making it possible to route power and signals between in standard silicon technology,” presented at the IEEE MTT-S Int.
regions bounded by the rotary clock structures. Microwave Symp., Baltimore, MD, June 1998, [Online] Available:
http://smirc.stanford.edu/papers/mtts98p-bendik.pdf.
[11] B. Kleveland, X. Qi, L. Madden 1, R. W. Dutton, and S. S. Wong, “Line
VI. CONCLUSION AND FURTHER WORK PLANNED inductance extraction and modeling in a real chip with power grid,” pre-
sented at the IEEE IEDM Conf., Washington, D. C., Dec. 1999, [Online]
This paper has described the rotary traveling-wave oscillator Available: http://gloworm.stanford.edu/tcad/pubs/device/iedm.pdf.
[12] N. Delorme et al., “Inductance and capacitance analytic formulas for
(RTWO) and its potential application to gigahertz-rate VLSI VLSI interconnect,” Electron. Lett., vol. 32, no. 11, May 23, 1996.
clocking. The oscillator is unique for a resonant-style LC-based [13] C. S. Walker, Capacitance, Inductance and Crosstalk Anal-
oscillator in that it produces square waves directly and can ysis. Norwood, MA: Artech, 1990, p. 95.
[14] A. Deutsch et al., “Modeling and characterization of long on-chip inter-
be hardwired to form rotary oscillator arrays (ROAs). Being connections for high-performance microprocessors,” IBM J. Res. De-
LC-based, the oscillator is stable and jitter is low. velop., vol. 39, no. 5, pp. 547–567, Sept. 1995. p. 549.
[15] J. B. Beyer et al., “MESFET distributed amplifier design guidelines,”
The formulas presented here give practical adiabatic oscil- IEEE Trans. Microwave Theory Tech., vol. MTT-32, pp. 268–275, Mar.
lator designs suitable for VLSI fabrication. The structure and 1984.
operation of the RTWO is fundamentally simple and amenable [16] Y. Tsividis, Operation and Modeling of the MOS Transistor, 2nd
ed. New York: McGraw-Hill, 1999, pp. 339–340.
to analysis. We find that agreement between simulation and [17] H. Larsson, “Distributed synchronous clocking using connected ring
measurement is good. oscillators,” Master’s thesis, Computer Systems Engineering Centre for
We need to demonstrate skew control (believed to be inherent) Computer System Architecture, Halmstad Univ., Halmstad, Sweden,
Jan. 1997. [Online] Available: http://www.hh.se/ide/ccaweb/publica-
to fully establish that the simulated performance of multiring tions/97/distclock/9705.ps.
ROAs is realizable, and to measure susceptibility to induced [18] L. Hall, M. Clements, W. Liu, and G. Bilbro, “Clock distribution
high-frequency noise. Further work is planned to establish firm using cooperative ring oscillators,” in Proc. IEEE 17th Conf. Ad-
vanced Research in VLSI (ARVLSI’97), 1997, [Online] Available:
mathematical/analytical foundations for the prediction of both http://www.computer.org/proceedings/arvlsi/7913/79130062abs.htm.
jitter and skew and to determine exact stability criteria for ar- [19] T. C. Edwards and M. B. Steer, Foundations of Interconnect and Mi-
crostrip Design, Chichester, U.K.: Wiley, 2000, ch. 6. sec. 6.11.
rayed oscillators. Currently, a test chip using braided transmis- [20] C. P. Yue and S. S. Wong, “On-chip spiral inductors with patterned
sion line design to minimize coupling and incorporating varac- ground shields for Si-based RF ICs,” IEEE J. Solid-State Circuits, vol.
tors to control frequency is awaiting packaging and test. 33, pp. 743–752, May 1998.
[21] C. P. Yue and S. S. Wong, “A study on substrate effects of silicon-based
Looking to the future, our simulations predict that the oscil- RF passive components,” in MTT-S Int. Microwave Symp. Dig., June
lator scales well. On a more modern 0.18- m copper process, 1999, pp. 1625–1628.
10.5-GHz square-wave oscillator/distributors should be realiz- [22] M. E. Becker and T. F. Knight Jr. Transmission line clock driver. pre-
sented at 1999 IEEE Int. Conf. Computer Design. [Online]. Available:
able consuming less than 32 mA per ring using slimmer 10- m http://www.computer.org/proceedings/iccd/0406/04060489abs.htm
conductors. From simulation, the RTWO also appears to be vi- [23] P. Zarkesh-Ha and J. D. Meindl, “Asymptotically zero power dissipation
Gigahertz clock distribution networks,” IEEE Electrical Performance
able on SOI processes.
and Electronic Packaging, pp. 57–60, Oct. 1999.
[24] K. Bernstein, K. Carrig, C. M. Durham, and P. A. Hansen, High Speed
CMOS Design Styles. Norwood, MA: Kluwer, 1998, p. 22.
ACKNOWLEDGMENT [25] T. Soorapanth, C. P. Yue, D. Shaeffer, T. H. Lee, and S. S. Wong, “Anal-
ysis and optimization of accumulation-mode varactor for RF ICs,” pre-
The authors would like to thank P. Franzon and M. Steer, both sented at the Symp. VLSI Circuits, Honolulu, HI, June 11–13, 1998,
of North Carolina State University, for their assistance, and the [Online] Available: http://smirc.stanford.edu/papers/VLSI98p-chet.pdf.
Raunds and British public library service. [26] H. B. Bakoglu, J. T. Walker, and J. D. Meindl, “A symmetric clock-
distribution tree and optimized high speed interconnections for reduced
clock skew in ULSI and WSI circuits,” in IEEE Int. Conf. Computer
REFERENCES Design, Oct. 1986, pp. 118–122.
[27] M. Bußmann and U. Langmann, “Active compensation of interconnect
[1] E. G. Friedman, High Performance Clock Distribution Net- losses for multi-GHz clock distribution networks,” IEEE Trans. Circuits
works. Boston, MA: Kluwer, 1997. and Syst. II, vol. 39, pp. 790–798, Nov. 1992.
[2] D. Harris, Skew Tolerant Circuit Design. San Mateo, CA: Morgan [28] M. C. Papaefthymiou and K. H. Randall, “Edge-triggering vs.
Kaufmann, 2000. two-phase level-clocking,” presented at the 1993 Symp. Re-
[3] G. A. Pratt and J. Nguyen, “Distributed synchronous clocking,” IEEE search on Integrated Systems, Mar. 1993, [Online] Available:
Trans. Parallel Distributed Syst., vol. 6, pp. 314–328, Mar. 1995. http://www.eecs.umich.edu/~marios/papers/sis93.ps.
WOOD et al.: ROTARY TRAVELING-WAVE OSCILLATOR ARRAYS 1665
[29] L. Benni et al., “Clock skew optimization for peak current reduction,” Terence C. Edwards (M’89) received the M.Phil.
J. VLSI Signal Processing, vol. 16, pp. 117–130, 1997. degree in microwaves.
[30] International Semiconductor Roadmap for Semiconductors (1999). He is the Executive Director of Engalco, a con-
[Online]. Available: http://public.itrs.net/files/1999_SIA_Roadmap/De- sultancy firm based in the U.K., mainly specializing
sign.pdf in signal transmission technologies and the global
[31] I. S. Kourtev and E. G. Friedman, Timing Optimization Through Clock RF and microwave industry. He researches and takes
Skew Scheduling. Boston, MA: Kluwer, 2000. responsibility for regular releases of Microwaves
[32] MultiGig, Ltd. Rotary Explorer. [Online]. Available: http://www. North America, published 1995, 1998, and 2001.
multigig.com/software.htm He has authored several publications (including
[33] M. Kamon, M. J. Tsuk, and J. K. White, “FASTHENRY: A multipole-ac- papers published in the IEEE TRANSACTIONS ON
celerated 3-D inductance extraction program,” IEEE Trans. Microwave MICROWAVE THEORY AND TECHNIQUES), has led
Theory Tech., vol. 429, pp. 1750–1758, Sept. 1994. management seminars on fiber optics, presented a paper on mobile technologies
[34] BSIM Research Group. (2000–2001) The BSIM4 Short-Channel at the IMAPS Microelectronics Symposium, Philadelphia, PA, October 1997,
Transistor Model. Univ. of California at Berkeley. [Online]. Available: and has written several articles and books. These include (jointly with Prof.
http://www-device.eecs.berkeley.edu/~bsim3/bsim4.html Michael Steer) one recently on MICs (New York: Wiley) and on gigahertz and
terahertz technologies (Norwood, MA: Artech, 2000). He is on the editorial
advisory board for the International Journal of Communication Systems. He
regularly consults for both national and overseas companies and is on the
prestigious IEE (London) President’s List of Consultants.
Mr. Edwards is a Fellow of the Institution of Electrical Engineers (IEE), U.K.
Abstract—The implementation of a dual-modulus prescaler (di- effective channel length) is presented. The prescaler imple-
vide by 128/129) using an extension of the true-single-phase-clock mentation purpose is the evaluation of the E-TSPC technique
(TSPC) technique, the extended TSPC (E-TSPC), is presented. potentialities.
The E-TSPC [1], [2] consists of a set of composition rules for
single-phase-clock circuits employing static, dynamic, latch, data- This paper is organized as follow. In Section II, the principal
precharged, and NMOS-like CMOS blocks. The composition features of the E-TSPC technique, blocks and design rules,
rules, as well as the CMOS blocks, are described and discussed. are presented. In Section III, some different dual-modulus im-
The experimental results of the complete dual-modulus prescaler, plementations are analyzed. Experimental results and compar-
implemented in a 0.8 m CMOS process, show a maximum 1.59 isons are reported in Section IV, and the principal conclusions
GHz operation rate at 5 V with 12.8 mW power consumption.
They are compared with the results from other recent implemen- are drawn in Section V.
tations showing that the proposed E-TSPC circuit can reach high
speed with both smaller area and lower power consumption.
II. E-TSPC CIRCUIT BLOCKS AND COMPOSITION RULES
Index Terms— CMOS digital, high-speed circuits, prescalers,
single-phase-clock design.
A. Basic CMOS Blocks
An E-TSPC circuit should use any of the blocks: CMOS
I. INTRODUCTION static block, n-dynamic block [Fig. 1(a)], p-dynamic block
(b)
(a)
(c)
Fig. 2. Transformation from (a) a static block into data-precharged blocks: (b) PH blocks and (c) PL blocks.
3) going through static, n-dynamic, n-Dp, or n-latch blocks; For the p-data chains, an equivalent definition applies,
4) regardless of the number and ordering of the blocks replacing n with p and vice versa.
defined above; When clock is high, n-data chains are in evaluation phase;
5) finishing in a circuit external output, or in the input of otherwise, they are in holding phase. P-data chains evaluate
the first p-latch, p-dynamic, or p-Dp block. when clock is low.
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 1, JANUARY 1999 99
Fig. 3. Example of n-data chains. The blocks mentioned in the text are named and hatched in the figure.
TABLE I
CONDITIONS FOR CORRECT OPERATION OF THE NMOS-LIKE BLOCKS
Fig. 6. Transistor schematic of the divide-by-4/5 counter DG4 . The transistor width or, when the length is different from 0.8 m, the transistor width/length,
in m, is also indicated in the figure.
TABLE II
MAXIMUM SPEED AND POWER-CONSUMPTION RESULTS FOR
THE FOUR DESIGNED DIVIDE-BY-4/5 COUNTERS (SPICE
SIMULATIONS, SLOW PARAMETERS, AND VDD = 5 V)
Design Speed Power
(GHz) (w/MHz)
DG1 0.98 3.27
DG2 1.28 4.45
DG3 1.39 4.85
DG4 1.67 5.62
TABLE III
AREA, SPEED, AND POWER-CONSUMPTION
RESULTS FOR FOUR DIFFERENT PRESCALERS
16
tector (PFD) is identified as the main source of spectral pollution in
fractional- synthesizers. The design of the zero-dead zone
PFD and the dual charge pump is optimized toward linearity and
spurious suppression. The frequency synthesizer consumes 35 mA
from a single 2-V power supply. The measured phase noise is as Fig. 1. Principle of 16 fractional-N synthesis.
low as 120 dBc/Hz at 600 kHz and 139 dBc/Hz at 3 MHz.
The measured fractional spur level is less than 100 dBc, even
for fractional frequencies close to integer multiples of the refer- digital noise coupling, the modulator is scheduled for inte-
ence frequency, thereby satisfying the DCS-1800 spectral purity gration on the digital baseband signal processing IC of the full
constraints. transceiver system.
Index Terms—CMOS RF integrated circuits, 16
modulator, The paper describes the design of a monolithic 1.8-GHz
fractional- frequency synthesis, phase-locked loop, phase noise. -controlled fractional- PLL frequency synthesizer. In
Section II, the influence of noise on PLL bandwidth
I. INTRODUCTION requirements is theoretically analyzed for multistage noise
shaping (MASH) and multibit single-loop modulators.
Fig. 2. Third-order multibit single-loop 16 modulator. The internal modulator accuracy is 16 bit. From the five output bits, only four are used for stability
reasons.
B. The Modulators
The influence of both third-order MASH and multibit
single-loop modulators on the spectral purity of the
fractional- synthesizer is investigated. Since the order of
the integrated PLL loop filter is three, the order of the
modulators must also be three or higher to ensure that
noise has at least a 20-dB/dec rolloff at intermediate offset
frequencies, causing no degradation of the output phase noise.
Both modulators have an internal accuracy of 16 bit and 1 LSB
dithering is applied to further randomize any spurious energy. Fig. 3. Maximum PLL bandwidth f versus the reference frequency and
The dithering sequence is third-order noise shaped to avoid an different16 modulator orders, for the type-II fourth-order PLL. The dashed
curve is for the third-order single-loop modulator. The targeted phase-noise
increased noise floor.
The MASH or cascade 1-1-1 modulator is chosen be-
0
specification is 136 dBc/Hz at 3 MHz for DCS-1800.
1
Fig. 5. Simulation results. The phase error for (a) the MASH modulator and (b) the single-loop multibit modulator. The FFT of the current pulses CP [i] for
(c) the MASH modulator and (d) the single-loop multibit modulator.
is 26 MHz and the fractional division number is 67.92. The domain, this effect corresponds to the smaller phase excursions.
output frequency is 1.76592 GHz, i.e., 2.08 MHz offset from The difference in phase error between MASH and single-loop
an integer multiple of . In Fig. 5(a) and (b), the time-domain modulators is reflected in a lower noise floor, i.e., a 10-dB dif-
phase error is plotted for both modulators. Note that the ference. In addition, previously unnoticed spurious tones appear
fractional- PLL frequency synthesizer can hardly be called a in the output spectrum at with .
phase-locked loop, since the loop is never in lock! Due to the Fig. 6 shows the noise of both modulators as it appears at
shaping of the HF noise in the single-loop modulator, the in- the PLL output for an ideal (dotted) and a nonlinear
stantaneous phase error is smaller than for a MASH modulator. conversion (solid). The results of the ideal case closely match
This has two important consequences. First, the on-time of the the theoretical results of Section II-C (solid light gray). Due
charge pumps is smaller for the single-loop modulator, making it to nonlinearity, the simulated output spectrum of the integer-
less sensitive to noise coupling from the substrate and the power PLL (the dash-dotted line) is seriously deteriorated by noise
supply. Second, the sensitivity to the nonlinear con- in the PLL noise bandwidth, increasing the . Especially,
version in terms of noise leakage is reduced. the MASH converter is critical in terms of in-band noise due
To be able to examine the effect of nonlinearities in the fre- to the higher phase error [see Fig. 5(a)], despite the inherently
quency domain, the FFTs of the charge-pump current pulses lower LF noise of the MASH modulator. Note that the sim-
are plotted in Fig. 5(c) and (d). A noise floor appears in ulations are performed without taking into account noise cou-
the output spectrum as well as spurious tones, although the pling through the substrate or power-supply lines. As a conse-
output is perfectly randomized and dithered. Due to the non- quence, the actual spurious performance of the fractional-
linear mixing in the PFD charge pump, noise at folds PLL could be worse than simulated. The presented simulation
back to lower offset frequencies, similar to the effect of a non- results are for a division modulus 67.92, close to an integer mul-
linear DAC in a multibit ADC. Since the noise at is tiple of . When analyzing division moduli in between integer
much lower for the single-loop modulator, its noise leakage multiples of , noise leakage is still observed, but the spurious
due to the nonlinear mixing in the PFD is also lower. In the time tones are well below the phase noise.
DE MUER AND STEYAERT: CMOS MONOLITHIC FREQUENCY SYNTHESIZER FOR DCS-1800 839
Fig. 7. Discrete time autocorrelation estimate of the modulator outputs for (a)
the MASH modulator and (b) the single-loop multibit modulator.
Fig. 6. Simulation results. The 16 noise at the output of the PLL for (a) the
PFD. This effect can be worsened by substrate and power-
supply coupling with signals at .
MASH modulator and (b) the single-loop multibit modulator. The results are
plotted for an ideal PFD (dotted), which closely corresponds to the theoretical
results (solid light gray) and for a nonlinear PFD (solid). They are compared to IV. PLL BUILDING-BLOCK CIRCUIT DESIGN
the simulated integer PLL phase noise (the dash-dotted line).
A. The Fourth-Order Type-II PLL
The explanation for the re-emerging of spurious tones is that A fourth-order type-II PLL is integrated, including a 4-bit
the modulator is unable to sufficiently decorrelate the successive prescaler, a zero-dead-zone PFD, a dual charge pump, and a
output samples. To quantify the correlation in the modulator 3-step equalizer, together with an on-chip LC-tank VCO and a
output, the discrete time autocorrelation estimate is calculated third-order dual-path 35-kHz low-pass loop filter (see Fig. 8).
and plotted for both modulators for inputs close to an integer The equalizer performs a 3-step piecewise equalization of the
value (see Fig. 7). The autocorrelation calculations show corre- loop gain, by keeping the product of the VCO gain and the
lation, although 1–LSB noise-shaped dithering is applied. The charge-pump current constant. To prevent switching between
autocorrelation of the single-loop modulator shows large different equalization states, the state transitions exhibit hys-
correlation peaks, explaining the higher spurious tones in the teresis.
output phase-noise spectrum of the PLL. With the autocorrela-
tion estimate, the necessary internal accuracy of the mod- B. The 4-Bit Prescaler
ulators is found to be at least 13 bits for MASH and 16 bit The first high-speed division of the prescaler is done
for single-loop modulators to sufficiently decorrelate the with two differential single-transistor-clocked (DSTC) logic
modulator output for inputs close to integers. A second possible n-latches [10], forming a differential dynamic D-flip-flop. The
source of tones is the downconversion of tones which are inher- flip-flop operates with rail-to-rail internal signals to minimize
ently present around [5], by the nonlinear mixing in the the residual prescaler phase noise [11] to levels insignificant to
840 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 7, JULY 2002
Fig. 9. (a) Timing control circuit and signals to control the dummy and the output current branch of the charge pump. (b) Charge-pump circuit with (at the left)
the dummy current branch, denoted by the suffix d, and the output branch.
seen in Section II-C, the loop bandwidth needs to be smaller than at a fixed level (see Fig. 8). Additionally, the charge-pump cur-
62 kHz for noise suppression. However, to ensure sufficient rent is designed to be at least a magnitude larger than the fixed
suppression of the low-frequency fractional spurious tones for parasitic charge injection of the switch transistors. The current
inputs close to the integers, the bandwidth is designed to 35 kHz. switches are implemented with pMOS and nMOS transistors to
Despite the rather low loop bandwidth for a fractional- syn- compensate charge injection. Finally, a timing control scheme
thesizer, a settling time of less than 293 s for a 104-MHz step [Fig. 9(a)] is developed to control the charge-pump switches.
is simulated. The up and down control pulses of the PFD are converted to syn-
chronized control signals to drive both the output current branch
E. The Conversion and the dummy current branch of the charge pump [Fig. 9(b)].
Fig. 9(a) shows the dummy and output control signals. The
The nonlinear analysis of Section III identified nonlinearity dummy control is delayed versus the output control by
of the conversion as the main cause of noise leakage modifying the thresholds of the second inverter-string (indicated
and spurious tones. Therefore, the PFD and charge-pump cir- by high and low) such that the current always flows, pre-
cuits are carefully optimized toward spurious suppression as venting hard on/off switching of the current sources. To equalize
such and toward a highly linear phase-error detection for rise and fall times and force a perfect rad relation between
spurious suppression. nMOS and pMOS control signals, latches at the outputs of both
First, the reference spur generation by the PFD charge-pump inverter strings are implemented. Capacitors at the control out-
circuit is carefully minimized. The integration in the first path of puts lower the rise and fall times to prevent large charge injec-
the loop filter is done actively to keep the charge-pump output tions by fast switching.
842 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 7, JULY 2002
TABLE II
SUMMARY OF MEASURED SPECIFICATIONS COMPARED TO THE
DCS-1800 SPECIFICATIONS
noisy control pulses are close to the LC tank and the bonding
wires of the VCO power supply. Without proper shielding, the
VCO phase noise is seriously degraded by this noise coupling.
In Fig. 13, the measured noise and the noise as sim-
ulated in Section III (dashed) is compared. The dash-dotted line
is the simulated phase noise of the PLL without control. The
simulated noise leakage closely matches the measured re-
sults, except at very low offsets due to the limited memory. The
phase noise at high offsets is increased versus the simulated PLL
results due to noise coupling. Second-order tones are larger in
measurements, since the models in the simulator do not include
second-order effects and noise coupling. Tones at 520 kHz are
believed to come from subharmonic tones present in the
Fig. 13. Phase noise measurement with the MASH converter at 1.76592 GHz modulator output [5], which are amplified by mixing through
compared to the simulated 16 noise at the output of the PLL (dashed), and
with the simulated PLL output without 16 control (dash-dotted).
noise coupling. When comparing the results for the MASH and
the single-loop modulator, the measured results are less pro-
nounced than the simulated results (see Fig. 6). The measured
single-loop multibit modulator is presented in Figs. 12 and 13.
phase noise for the single-loop modulator is however a few deci-
Small spurs are present at 2.08 MHz as predicted by the simu-
bels lower than for the MASH modulator. Note that all measure-
lations in Fig. 6. The spur level is well below 100 dBc, due to
ments are performed for frequencies close to integer multiples
careful PFD charge-pump design. The phase noise at 600 kHz
of .
is lower than 120 dBc/Hz. The measured settling time of the PLL is 226 s for a
In Fig. 12, the measured phase noise of the PLL with a
104-MHz frequency step. The power consumption of the PLL
multibit single-loop modulator (dark) is compared to the phase
is 70 mW from a 2-V power supply. The fully integrated
noise at integer division (light). Noise at lower offsets origi-
low-phase-noise VCO is responsible for almost 66% of the
nates from the modulator due to noise folding in the PFD,
total power consumption. The IC area is 2 2 mm , including
as predicted by the simulations. As a result, the rms phase error
bonding pads and bypass capacitors. Table II shows the mea-
is increased from 1.7 to 3 . Note that the phase noise
sured specifications compared to the DCS-1800 specifications
of the PLL at integer divisions is as low as 124 dBc/Hz
[1]. The specifications of the IC prototype comply with the
at 600 kHz, which is only 0.3 dB higher than predicted by
DCS-1800, only the is degraded due to the limited
the PLL simulations (see Table I). The measured results for
resolution of the measurement setup.
fractional division are much noisier than predicted by simu-
lation. The phase noise at offset frequencies close to 10 kHz
is increased due to the limited memory of the data generator. VI. CONCLUSION
The noise at higher offset frequencies is corrupted by noise A monolithic 1.8-GHz -controlled fractional- PLL
coupling from the data generator. As can be seen in Fig. 10, frequency synthesizer is implemented in a standard 0.25- m
the -control bonding wires, which conduct rail-to-rail, very CMOS technology. The monolithic fourth-order type-II PLL
844 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 7, JULY 2002
integrates the digital synthesizer part together with a fully [11] B. De Muer and M. S. J. Steyaert, “A single-ended 1.5-GHz 8/9 dual-
integrated LC VCO, a high-speed prescaler, and a 35-kHz modulus prescaler in 0.7-m CMOS with low phase-noise and high
input sensitivity,” in Proc. Eur. Solid-State Circuits Conf. (ESSCIRC),
dual-path loop filter on a die of only 2 2 mm . To investigate The Hague, Sept. 1998, pp. 256–259.
the influence of the modulator on the synthesizer’s spectral [12] J. Craninckx and M. S. J. Steyaert, “Low-phase-noise fully integrated
purity, a fast nonlinear analysis method is developed, showing CMOS frequency synthesizers,” Ph.D. dissertation, Katholieke Univ.
Leuven, Belgium, 1997.
good correspondence with measurements, in contrast to the [13] B. De Muer, M. Borremans, N. Itoh, and M. S. J. Steyaert, “A 1.8-GHz
results of the theoretical analysis. Nonlinear mixing in the highly tunable low-phase-noise CMOS VCO,” in Proc. IEEE Custom
phase-frequency detector and the VCO is identified as the main Integrated Circuits Conf. (CICC), Orlando, FL, May 2000, pp. 585–588.
[14] B. De Muer and M. S. J. Steyaert, “Fully integrated CMOS frequency
source of spectral pollution in fractional- synthesizers. synthesizers for wireless communications,” in Analog Circuit Design,
MASH and single-loop multibit modulators are compared W. Sansen, J. H. Huijsing, and R. J. van de Plassche, Eds. Norwell,
for use in fractional- synthesis. Although the MASH is stable MA: Kluwer, 2000, pp. 287–323.
[15] F. M. Gardner, Phaselock Techniques. New York: Wiley, 1979.
and easy to integrate, the single-loop modulator presents a
better solution, showing less sensitivity to noise leakage and
noise coupling and providing more flexibility. The measured
phase noise is lower than 120 dBc/Hz at 600 kHz and Bram De Muer (S’00) was born in Sint-Amands-
139 dBc/Hz at 3 MHz. The measured fractional spur level is berg, Belgium, in 1973. He received the M.Sc.
lower than 100 dBc, satisfying the DCS-1800 spectral purity degree in electrical engineering in 1996 from the
Katholieke Universiteit Leuven, Belgium, where
requirements. All measurements are performed for frequencies he is currently working toward the Ph.D. degree
close to integer multiples of the reference frequency, where the on high frequency low-noise integrated frequency
synthesizer is most sensitive to spurious tones. synthesizers at the ESAT-MICAS laboratories.
He has been a Research Assistant with
ESAT-MICAS laboratories since 1996. His research
REFERENCES is focused on integrated low-phase-noise VCOs with
on-chip planar inductors and high-speed prescaler
[1] M. S. J. Steyaert, J. Janssens, B. De Muer, M. Borremans, and N. Itoh, “A design, leading to fully integrated 16 fractional-N synthesizers in CMOS
2-V CMOS cellular transceiver front-end,” IEEE J. Solid-State Circuits, technology.
vol. 35, pp. 1895–1907, Dec. 2000.
[2] T. Cho, E. Dukatz, M. Mack, D. Macnally, M. Marringa, S. Mehta, C.
Nilson, L. Plouvier, and S. Rabii, “A single-chip CMOS direct-conver-
sion transceiver for 900-MHz spread-spectrum digital cordless phones,”
in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, San Michel S. J. Steyaert (S’85–A’89–SM’92) was born
Francisco, CA, Feb. 1999, pp. 228–229. in Aalst, Belgium, in 1959. He received the M.S.
[3] A. Rofougaran, G. Chang, J. J. Rael, J. Y.-C. Chang, M. Rofougaran, P. degree in electrical-mechanical engineering and
J. Chang, M. Djafari, J. Min, E. W. Roth, A. A. Abidi, and H. Samueli, the Ph.D. degree in electronics from the Katholieke
“A single-chip 900-MHz spread-spectrum wireless transceiver in 1-m Universiteit Leuven (K.U. Leuven), Heverlee,
CMOS—Part II: Receiver design,” IEEE J. Solid-State Circuits, vol. 33, Belgium, in 1983 and 1987, respectively.
pp. 547–555, Apr. 1998. From 1983 to 1986, he obtained an IWONL fel-
[4] M. Copeland, T. Riley, and T. Kwasniewski, “Delta–sigma modulation lowship (Belgian National Foundation for Industrial
in fractional-N frequency synthesis,” IEEE J. Solid-State Circuits, vol. Research) which allowed him to work as a Research
28, pp. 553–559, May 1993. Assistant at the Laboratory ESAT at K.U. Leuven.
[5] S. R. Norsworthy, R. Schreier, and G. C. Themes, Delta–Sigma Data In 1987, he was responsible for several industrial
Converters: Theory, Design and Simulation. New York: IEEE Press, projects in the field of analog micropower circuits at the Laboratory ESAT as
1997. an IWONL Project Researcher. In 1988, he was a Visiting Assistant Professor
[6] B. Miller and R. Conley, “A multiple modulator fractional divider,” at the University of California, Los Angeles. In 1989, he was appointed by
IEEE Trans. Instrum. Meas., vol. 40, pp. 578–583, June 1991. the National Fund of Scientific Research (Belgium) as a Research Associate,
[7] “Digital cellular communication system (Phase 2+); Radio transmission in 1992 as a Senior Research Associate, and in 1996 as a Research Director
and reception,” Eur. Telecommun. Standards Inst., ETSI 300 190 (GSM at the Laboratory ESAT, K.U. Leuven. Between 1989 and 1996, he was also
05.05 version 5.4.1), 1997. a part-time Associate Professor and since 1997 an Associate Professor at
[8] W. Rhee, B.-S. Song, and A. Ali, “A 1.1-GHz CMOS fractional-N
16
the K.U. Leuven. His current research interests are in high-performance and
frequency synthesizer with a 3-b third-order modulator,” IEEE J. high-frequency analog integrated circuits for telecommunication systems and
Solid-State Circuits, vol. 35, pp. 1453–1460, Oct. 2000. analog signal processing.
[9] The Mathworks Inc., Matlab User’s Guide, Version 5. Englewood Dr. Steyaert received the 1990 European Solid-State Circuits Conference
Cliffs, NJ: Prentice Hall, 1997. Best Paper Award, the 1995 and 1997 ISSCC Evening Session Award, the
[10] J. Yuan and C. Svensson, “New single-clock CMOS latches and flip- 1999 IEEE Circuit and Systems Society Guillemin–Cauer Award, and the
flops with improved speed and power savings,” IEEE J. Solid-State Cir- 1991 NFWO Alcatel-Bell-Telephone award for innovative work in integrated
cuits, vol. 32, pp. 62–69, Jan. 1997. circuits for telecommunications.
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 7, JULY 2000 1039
(a)
(b)
Fig. 2. Programmable prescaler. (a) Basic architecture. (b) With extended division range.
B. Programmable Prescaler Architectures 1) can be realized. The division range is thus rather limited,
The “basic” programmable prescaler architecture is depicted amounting to roughly a factor two between maximum and
in Fig. 2(a). The modular structure consists of a chain of 2/3 di- minimum division ratios.1 The division range can be extended
vider cells connected like a ripple counter [11]. The structure by combining the prescaler with a set-reset counter [13]. In that
of Fig. 2(a) is characterized by the absence of long delay loops, case, however, the resulting architecture is no longer modular.
as feedback lines are only present between adjacent cells. This The divider implementation presented in Fig. 2(b) extends
“local feedback” enables simple optimization of power dissipa- the division range of the basic prescaler, whilst maintaining the
tion. Another advantage is that the topology of the different cells modularity of the basic architecture [14]. The operation of the
in the prescaler is the same, therefore facilitating layout work. new architecture is based on the direct relation between the per-
The architecture of Fig. 2(a) resembles the one presented in [12], formed division ratio and the bus programmed division word
which is also based on 2/3 divider cells. Yet there are two fun- Let us introduce the concept of effective
damental differences. First, in [12] all cells operate at the same length of the chain. It is the number of divider cells that are
(high) current level. Second, the architecture of [12] relies on effectively influencing the division cycle. Deliberately setting
a common strobe signal shared by all cells. This leads to high the mod input of a certain 2/3 cell to the active level overrules
power dissipation, because of high requirements on the slope of the influence of all cells to the right of that cell. The divider
the strobe signal, in combination with the high load presented chain behaves as if it has been shortened. The required effective
by all cells in parallel. length corresponds to the index of the most significative (and
The programmable prescaler operates as follows. Once in a active) bit of the programmed division word. Only a few extra
OR gates are required to adapt to the programmed division
division period, the last cell on the chain generates the signal
This signal then propagates “up” the chain, being re- word, as depicted on the right side of Fig. 2.
clocked by each cell along the way. An active mod signal en- With the additional logic the division range becomes:
ables a cell to divide by 3 (once in a division cycle), provided • minimum division ratio: ;
that its programming input is set to 1. Division by 3 adds one • maximum division ratio: .
extra period of each cell’s input signal to the period of the output We see that the minimum and maximum division ratios can be
signal. Hence, a chain of 2/3 cells provides an output signal set independently, by choice of and respectively. Subse-
with a period of quent changes in an optimized design can be realized with low
risk. A somewhat similar technique, applied to an asynchronous
programmable counter, is described in [9].
Fig. 3. Family of truly modular programmable dividers, and corresponding division range of the different implementations.
Fig. 4. Functional blocks and logical implementation of a 2/3 divider cell. Fig. 5. SCL implementation of an AND gate combined with a latch function.
Three circuits were implemented: an 18-bit -band divider, a to swallow one extra period of the input signal. In other words,
17-bit UHF divider, and a 12-bit reference divider. The architec- the cell divides by 3. If = 0, the cell stays in division by 2
ture and the division range of the dividers is presented in Fig. 3. mode. Regardless of the state of the input, the end-of-cycle
The -band divider was used as the basis for the UHF and for logic reclocks the signal, and outputs it to the preceding
the reference divider. The UHF divider consists of the same cir- cell in the chain signal).
cuitry as the -band divider, except for the first 2/3 cell, which
was removed. The reference divider is simply the -band di- C. Circuit Implementation of the 2/3 Divider Cells
vider stripped off its six high frequency cells. The use of standard rail-to-rail CMOS logic techniques
makes the integration of digital functions with sensitive RF
B. Logic Implementation of the 2/3 Divider Cells signal processing blocks difficult, due to the generation of large
A 2/3 divider cell comprises two functional blocks, as de- supply and substrate disturbances during logic transitions.
picted in Fig. 4. The prescaler logic block divides, upon control Source coupled logic (SCL), often referred to as MOS current
by the end-of-cycle logic, the frequency of the input signal mode logic (MCML), has better EMC properties, because of
either by 2 or by 3, and outputs the divided clock signal to the the constant supply current and differential voltage switching
next cell in the chain. The end-of-cycle logic determines the mo- operation [8]. Besides, SCL has lower power dissipation than
mentaneous division ratio of the cell, based on the state of the rail-to-rail logic, for (very) high input frequencies [15].
and signals. The signal becomes active once The logic functions of the 2/3 cells are implemented with the
in a division cycle. At that moment, the state of the input is SCL structure presented in Fig. 5. The logic tree combines an
checked, and if = 1, the end-of-cycle logic forces the prescaler AND gate with a latch function. Three AND_latch circuits are
1042 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 7, JULY 2000
used to implement Dlatch1, Dlatch3, Dlatch4 and the AND gates TABLE I
of the 2/3 cells (see Fig. 4). Therefore, six logic functions are SCALING OF CURRENTS IN THE 2/3 DIVIDER CELLS
Fig. 7. Sensitivity of the UHF divider, for different divider current settings. Division ratio = 511, nominal current is I = 10 A.
Fig. 8. Sensitivity curves of the L-band divider, for a few divider and amplifier current settings. Division ratio = 1023.
Fig. 9. Maximum operation frequency of the UHF and L-band dividers, as Fig. 10. Minimum input level for correct division of the frequency dividers,
function of divider current consumption (excluding input amplifiers). as function of the input amplifiers consumption.
Fig. 11. Phase noise of the reference and UHF divider, measured at 10 MHz. - - - Reference divider (I = 10 , F = 20 MHz, 010 dBm).
0
— Reference divider (I = 10 , F = 20 MHz, 20 dBm).
MCML circuits have been demonstrated to operate with supply level of the 20-MHz input signal. For the UHF divider, how-
voltages as low as 1.2 V [15], without significant loss of speed. ever, no dependency of the noise floor on the level of the input
signal at 640 MHz was observed. An increase of 25% in current
B. Phase Noise led to a change in noise floor from 122 dBc/Hz (nominal bias,
The phase noise of the UHF and reference dividers was = 10 A) to 124 dBc/Hz (with = 12.5 A). The noise
measured with a dedicated phase noise measurement system. floor of the reference divider went from 127.5 dBc/Hz down
We used coherent demodulation techniques (phase-locked to 130 dBc/Hz, with increased bias. Fig. 11 shows that the
loop configuration), and employed a low-noise 10 MHz signal high frequency cells of the UHF divider (see Fig. 3) contribute
source during the evaluation of the circuits. To facilitate the significantly to the phase noise, specially in the “ region.”
measurements, we implemented signal taps on the output An increase of noise of about 15 dB is observed, compared
of certain cells on the divider chain. The UHF divider was to the noise of the single reference divider’s cell.
provided with a tap on the output of its sixth cell; the
reference divider had the output of the first cell tapped. C. Power Efficiency
Fig. 11 presents the phase noise of the UHF divider, with Fig. 12 presents the power efficiency of the UHF and -band
nominal settings for the supply current. The straight lines rep- dividers, in comparison to recently published data on low-power
resent measured phase noise of the reference divider. We see a dividers and tuning systems. Power efficiency is defined here as
dependency of the noise floor of the reference divider on the the ratio of the divider’s maximum operation frequency to its
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 7, JULY 2000 1045
Abstract—A 10-Gb/s phase-locked clock and data recovery The next section of the paper presents the CDR architec-
circuit incorporates an interpolating voltage-controlled oscillator ture and design issues. Section III deals with the design of the
and a half-rate phase detector. The phase detector provides a building blocks. Section IV describes the experimental results.
linear characteristic while retiming and demultiplexing the data
with no systematic phase offset. Fabricated in a 0.18- m CMOS
technology in an area of 1 1 0 9 mm2 , the circuit exhibits an
RMS jitter of 1 ps, a peak-to-peak jitter of 14.5 ps in the recovered II. ARCHITECTURE
clock, and a bit-error rate of 1 28 10 6 , with random data
input of length 223 1. The power dissipation is 72 mW from a The choice of the CDR architecture is primarily determined
2.5-V supply. by the speed and supply voltage limitations of the technology
Index Terms—Clock recovery, half-rate CDR, optical communi- as well as the power dissipation and jitter requirements of the
cation, oscillators, phase detectors, PLLs. system.
In a generic CDR circuit, shown in Fig. 1, the phase de-
I. INTRODUCTION tector compares the phase of the incoming data to the phase of
the clock generated by the voltage-controlled oscillator (VCO),
Fig. 4. (a) Three-stage ring oscillator. (b) Implementation of each stage. (c)
Transistor-level schematic.
B. Phase Detector
Fig. 6. VCO gain partitioning. (a) Fine control. (b) Coarse control.
Phase detectors generally appear in two different forms. Non-
linear PDs coarsely quantize the phase error, producing only a
positive or negative value at their output. Linear PDs, on the the phase error is obtained by taking the difference between the
other hand, generate a linearly proportional output that drops to width of two pulses, both of which are generated whenever a
zero when the loop is locked. data transition occurs. The width of one of the pulses is linearly
Compared to nonlinear PDs, linear PDs result in less charge proportional to the phase difference between the clock and the
pump activity, smaller ripple on the oscillator control line, and data, whereas the width of the other is constant. By using a dif-
hence lower jitter. In a linear PD, such as that described in [6], ferential error signal, pattern dependency of phase error is can-
764 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 36, NO. 5, MAY 2001
riodic. The random nature of the data and the periodic behavior
of the clock in fact make the average value of Error pattern de-
pendent. For this reason, a reference signal must also be gen-
erated whose average conveys this dependence. The two wave-
forms and contain the samples of the data at the rising
and falling edges of the clock. Thus, contains pulses as
wide as half the clock period for every data transition, serving
as the reference signal.
While the two XOR operations provide both the Error and the
Reference pulses for every data transition, the pulses in Error
are only half as wide as those in Reference. This means that
the amplitude of Error must be scaled up by a factor of two
Fig. 7. (a) Phase detector. (b) Operation of the circuit.
with respect to Reference so that the difference between their
averages drops to zero when clock transitions are in the middle
celled because both pulses are present only when a data transi- of the data eye. The phase error with respect to this point is then
tion occurs. linearly proportional to the difference between the two averages.
For linear phase comparison between data and a half-rate In order to generate a full-rate output, the demultiplexed se-
clock, each transition of the data must produce an “error” pulse quences are combined by a multiplexer that operates on the
whose width is equal to the phase difference. Furthermore, to half-rate clock as well. This output can also be used for testing
avoid a dead zone in the characteristics, a “reference” pulse purposes in order to obtain the overall bit-error rate (BER) of
must be generated whose area is subtracted from that of the error the receiver.
pulse, thus creating a net value that falls to zero in lock. It is important to note that the XOR gates in Fig. 7 must be
The above observations lead to the PD topology shown in symmetric with respect to their two differential inputs. Oth-
Fig. 7(a). The circuit consists of four latches and two XOR gates. erwise, differences in propagation delays result in systematic
The data is applied to the inputs of two sets of cascaded latches, phase offsets. Each of the XOR gates is implemented as shown
each cascade constituting a flipflop that retimes the data. Since in Fig. 8 [7]. The circuit avoids stacking stages while providing
the flipflops are driven by a half-rate clock, the two output se- perfect symmetry between the two inputs. The output is single-
quences and are the demultiplexed waveforms of ended but the single-ended Error and Reference signals pro-
the original input sequence if the clock samples the data in the duced by the two XOR gates in the phase detector are sensed with
middle of the bit period. respect to each other, thus acting as a differential drive for the
The operation of the PD can be described using the wave- charge pump. The operation of the XOR circuit is as follows. If
forms depicted in Fig. 7(b). The basic unit employed in the cir- the two logical inputs are not equal, then one of the input tran-
cuit is a latch whose output carries information about the zero sistors on the left and one of the input transistors on the right
crossings of both the data and the clock. The output of each turn on, thus turning off. If the two inputs are identical,
latch tracks its input for half a clock period and holds the value one of the tail currents flows through . Since the average
for the other half, yielding the waveforms shown in Fig. 7(b) for current produced by the Error XOR gate is half of that generated
points and . The two waveforms differ because their cor- by the Reference XOR gate, transistor is scaled differently,
responding latches operate on opposite clock edges. Produced making the average output voltages equal for zero phase differ-
as , the Error signal is equal to ZERO for the portion of ence. Channel length modulation of transistor reduces the
time that identical bits of and overlap, and equal to the precision of current scaling between the two XOR gates. This ef-
XOR of two consecutive bits for the rest. In other words, Error fect can be avoided by increasing the length of the device.
is equal to ONE only if a data transition has occurred. The gain of the PD is determined by the value of the resistor
It may seem that the Error signal uniquely represents the and the tail current sources ( ). The voltage is gener-
phase difference, but that would be true only if the data were pe- ated on chip in order to track the variations over temperature and
SAVOJ AND RAZAVI: CLOCK AND DATA RECOVERY CIRCUIT 765
Fig. 13. (a) Spectrum of the recovered clock. (b) Recovered clock in the time
domain.
a value relatively close to its value in phase lock. The loop goes
through a transition of 350 ns before it locks. The ripple on the
control line in phase lock is approximately 1 mV.
supply. The VCO, the PD, and the clock and data buffers con-
sume 20.7, 33.2, and 18.1 mW, respectively.
V. CONCLUSION
CMOS technology holds great promise for optical communi-
cation circuits. The raw speed resulting from aggressive scaling
along with high levels of integration provide a high performance
at low cost. A 10-Gb/s clock and data recovery circuit designed
in 0.18- m CMOS technology performs phase locking, data re-
generation, and demultiplexing with 1 ps of RMS jitter.
REFERENCES
[1] Y. M. Greshishchev and P. Schvan, “SiGe clock and data recovery IC
with linear type PLL for 10-Gb/s SONET application,” in Proc. 1999
Bipolar/BiCMOS Circuits and Technology Meeting, Sept. 1999, pp.
169–172.
[2] M. Wurzer et al., “40-Gb/s integrated clock and data recovery circuit in
a silicon bipolar technology,” in Proc. 1998 Bipolar/BiCMOS Circuits
and Technology Meeting, Sept. 1998, pp. 136–139.
[3] M. Rau et al., “Clock/data recovery PLL using half-frequency clock,”
IEEE J. Solid-State Circuits, vol. 32, pp. 1156–1159, July 1997.
[4] K. Nakamura et al., “A 6 Gb/s CMOS phase detecting DEMUX module
using half-frequency clock,” in Dig. Symp. VLSI Circuits, June 1998, pp.
196–197.
[5] E. Mullner, “A 20-Gb/s parallel phase detector and demultiplexer circuit
in a production silicon bipolar technology with f = 25 GHz,” in Proc.
1996 Bipolar/BiCMOS Circuits and Technology Meeting, Sept. 1996,
pp. 43–45.
[6] C. Hogge, “A self-correcting clock recovery circuit,” J. Lightwave
Fig. 15. (a) Recovered demultiplexed data. (b) Recovered full-rate data. Technol., vol. LT-3, pp. 1312–1314, Dec. 1985.
[7] B. Razavi, Y. Ota, and R. G. Swarz, “Design techniques for low-voltage
high-speed digital bipolar circuits,” IEEE J. Solid-State Circuits, vol. 29,
the SONET specifications, but the jitter analyzer must then gen- pp. 332–339, Mar. 1994.
erate large jitter and drives the loop out of lock. The loop band- [8] L. M. De Vito, “A versatile clock recovery architecture and monolithic
implementation,” in Monolithic Phase-Locked Loops and Clock
width can be reduced to the SONET specifications if a means of Recovery Circuits, Theory and Design, B. Razavi, Ed. New York:
frequency detection is added to the loop [9]. The circuit is then IEEE Press, 1996.
much less susceptible to loss of lock due to the jitter generated [9] J. Savoj and B. Razavi, “A 10-Gb/s CMOS clock and data recovery cir-
cuit with frequency detection,” in Int. Solid-State Circuits Conf. Dig.
by the analyzer. Tech. Papers, Feb. 2001, pp. 78–79.
Fig. 15 depicts the retimed data. The demultiplexed data
outputs are shown in Fig. 15(a). The difference between the
waveforms results from systematic differences between the
bond wires and traces on the test board. Fig. 15(b) depicts the Jafar Savoj (S’98) was born in Tehran, Iran, in 1974.
He received the B.S.E.E. degree from Sharif Univer-
full-rate output. Using this output, the BER of the system can sity of Technology, Tehran, in 1996 and the M.S.E.E.
be measured. With a random sequence of , the BER is degree from the University of California, Los An-
smaller that . However, a random sequence of geles (UCLA), in 1998. He is currently working to-
ward the Ph.D. degree at UCLA.
results in a BER of . This BER can be reduced if He spent the summer of 1998 with Integrated
the bandwidth of the output buffer driving the 10-Gb/s data is Sensor Solutions, San Jose, CA, working on the
increased. Furthermore, if the value of the linear resistors is design of high-precision interfaces for sensor appli-
cations. During the summer of 1999, he was with
adjusted to their nominal value, the increased operating speed NewPort Communications, Irvine, CA, developing
of the back-end multiplexer results in an improved BER [9]. CMOS clock and data recovery circuits for the SONET OC-192 standard.
The CDR circuit exhibits a capture range of 6 MHz and a Mr. Savoj received the IEEE Solid-State Circuits Society Predoctoral Fellow-
ship for 2000–2001, and the Beatrice Winner Award for Editorial Excellence at
tracking range of 177 MHz. The total power consumed by the the 2001 ISSCC. He is also a recipient of the Design Contest Award of the 2001
circuit excluding the output buffers is 72 mW from a 2.5-V Design Automation Conference.
ISSCC 2001 / SESSION 5 / GIGABIT OPTICAL COMMUNICATIONS I / 5.3
5.3 A 10Gb/s CMOS Clock and Data Recovery Circuit The phase detector operates at high speeds because it uses a
half-rate clock. Since in the locked condition, the rising and
with Frequency Detection falling edges of the quadrature clock coincide with data transi-
tions, the in-phase clock transitions sample the data at its opti-
Jafar Savoj, Behzad Razavi
mum point with no systematic offset, generating a full-rate out-
put stream. Also, since the phase-error signal is reevaluated only
Electrical Engineering Department, University of California, Los Angeles, CA
at data transitions, it incurs little ripple. Note that the output is
independent of the data transition density, resulting in reduction
Clock and data recovery (CDR) circuits operating in the 10Gb/s
of pattern-dependent jitter.
range have become attractive for the optical fiber backbone of
the Internet. While CDR circuits operating at 10Gb/s have been
With the small CDR loop bandwidths specified by optical stan-
designed in bipolar technologies, cost and integration issues
dards, circuits employing only phase detection suffer from an
make it desirable to implement these circuits in standard CMOS
extremely narrow capture range, e.g., about 1% of the center fre-
processes. This 10Gb/s CDR circuit is realized in 0.18µm CMOS
quency. For this reason, a means of frequency detection is neces-
technology. Architecture and circuit techniques circumvent the
sary to guarantee lock to random data. As with other phase
speed limitations of the devices. In contrast to previous work [1],
detectors, the half-rate PD of Figure 5.3.3 generates a beat fre-
this design incorporates an LC oscillator to reduce the jitter as
quency equal to the difference between the data rate and twice
well as a phase/frequency detector to achieve a wide capture
the VCO frequency. However, it does not provide knowledge of
range.
the polarity of this difference. Figure 5.3.4 depicts the half-rate
phase and frequency detector introduced in this work. A second
Shown in Figure 5.3.1, the CDR consists of a phase/frequency
PD is added and driven by phases that are 45° away from those
detector (PFD), a voltage-controlled oscillator (VCO), a charge
in the first PD. The circuit operates as follows. (1) If the clock is
pump, and a low-pass filter (LPF). The PFD compares the phase
slow, VPD1 leads VPD2; therefore, if VPD2 is sampled by the rising
and frequency of the input data to that of a half-rate clock, pro-
and falling edges of VPD1, the results are negative and positive,
viding two binary error signals for phase and frequency. The
respectively. (2) If the clock is fast, VPD1 lags VPD2. Therefore, if
PFD is designed so that, in addition to providing information
VPD2 is sampled by the rising and falling edges of VPD1, the
about the phase error, it retimes the data as well. Consequently,
results are the reverse of the previous case.
the CDR exhibits no systematic offset, i.e., inherent skews
between clock and data edges due to their unidentical paths
The output buffer delivering the 10Gb/s retimed data with high
through the loop do not degrade the quality of detection. The
current levels requires a bandwidth of more than 7 GHz. As
VCO provides four differential half-quadrature phases over the
shown in Figure 5.3.5, the buffer stage employs inductive peak-
full tuning range. All building blocks are fully differential.
ing [3]. The value of the spiral inductors is chosen so as to avoid
ripple in the passband. Since the quality factor of the inductors
Since the half-rate frequency detector requires clock phases that
is not critical here, the spiral structures have a linewidth of only
are integer multiples of 45°, the 5GHz VCO is designed as a ring
4µm to achieve a high self-resonance frequency.
structure consisting of four LC-tuned stages [Figure 5.3.2a]. If the
dc feedback around the ring is positive, all stages operate in-phase
The CDR circuit is fabricated in a 0.18µm CMOS technology. The
at the resonance frequency defined by the LC tanks. On the other
circuit is tested in a chip-on-board assembly while operating
hand, if the dc feedback is negative, the frequency shifts by a small
with a 1.8V supply. The phase noise of the clock in response to a
amount so as to allow each stage to contribute 45° of phase.
9.95328Gb/s data sequence of length 223-1 at 1MHz offset is
approximately equal to -107dBc/Hz. Figure 5.3.6a depicts the
The oscillator topology has two advantages over resistive-load ring
recovered clock and data. A pseudo-random sequence of length
oscillators. First, owing to the phase slope (Q) provided by the res-
223-1 produces 9.9ps of peak-to-peak and 0.8ps rms jitter on the
onant loads, it exhibits less phase noise. Second, its frequency of
clock signal. The jitter characteristics are measured by the
oscillation is only a weak function of the number of stages, gener-
Anritsu MP1777 jitter analyzer. The measured jitter transfer
ating multiple phases with no speed penalty. By comparison, a
characteristic of the CDR is shown in Figure 5.3.6b. The jitter
four-stage resistive-load ring operates at a lower frequency.
peaking is 0.04dB and the 3dB bandwidth is 5.2MHz. Despite
the small loop bandwidth, the frequency detector provides a cap-
Figure 5.3.2b shows the implementation of each stage. The loads
ture range of 1.43GHz, obviating the need for external refer-
are formed using on-chip spiral inductors and MOS varactors.
ences. The total power consumed by the circuit excluding the
Resistor R1 provides a shift in the output common-mode level,
output buffers is 91mW from a 1.8V supply. Figure 5.3.7 shows a
allowing both positive and negative voltages across the varactors
micrograph of the chip, which occupies 1.75x1.55mm2.
and thus maximizing the tuning range. Modeling each tank by a
parallel network, the required 45° phase shift slightly detunes Acknowledgments:
the circuit. The oscillation frequency is given by ω0=(LC)-0.5(1- The authors thank NewPort Communications for fabrication and test sup-
1/Q0)0.5, where Q0 denotes the Q of each stage at resonance. port. This work was supported by SRC and Cypress Semiconductor.
The phase detector (PD) is derived from the data transition References:
[1] J. Savoj and B. Razavi, “A 10-Gb/s CMOS Clock and Data Recovery
tracking loop described in Reference 2. In this PD, in-phase and Circuit,” Dig. of Symposium on VLSI Circuits, pp. 136-139, June 2000.
quadrature phases of a half-rate clock signal sample the data in [2] A. W. Buchwald, Design of Integrated Fiber-Optic Receivers Using
two double-edge-triggered flipflops (DETFFs). Figure 5.3.3 Heterojunction Bipolar Transistors, Ph.D. Thesis, University of California,
shows the implementation of the PD. Two latches operating on Los Angeles, Jan. 1993.
opposite clock phases and a multiplexer form a DETFF that sam- [3] J. Savoj and B. Razavi, “A CMOS Interface Circuit for Detection of 1.2-
Gb/s RZ Data,” ISSCC Digest of Technical Papers, pp. 278-279, Feb. 1999.
ples the data using both the positive and negative transitions of
a half-rate clock. The two signals V1 and V2 are therefore the in-
phase and quadrature samples of data, respectively, and one is
used to route the other or its complement.
Figure 5.3.3: Phase detector. Figure 5.3.4: Phase and frequency detector.
(a) (b)
Fig. 2. CDR circuit: (a) block diagram and (b) timing diagram.
with DFF2 and the XOR gate. All these functions are integrated IV. CIRCUIT AND DESIGN PRINCIPLES
in a single chip. The fixed 90 phase shifter, voltage-controlled The circuit is designed for the single supply voltage of 5
oscillator (VCO), and loop filter have been realized externally V. The circuit principles used are seen in the circuit blocks
with commercially available components.
of a master–slave D-flip-flop (MS-DFF), shown in Fig. 3.
Fig. 2(b) shows the timing diagram. The incoming 40-Gb/s
For details, see [9]. The well-proven E CL (emitter–emitter
data signal is applied to flip-flops DFF1, DFF2, and DFF3.
coupled logic) is used with emitter followers at the inputs
DFF1 is toggled by CLK, DFF2 by CLK, and DFF3 by the
and current switches at the outputs. The series gating between
90 delayed clock signal. This results in the sampling of the
clock and data signals enables differential operation with low
input in the vicinity of midbit and each following potential
transition. If a transition is present, the phase relationship of voltage swings ( mV - ) resulting in an increase in
the data and the clock can be deduced to be early or late. If speed and a reduction of power consumption. Furthermore,
the midbit clock CLK is too early, DFF3 samples the same differential operation reduces time jitter and crosstalk and
bit; if it is too late, DFF3 samples the following bit. Under offers good common-mode suppression compared to single-
locked conditions, DFF3 samples at the edge of the data eye. mode operation [10]. Cascaded emitter followers are used
The XOR compares the output samples of DFF2 and DFF3. for level shifting and impedance transformation between the
The result is fed to the loop filter. The output signal of the various current switches. Multiple emitter followers improve
loop filter serves as the control signal of the VCO. the decoupling capability and increase the collector-base volt-
The advantages of this concept are that all components age of the current-switch transistors allowing for smaller
operate at half the data rate and that the input is demultiplexed transistors, resulting in lower collector-base capacitances [10].
at the same time. The disadvantage is that the input signal has On-chip matching resistors (50 ) at all data inputs are used
to drive three DFF’s in parallel. in order to reduce jitter introduced by reflections [2], [11].
1322 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 9, SEPTEMBER 1999
TABLE I
Fig. 4. Chip micrograph (chip size: 0.9 2 0.9 mm2 ). DEVICE PARAMETERS (JUNCTION CAPACITANCES ARE ZERO-BIAS VALUES)
V. CHIP TECHNOLOGY
The circuit has been fabricated in a self-aligned double-
polysilicon bipolar technology [12]. The fabrication starts
with buried layer formation. A 1- m epitaxial layer is grown
to compromise between high-transit-frequency and low
external collector-base capacitance . The isolation consists
of a channel stopper implantation combined with LOCOS field
oxide. The active base is formed by 5-keV BF implantation.
This low implantation energy in combination with optimized
annealing conditions allows for very steep base profiles. This
results in a narrow base width of about 50 nm and thus enables
a high transit frequency. A selectively implanted collector Fig. 6. Measured ECL gate delay versus current per gate with an 800-mV
improves the current-carrying capability of the transistors to differential voltage swing.
the high optimum collector current density of about 2
mA/ m . To minimize narrow emitter effects, an in situ doped
VI. MOUNTING AND MEASURMENT SETUP
emitter-polysilicon layer is used [13]. This prevents a reduction
of cutoff frequency even for 0.5- m design rules. A three-level For measurements, the clock and data recovery IC has been
metallization completes the process. mounted on a 15-mil ceramic substrate using
Fig. 5 shows a schematic cross section of a transistor. conventional bonding techniques. Special care has been taken
Except for epitaxy, only process steps of a 0.5- m CMOS to minimize the length of the bond wires by positioning the
production environment are necessary. The maximum transit surface of the chip on the same level as the signal, ground,
frequency of the transistors is GHz at V and supply lines of the mounting substrate. Due to differential
and mA m . Table I summarizes typical parameters operation, a pair of lines for each clock and data signal is
for transistors with effective emitter size of needed to connect the chip with the environment. Therefore,
m . The minimum gate delay for an ECL differentially a corresponding number of connectors are necessary. The
operating ring oscillator with output voltage swing of 800 minimum distance between them determines the minimum
mV - is measured to be 15.4 ps. This value is achieved for size of the test fixture. To avoid additional delay lines, the
a current per gate of 1.6 mA (see Fig. 6). length of the lines for the signals , and , , ,
WURZER et al.: 40-Gb/s INTEGRATED CLOCK AND DATA RECOVERY CIRCUIT 1323
Fig. 9. Eye diagram of the 20-Gb/s data signal at the output D2 of the 1 : 2
demultiplexer.
(b)
digital functions necessary for a 40-Gb/s transmission system Josef Böck was born in Straubing, Germany, in
are feasible with silicon bipolar production technologies. 1968. He received the diploma degree in physics and
the Ph.D. degree from University of Regensburg,
Germany, in 1994 and 1997, respectively.
REFERENCES He joined Corporate Research and Development,
Siemens AG, Munich, Germany, in 1993, where
[1] K. Hagimoto, Y. Miyamoto, T. Kataoka, H. Ichino, and O. Nakajima, he first investigated narrow emitter effects in deep
“Twenty-Gbit/s signal transmission using simple high-sensitivity optical submicrometer silicon bipolar devices. His work on
receiver,” in OFC’92 Tech. Dig., Feb. 1992, p. 48. technology development and process integration for
[2] A. Felder, M. Möller, J. Popp, J. Böck, and H.-M. Rein, “46 Gb/s high-speed silicon bipolar transistors resulted in the
DEMUX, 50 Gb/s MUX, and 30 GHz static frequency divider in silicon SIEGET 45 microwave-transistor family. Currently,
bipolar technology,” IEEE J. Solid-State Circuits, vol. 31, pp. 481–486, he is working on process development for Si and SiGe bipolar technologies.
Apr. 1996.
[3] W. Bogner, U. Fischer, E. Gottwald, and E. Müllner, “20 Gbit/s TDM
nonrepeatered transmission over 198 km DSF using Si-bipolar IC for
demultiplexing and clock recovery,” in Proc. ECOC, Sept. 1996, paper
TuD.3.4. Herbert Knapp was born in Salzburg, Austria,
[4] W. Bogner, E. Gottwald, A. Schöpflin, and C.-J. Weiske, “40 Gbit/s un- in 1964. He received the Diplomingenieur degree
repeatered optical transmission over 148 km by electrical time division in electrical engineering from Technical University
multiplexing and demultiplexing,” Electron. Lett., vol. 33, no. 25, pp. Vienna, Austria, in 1997.
2136–2137, Dec. 1997. He joined Corporate Research and Development,
[5] R. Yu, R. Pierson, P. Zampardi, K. Runge, A. Campana, D. Meeker, Siemens AG, Munich, Germany, in 1993, where
K. C. Wang, A. Petersen, and J. Bowers, “Packaged clock recovery he has been involved in the design of integrated
integrated circuits for 40 GBit/s optical communication links,” in GaAs circuits for wireless communications. His current
IC Symp. Tech. Dig., Nov. 1996, pp. 129–132. research interests include the design of high-speed
[6] M. Mokhtari, T. Swahn, R. H. Walden, W. E. Stanchina, M. Kardos, T. and low-power microwave circuits.
Juhola, G. Schuppener, H. Tenhunen, and T. Lewin, “InP-HBT chip-set
for 40-Gb/s fiber optical communication systems operational at 3 V,”
IEEE J. Solid-State Circuits, vol. 32, pp. 1371–1383, Sept. 1997.
[7] M. Lang, Z.-G. Wang, Z. Lao, M. Schlechtweg, A. Thiede, M. Rieger-
Motzer, M. Sedler, W. Bronner, G. Kaufel, K. Köhler, A. Hülsmann,
and B. Raynor, “20–40 Gb/s 0.2-m GaAs HEMT chip set for optical Wolfgang Zirwas received the Diplomingenieur
data receiver,” IEEE J. Solid-State Circuits, vol. 32, pp. 1384–1393, degree in electrical engineering from Technical Uni-
Sept. 1997. versity Munich, Germany.
[8] A. Felder, M. Möller, M. Wurzer, M. Rest, T. F. Meister, and H.-M. He joined Siemens AG, Munich, in 1987. First,
Rein, “60 Gbit/s regenerating demultiplexer in SiGe bipolar technology,” he worked in the field of high-bit-rate fiber-optic
Electron. Lett., vol. 33, no. 23, pp. 1984–1986, Nov. 1997. communication systems. Later, he focused his work
[9] J. Hauenschild, A. Felder, M. Kerber, H.-M. Rein, and L. Schmidt, on broad-band access technologies (xDSL, HFC)
“A 22 Gb/s decision circuit and a 32 Gb/s regenerating demultiplexer for both residential and business users. He is now
IC fabricated in silicon bipolar technology,” in Proc. IEEE BCTM’92, working in the field of broad-band wireless systems.
Sept. 1992, pp. 151–154.
[10] H.-M. Rein and M. Möller, “Design considerations for very-high-speed
Si-bipolar IC’s operating up to 50 Gb/s,” IEEE J. Solid-State Circuits,
vol. 31, pp. 1076–1090, Aug. 1996.
[11] J. Hauenschild and H.-M. Rein, “Influence of transmission-line inter-
connections between Gbit/s IC’s on time jitter and instabilities,” IEEE Fritz Schumann received the Diplomingenieur de-
J. Solid-State Circuits, vol. 25, pp. 763–766, June 1990. gree in electrical engineering from Technical Uni-
[12] J. Böck, A. Felder, T. F. Meister, M. Franosch, K. Aufinger, M. Wurzer, versity Berlin, Germany, in 1981.
R. Schreiter, S. Boguth, and L. Treitinger “A 50 GHz implanted base Subsequently, he worked in the field of RF and
silicon bipolar technology with 35 GHz static frequency divider,” in microwave hybrid circuit and system design for
Symp. VLSI Technology Tech. Dig., June 1996, pp. 108–109. telecommunication and radar applications. In 1992,
[13] J. Böck, M. Franosch, H. Schäfer, H. v. Philipsborn, and J. Popp, “In- he joined the silicon bipolar IC design group, Cor-
situ doped emitter-polysilicon for 0.5 m silicon bipolar technology,” in porate Research and Development, Siemens AG,
Proc. ESSDERC’95, The Hague, the Netherlands, Sept. 1995, pp. 421– Munich, Germany. Since then, he has realized IC’s
424. for wireless and fiber-optic communication systems
[14] M. Möller, H.-M. Rein, A. Felder, and T. F. Meister, “60 Gbit/s time-
up to 60 Gb/s.
division multiplexer in SiGe-bipolar technology with special regard to
mounting and measuring technique,” Electron. Lett., vol. 33, no. 8, pp.
679–680, Apr. 1997.
Alfred Felder was born in Bruneck, South Tyrol, Italy, in 1963. He received
the Diplomingenieur and Ph.D. degrees in electrical engineering from the
Martin Wurzer was born in Innsbruck, Austria, in Technical University Vienna, Austria, in 1989 and 1993, respectively.
1966. He received the Diplomingenieur degree in He joined Corporate Research and Development, Siemens AG, Munich,
electrical engineering from the Technical University Germany, in 1989, where he has been engaged in the development of analog
Vienna, Austria, in 1994, where he is currently and digital high-speed silicon bipolar IC’s for future optical communication
pursuing the Ph.D. degree. systems in the gigabit-per-second range. From 1996 to 1998, he was Manager
He joined Corporate Research and Development, of the Technology Department of Siemens K.K. The department is the liaison
Siemens AG, Munich, Germany, in 1994, where office of the Corporate Technology of Siemens AG in Japan, responsible
he has been engaged in the development of digital for the cooperation with Japanese companies in research. Since 1998, he
high-speed silicon bipolar IC’s for future optical has been heading the business operation Signal Processing & Control within
communication systems in the gigabit-per-second the Siemens Semiconductor Group in Japan and has been responsible for
range. marketing of microcontrollers and digital signal processors.
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 36, NO. 12, DECEMBER 2001 1937
Abstract—In this paper, a fully integrated 40-Gb/s clock Additionally, 40-Gb/s TDM will become more cost effective,
and data recovery (CDR) IC with additional 1:4 demulti- as the number of optical ports is reduced by a factor of 4 com-
plexer (DEMUX) functionality is presented. The IC is im- pared to 10-Gb/s TDM, resulting in fewer price-determining op-
plemented in a state-of-the-art production SiGe process. Its
phase-locked-loop-based architecture with bang-bang-type phase tical components, smaller system footprint, and reduced main-
detector (PD) provides maximum robustness. To the authors’ best tenance costs.
knowledge, it is the first 40-Gb/s CDR IC fabricated in a SiGe Regarding next-generation 40-Gb/s TDM links, the clock
heterojunction bipolar technology (HBT). The measurement re- and data recovery (CDR) IC is a key electronic component,
sults demonstrate an input sensitivity of 42-mV single-ended data
input swing at a bit-error rate (BER) of 10 10 . As demonstrated which strongly determines the overall transmission perfor-
in optical transmission experiments with the IC embedded in a mance. 40-Gb/s TDM designs must be architecturally robust
40-Gb/s link, the CDR/DEMUX shows complete functionality as and manufacturable to compete with 10-Gb/s TDM systems.
a single-chip-receiver IC. A BER of 10 10 requires an optical Accordingly, a fully integrated phase-locked loop (PLL)-based
signal-to-noise ratio of 23.3 dB.
approach with self-aligning bang-bang phase detector (PD) is
Index Terms—Bang-bang, BER, CDR, clock and data recovery, employed in this work. The IC is fabricated in a production
demultiplexer, DEMUX, dynamic frequency divider, jitter gener- state-of-the-art SiGe heterojunction bipolar technology (HBT)
ation, jitter tolerance, limiting amplifier, OSNR, phase detector,
phase-locked loop, PLL, SiGe, VCO. which provides advantages with respect to the achievable level
of integration, yield, cost-effectiveness, and process stability
compared to III-V process technologies.
I. INTRODUCTION
pected to be insufficient to meet the rapidly increasing demands The 40-Gb/s TDM optical link employs a 4:1 multiplexing
for higher bandwidth in the foreseeable future. scheme, as shown in the block diagram in Fig. 1. At the re-
The economically achievable transmission capacity of these ceiver, the incoming optical signal is first amplified by an op-
wavelength-division multiplexing (WDM) systems is currently tical preamplifier (OA), converted into electrical pulses by the
limited to 1.6 Tb/s, assuming 160 parallel 10-Gb/s TDM chan- photo diode, and then directly feeds the CDR/DEMUX. Data
nels in the C- and L-band at a channel spacing of 50 GHz. recovery is accomplished by the first 1:2 DEMUX. In the PLL-
This corresponds to a spectral efficiency of 0.2 (b/s)/Hz. By based clock recovery approach presented here, the PD output
increasing the channel bit rate to 40-Gb/s per TDM channel, forces the receive-side voltage-controlled oscillator (VCO) to
the fiber capacity can be better utilized. With the spectral ef- track the phase of the incoming data signal. The combination of
ficiency increased to 0.4 (b/s)/Hz, the total transmission capa- the PD and 1:2 DEMUX function allows the use of a 20-GHz
bility is 3.2 Tb/s, assuming 80 parallel 40-Gb/s channels and half-bit-rate clock. This half-bit-rate architecture is explained in
100-GHz channel spacing. more detail in Section IV.
This paper focuses on the CDR/DEMUX IC. However, the
remaining basic functions such as the 4:1 multiplexer (MUX)
Manuscript received March 26, 2001; revised July 15, 2001. and the driver IC have also been realized in this work program in
M. Reinhold, C. Dorschky, E. Rose, and F. Kunz were with Lucent Technolo- SiGe HBT and GaAs high-electron-mobility transistor (HEMT)
gies, Optical Networking Group, D-90411 Nürnberg, Germany. They are now technology, respectively.
with CoreOptics GmbH, D-90411 Nürnberg, Germany (e-mail: mario@coreop-
tics.com or mario_reinhold@gmx.de).
R. Pullela was with Lucent Technologies, Bell Labs, Murray Hill, NJ. He is
now with Gtran Inc., Westlake Village, CA 91362 USA. III. PROCESS TECHNOLOGY
P. Mayer and T. Link are with Lucent Technologies, Optical Networking
Group, D-90411 Nürnberg, Germany. The CDR/DEMUX IC presented in this work was designed
Y. Baeyens is with Lucent Technologies, Bell Labs, Murray Hill, NJ 07974 in a state-of-the-art SiGe HBT with 72-GHz and 74-GHz
USA. [1]. SiGe HBT provides superiority for a high level of in-
J.-P. Mattia was with Lucent Technologies, Bell Labs, Murray Hill, NJ 07974
USA. He is now with Big Bear Networks, Sunnyvale, CA 94086 USA. tegration compared to III-V technologies. The process features
Publisher Item Identifier S 0018-9200(01)09325-8. four metal layers in total including a thick metal layer on top.
0018–9200/01$10.00 © 2001 IEEE
1938 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 36, NO. 12, DECEMBER 2001
Small-scale integrated analog and digital building blocks im- 1:2 DEMUX functions allows the use of a half-bit-rate clock
plemented in this process have been demonstrated for 40-Gb/s running at 20 GHz.
operation [2], [3]. The three upper eye diagrams in Fig. 3 illustrate the basic
principle of a common bang-bang PD. For the basic realization,
IV. CDR/DEMUX ARCHITECTURE three samples of the incoming data signal are necessary.
The half-bit-rate architecture of the CDR/DEMUX IC (Fig. 2) In locked condition, and sample two consecutive bits
is based on the concept reported in [4] and has already demon- while samples the data transition, as indicated in the first eye
strated its functionality for 10-Gb/s applications. diagram.
The nonlinear bang-bang PD described in [5] was modified In this implementation, two modifications to the common
for interlaced operation. The combination of the PD and the bang-bang PD are made. First, the 1:2 DEMUX and phase-
REINHOLD et al.: FULLY INTEGRATED 40-Gb/s CLOCK AND DATA RECOVERY IC 1939
V. BUILDING BLOCKS
The most challenging building blocks of the CDR/DEMUX
are the 40-GHz 2:1 frequency divider, the 40-GHz VCO, the
limiting data input amplifier and the transition sampling latches,
Fig. 3. PD principle. including the four-phase 20-GHz clock tree.
Fig. 7 shows the circuit diagram of the latch. The latch struc-
ture can be subdivided into the latch core and the local clock
input stage.
The steepness of the PD curve is strongly determined by the
metastability and the clock phase margin (CPM) of the latch as
sample the bit transition when the PLL is in lock. Two high
current-biased emitter followers in the data path provide a high
CPM and a small metastability region.
For optimal clock distribution, a local clock input stage con-
sisting of termination resistors and two emitter followers
is included in each latch cell, since a total clock line length
Fig. 5. Circuit diagram of the 40-GHz on-chip VCO.
of several millimeters cannot be avoided. For this reason, cur-
rent interfaces between each clock buffer and the latches are
respect to loop gain, the VCO phase noise is suppressed by its
employed. The concept of the clock distribution is illustrated
open loop gain when the PLL is in lock.
in Fig. 8. The two clock buffers are based on open-collector
C. Limiting Data Amplifier transadmittance stages. As stated before, a local clock input
stage consisting of termination resistors is included in
The limiting data input amplifier (Fig. 6) employs cascaded
every latch cell. Each clock buffer is loaded with ten latches.
chains of emitter followers (EF), transimpedance stages (TIS),
Impedance matching can be easily achieved by designing
and transadmittance stages (TAS) in accordance with the con-
cept of impedance mismatch [9]. the line impedance according to the number of loading
Layout aspects strongly influence the circuit performance; es- latches. The value of can be locally adapted with respect
pecially, it should not be degraded by signal interconnects. Since to signal splits, as it is shown in Fig. 8. If the line impedances
signal lines can be distinguished into critical and uncritical lines are chosen (with being the number of loading
[9], long transmission lines are arranged between current inter- latches), the clock amplitude can be increased. This is due
faces consisting of a TAS and its load, which is either repre- to the inductive peaking effect of the transmission line, as
sented by an active TIS or by passive resistors. For this reason, indicated in Fig. 9.
the limiting data amplifier is implemented—both in schematic The layout of the latch, which is shown as part of Fig. 10,
and layout—in the form of three separate amplifier blocks with employs orthogonal data and clock inputs. This implementation
a TIS–TAS interface. minimizes line length on the high-speed data path by running a
As the data signal has to be split into four latch chains (Fig. 2) data channel directly through the cascaded latch cells (Fig. 10).
and longer lines cannot be avoided, four TIS2 stages are driven In addition, clock channels run beside the cells to simplify the
in parallel by one TAS2. clock-tree routing.
REINHOLD et al.: FULLY INTEGRATED 40-Gb/s CLOCK AND DATA RECOVERY IC 1941
Fig. 10. Layout of the entire clock tree and enlargement of the latch.
E. Electrical Sensitivity (BER)
The bit-error rate (BER) curve as measure of the overall
electrical CDR performance is given in Fig. 17. For the photodiode to the CDR/DEMUX as the optical measurement
CDR/DEMUX with external VCO, a very high sensitivity results demonstrate.
of 28-mV single-ended voltage swing at is
measured. Due to minor modifications of the limiting amplifier F. Performance in System Application
resulting in lower bandwidth, the CDR/DEMUX with internal The performance of the CDR/DEMUX embedded in a op-
VCO provides slightly less sensitivity. For the same BER, a tical fiber link (refer to Fig. 1) can be characterized by the op-
42-mV single-ended voltage swing is necessary. A contribution tical signal-to-noise ratio (OSNR) measurement, as shown in
of the internal VCO to the performance degradation can be Fig. 18. This is due to the fact that in the given configuration the
ruled out, since a variant with external VCO and modified sensitivity is limited by the noise of the OA.
limiting amplifier showed similar performance degradation. A 50- terminated photodiode is directly connected to the
Such high input sensitivity allows a direct connection of the CDR/DEMUX without any electrical amplifier in between, so
REINHOLD et al.: FULLY INTEGRATED 40-Gb/s CLOCK AND DATA RECOVERY IC 1943
Yves Baeyens (S’89–M’96) received the M.S. and John-Paul Mattia received the B.S., M.S., E.E.,
Ph.D. degrees in electrical engineering from the and Ph.D. degrees in electrical engineering and
Catholic University, Leuven, Belgium, in 1991 computer science from the Massachusetts Institute
and 1997, respectively. His Ph.D. research was of Technology, Cambridge.
performed in cooperation with IMEC, Leuven, and He began working in high-speed electronics
treated the design and optimization of coplanar at MIT Lincoln Laboratory in 1989. In 1996, he
InP-based dual-gate HEMT amplifiers, operating up joined Texas Instruments Inc. in the DSP R&D
to W-band. organization. From 1997 to 2000, he worked in
After a year and a half stay as a Visiting Scien- the High-Speed Electronics Group of Lucent Bell
tist at the Fraunhofer Institute for Applied Physics, Labs, designing and testing circuits for lightwave
Freiburg, Germany, he is currently a Technical Man- communication systems. Since July 2000, he has
ager in the High-Speed Electronics Research Department of Lucent Technolo- been at Big Bear Networks, Sunnyvale, CA, where he is Chief Technical
gies, Bell Laboratories, Murray Hill, NJ. His research interests include the de- Officer of Electronics.
sign of mixed analog–digital circuits for ultrahigh-speed lightwave and mil-
limeter-wave applications.
I. INTRODUCTION
pseudorandom bit sequence (PRBS) generator for self-testing,
A TYPICAL fiber-optic SONET receiver contains pin-diode
with transimpedance (TZ) amplifier, wide dynamic range
automatic gain control amplifier (AGC), and a clock and data
as shown in the dotted-line box in Fig. 1 [6]. Receiver perfor-
mance mounted into test fixture was verified in a data-recovery
mode up to 12.5 Gb/s and in a CDR mode at 9.1 Gb/s (only
recovery circuit (CDR) with a demultiplexer. Introduction of
limited by the VCO maximum oscillation frequency after
dense wave-division-multiplexed (DWDM) systems has put a
packaging). The OC192 10-Gb/s SONET-compliant jitter
high demand on the receiver production. A high level of 10-Gb/s
characteristics of the CDR were verified on-wafer with a
component integration, as opposed to using a filter-based CDR
membrane probe card and with a jitter analyzer from Anritsu.
architecture [1], is required along with self-testing capabilities
Phase-noise characteristics have also been measured to confirm
to reduce receiver cost, module size, and power dissipation. One
the CDR’s sub-picosecond rms jitter performance. Measured
of the major difficulties in the integration of 10-Gb/s receiver
10-Gb/s maximum receiver sensitivity de-embedded after
is to achieve jitter characteristics compliant to the SONET re-
losses in the test fixture is 4.5 mV at a bit-error rate (BER) of
quirements, such as Bellcore recommendations for the OC192
at the demultiplexer (DEMUX) output.
system [2]. To the authors’ knowledge, none of the previously
In Section II, the binary CDR architecture used in the receiver
reported [3]–[5] 10-Gb/s receiver ICs with the integrated clock
is briefly analyzed as compared to a linear-type CDR and design
and data recovery circuit (CDR) demonstrated all of the SONET
method to meet SONET jitter requirements is presented. Then in
compliant jitter characteristics. While sub-picosecond jitter gen-
Section III, the full receiver architecture and the building blocks
eration was previously confirmed in the SONET CDR [5], an-
implementation details are discussed. In Section IV, the IC die
other important question is if all of the receiver components can
fabrication features are described. Finally, in Section V, mea-
be integrated on a die without sensitivity and jitter performance
sured results are presented.
degradation.
A fully integrated SiGe receiver IC, presented in the
paper, combines CDR, AGC, 1 : 8 demultiplexer and II. BINARY CDR IN SONET RECEIVER
A. Binary CDR Versus Linear CDR
Manuscript received April 17, 2000; revised June 29, 2000. The CDR published in [5] uses a linear-type PLL approach
Y. M. Greshishchev and P. Schvan are with Nortel Networks, Ottawa, ON [Fig. 2(a)], while the CDR presented here is based on a binary
K1Y 4H7, Canada (e-mail: greshy@nortelnetworks.com).
J. L. Showell was with Nortel Network and is currently with Quake Tech- PLL [Fig. 2(b)]. In the binary PLL, a binary Alexander-type [7]
nologies Inc., Ottawa, ON, Canada. phase detector (PD) is used as compared to the Hogge-type PD
M.-L. Xu was with Nortel Networks, Ottawa, ON K1Y 4H7, Canada. He is [8] in a linear-type PLL. Examples of using binary architecture
now with Conextant Systems, San Diego, CA.
J. J. Ojha was with Nortel Networks, Ottawa, ON K1Y 4H7, Canada. He in optical receiver ICs can be found in [9], [10]. Binary PD pro-
is now with Caspian Networks, Palo Alto, CA (e-mail: jojha@caspiannet- duces two digital outputs, UP and DOWN, to signal if the data is
works.com). early or late with respect to the VCO clock. To control the VCO,
J. E. Rogers is with The University of Toronto, Toronto, ON M5S 3G4,
Canada. the binary information is split into two loops as suggested in
Publisher Item Identifier S 0018-9200(00)09475-0. [11]. The phase-control loop is formed with the UP and DOWN
0018-9200/00$10.00 © 2000 IEEE
1950 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 12, DECEMBER 2000
(a)
Fig. 3. CDR analytical jitter tolerance as compared to SONET mask.
TABLE I
COMPARISON OF LINEAR AND BINARY TYPE CDR ARCHITECTURES
shown in Fig. 4. The minimum value for the frequency step stages implemented with output current steering in
is determined by Jitter Tolerance minimum bandwidth a differential pair [13]. Stage has a fixed gain and also
requirements (4 MHz); the maximum value is limited provides open collector transmission line interface to drive the
by jitter generation (10 ps is recommended [2]). In the de- CDR data bus. To alleviate conflicting requirements for large
sign presented here, ps was assumed. To reduce data swing and low noise figure at low input amplitudes, the
jitter generation delay, should be minimized. This makes the AGC has two gain ranges: 7–7 dB (low gain range) and 7–20
ring-type VCO preferable in the binary CDR as compared to dB (high gain range). Two differential pairs with a “large” and
LC-tank based VCO where tuning delay is larger due to the usu- a “small” emitter degeneration resistors are used to switched
ally higher -factor of the LC-tank. the gain ranges. AGC-measured S11 is better than 15 dB in
a frequency range up to 10 GHz, noise figure 13.5 dB. The ac
bandwidth is adjustable in a range of 8–10 GHz.
III. RECEIVER ARCHITECTURE
A. Architecture C. CDR
As compared to original version of binary CDR [7], [11], in
The receiver architecture is shown in Fig. 5. It combines an
the CDR presented here, the data decision and clock recovery
AGC and a binary CDR with a 1 : 8 demultiplexer and a
processes are split. This allows for independent optimization
PRBS generator for self-testing. The receiver recovers 10-Gb/s
of data decision threshold (slicing) without affecting clock re-
data and a 10-GHz clock, and produces eight demultiplexed
covery process. There are four decision channels in the CDR, all
1.25-Gb/s CML data outputs with a 1.25-GHz clock. The PRBS
driven by the CDR data bus. Channels 1 and 2 are identical de-
generator allows functional testing of the CDR and subsequent
cision circuits, as shown in the block diagram of Fig. 7(a). The
circuits. A PRBS clock (CLK) is required for testing. In the test
additional decision channel allows operation with two different
mode, the PRBS output is enabled to drive the CDR data bus.
input slicing levels. The data decision threshold is set by a differ-
In the receiver mode, the AGC output is enabled. The recovered
ential slicer circuit based on an emitter follower [Fig. 7(b)]. In a
10-Gb/s data and 10-GHz clock appear at the recovered data
long-haul receiver application, a high-performance limiting am-
(DATA REC BUS) and clock (CLK REC BUS) buses, driving a
plifier [2 in Fig. 7(a)] is required. Note that the AGC stabilizes
1 : 8 DEMUX circuit. A clock signal can also be supplied exter-
only a long-time averaged amplitude measured at AGC output
nally (CLKx) for data recovery operation only.
with a peak detector (not shown in Figs. 5 and 6). The limiting
amplifier stage was designed for 40-dB gain with bandwidth of
B. AGC more than 16 GHz and input AM to output PM conversion less
The block diagram of the AGC is shown in Fig. 6. The than 1 ps in 20-dB input dynamic range. Two 20-dB gain-lim-
AGC has total linear gain range from 3 to 20 dB with a iting amplifier stages similar to [14] were employed. The dig-
maximum input of 1.7 V . The AGC has two variable gain ital sampler is based on a master–slave–master (MSM) flip-flop
1952 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 12, DECEMBER 2000
TABLE II
TRUTH TABLE OF BINARY PHASE DETECTOR
the PLL controls the recovered clock phase via the input of
the VCO. The frequency loop includes the charge pump and an
external integration capacitor (pins C1 and C2).
The VCO is a ring oscillator type with an architecture shown
in Fig. 9. A mixer-type delay cell is used to control the oscil-
lation frequency. The mixer cell is split into a fine-tune (for
the internal frequency loop) and a coarse-tune (to compensate
for process variation). Care was taken to provide symmetrical
bang-bang frequency steps with respect to the tri-state. All of the
VCO control inputs were implemented with the high-impedance Fig. 10. 1 : 8 demultiplexer block diagram.
pMOS buffers. A pMOS-based charge pump was employed [5].
D. 1 : 8 Demultiplexer
The 1 : 8 DEMUX (Fig. 10) is similar in architecture to
the design presented in [15]. Seven 1 : 2 demultiplexer cir-
cuits are cascaded, with each stage optimized for the clock
frequency required. Each 1 : 2 demultiplexer consists of a
master–slave–master flip-flop to capture the lead bit on the pos-
itive edge of the clock and a master–slave flip-flop to capture
the second bit using the negative edge of the clock. Utilizing
an extra latch in the MSM flip-flop ensures that the 1 : 2 data
outputs are aligned for further processing. The frequency of the
incoming CDR clock is divided by two at each demultiplexer
stage with a delay equal to the data delay in the 1 : 2 block.
Fig. 11. 2 0 1 PRBS generator block diagram.
E. Built-in PRBS Generator
The PRBS generator (Fig. 11) was implemented using The parallel form avoids the necessity of distributing a 10-GHz
the standard polynomial equation in a parallel form. clock, as would be required if using a shift-register-type PRBS.
1954 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 12, DECEMBER 2000
F. CDR Simulation
Because of the nonlinear jitter response of a binary CDR,
hierarchical numerical analysis was an important part of the Fig. 14. SiGe receiver eye diagrams measured in CDR mode at 9.1 GB/s. Input
receiver IC design. Four levels of PLL analysis were carried data: 80 mV p PRBS 2 0 1.
out: analytical, behavioral, schematic level, and post-layout ex-
tracted circuit with distributed parasitics. The last three levels cm in the test fixture was defined by the perimeter re-
are HSPICE-based. A behavioral library of linear and digital quired for mounting I/O connectors in the housing metal box. A
components was developed. Analytical models of jitter transfer large number of required I/O were used for testing purposes. The
and jitter tolerance are based on simplified binary-PLL theory, receiver IC does not require external components, except decou-
as described above. pling and integration capacitors mounted beside the die. The
recovered clock and data eye diagrams at 9.1 Gb/s are shown
IV. FABRICATION in Fig. 14. The IC input sensitivity is less than 4.5 mV at
The receiver IC was implemented in IBM”s SiGe technology measured at the 1 : 8 demultiplexer output with
( GHz, GHz). The microphotograph of the AGC gain set to 20 dB (data eye closure in the test fixture was
die is shown in Fig. 12. The die size is mm . The major de-embedded). DATA : 8 transition distortion apparent in Fig. 14
circuit building blocks were not only integrated into the receiver, is due to long ribbon cable attached to the test fixture demulti-
but were implemented as individual IC components and tested. plexed outputs. The receiver consumes in a mission mode 4.5 W
In the receiver IC, the building blocks were physically parti- from 5 V.
tioned with a transmission line circuit and layout isolation inter- The CDR performance IC was fully characterized at 10-Gb/s
face similar to that presented in [5], [14]. Separate power supply on-wafer with a probe card. The die micrograph of the CDR
systems with digital and analog grounds were routed. is shown in Fig. 15. It consists of an exact copy of the receiver
CDR layout plus the output buffers located in the DEMUX parti-
V. EXPERIMENTAL RESULTS tion. In all of the measurements, input data were supplied single-
ended while unused differential input was terminated with 50 .
The IC worked at first implementation with the VCO oscilla- The CDR typical eye diagrams measured with 20 mV
tion frequency 10% lower than simulated. The receiver IC was PRBS data are shown in Fig. 16. The input sensitivity was mea-
mounted into a microwave test fixture (Fig. 13) and was tested sured to be 14 mV at as compared to 13.4 mV
at 9.1 Gb/s (VCO oscillation frequency limit2 ) in a CDR mode simulated considering thermal and shot noise in the decision
and up to 12.5 Gb/s in a data-recovery mode or in internal PRBS channel.
test mode with an external clock. The carrier substrate size of Phase noise of the recovered clock was measured with an
2Maximum oscillation frequency can be easily corrected by removing one HP 4352B as a power spectrum density (Fig. 17). 10-Gb/s input
delay stage in the VCO design of Fig. 9. data were supplied with amplitude of 100 mV and
GRESHISHCHEV et al.: Fully Integrated SiGe Receiver IC 1955
Fig. 15. SiGe CDR IC die micrograph. Fig. 18. CDR jitter transfer.
Fig. 19. CDR jitter tolerance. Performance measured with the clock reference
level modulaton test marked with symbol .
clock gives jitter RMS value of 0.78 ps. Phase noise was found
to be PRBS pattern independent up to a pattern of .
The OC192 jitter compliant performance (at 9.953 28 Gb/s)
was verified with a jitter analyzer MX177 701 from Anritsu.
Jitter generation (in 80-MHz bandwidth) was measured to be
5.4 ps and 0.8 ps RMS as compared to 10 ps or 1 ps RMS
recommended by Bellcore [2]. The RMS jitter is very close to
the 0.78-ps value obtained in the phase-noise measurement.
Jitter transfer measurement (Fig. 18) showed, as predicted
by modeling, single-pole-like characteristics with no jitter
Fig. 17. Phase-noise comparison of the CDR recovered clock, free running peaking. Jitter tolerance (Fig. 19) has a very wide safety margin
VCO and data pattern generator (BERT) clock.
for SONET mask with a minimum of 40 ps (15 ps is
recommended). The shape of measured CDR jitter tolerance
response differs from the modeled in Fig. 3 because of test
PRBS pattern. For comparison, phase noise of the free-running setup limitations. This is seen from the measured jitter tolerance
VCO and the data-pattern generator was also measured and of the test setup (BERT) with no CDR in the data path (shown
shown on the same plot. As expected in a high performance in the same plot of Fig. 19). Only in the frequency range of
CDR, the recovered clock-phase noise follows, with no error, 40 kHz–2 MHz measured jitter tolerance is determined by CDR
the data-reference clock noise down to the CDR jitter noise performance. In this frequency range, measured and modeled
floor at 110 dBc/Hz. Similar recovered phase noise was jitter tolerance coincide. The upper frequency range of the
achieved in the CDR design with a linear PLL and LC-type jitter tolerance response was also remeasured with a different
VCO [5]. Numerically integrated phase noise of the recovered method, based on the reference voltage (see Fig. 5)
1956 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 12, DECEMBER 2000
modulation with a sin-wave signal. Minimum jitter tolerance of [14] Y. Greshishchev and P. Schvan, “60-dB gain 55-dB dynamic range
40 ps was measured. Both jitter transfer and jitter tolerance 10-Gb/s SiGe HBT limiting amplifier,” IEEE J. Solid-State Circuits,
vol. 34, pp. 1914–1920, Dec. 1999.
response were found to be PRBS pattern independent. The [15] L. I. Anderson et al., “Silicon bipolar chipset for SONET/SDH 10-Gb/s
IC demonstrated a 60-MHz frequency range of robust PLL fiber-optic communication links,” IEEE J. Solid-State Circuits, vol. 30,
locking and operation even at the input signals well below the pp. 210–218, Mar. 1995.
sensitivity level.
ACKNOWLEDGMENT
The authors thank their colleagues S. Szilagyi for the
microwave test fixture design, and D. Marchesan and Peter Schvan (M’89) was born in Budapest, Hun-
Dr. S. Voinigescu for useful discussions and distributed gary, in 1952. He received the M.S. degree in physics
components modeling. Special thanks to R. Hadaway for his from Eotovos Lorand University, Budapest, in 1975
and the Ph.D. degree in electrical engineering from
directions and to IBM Corporation for fabrication. Carleton University, Ottawa, Ontario, Canada, in
1985.
In 1985, he joined Nortel Neworks, Ottawa, where
REFERENCES he started working in the area of BiCMOS and
bipolar technology development, yield prediction,
[1] B. Beggs, “GaAs HBT 10-Gb/s Product,” in 1999 IEEE MTT-S Int. Mi- device characterization, and modeling. Recently, his
crowave Symp. Workshop, Anaheim, CA, June 13–19, 1999. work has been extended to the design of multi-gigabit
[2] SONET OC-192, “Transport system generic criteria,” Bellcore, circuits and systems. He is currently Senior Manager of a group responsible
GR-1377-CORE, no. 4, Mar. 1998. for evaluating various high-performance technologies and demonstrating
[3] R. C. Walker et al., “A 10-Gb/s Si-bipolar Tx/Rx chipset for computer advanced circuit concepts required for fiber optic communication systems. He
data transmission,” in ISSCC Dig. Tech. Papers, Feb. 1998, pp. 302–303. has authored and co-authored numerous publications.
[4] T. Morikawa et al., “A SiGe single-chip 3.3-V receiver IC for 10-Gb/s
optical communication systems,” in ISSCC Dig. Tech. Papers, Feb.
1999, pp. 380–381.
[5] Y. Greshishchev and P. Schvan, “SiGe clock and data recovery IC
with linear-type PLL for 10-Gb/s SONET application,” in Proc. 1999 Jonathan L. Showell (S’90–M’95) received the
Bipolar/BiCMOS Circuits and Technology Meeting, Sept. 1999, pp.
B.Eng and M.Eng degrees in engineering physics
169–172. from McMaster University, Hamilton, ON, Canada,
[6] Y. M. Greshishchev, P. Schvan, J. L. Showell, M.-L. Xu, J. J. Ojha, and
in 1990 and 1994, respectively.
J. E. Roger, “A fully integrated SiGe receiver IC for 10-Gb/s data rate,” He joined Nortel Networks, Ottawa, Canada, in
in ISSCC Dig. Tech. Papers, Feb. 2000, pp. 52–53.
1994, working on hot carrier injection reliability of
[7] J. D. H. Alexander, “Clock recovery from random binary signals,” Elec-
CMOS devices. Later he became a member of the
tron. Lett., vol. 11, pp. 541–542, Oct. 1975. Technology Access and Applications Group where
[8] C. R. Hogge, “A self-correcting clock recovery circuit,” J. Lightwave
his responsibilities included accurate high-fre-
Technology, vol. 3, pp. 1312–1314, Dec. 1985. quency analog (up to 110 GHz) and digital (40
[9] J. Hauenschild et al., “A two-chip receiver for short-haul links up to
Gb/s) measurements and the design of high-speed
3.5-Gb/s with PIN-preamp module and CDR-MUX,” in ISSCC Dig. 10- to 40-Gb/s, multiplexer/demultiplexer circuits in SiGe HBT and InP
Tech. Papers, Feb. 1998, pp. 308–309.
HBT technologies, respectively. Recently, he joined Quake Technologies,
[10] J. Hauenschild et al., “A plastic packaged 10-Gb/s biCMOS clock and Ottawa, Canada, working on the design of chip sets for high-speed datacom
data recovering 1 : 4-demultiplexer with external VCO,” IEEE J. Solid-
applications. His interests include high-speed technologies, circuit design for
State Circuits, vol. 31, pp. 2056–2059, Dec. 1996.
high-speed communications, and accurate high-frequency measurements.
[11] R. C. Walker et al., “A two-chip 1.5-GBd serial link interface,” IEEE J.
Solid-State Circuits, vol. 27, pp. 1805–1811, Dec. 1992.
[12] R. Steele, Delta Modulation Systems. New York/Toronto: Wiley, 1975.
[13] M. Soda, T. Suzaki, and T. Morikawa et al., “A Si bipolar chip set for
10-Gb/s optical receiver,” in ISSCC Dig. Tech. Papers, Feb. 1992, pp.
100–101. Mu-Liang Xu (M’00), biography not available at time of publication.
GRESHISHCHEV et al.: Fully Integrated SiGe Receiver IC 1957
Jugnu J. Ojha (M’00) received the B.Eng. degree from Salhousie University Jonathan E. Rogers (S’00) received the B.A.Sc de-
and the Technical University of Nova Scotia in 1987. He received the M.Sc. and gree in engineering sciences (electrical option) from
Ph.D. degrees from McMaster University, Hamilton, Ontario, Canada, in 1990 the University of Toronto, Ontario, Canada, in 1999.
and 1994, respectively. His graduate work involved research in electronic and He is currently working toward the M.A.Sc in elec-
optoelectronic devices, as well as optoelectronic properties of semiconductors. tronics at the University of Toronto. His area of re-
He was with Nortel Networks, Ottawa, Ontario, Canada, from 1994 to 2000, search is clock and data recovery systems in deep
where he worked on a wide range of technologies, including design of circuits sub-micron CMOS.
for 10 and 40 Gb/s optical transmission systems using SiGe and InP HBTs. He In May, 1997, he joined Nortel, Ottawa, Ontario,
also led a program in MEMS technology, with a focus on optical applications, for a 16-month internship, where he performed clock
including optical crossconnects. His other activities included next-generation and data recovery system characterization, VCO de-
optical network development, as well as research on optical properties of SiGe sign, and high-speed measurements on SiGe MMICs
materials and devices. He recently joined Caspian Networks in Palo Alto, CA, under the guidance of Dr. Y. Greshishchev.
as Senior Advisor in Optical Networking.
1120 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 9, SEPTEMBER 2002
data time slot, while the chain samples the data zero cross-
ings. The combinatory logic block, with inputs from the , ,
and latch chains, determines if the clock is early or late with
respect to the incoming data transition. This logic generates the
UP–DOWN control signal for the VCO.
and are generated by decision circuits (respectively, after
two and four latches). The phase difference between and is
180 (one bit). is generated after three latches with an inverted
clock. If the clock phase is correct, is always in the middle
of and . EXOR ( ) and NAND ( ) gate combinatory logic
converts , , and to the UP–DOWN pulses for controlling
the VCO. The logic equations are as follows.
UP
DOWN
Fig. 5. Buffer amplifier cascaded TAS–TIS architecture and transistor level schematic.
electrical 10-Gb/s 2 PRBS NRZ signal in an optical link Young-Kai Chen (S’78–M’86–SM’94–F’98) received the B.S.E.E. degree
experiment. An optical sensitivity of 29.5 dBm is measured from National Chiao Tung University, Hsinchu, Taiwan, R.O.C., the M.S.E.E.
degree from Syracuse University, Syracuse, NY, and the Ph.D. degree from
at 10 BER. Cornell University, Ithaca, NY, in 1988.
From 1980 to 1985, he was a Member of Technical Staff in the Electronics
Laboratory of the General Electric Company, Syracuse, responsible for the de-
REFERENCES sign of silicon and GaAs MMICs for phase array applications. Since 1988,
he has been with Lucent Technologies, Bell Laboratories, Murray Hill, NJ,
[1] R. Yu, R. Pierson, P. Zampardi, K. Runga, A. Campana, D. Meeker, K. C.
as a Member of Technical Staff. Since 1994, he has been the Director of the
Wang, A. Peterson, and J. Bowers, “Packaged clock recovery integrated
High Speed Electronics Research Department. He is also an Adjunct Asso-
circuits for 40-Gb/s optical communication links,” in GaAs IC Symp.
ciate Professor at Columbia University, New York, NY. His research interest
Tech. Dig., 1996, pp. 129–132.
is in high-speed semiconductor devices and circuits for wireless and fiber-optic
[2] M. Wurzer, J. Bock, H. Knapp, W. Zirwas, F. Schumann, and A. Felder,
communications. He has authored more than 90 technical papers and holds nine
“A 40-Gb/s integrated clock and data recovery circuit in a 50-GHz f
patents in the field of high-frequency electronic and semiconductor lasers.
silicon bipolar technology,” IEEE J. Solid-State Circuits, vol. 34, pp.
Dr. Chen is a member of the American Physics Society and the Optical So-
1320–1324, Sept. 1999.
ciety of America.
[3] J. Hauenschild, C. Dorschky, T. W. von Mohrenfels, and R. Seitz, “A
plastic packaged 10-Gb/s BiCMOS clock and data recovering 1 : 4 de-
multiplexer with external VCO,” IEEE J. Solid-State Circuits, vol. 31,
pp. 2056–2059, Dec. 1996. Alan H. Gnauck (M’98–SM’00) received the B.S. degree in physics and the
[4] M. Reinhold, C. Dorschky, R. Pullela, E. Rose, P. Mayer, P. Paschke, Y. M.S. degree in electrical engineering from Rutgers University, New Brunswick,
Baeyens, J. P. Mattia, and F. Kunz, “A fully integrated 40-Gb/s clock and NJ, in 1975 and 1986, respectively.
data recovery/1 : 4 DEMUX IC in SiGe technology,” IEEE J. Solid-State In 1982, he joined AT&T (now Lucent Technologies) Bell Laboratories. He
Circuits, vol. 36, pp. 1937–1945, Dec. 2001. has designed and built multigigabit amplifiers, multiplexers, demultiplexers,
[5] J. D. H. Alexander, “Clock recovery from random binary signals,” Elec- and optical receivers, and performed record-breaking optical transmission
tron. Lett., vol. 11, pp. 541–542, 1975. experiments at single-channel rates of from 2 to 40 Gb/s. He has investigated
[6] H.-M. Rein, “Design considerations for very high speed Si-bipolar ICs coherent detection, chromatic-dispersion compensation techniques, CATV
operating up to 50 Gb/s,” IEEE J. Solid-State Circuits, vol. 31, pp. hybrid fiber-coax architectures, wavelength-division-multiplexed (WDM)
1076–1090, Aug. 1996. systems, and system impacts of fiber nonlinearities. His WDM transmission
[7] M. Sokolich, D. Doctor, Y. Brown, A. Kramer, J. Jensen, W. Stanchina, experiments include the first demonstration of terabit transmission. He is a
S. Thomas, C. Fields, D. Ahmari, M. Liu, R. Martinez, and J. Duvall, Technical Committee Member of the Optical Fiber Communications Confer-
“A low power 52.9-GHz static divider implemented in a manufacturable ence (OFC) 2003. He holds twelve patents in optical fiber communications.
180-GHz InAlAs/InGaAs HBT IC technology,” in GaAs IC Symp. Tech. His current research interests include the study of WDM systems with
Dig., 1998, pp. 117–120. single-channel rates of 40 Gb/s.
[8] G. Georgiou, P. Paschke, R. Kopf, R. Hamm, R. Ryan, A. Tate, J. Burm, Dr. Gnauck is an Associate Editor for IEEE PHOTONICS TECHNOLOGY
C. Schullien, and Y.-K. Chen, “High gain limiting amplifier for 10-Gb/s LETTERS.
lightwave receivers,” in Proc. 11th Int. Conf. InP and Related Materials,
1999, pp. 71–74.
Mario Reinhold was born in Mülheim/Ruhr, John-Paul Mattia received the B.S., M.S.E.E.,
Germany, in 1972. He received the Dipl.-Ing. degree and Ph.D. degrees in electrical engineering and
in electrical engineering from the Ruhr University, computer science from the Massachusetts Institute
Bochum, Germany, in 1998. of Technology (MIT), Cambridge.
He joined the Optical Networking Group, Lu- He began working in high-speed electronics
cent Technologies, Nürnberg, Germany, in 1998, at MIT Lincoln Laboratory in 1989. In 1996, he
where his activities focused on the development joined Texas Instruments Incorporated in the DSP
of various analog and digital high-speed bipolar R&D organization. From 1997 to 2000, he worked
ICs for 40-Gb/s and advanced 10-Gb/s fiber-optic in the High-Speed Electronics Group, Lucent
communication systems. Since 2001, he has been Technologies, Bell Labs, designing and testing
with CoreOptics Inc., Nürnberg, working on a circuits for lightwave communication systems. Since
next-generation 40-Gb/s chipset. July 2000, he has been with Big Bear Networks, Sunnyvale, CA, where he is
Chief Technical Officer of Electronics.
(a)
Fig. 5. DLL to generate all 90 phase shifted sampling clocks with high
accuracy. Fig. 7. VCO schematic.
Fig. 11. VCO clock output and data output eye pattern at 1 Gb/s with a
(215 0 1)-bit length pseudorandom input sequence.
IX. CONCLUSION
Complete on-chip clock and data recovery at 1 Gb/s is
feasible with a standard 0.5- m CMOS technology. On-
chip clock is only 500 MHz in this case. Data are directly
demultiplexed one to two in the retiming flip-flops. A multi-
plexer to regenerate the original data stream was included for
measurement purposes only. In applications, serial-to-parallel
Fig. 10. Maximum data rate versus pseudorandom sequence length for conversion will normally follow the PLL. In that case, the
error-free receiving during time of measurement (complies with error rate
smaller than 1 10011 ).
1 halved clock frequency is an advantage, because the following
blocks can be designed more easily.
Abstract—An integrated 10 Gb/s clock and data recovery (VCO) frequency (that causes jitter) during a long run of data
(CDR) circuit is fabricated using SiGe technology. It consists of 0 s or 1 s. Second, charge-pump and VCO control circuits were
a linear-type phase-locked loop (PLL) based on a single-edge designed to provide a high degree of PLL filter isolation, or low
version of the Hogge phase detector, a LC-tank voltage-controlled
oscillator (VCO) and a tri-state charge pump. A PLL equivalent charge-pump offset current, in a tri-state. In addition, the charge
model and design method to meet SONET jitter requirements are pump has a high output impedance necessary for high loop gain
presented. The CDR was tested at 9.529 GB/s in full operation and in a PLL with passive filter. Third, the original Hogge PD was
up to 13.25 Gb/s in data recovery mode. Sensitivity is 14 mVpp
at a bit error rate (BER) = 10 9 . The measured recovered clock
modified to provide a single-edge operation and to extend linear
phase range. Fourth, circuit and layout cross-talk isolation tech-
jitter is less than 1 ps rms. The IC dissipates 1.5 W with a 5-V
power supply. niques similar to those presented in [6] are employed to prevent
jitter generation and sensitivity degradation due to a cross-talk.
Index Terms—Charge pump, clock and data recovery (CDR),
jitter generation, jitter tolerance, jitter transfer, phase detector, The CDR IC was implemented in IBM’s SiGe bipolar process
phase-locked loop (PLL), SONET, VCO. which includes pMOS devices.
Jitter characteristics of a LPLL depend to a large degree on
the PLL filter parameters. An LPLL equivalent model and de-
I. INTRODUCTION sign method to satisfy SONET requirements are presented in
(3) (5)
where is 3-dB bandwidth of the PLL jitter transfer func- For practical filter parameters and expected maximum time
tion in Hz, is VCO sensitivity in Hz/V, is the loop interval with no data transitions, the voltage variation across
natural frequency, and is the average data transition density capacitor is negligible compared to the voltage step
factor (maximum for 0101 pattern, for . The phase jitter is proportional to the number of
PRBS pattern). In (3) the charge pump current is doubled, consecutive 0’s or 1’s in the data, as follows:
compared to the Gardner’s formula [9], since in a Hogge PD a
[ps] [MHz] (6)
current variation of corresponds to -radians of the data
phase. Both the bandwidth and the damping factor are 2The amount of errors is defined with 1-dB receiver input power penalty.
GRESHISHCHEV AND SCHVAN: CLOCK AND DATA RECOVERY IC FOR SONET APPLICATION 1355
(a)
(b)
The charge pump (Fig. 8) employs a well known current- E. Cross-Talk Isolation
switching technique with the addition of a common-mode feed-
Two differential output buffers ( in Fig. 1) provide ad-
back amplifier . Care was taken to achieve unconditional sta-
justable differential voltage swing up to 1 V . The buffers are
bility in the feedback with sufficient gain and with a small value
physically separated from the VCO and PD with transmission
of capacitance in Fig. 8. Small is necessary for low jitter
line interfaces to prevent jitter generation due to cross-talk via
peaking in the PLL. The charge-pump output differential cur-
substrate and common grounds. The VCO is also separated
rent , as accounted for by the model in Fig. 2.
from the PD with similar transmission line interface. All of the
The charge pump is in a tri-state when both differential inputs
blocks have separate power-supply systems routed according
and are switched into a low (or high) state. A mismatch
to the isolation and analog–digital ground splitting techniques
between charge-pump current sources, , their finite output
described in [6]. All of the CDR circuits are fully differential.
resistances, and the VCO control input current cause an offset
The 10 Gb/s inputs and outputs are terminated on-chip with
current in the tri-state. Fig. 9 shows a plot of the PLL jitter due
50- resistors.
to relative offset current in the tri-state as calculated from
(7) for . Single-edge phase detection, employed in the
CDR, requires half the value compared to the double-edge IV. SIMULATION
Hogge-type PD. The top current sources and the feedback Five levels of hierarchical PLL analysis were carried out:
amplifier were designed with pMOS transistors. Appropriate analytical, behavioral linear, behavioral mixed-mode, circuit
matching was achieved by sizing the critical components and schematic level, and post layout with distributed parasitics. The
using symmetrical layout. To increase the charge-pump output last four levels are HSPICE-based. A mixed-mode behavioral
impedance, cascode current sources were employed. The library of linear and digital components was developed. All
measured offset was less than 0.2%. levels of simulation give consistent results, with increasing
GRESHISHCHEV AND SCHVAN: CLOCK AND DATA RECOVERY IC FOR SONET APPLICATION 1357
(a)
Fig. 12. CDR 9.529-Gb/s eye diagrams and the recovered clock. Input data 30
mV , 2 0
1 PRBS pattern.
APPENDIX A
The CDR jitter transfer function is similar by definition to
the PLL phase transfer function . For the second-order
charge-pump PLL of Fig. 2, the phase transfer function is [9]
(A.1)
Fig. 13. CDR data recovery eye diagrams at 13.25 Gb/s. Input data 14 mV ,
2 0 1 PRBS pattern. The RECCLK waveform is the PRBS generator (A.2)
reference clock translated by CDR.
(A.3)
(A.4)
Fig. 14. Phase-noise comparison of the CDR recovered clock, free-running APPENDIX B
VCO, and data pattern generator reference clock.
The following PLL filter components are found by solving
(1)–(3):
ence clock. Therefore jitter generated by the CDR is estimated to
be 0.78 ps rms. Phase noise was measured more accurately with
(B.1)
a HP4352B phase-noise meter (Fig. 14). Recovered clock phase
noise follows, with no error, the data reference clock noise down
to the CDR jitter noise floor at 110 dBc/Hz. The noise floor is (B.2)
reached within the bandwidth of the loop (designed to be above
4 MHz). Numerically integrated phase noise of the recovered
clock in 80 MHz bandwidth gives a jitter value of 0.77 ps rms. ACKNOWLEDGMENT
Jitter was found to be independent of the PRBS word length up The authors thank C. Kelly and P. Popescu for discussions,
to . The IC dissipates 1.5 W with a 5-V power supply. J. E. Rogers for his contributions to layout design and simula-
tions, M.-L. Xu for help with the output buffer layout, J. Showell
VII. CONCLUSION for assistance with the measurements, Dr. S. Voinigescu and D.
In this paper, a low-jitter integrated CDR with a linear-type Marchesan for their expertise in SiGe components modeling,
PLL has been demonstrated. The PLL equivalent model and de- and Dr. M. Copeland for advice on VCO phase noise analyses.
sign method to meet SONET jitter requirements were presented. Special thanks to R. Hadaway for his support and to IBM cor-
The IC was implemented in SiGe technology. Sub-picosecond poration for fabrication.
rms jitter with no jitter dependence on data PRBS pattern is
achieved. Jitter generation factors in CDR were considered. A REFERENCES
single-edge version of the Hogge-type PD and a tri-state charge [1] T. Morikawa et al., “A SiGe single-chip 3.3 V receiver IC for 10Gb/s op-
pump were designed to satisfy jitter requirements. PMOS tran- tical communication systems,” in ISSCC Dig. Tech. Papers, Feb. 1999,
pp. 380–381.
sistor circuits and cross-talk isolation technique were used to [2] R. C. Walker et al., “A 10Gb/s Si-bipolar Tx/Rx chipset for computer
improve CDR jitter performance. In a second-order LPLL a data transmission,” in ISSCC Dig. Tech. Papers, Feb. 1998, pp. 302–303.
bandwidth of more than 4 MHz and a damping factor of 4–6 [3] Y. Greshishchev and P. Schvan, “SiGe clock and data recovery IC
with linear-type PLL for 10 Gb/s SONET application,” in Proc. 1999
at minimum expected data transition density are recommended Bipolar/BiCMOS circuits and Technology Meeting, Sept. 1999, pp.
to satisfy OC192 jitter tolerance and jitter transfer peaking re- 169–172.
quirements. To satisfy jitter transfer bandwidth ( 120 KHz), [4] B. Razavi, “Design of monolithic phase-locked loops and clock recovery
circuits—A tutorial,” in Monolithic Phase-Locked Loops and Clock Re-
additional low-pass filtering of the recovered clock must be per- covery Circuits: Theory and Design, B. Razavi, Ed. New York, NY:
formed, for instance, in the PLL of a transmitter circuit. IEEE Press, 1996, pp. 405–420.
GRESHISHCHEV AND SCHVAN: CLOCK AND DATA RECOVERY IC FOR SONET APPLICATION 1359
[5] K. Kishine, N. Ishihara, K. Takiguchi, and H. Ichino, “A 2.5-Gb/s clock Peter Schvan (M’89) was born in Budapest, Hun-
and data recovery IC with tunable jitter characteristics for use in LANs gary, in 1952. He received the M.S. degree in physics
and WANs,” IEEE J. Solid-State Circuits, vol. 34, pp. 805–812, June from Eotvos Lorand University, Budapest, in 1975
1999. and the Ph.D. degree in electrical engineering from
[6] Y. Greshishchev and P. Schvan, “60 dB gain 55 dB dynamic range Carleton University, Ottawa, ON, Canada, in 1985.
10Gb/s SiGe HBT limiting amplifier,” IEEE J. Solid-State Circuits, In 1985, he joined Nortel Networks, Ottawa,
vol. 34, pp. 1914–1920, Dec. 1999. ON, Canada, where he worked in the area of
[7] L. De Vito, “A versatile clock recovery architecture and monolithic im- BiCMOS and bipolar technology development, yield
plementation,” in Monolithic Phase-Locked Loops and Clock Recovery prediction, device characterization, and modeling.
Circuits: Theory and Design, B. Razavi, Ed. New York, NY: IEEE Recently, his work has been extended to the design
Press, 1996, pp. 405–42. of multigigabit circuits and systems. He is currently
[8] “SONET OC-192 Transport System Generic Criteria,” Bellcore, Senior Manager of a group responsible for evaluating various high-perfor-
GR-1377-CORE, Mar. 1998. mance technologies and demonstrating, advanced circuit concepts required for
[9] F. M. Gardner, “Charge-pump phase-lock loops,” IEEE Trans. fiberoptic communication systems. He is the author or coauthor of numerous
Commun., vol. COM-28, pp. 1849–1858, Nov. 1980. publications.
[10] C. R. Hogge, “A self-correcting clock recovery circuit,” IEEE J. Light-
wave Technol., vol. 3, pp. 1312–1314, Dec. 1985.
[11] B. Jansen, K. Negus, and D. Lee, “Silicon bipolar VCO family for 1.1 to
2.2 GHz with fully-integrated tank and tuning circuits,” in ISSCC Dig.
Tech. Papers, Feb. 1997, pp. 392–393.
[12] J. D. Cressler, “SiGe HBT technology: A new contender for Si-based
RF and microwave circuit applications,” IEEE Trans. Microwave Theory
Tech., vol. 46, pp. 572–589, May 1998.
Abstract— A general-purpose phase-locked loop (PLL) with design is a large digital circuit incorporating a PLL-based
programmable bit rates is presented demonstrating that large clock generator with low-jitter requirements, which is the most
frequency tuning range, large power supply range, and low jitter common mixed-mode design today. Digital style PLL’s have
can be achieved simultaneously. The clock recovery architecture
uses phase selection for automatic initial frequency capture. The been suggested, e.g., [3], but these cannot compete with the
large period jitter of conventional phase selection is eliminated supply-noise rejection of differential analog circuitry.
through feedback phase selection. Digital control sequencing of A clock recovery PLL architecture suitable for pro-
the feedback enables accurate phase interpolation without the grammable bit rates is developed in Sections II and III with
traditional need of analog circuitry. Circuit techniques enabling emphasis on jitter reduction. Sections IV–VI present PLL
low-V dd operation of a PLL with differential delay stages are
presented. Measurements show a PLL frequency range of 1–200 circuit techniques that use the noise resistant differential pair
MHz at V dd = 1:2 V linearly increasing to 2–1600 MHz at V dd but avoid other “expensive” (in terms of headroom) analog
= 2:5 V, achieved in a standard process technology without low circuitry, such that low- operation is enabled in a standard
threshold voltage devices. Correct operation has been verified digital CMOS process without the need of low-threshold
down to V dd = 0:9 V, but the lower limit of differential operation devices.
with improved supply-noise rejection is estimated to be 1.1 V.
Index Terms—Frequency locked loops, frequency synthesizers, II. LOW-JITTER PHASE-SELECTING CLOCK RECOVERY
phase comparators, phase jitter, phase locked loops, phase noise,
synchronization. A basic PLL for clock recovery is shown in Fig. 1(a). In
most CMOS implementations, the VCO must have a tuning
range covering more than 50% of the target frequency
I. INTRODUCTION to guarantee high yield over large process variations. This
(b)
III. AVERAGING PHASE INTERPOLATION
The smoothing effect of the loop filter can also be used for
phase interpolation. If the Ctrl signal in Fig. 1(c) alternates be-
tween two different clock phases every second cycle of the ref-
erence clock, the result will be a VCO clock phase correspond-
ing to the average of the two selected phases. In the test chip,
four levels of averaging phase interpolation were implemented
by circulating through four clock cycles and in each clock cy-
cle selecting phase or as the feedback clock. A quar-
ter phase interpolation generating is then achieved by
selecting for three consecutive clock cycles, then selecting
for the fourth cycle and repeating this sequence.
The architecture in Fig. 1(c) lends itself naturally to combin-
(c)
ing both averaging phase interpolation and standard current-
Fig. 1. Clock recovery PLL’s. (a) Standard, (b) phase selection, and (c) mode interpolation. A test chip was built in a 0.25- m,
+
feedback phase selection. ChP F denotes charge pump + loop filter.
2.5-V digital CMOS process to evaluate the jitter performance
of the phase selection architecture. A block diagram of the
Loop A to suppress VCO jitter [4], for example, jitter induced implemented VCO and phase control circuitry is shown in
by power-supply noise. At the same time, a low bandwidth Fig. 2. The phase select control code at the input consists
can be used in Loop B to reduce jitter transfer. This cannot of seven bits, of which two are directly fed to a finite state
be achieved by the PLL in Fig. 1(a), which has a single loop machine (FSM) that generates control signals for realizing
with conflicting design goals regarding loop bandwidth. the averaging interpolation. The remaining five bits of the
A disadvantage of a phase-selecting PLL is that the phase control code represent from which the code for
step that is generated when the Ctrl signal in Fig. 1(b) switches is generated by adding one. The FSM controls Mux1 to
to a new clock phase. This phase switching leads to large select one of the codes representing and in a four
cycle-to-cycle jitter (greater than or equal to the phase spac- clock period repetitive cycle, as described above. The five
ing) that can actually dominate the peak-to-peak jitter. By bits at the output of Mux1 are split into three bits coarse
increasing the number of phases, the phase spacing will be select and two bits fine select. The three coarse bits select
smaller with less jitter. More phases can be generated by two neighboring phases from a four-stage differential VCO
having more delay stages in the VCO, but this limits the having eight evenly spaced output phases and send these two
speed. An alternative is phase interpolation that enables a large phases to a current-mode interpolator. Mux2/Mux3 in Fig. 2
number of phases without degrading the VCO speed [8], [9]. receive one coarse bit each, and the third coarse bit is used
However, interpolators add analog circuitry to the design and to conditionally invert the output signals. The interpolator is
are prone to mismatch, which in the worst case can lead to similar to the Type-I circuit in [9] and is controlled by a four-
nonmonotonic phase spacing. bit temperature code derived from the two fine select bits.
A proposed remedy for the jitter due to phase steps is shown Both the current-mode interpolation and the averaging phase
in Fig. 1(c). Instead of selecting a clock phase feeding the interpolation are programmable in the test chip and can be
sampling flip-flop and the phase detector, the feedback clock in disabled. The two complementary multiplexers at the output
LARSSON: 2–1600-MHz CMOS CLOCK RECOVERY PLL 1953
(a)
low oscillator noise suggests that the rise and fall times of the
output nodes should be made equal [16]. This is achieved
by reflecting half of to each of the controlled PMOS
loads by the current mirror formed by devices Md1–Md3.
Assuming that Md4 recently turned on, “node a” will be
discharged by a current of
At the same time, the complementary output node is pulled
(b)
to by a current equal to indicating equal
Fig. 5. Bias generation and one VCO delay stage of (a) replica bias scheme rise and fall times. A disadvantage of this oscillator is the
and (b) diode clamping.
additional parasitic capacitance of the diodes, which makes the
maximum operating frequency lower than that of the replica
Good 1 noise performance has been shown, but their power- bias structure. The additional gate capacitance of the diode
supply noise rejection is inferior to that of the standard analog loads can be eliminated by using NMOS diodes [17].
differential pair since they lack a high-impedance source, The minimum supply voltage for the VCO is
making the delay depend on Therefore, the analog style which has been verified by measurements
differential pair is preferable in applications where power- down to V. However, at this value of
supply noise is the main source of oscillator jitter. When a the VCO is no longer differential. An estimate of the min-
differential pair with resistor loads is used as a delay cell in a imum for differential operation can be derived from
VCO, the frequency is regulated by changing the tail current the simulated VCO waveforms in Fig. 6. The VCO output
as implemented by the control voltage in Fig. 5(a). To swings from down to approximately For
achieve a large frequency tuning range, it is desirable that the differential operation, it is required that both NMOS devices
output swing and common mode do not change significantly in the differential pair (Md4, Md5) are turned ON at the
with frequency. Often the replica-bias scheme in Fig. 5(a) is crossover point of the waveforms. Assuming a drop
employed, which relies on good matching between a replica of over the current source device Md6 leads to a minimum input
the delay stage (devices Mr1–Mr3) and the VCO delay stages voltage of At the lowest limit of this
to set the VCO output swing from to , giving a known input voltage is generated by the previous
common mode and swing independent of the speed-regulating stage in the oscillator, indicating a minimum of
current. A disadvantage of this technique is that the PMOS Measurements determined and to
load (Mr3) will operate as a current source at low frequencies, be 0.53 and 0.85 V, respectively, indicating a minimum
introducing high gain in the replica feedback loop. To prevent of about 1.1 V assuming a of 0.1 V. Note that this
instability, a large compensation capacitor is required, which is a theoretical number, since the differential operation of
introduces another pole in the PLL, leading to more intricate the VCO has zero tuning range at this value of Good
design. Furthermore, the amplifier in the replica bias loop power-supply rejection can also be achieved by the regulated-
requires additional headroom, thereby prohibiting low- supply structure in [18]. However, the requirement of a large
operation. decoupling capacitor generates contradicting design goals on
Fig. 5(b) shows a structure that achieves the good power- PLL bandwidth.
supply noise rejection of the analog differential pair, at the
same time enabling low- operation. The PMOS diodes V. CHARGE PUMP
are used for clamping the output voltage to a minimum level
of giving a fixed common mode and swing A. Bandwidth and Peaking Compensation
without the need for a replica bias circuit. This makes the VCO To reduce peak-to-peak jitter due to VCO noise, it is
suitable for a wide range of operating frequencies and supply advantageous to keep as high a PLL bandwidth as possible.
voltages. To guarantee clamping action, the NMOS tail current Traditional worst case design would keep the PLL bandwidth
must be larger than the current through the controlled and damping factor sufficiently far away from stability limits
PMOS load Furthermore, a proposed design goal for under all variations of the input reference frequency, the
LARSSON: 2–1600-MHz CMOS CLOCK RECOVERY PLL 1955
(a)
(1)
where is a fixed reference current. This is realized by the Fig. 9. (a) Charge-pump suffering from charge sharing (Type A). (b) Charge
removal transistors eliminate charge sharing (Type B).
current multiplier in Fig. 7, which generates the charge-pump
current by letting the individual bits of control binary
weighted current sources. of nodes ncs and pcs can never be matched, this will lead to a
The simulated jitter transfer function of a standard PLL in static phase offset, as shown in Fig. 10(a). This is the transfer
Fig. 8(a) demonstrates the change of loop parameters as is function of a phase-frequency detector followed by a Type A
altered. The damping factor is intentionally set low to show charge pump. The two transistors Mp and Mn in the Type
its dependence on The measured jitter transfer function of B charge pump in Fig. 9(b) will remove the charge from the
Loop A in Fig. 8(b) shows the desired independence of The nodes pcs and ncs when Up and Down are deactivated [22].
slight deviation of the curves is caused by transistor mismatch This leads to a large reduction in the phase offset, as shown
in the current multiplier. in Fig. 10(a).
For this application, static phase offset in Loop A is not
B. Charge Sharing critical. However, when analyzing the cause of phase offset, a
A common problem of many charge pumps is charge source of increased jitter is revealed. Fig. 10(b) indicates that
sharing. For the charge pump in Fig. 9(a) (Type A), charge the leakage from node pcs is larger than that from ncs. When
sharing is caused by the parasitic capacitance in nodes pcs the PLL is locked, the leakage mismatch is compensated for by
and ncs [21]. When is active, node pcs is charged to activating earlier than , giving a phase offset. Since
When deactivating some of the charge stored in node pcs the compensation charge is applied in the early portion of the
will leak through the current source device. Since the parasitics charge-pump activation time, it will cause voltage ripple on
1956 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 12, DECEMBER 1999
(a) (b)
(a)
(c) (d)
(e)
Fig. 11. Evolution of loop filter. (a) Ideal model, (b) MOS-only implemen-
(b) tation, (c) improved resistor linearity for low-V dd operation, (d) improved
Fig. 10. Characteristics of the Type A and B charge pumps. (a) Transfer capacitor linearity, and (e) final model where C3 models the well-to-substrate
function of PFD followed by charge pump. (b) Simulated IUp and IDown capacitance.
when net output charge is zero.
shown in Fig. 11(e), where is the parasitic well-to-substrate Fig. 13. Phase-frequency detector used in Loop A with details of Up section.
capacitance of the MOS capacitor. This filter has an impedance
of to the two classical PFD’s implemented by either four RS flip-
(3) flops or two resettable D—flip-flops. The precharged gate and
the shorter logic depth of this implementation make the delay
which is a close approximation to the impedance of the original shorter than for the standard PFD’s. This allows a smaller
filter in Fig. 11(a), given as delay in the reset path for eliminating the dead zone, such that
loop filter ripple will be reduced and generate less noise. An
(4) additional benefit of low logic depth is a reduction in phase
detector jitter caused by power-supply-dependent delays and
when as is common design practice [20]. device noise.
The reset delay of this PFD can be further reduced by
VII. PHASE-FREQUENCY DETECTOR letting the signal directly reset the precharged gate
simultaneously as the RS flip-flop is reset. This technique
Phase detectors may exhibit a dead zone, resulting in en- was not adopted in order to keep a conservative design,
larged jitter. A common design technique to avoid a dead zone guaranteeing operation with no dead zone. Similar precharged
is to make sure that both Up and Down output signals are fully gates have previously been used in PFD designs [27]–[29].
activated before shutting them both off. This is implemented
by generating a reset signal with an AND operation of Up VIII. FREQUENCY DIVIDER
and Down output and introducing a delay before feeding back
this signal to reset the phase detector. It is this reset delay To enable high flexibility, the frequency divider in Fig. 2 is
that causes the simultaneous and in Fig. 10(b). a fully programmable ( ) divider. The structure in
If the charge sharing in the charge pump is not perfectly [30] based on a clock-gated dual-modulus prescaler followed
cancelled or if there is a mismatch of and there will by a counter was chosen to achieve high speed at low supply
always be some current compensation, leading to phase offset voltages. The divider was realized in standard static CMOS
and loop filter ripple, as discussed in Section V. A longer logic, reaching a maximum operating frequency of 800 MHz in
reset delay results in a longer period during which the VCO simulations of worst case slow process variation at
is running at a different frequency due to the compensation V and C This exceeded the simulated speed limit of
current. Therefore, the reset delay should be minimized under the VCO. The potential startup deadlock in [30] was eliminated
the constraint that it has to be longer than the response time by logic that prohibits two consecutive clock pulse removals.
of the PFD with some additional design margin to avoid a
dead zone. IX. PLL OPERATING RANGE AND JITTER
A PFD with low logic depth is shown in Fig. 13, including The maximum operating frequency of the PLL measured
details of the Up section. Its operation is easiest to analyze at room temperature is plotted in Fig. 14 as function of
by assuming an initial state of Simulations indicate that the speed is limited by the VCO. A
This implies that and that is minimum of 0.9 V agrees well with the measured
precharged high. A rising edge on discharges and sets V. At low power-supply voltages, the speed cannot
without changing the state of the RS flip-flop. The compare with high-end circuits using standard However,
internal weak feedback in the path will assure that the operating frequency range exceeds that of low-voltage
is kept active even if falls. At the next rising edge on V, circuit implementations [2], [3], [25], [26]. The maximum
is activated, which sets This triggers the RS speed also compares favorably with another low-voltage PLL
flip-flop to precharge high, which shuts off ; and, at the based on a low-threshold process [18].
same time, is deactivated in a similar way. With a PLL bandwidth of 2 MHz, the tracking jitter is 5.2 ps
In summary, a positive edge on sets which is rms at 1200 MHz, as shown in Fig. 15(a). This measurement
reset by the next positive edge on This behavior is identical represents the standard deviation of the delay between a
1958 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 12, DECEMBER 1999
(5)
TABLE I TABLE II
PLL JITTER FOR V DD = 2:5 V MEASURED WITH A PLL BANDWIDTH OF 2 MHz PLL CHARACTERISTICS
Abstract— A 2.5-Gb/s monolithic clock and data recovery Case 1) LAN’s such as gigabit/second ethernets, fiber
(CDR) IC using the phase-locked loop (PLL) technique is channels, and other optical interconnections. They
fabricated using Si bipolar technology. The output jitter use a single span of a transmission medium.
characteristics of the CDR can be controlled by designing the
loop-gain design and by using the switched-filter PLL technique. Case 2) Backbone networks or WAN’s such as syn-
The CDR IC can be used in local-area networks (LAN’s) and in chronous digital hierarchy (SDH) or synchronous
long-haul backbone networks or wide-area networks (WAN’s). optical network. They use line regenerators to
Its power consumption is only 0.4 W. For LAN’s, the jitter
generation of the CDR when the loop gain is optimized is 1.2 ps
transport information over long distances.
(0.003 UI). The jitter characteristics of the CDR optimized for For case 1), the CDR must suppress mainly the jitter generated
WAN’s meet all three types of STM-16 jitter specifications given due to noise in the CDR, so-called jitter generation. In case 1),
in ITU-T Recommendation G.958. This is the first report on a there is no jitter accumulation due to cascaded regenerators.
CDR that can be used for both LAN’s and WAN’s. This paper
also describes the design method of the jitter characteristics of
For case 2), however, the ITU-T G.958 recommendation
the CDR for LAN’s and WAN’s. for SDH stipulates other specifications [3]: a) jitter transfer
specification, which is the criterion of the suppression of
Index Terms—Clock and data recovery (CDR), IC, jitter sup-
pression, local-area network (LAN), low jitter, phase-locked loop
the noise in input signals to line regenerators, and b) jitter
(PLL), transmission receiver, wide-area network (WAN), 2.5 tolerance specification.
Gb/s. This paper describes a 2.5-Gb/s CDR that can be used in
both cases, which eliminates the need to fabricate two chips
I. INTRODUCTION with different characteristics. The key design techniques are
based on the switched-filter (SF) PLL technique and loop gain
O PTICAL communication systems, which are used in
local-area networks (LAN’s) and wide-area networks
(WAN’s), are expected to play an important role in realizing
adjustment using a gain control amplifier (GCA) circuit. The
CDR IC is fabricated using 0.5- m Si bipolar technology.
the future multimedia society. These systems must be com- The loop gain and loop bandwidth can be adjusted using a
pact, economical to produce, and efficient in terms of power control signal from outside the chip. For case 1), the rms
consumption. Given these requirements, researchers have been jitter generation of the CDR can be reduced to only 1.2 ps,
developing low-power and small-size optical receiver/sender and the capture range is 150 MHz. For case 2), the jitter of
(OR/OS) modules. A clock and data recovery (CDR) circuit the CDR meets the jitter specifications of the ITU-T G.958
is one of the key components of the OR, which must have recommendation. The rms jitter generation is 3.6 ps, and the
retiming, reshape, regeneration (3R) operation. To ensure that capture range is 50 MHz. The power consumption of the CDR
the receivers have low power consumption and are cost- for both cases is only 0.4 W.
effective and compact, it is essential to employ a single-chip, In Section II, the concept of the suppression of jitter in each
adjustment-free CDR IC using the phase-locked loop (PLL) case is discussed. It is explained that the SF PLL technique
technique without any high- components. can be used in the CDR for both cases. Design details
A number of approaches have been proposed for developing and the configurations of the circuits of the CDR are given
a CDR IC using the PLL technique [1], [2]. Generally, the in Section III. Section IV discusses the experimental results,
jitter specifications for the CDR differ depending on what which show that the CDR has very good jitter characteristics,
it is being used for, and jitter suppression is one especially and discusses the feasibility of using the CDR for various
serious problem for the CDR-IC design. There are different transmission systems.
jitter specifications for the following two applications.
Manuscript received August 19, 1998; revised February 8, 1999. II. CONCEPT FOR JITTER-SUPPRESSION DESIGN
K. Kishine and H. Ichino are with NTT Network Innovation Laboratories,
Yokosuka, Kanagawa 239-0847 Japan.
N. Ishihara is with NTT Opto-electronics Laboratories, Atsugi-shi, Kana- A. Jitter Characteristics of CDR Using the PLL Technique
gawa 243-01 Japan. Generally, output jitter of a CDR based on the PLL tech-
K. Takiguchi is with NTT Electronics Corp., Atsugi-shi, Kanagawa Pref.
243-0032 Japan. nique can be caused by two kinds of sources: 1) additive
Publisher Item Identifier S 0018-9200(99)04198-0. noise that accompanies the input signal [Fig. 1(a)] and 2)
0018–9200/99$10.00 1999 IEEE
806 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 6, JUNE 1999
(a)
(b)
Fig. 2. Loop-gain dependence of jitter.
Fig. 1. Noise source (a) in input signal and (b) in PLL.
(1)
B. Design of the CDR
Given the previous discussion, it is clear that there should
where is the power spectra density of noise, is the input
be two types of CDR design, one for LAN’s and another for
signal amplitude, is the natural angular frequency, is the
WAN’s.
damping factor, and is the loop gain. When the loop gain
1) CDR for LAN’s: In the case for LAN’s, the jitter from
becomes larger, the jitter becomes larger, as shown in the
input signals is small because there is no jitter accumulation
Appendix. This means that smaller loop gain causes narrow
through the short and single laser-fiber-receiver span. We can
noise bandwidth, thereby suppressing jitter. It should be noted
therefore concentrate on reducing the jitter generation, which
that smaller loop gain leads to a smaller cutoff frequency of
is caused by the input-signal-pattern dependence of the circuit,
the jitter transfer function of a PLL. On the other hand, in
the fluctuation of the supply voltage, and device noise in the
order to suppress the jitter caused by noise generated in the
CDR. As described in Section II-A, this design should not
CDR circuit, the operation of the CDR circuit needs to be
utilize smaller loop gain to lower the cutoff frequency, but
made stable. This stability can be obtained by reducing the
instead should utilize larger loop gain to achieve smaller output
signal fluctuation in the CDR circuit caused by the input of
jitter.
consecutive data bits, device noise, and so on. This is the so-
2) CDR for WAN’s: In the case for backbone networks or
called suppression of jitter generation, which is specified for
WAN’s, the regenerator may be cascaded in order to transport
SDH in ITU-T Recommendation G.958. The output jitter (in
information over long distances, causing the jitter to accu-
rad) in this case (jitter generation) is expressed as [4]
mulate. Therefore, not only the jitter generation of the CDR
has to be taken into consideration but also the jitter transfer
(2) characteristics, which is the criterion of suppression of noise in
input data signals as given in ITU-T Recommendation G.958.
where is the power spectra density of noise. This equation is There is a tradeoff between reducing the jitter generation and
derived assuming that the instantaneous frequency deviation of reducing the cutoff frequency of the jitter transfer function.
the VCO output is caused by disturbance due to random phase a) Jitter transfer: The loop gain of the CDR IC using the
noise. In this equation, when the loop gain becomes larger, PLL technique, on an IC whose jitter transfer specifications
the jitter becomes smaller, as shown in the Appendix. In meet those of ITU-T Recommendation G.958, must be de-
other words, the jitter increases as the loop gain decreases. signed to be lower. The jitter transfer function of the 2.5-Gbit/s
Larger loop gain can reduce the jitter caused by the noise in PLL using a lag-lead filter can be expressed by substituting the
KISHINE et al.: CDR IC FOR LAN’S AND WAN’S 807
B. GCA
the delay circuit itself is relatively small. In addition, in the
A current-bypass GCA circuit is used in the CDR (see
ECL circuit, there is no input-data-pattern dependence of the
Fig. 10). The gain of the GCA, which can be controlled from
response in the edge-inclined circuit capacitor. When the ECL
outside the chip, can be varied from 40 to 0 dB. To lower
delay circuit is used, the simulated jitter due to the input
the jitter generation of the CDR, the gain should be higher. On
data pattern is about 80% of that when an edge-inclined-delay
the other hand, to achieve the lower cutoff frequency of the
circuit is used. As a result of using the ECL circuit, the jitter
jitter transfer curve, it must be lower. Therefore, gain should
due to the input pattern effect is more suppressed than in the
be adjusted according to the jitter specification in each case.
circuit reported in our previous work [5].
C. Delay Circuit
D. Loop Filter
To reduce jitter generation, the edge-inclined circuit in the
The lag-lead filter is used as the loop filter, and the RC time
90 -delay block shown in Fig. 4, which includes a capacitor
constant is adjusted for each use. An additive capacitor outside
for delay control [5], is replaced by a chain of emitter-coupled
the chip is not needed when the CDR is used for short-distance
logic (ECL) buffer circuits without capacitors, the delay of
transmission systems. It is, however, needed for long-distance
which can be adjusted from about 100 to 300 ps from outside
use.
the chip. The edge-inclined circuit was employed to make
the 156-Mbit/s CDR smaller. But the delay needed for the
2.5-Gbit/s CDR is only 200 ps, which is much smaller than E. Other Considerations
that needed for the 156-Mbit/s CDR. Therefore, only a small Furthermore, in order to guarantee jitter tolerance, it is
number of ECL circuits are needed for the 200-ps delay, and important to maintain an optimum timing adjustment between
KISHINE et al.: CDR IC FOR LAN’S AND WAN’S 809
the extracted clock and the input data. This timing adjustment IV. EXPERIMENTAL RESULTS
is attained by allowing the clock to trigger the center of the A new chip was fabricated using the 0.5- m super self-
data period by means of the phase-comparator (PC) output aligned process technology Si bipolar process [7]. It was
810 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 6, JUNE 1999
(a)
(b)
Fig. 12. Output waveforms of the CDR for LAN. (a) Data output. (b) Clock output.
(a)
(b)
Fig. 13. Output waveforms of the CDR for WAN. (a) Data output. (b) Clock output.
mounted in a 7 7 mm -square ceramic package. The CDR Gb/s are shown in Fig. 12. The eye opening of the output data
IC of both high and low loop gain is evaluated in each case was sufficiently wide, and clock extraction was very precise.
when the gain is adjusted to both short- and long-distance The rms jitter generation is 1.2 ps and the capture range is
transmission systems. Jitter was measured with a commercial over 150 MHz.
jitter analyzer. An rms jitter-generation value from which the
jitter value of input data is subtracted can be obtained with
the analyzer. B. SF CDR for WAN
To lower the cutoff frequency of the jitter transfer curve, the
external capacitor for the loop filter of 0.1 F is added. The
A. SF CDR for LAN loop gain is set to about 2 10 (1/s), which is the loop gain
The internal capacitor in the loop filter is 10 pF, and an when the jitter transfer curve meets the ITU-T jitter transfer
external capacitor is not needed in this case. The GCA gain specification in Fig. 3.
dependence of the jitter generation is shown in Fig. 11. The The output waveforms when the loop gain is set to the
jitter generation decreases as the gain increases, and the lowest point above are shown in Fig. 13. Again, the eye opening of
point is when the gain is larger than about 8 dB. The loop the output data was sufficiently wide, and clock extraction
gain at this point is 1.2 10 (1/s). The output waveforms was very precise. The rms jitter generation is 3.6 ps, which is
when the loop gain is set to that point and input data is 2.488 32 larger than that of the CDR when its loop gain is adjusted for
KISHINE et al.: CDR IC FOR LAN’S AND WAN’S 811
Fig. 15. Jitter tolerance curve. the I/O circuit) in both cases is less than 35% of that in the
short-distance transmission systems, but is smaller than the 2.5-Gb/s PLL’s reported previously [1], [2].
specification of the jitter generation of 4.0 ps (for STM-16;
ITU-T Recommendation G.958). The capture range is over V. CONCLUSION
50 MHz. Fig. 14 shows the measured jitter transfer function The design method of the CDR for both LAN’s and WAN’s
of the CDR in this case. The curve meets the ITU-T G.958 is presented. A new 2.5-Gb/s SF monolithic CDR IC using
specification. Fig. 15 shows the jitter tolerance curve when the the 0.5- m Si bipolar process has been developed. The CDR
input jitter magnification is 120% of the ITU-T specification. IC can be used in the transmission receivers for both LAN’s
The squares indicate error-free operation (where the error rate and WAN’s. The rms jitter generation of the CDR adjusted
is lower than 10 ). The rms jitter generation, jitter tolerance, for LAN’s is 1.2 ps. Furthermore, the jitter characteristics
and jitter transfer function all meet the jitter specifications in of the CDR for backbone networks or WAN’s meet the
ITU-T G.958. specifications for STM-16 given in ITU-T Recommendation
The relationship between the measured jitter generation (or G.958. In addition, the power consumption of the CDR is
cutoff frequency) and the loop gain in this experiment is only 0.4 W.
shown in Fig. 16. The darker shaded area is for the CDR,
whose jitter characteristics meet the specifications of ITU-T
APPENDIX
Recommendation G.958. In the area of larger loop gain, the
jitter generation becomes small. Fig. 16 shows clearly that,
when its loop gain is optimized, the CDR IC is suitable for both A. Loop-Gain Dependence of the Jitter Shown in Fig. 1
LAN’s and WAN’s. The capture range of both types of CDR’s The jitter due to the noise in the input signal to the PLL is
is wide enough to cover the deviation in the free-running expressed as (1). When the loop filter is a lag-lead type (the
frequency due to changes in temperature (ranging from 5 to series and shunt register are respectively and and the
90 C In addition, the power consumption (including that of shunt capacitance is ), the natural angular frequency and
812 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 6, JUNE 1999
the damping factor are expressed as [7] C. Yamaguchi, Y. Kobayashi, M. Miyake, K. Ishii, and H. Ichino, “A
m
0.5 bipolar technology using a new base formation method,” in
Proc. BCTM, 1993, pp. 63–66.
I. INTRODUCTION
the received data by the clock/data recovery block (CDR), and
available from 0.18- m CMOS transistors. In the quadrature tages over conventional CMOS and other all-NMOS implemen-
sampling paths, where metastability and hysteresis are a con- tations. First, it has a relatively low output impedance making
cern, extra buffering is used. Also, the input latch of the first it suitable for high-speed operation. MCML logic also benefits
flip-flop is increased in size in order to improve performance. from reduced logic voltage swing as well as from the elimina-
All samples enter the logic on the same clock edge simplifying tion of lower mobility PMOS transistors compared to CMOS
the early/late logic. The phase detector pulses are retimed at the logic. The MOS equivalent of a bipolar ECL gate is not prac-
output of the phase detector in order to remove asymmetries in tical, especially from a 1.8-V supply due to attenuation of the
both amplitude and duration from the output pulses. Note that signal by source followers and lack of headroom.
a drawback to these modifications is unequal loading of the in- Another benefit of using MCML is reduced switching-re-
phase and quadrature clock lines. Thus, care must be taken to lated supply noise, due to the relatively constant current drawn
avoid a static offset in the phase detector as this could cause a from the power supply. For improved supply rejection, the gain
reduction in the residual jitter tolerance. stages and output buffers of the ring oscillator are implemented
Note that 1 : 2 demultiplexed data could be tapped off directly as MCML inverter/buffers. An additional benefit of this simple
from early/late logic inputs A and B of Fig. 4. However, a sep- design is that the clocked phase detector elements can interface
arate 1 : 2 demux was used for the testchip to minimize loading with the ring oscillator without level shifting or swing adjust-
of the phase detector latches, at the expense of possible phase ment.
alignment errors at the demux and a slight increase (10 mW) in The first goal of this design is to create a buffer which has the
power consumption. widest possible bandwidth, while still having enough gain. A
minimum value of approximately 2 for the small-signal gain was
III. CIRCUIT DESCRIPTION chosen, otherwise the gate noise margin becomes unacceptable.
The transistor and block level design of the 10 Gb/s CDR Biasing of the circuit so that the large-signal switching speed
circuits are described in the following sections. The implemen- approaches maximum performance is now considered.
tation of the phase detector is examined first, followed by an First, an appropriate voltage swing ( ) is se-
in-depth description of the LC delay line VCO. lected. The voltage swing is made as large as possible, without
forcing the switching transistors into the triode region at any
A. Phase Detector time during the cycle. Larger drain-to-gate capacitance of the
The phase detector logic is implemented in resistively-loaded MOSFET in the triode region limits the switching speed. Note
MOS current mode logic (MCML). This offers several advan- that gain is (first-order) dependent on the voltage swing, but
1784 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 12, DECEMBER 2002
(3)
(a)
(a)
(b)
Fig. 9. NMOS varactor. (a) Varactor structure. (b) Varactor C –V curve.
TABLE II
MEASURED VCO PERFORMANCE
TABLE III
CDR SUMMARY
developed so that a design iteration is avoided, thereby allowing with the exception of the small residual jitter tolerance. The
the prototype to be fully characterized. The method is shown jitter tolerance exceeds specifications at low jitter frequencies
schematically in Fig. 11, where a metal plate is placed in close but is very close to the SONET mask at higher frequencies (see
proximity to the delay line using a micro-manipulator. Current Fig. 13). Poor electrical contact from probes to the chip, phase
induced in the metal plate reduces the self-inductance of the error between the quadrature clocks, and mismatch between the
delay lines. Since delay between oscillator stages is proportional demux and PD latches are likely sources of degradation in the
to line inductance, the frequency increases when the inductance jitter performance at higher frequencies. Poor electrical contact
is lowered. However, the plate must be placed within approxi- is partly due to wear caused by mechanical scrubbing of the pad
mately 10 m of the IC surface ( m from Fig. 11) to be by the probe tip. Repeated contacts were needed to trim the os-
effective. Conductivity of the metal plate is important (i.e., gold cillation frequency before measuring the jitter tolerance, which
or a similar metal is used), as resistive losses actually increase caused significant wear of the pad metal and inconsistent elec-
the signal delay and slow down the oscillator. An unwanted sec- trical contacts. CDR performance is summarized in Table III.
ondary effect is additional interwinding capacitance that results A photomicrograph of the 1.9 1.5 mm IC is shown in
from placing another conductor in close proximity to the delay Fig. 14. The input and output data lines are implemented in 50
line, which acts to reduce the oscillation frequency. The induc- microstrip. The pad configuration used was dictated by the RF
tive effect dominates, however, with the net result that the center on-wafer probes used for test. In order to increase the isolation
frequency is adjustable from 4.45 to 5.5 GHz with negligible ef- between the oscillator and the data path circuits, power supplies
fect on phase noise. are kept separate. The layout also includes an extensive bottom
The oscilloscope eye pattern of Fig. 12 is measured in re- metal ground plane which provides the reference plane for the
sponse to 10-Gb/s PRBS input data (2 –1). Error-free data re- microstrips as well as increasing the capacitance from substrate
covery at 10 Gb/s was measured in BER tests with a recovered to ground. The IC consumes 285 mW from a 1.8-V supply (not
clock jitter of 1.2 ps rms, or 8 ps p-p. The 5-Gb/s output data including 50- test output drivers).
eye has larger jitter than the clock due to pattern dependencies
that are likely introduced by bandwidth limitations of the 50-
output buffers used for testing. Note that a 1 : 8 or 1 : 16 demulti- ACKNOWLEDGMENT
plexer would be used in a typical application, which relaxes the
bandwidth requirements for off-chip buffering of the recovered Circuit fabrication was facilitated by the Canadian Microelec-
data. tronics Corporation. The authors thank Dr. Y. Greshishchev and
Measured jitter transfer, generation, and tolerance all meet Dr. P. Schvan for providing access to test facilities at Nortel Net-
the SONET OC-192 requirements (measured jitter of 8 ps p-p), works’ Ottawa Laboratories.
ROGERS AND LONG: 10-Gb/s CDR/DEMUX WITH LC DELAY LINE VCO IN 0.18- m CMOS 1789
I. INTRODUCTION
(a)
less than , half the bit time, even if the peak-to-peak jitter can
be much larger than a bit time. Changes greater than are
indistinguishable from a phase shift in the opposite direction,
.
Choosing between the two clock recovery systems depends
on the system requirements and noise behavior. We chose a
phase-picking architecture to explore the usefulness of the
higher phase-tracking capability. In such VLSI implementa-
tions, supply noise can be significant enough for the peak-to-
peak jitter to occupy a large fraction of the bit time, especially
since a PLL accumulates jitter. For the 4-Gbit/s link, we
Fig. 9. Phase-picking algorithm block diagram.
chose a low oversampling ratio of 3 to maintain high input
bandwidth and to keep the number of clock phases manageable
(1 : 8 demultiplexing and 3 oversampling yields 24 phases).
With a bit time of 250 ps, the phase-picking scheme4 can track
the noise of the on-chip multiphase generator (PLL) from both
the transmit and receive sides to keep the total “effective jitter”
below the 83-ps quantization spacing. One limitation of the
phase-picker tracking is that the maximum rate of the tracking
depends on the data transition density. Since the PRBS signal
guarantees one transition per byte, the maximum tracking rate
of one sample spacing every transition is fast (83 ps/2 ns).
Although the tracking rate is high, the maximum static phase
error from the quantization is 41 ps (2% of the clock period,
8 bit time), causing an SNR penalty (Fig. 5). Whether or
not a 3 oversampled phase-picking approach with higher
tracking bandwidth than a PLL can achieve better performance Fig. 10. Example of the phase-picking algorithm.
with the larger static phase error depends on the amount of
jitter induced by on-chip noise sources. If the lower SNR of the 3-byte sliding accumulation, the rate of phase change
penalty from the lower jitter compensates the higher SNR that the algorithm can track is slower than the maximum of 83
penalty of larger static phase error, phase picking would be ps/2 ns. The algorithm picks the correct sample if the majority
the better choice. of the transition information within the 3-byte window (6 ns)
indicates the correct phase. For example, if the input phase
IV. PHASE-PICKING ALGORITHM AND IMPLEMENTATION has a constant rate of change of <1 sample spacing per 3 ns
(corresponding to a frequency difference of 4%), the transition
The details of the phase-picking algorithm are illustrated in
information from >1.5 bytes of the 3-byte window would fall
Fig. 9. Picking the center sample requires finding and tracking
in the same phase quantization. Then the tally and compare
the bit boundaries. The decision logic first detects transitions
would select the correct sample to track the phase change.
by an XOR of adjacent samples, indicating the bit boundary to
This indicates a maximum phase-tracking rate of 83 ps/3 ns.
be in one of three possible positions. Fig. 10 shows an example
The criterion of tracking both and -PLLs’ accumulation
of the boundary detection with a portion of a sampled stream.
is met because the VCO elements’ supply noise sensitivity
To find which of the three transition positions is the most
is %/% (percent of frequency change per percent of
likely bit boundary, transitions corresponding to the same bit
supply noise [1], [3]),5 corresponding to 30 ps/3 ns for a 10%
boundary position are tallied. The position with the largest
supply step, which is less than the tracking rate. If the phase
total determines the bit boundaries.
change is slower than 83 ps/3 ns, the 3-byte accumulation
The decision logic makes a new decision per byte of data. In
offers some robustness by averaging any uncertainty in the
contrast to a higher order oversampling phase picker, the 3
transition detection due to high-frequency bit-to-bit noise. A
oversampling limits the change of the selected sample position
smaller window of one byte can track phase faster, but has
to one sample position per byte. To guarantee sufficient
poorer performance without sufficient transitions within that
transitions for averaging any bit-to-bit variations of high-
byte to average the bit-to-bit variation. A larger window of 5
frequency noise (near the bit rate), the tally is across a sliding
bytes (<83 ps/6 ns) would be too slow to track the - and
window of 3 bytes. The transitions are accumulated from the
-PLLs’ phase accumulation under reasonable supply noise.
current byte, the previous byte, and the next byte (delaying the
Once the transition position is determined, the middle
data allows the noncausal information) so that the decision is
sample within the bit boundaries is selected as the data.
applied to the byte at the middle of the window. As a result
5 Although the maximum phase error accumulation rate is based on the
4 In our system, the oscillator is at 250 MHz so the PLL bandwidth is supply sensitivity of the VCO, the peak phase error depends on the loop
2
restricted to <25 MHz. This yields a 10 tracking rate difference between bandwidth. The Tx -PLL and Rx -PLL generating the multiple clock phases
the two systems. have bandwidths of 15 and 5 MHz, respectively.
718 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 5, MAY 1998
ACKNOWLEDGMENT
The authors would like to thank S. Sidiropoulos, B. Am-
rutur, K. Falakshahi, Vitesse Semiconductor, Prof. T. Lee,
Prof. L. Kazovsky, and their research groups for invaluable
discussions and assistance.
Fig. 18. Measured BER at various sampling phase.
TABLE I
TEST-CHIP PERFORMANCE
REFERENCES
[1] C.-K. Yang and M. Horowitz, “A 0.8 m CMOS 2.5 Gbps oversampling
receiver and transmitter for serial links,” IEEE J. Solid-State Circuits,
vol. 31, Dec. 1996.
[2] C. Gray et al., “A sampling technique and its CMOS implementation
with 1-Gb/s bandwidth and 25 ps resolution,” IEEE J. Solid-State
Circuits, vol. 29, Mar. 1994.
[3] J. Maneatis and M. Horowitz, “Precise delay generation using coupled
oscillators,” IEEE J. Solid-State Circuits, vol. 28, pp. 1273–1282, Dec.
1993.
[4] K. Lee et al., “A CMOS serial link for fully duplex data commu-
nications,” IEEE J. Solid-State Circuits, vol. 30, pp. 353–364, Apr.
1995.
2
[5] A. Fiedler et al., “A 1.0625Gb/s transceiver with 2 -oversampling and
transmit signal pre-emphasis,” in ISSCC’97 Dig. Tech. Papers, Feb.
1997, pp. 238–239.
[6] A. Widmer et al., “Single-chip 4 2 500 Mbaud CMOS transceiver,”
IEEE J. Solid-State Circuits, vol. 31, pp. 2004–2014, Dec. 1996.
mV with an internal eye height of 65 mV. The 24 mV of [7] F. M. Gardner, Phaselock Techniques, 2nd ed. New York: Wiley, 1979.
amplitude noise is primarily due to ringing from the package [8] W. Dally and J. Poulton, “A tracking clock recovery receiver for 4-Gb/s
inductance and on-chip output capacitance at the transmitter. signaling,” in Hot Interconnect97 Proc., Aug. 1997, p. 157.
[9] S. Sidiropoulos and M. Horowitz, “A semi-digital DLL with unlimited
phase shift capability and 0.08–400MHz operating range,” in ISSCC’95
Dig. Tech. Papers, Feb. 1995, pp. 332–333.
VI. CONCLUSION [10] J. E. McNamara, Technical Aspects of Data Communication, 2nd ed.
Bedford, MA: Digital, 1982.
Very high data rates are achievable in CMOS technolo- [11] S. Kim et al., “An 800Mbps multi-channel CMOS serial link with 3 2
gies by making extensive use of parallelism. Using an 8 : 1 oversampling,” in IEEE 1995 CICC Proc., Feb. 1995, p. 451.
demultiplexing at the input and a 8 : 1 multiplexing output [12] M. J. Pelgrom, “Matching properties of MOS transistors,” IEEE J.
Solid-State Circuits, vol. 24, p. 1433, Dec. 1989.
transmitter, we achieved a 4-Gbit/s transceiver while keeping [13] J. A. Crawford, Frequency Synthesizer Design Handbook. Boston,
all internal signals <500 MHz in a 0.5- m process technology. MA: Artech House, 1994.
[14] J. Proakis, Communication Systems Engineering. Englewood Cliffs,
The fundamental limitations of this approach are the I/O NJ: Prentice-Hall, 1994.
capacitance (increased due to the parallelism), the sampler
uncertainty, and the phase position accuracy of the multiple
clock phases.
Provisions were made in this design to handle very large
jitter accumulation of 83 ps/3 ns by a fast phase-picking
algorithm. The effectiveness of this architecture critically
depends on the jitter characteristics. Although a CMOS PLL Chih-Kong Ken Yang (S’93) received the B.S. and
M.S degrees in electrical engineering from Stanford
can potentially exhibit this large jitter due to supply noise, University, Stanford, CA, in 1992.
the measured jitter while operating this transceiver is only 50 He is currently pursuing the Ph.D. degree at
ps. This jitter is measured in a realistic noise environment Stanford University in the area of circuit design for
high-speed interfaces.
because of the presence of significant digital switching noise Mr. Yang is a member of Tau Beta Pi and Phi
from the large digital phase picker that can couple onto the Beta Kappa.
VCO elements. Since the jitter is less than the quantization
error, the advantage of the phase picking is only apparent
722 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 5, MAY 1998
Ramin Farjad-Rad (S’95) was born in Tehran, Mark A. Horowitz (S’77–M’78–SM’95) received
Iran, in 1971. He received the B.Sc. degree in the B.S. and M.S. degrees in electrical engineering
electrical engineering from Sharif University of from MIT in 1978, and the Ph.D. degree from
Technology, Tehran, in 1993 and the M.Sc. degree Stanford University, Stanford, CA, in 1984.
in electrical engineering from Stanford University, He is the Yahoo Founders Professor of Electrical
Stanford, CA, in 1995, where he is currently a Ph.D. Engineering and Computer Science at Stanford. His
candidate in electrical engineering. research area is in digital system design, and he has
He worked at SUN Microsystems Laboratories, led a number of processor designs including MIPS-
Mountain View, CA, on a 1.25-Gbit/s serial trans- X, one of the first processors to include an on-chip
ceiver for the fiber channel standard during the instruction cache, TORCH, a statically scheduled,
summer of 1995. Over the summer of 1996, he superscalar processor, and FLASH, a flexible DSM
worked at LSI Logic, Milpitas, CA, where he examined different multi-Gbit/s machine. He has also worked on a number of other chip design areas including
serial transceiver architectures. high-speed memory design, high-bandwidth interfaces, and fast floating point.
Mr. Farjad-Rad holds one U.S. patent, and is also the Bronze Medal Winner In 1990, he took a leave from Stanford to help start Rambus Inc., a company
of the 20th International Physics Olympiad, Warsaw, Poland. designing high-bandwidth memory interface technology. His current research
includes multiprocessor design, low-power circuits, memory design, and high-
speed links.
Dr. Horowitz is the recipient of a 1985 Presidential Young Investigator
Award and an IBM Faculty Development Award, as well as the 1993 Best
Paper Award from the International Solid-State Circuits Conference.