Dynamic Element Matching

1 Dynamic Element Matching
1.1 The mismatch issue
A DAC converts a digital number d[n] into an analog voltage v[k] or current i[k]. It also
usually holds this voltage or current until the next sample, meaning that it converts the signal
to continuous time as a first-order hold function.
v[k ], vˆ(t ),
d[k] i[k ] First iˆ(t )
D=>A order
hold
Figure 1: DAC basic principle
The first order hold as known has a SINC-frequency response. In the following, we will
however focus on the first D=>A block.
The DAC is typically implemented using DAC elements, these being current sources (in the
case of the analog output value being a current value) or capacitors (in the case of the analog
output value being a voltage value). The simplest way to implement a binary DAC is to use
binary weighted elements like shown in fig.2.
2N-1
bN-1 KN-1
2N-2
bN-2 KN-2
a(k)
2N-3 +
bN-3 KN-3
1
b0 K0
Figure 2: Binary weighted element DAC
Here, bi is the i’th bit of the digital number d(k) The LSB b0 is scaled by 1 and the MSB bN-1
by 2N-1.The DAC elements can as mentioned be capacitor elements (in which case one is used
for b0, two for b1 and so forth) or scaled current sources.
The nominal gain of the DAC, normalized to 1, will be given by:

N −1
∑2 K i
i
Kˆ = i =0
N −1
(1)
∑2
i =0
i
The INL for any digital value d (k ) = b0 + b1 + L + bN −1 will then be given by:
( )
N −1 N −1 N −1
INL(k ) = ∑ 2i bi K i − ∑ 2i bi Kˆ = ∑ 2i bi K i − Kˆ (2)
i =0 i =0 i =0
As can be seen, the MSB or close to MSB elements will be very critical and because of this,
the binary weighted DAC is not very useful for high resolution applications. An alternative
way of looking at it is by viewing the scale factor 2i as the size ratio between element 0 and i.
Obviously it’s difficult to match two elements whose size ratio is large.
For high resolution converters, the converter is often implemented as a unit-element DAC.
This is shown in figure 3. The binary number is decoded to thermometer code representation
and fed to 2N-1 equal unit elements.
t 2 N −1
K 2 N −1
bN-1
t 2N −2
K 2N −2
Binary
a(k)
t 2 N −3 +
to
therm.
K 2 N −3
decoder
b0
t0
K1
Figure 3: Unit element DAC.
In the unit-element DAC all elements are of equal (LSB) size. The nominal gain of the DAC
will be given by:
2 N −1
∑K i
Kˆ = i =0
(3)
2N
The INL for any value d(k) will consequently be given by:
∑ t (K )
d (k ) 2 N −1
INL(k ) = ∑ K i −d (k ) ⋅ Kˆ = i i − Kˆ (4)
i =0 i =0
Now the relative mismatch is much less critical. If it is 1% for any given element, the DNL
will only be 0.01 LSB.
1.2 Dynamic element matching in oversampled unit element DACs
In an audio DAC, the number of bits is reduced from 20+ to much fewer, usually 3-6, through
the means of an oversampled delta-sigma modulator. The additional quantization noise is
shaped arbitrarily through the modulator, and very high resolution can still be maintained for
the baseband. However, errors in the D/A-converter are not shaped by the loop and its static
performance must still be at the 20+ bit level. In a usual CMOS-process, matching of
individual capacitors or current-sources are usually limited to about 0.1%, resulting in a static
error much inferior to this requirement. However, these errors can too be shaped digitally, an
idea first proposed by Van De Plassche back in 1976 [1].
1.2.1 Randomisation
The simplest method of dynamic element matching is randomisation of the element selection,
first shown in implementation in 1989 [2]. As such, the element usage will be input
independent and the INL will be randomised and input independent (assuming a normal or
Gaussian mismatch distribution with zero mean). In other words, it is converted from
distortion to noise. A random selector can be implemented through a butterfly-network and a
pseudorandom number generator as shown in fig.4.
t 2 N −1
K 2 N −1
bN-1 t 2N −2
K 2N −2
Binary
a(k)
t 2 N −3 +
to
therm.
K 2 N −3
decoder
b0
t1
K1
PRNG
Figure 4: Element randomizer
Since the elements are selected randomly, the output will be a white noise source given by:
⎧⎪⎛ d ⎞ ⎫⎪
2
E {ε } = E ⎨⎜ ∑ K i −d ⋅ K ⎟ ⎬
2 ˆ
⎪⎩⎝ i = 0 ⎠ ⎪⎭
⎧⎛ 2 N −1 ⎞ ⎫
2
⎪⎜ d
⎪⎜
∑ Ki ⎟ ⎪
⎪
= E ⎨⎜ ∑ K i −d ⋅ N ⎟⎟ ⎬
i =0
⎪⎜ i =1 2
⎜ ⎟⎟ ⎪ (5)
⎪⎝ ⎠ ⎪⎭
⎩
⎛ d ⎞ ⎛ d ⎞
= 2 N ⋅ ⎜ N ⎟ ⋅ ⎜1 − N ⎟σ K2
⎝2 ⎠ ⎝ 2 ⎠
⎛ d ⎞
= d ⋅ ⎜1 − N ⎟σ K2
⎝ 2 ⎠
1
If the noise is white and of the noise power falls within the baseband, a maximum
2 ⋅ OSR
signal swing of 2N-1 peak-to-peak will result in a maximal Signal-to-Mismatch-Noise-Ratio
of:
⎛ 2 N ⋅OSR ⎞
SMNR = log 2 ⎜ 3 ⎟ [bits] (6)
⎜ σε ⎟
⎝ ⎠
It can easily be calculated that with a typical process mismatch of 0.1%, and a reasonable
number of bits and a reasonable oversampling ratio (tens to hundreds), the resolution is still
limited to the sub-16-bit range.
1.2.2 Mismatch shaping and Data Weighted Averaging
An improvement of element randomization came with the introduction of mismatch-shaping

techniques in the early 90s, the Individual Level Averaging (ILA) algorithm from 1992 [3]
and the Data Weighted Averaging (DWA) algorithm from 1995 [4] representing major
breakthroughs. They were both based on the idea that if all elements contribute equally, the
INL would be cancelled, since the total error would then always have the same expectation
value. So, if all elements contribute equally over time, the error will grow smaller for larger
time windows, that is, lower frequencies. The error will be high-pass shaped, which is very
desirable in oversampled systems. Especially the DWA algorithm became very popular and is
widely used and the success of DWA was instrumental in the resurrection of multi-bit DACs
in audio applications. Following its introduction, more thorough analyses of DWA were later
published [5]-[6] that proved its first-order high pass mismatch shaping property as well as its
susceptibility to idle-tone behaviour. As a consequence of the latter, many randomized DWA
algorithms have been published [7]-[9]. Extension of the DWA principle to include arbitrary
noise-shaping functions has also been shown [6], although none can be practically
implemented in a reasonable way. However, a practically implementable restricted second-
order DWA has been published proving good results [10].
1.2.2.1 First order DWA
The principle of DWA is shown in fig. 5. If the first value d(0)=2, elements t0..1 are selected, if
d(1)=3, elements t3..4 are selected next. As can be seen from the 3-bit example to the right in
the figure, each element is used twice over the period. Of course, it will normally converge
much slower.
t 2 N −1
K 2 N −1
t 2N −2
K 2N −2
t 2 N −3
K 2 N −3
Figure 5: Data Weighted Averaging mismatch shaping
The shifting is controlled by a pointer that is incremented by the input value d(k) for each
sample instant k. To be rotational it operates in modulo N, e.g.:
ptr (k ) = ( ptr (k − 1) + d (k ) ) mod N (7)
The integral mismatch error as a function of the pointer is of course given by:
ptr
IM ( ptr ) = ∑ K i − ptr ⋅ Kˆ (8)
i =0
It is then given from (7) that the mismatch as a function of the time instant k must be:
ε (k ) = IM [ ptr (k )] − IM [ ptr (k − 1)]

(9)
⇒ Ε( z ) = (1 − z −1 ) IM [PTR( z )]
The resulting noise is in other words a first order high-pass filtered version of the integral
mismatch error generated as a function of the pointer. This is again dependent on the input
signal as seen by (7). The integral mismatch error IM(ptr) is often considered white, but this is
a simplification that could lead to erroneous performance estimation. Especially in the
presence of certain DC-level input signals, IM(ptr) will be periodic (due to the modulo term)
and produce significant tonal behaviour. This tonal behaviour is the biggest drawback with
first order error shaping just like tones are also a major problem in first order quantization
noise shaping (delta-sigma modulation). If the white noise assumption holds, the baseband
resolution can be calculated very similarly to randomization, for maximum amplitude white
input signals it is found to be:
⎛ 2 N ⋅ OSR3 ⎞
SMNR = log 2 ⎜ ⎟
(
⎜ πσ ε 1 − 1N
⎝ 2
) ⎟
⎠
[bits] (10)
The tones can be estimated for DC-inputs, if the input signal is a DC-value X, the errors will
be rotated at a fixed speed, with one rotation being completed every 2N/r sample, where r is
the smallest common divisor of X and 2N. The output error will then be a periodic sequence:
X −1 2 X −1 2 N −1 X −1
∑ε , ∑ε K ∑ε , ∑ε K
i =0
i
i= X
i i
i =0
i (11)
i=2N − X
The tonal behaviour is addressed similarly to tones in a delta-sigma modulator, by ways of

dithering the selection process to break up the periodic pattern [7]-[9].
1.2.2.2 Second order DWA
The approach in section 1.2.2 can be generalized to encompass second order or arbitrary noise
shaping. We want to achieve:
Ε( z ) = H ( z ) ⋅ IM [PTR ( z )] (12)
Where H(z) is a general noise shaping function:
m
H ( z) = ∑ a j z − j (13)
j =0
In the time-domain, (12) simply becomes:
m
ε (k ) = ∑ a j ⋅ IM [ ptr (k − j )] (14)
j =0
For H ( z ) = 1 − z −1 , we have already shown that this can be achieved through the single
(
step ptr (k ) = ( ptr (k − 1) + d (k ) ) mod N . For H ( z ) = 1 − z −1 )2
we get:
ε (k ) = IM [ ptr (k )] − 2 ⋅ IM [ ptr (k − 1)] + IM [ ptr (k − 2)] (15)
To achieve this, some elements, the elements up to the current pointer value must contribute
with weight “1”, the elements up to the previous pointer value must contribute with weight
“-2” and the elements up to the pointer value two instances ago must contribute with weight
“1”. Since these are the same elements, each element’s weight at any time instance is given by
the sum of the three terms given by the pointer recursion. In the first order case it will always
be “-1” or “1” (one and zero binary), and it by updated through a single assignment of each
element. In the second order case the sum can grow to larger integers. Hence, to enable an
element to contribute proportionally, each element must be assigned several times in each
sample instant, making practical implementation impossible unless the DAC is run at several
times its input sampling speed.
However, a modified version of the second order DWA has been implemented successfully,
called the Restricted Second-order DWA or R2DWA. R2DWA puts a restriction on the
element selection, by updating the pointer like second order DWA, but through a restriction
on the element selection ensuring that no element must be assigned more than one time. This
principle is shown in fig.6.
Figure 6: Left: Second order DWA. Right: Restricted second order DWA.
R2DWA can be implemented by updating each row in a swapper array with a second-order
delta-sigma modulator. The implementation is detailed in [10].
1.2.3 The Galton Tree Structure Dynamic Element Matching
In addition to the very intuitive DWA approach, several other algorithms for dynamic element
matching have been introduced during recent years. These algorithms are often more difficult
to understand and appear more complex, but some have proved to be much simpler to
implement in hardware.
1.2.3.1 The tree structure and mode of operation
A very hardware-efficient realization of dynamic element matching is the tree-structure

proposed by Galton [11]. Here, the decoder is partitioned into layers, until ending with the 1-
bit two level data necessary to control a 1-bit DAC. This is shown in fig.7. The control units
denoted S are called switching blocks.
K 2 N −1
K 2N − 2
K 2N − 3
K 2N − 4
Figure 7: The Galton tree structure
To preserve the signal integrity, the structure must be number conserving, i.e. that the sum of
the two outputs from a switching block must always equal its input. This is called the number
conservation rule. In addition, the input to any switching block must be equal to or smaller
than the total number of DAC-elements it connects a path to. E.g. the input to the blocks in
layer two can never be larger than four. As such, the number of bits can be reduced from layer
to layer, as shown in fig.7. Since an N-bit binary number has the range [0,2N-1] the inputs are
encoded with an extra LSB to have the range [0,2N]. This is called extra-LSB encoding.
The implementation of the switching block, to be in conformance with these rules, is as

shown in fig.8.
Y1
+
0.5
X Y2
+
0.5
-1
s
Figure 8: The basic switching block.
From fig.8, it is clear that:
1
Y1 = (X + S )
2 (16)
1
Y2 = ( X − S )
2
It is easy to see that X = Y1 + Y2 , so the number conservation rule holds, and also that
Y1 − Y2 = S . The signal S is called the switching sequence and controls the partition of the
input into the two output signals, hence the dynamic element matching is encoded in S. It can
be shown using recursion for the whole tree (see [11]), that:
Aout [n] = β + α ⋅ Din [n] + e[n] (17)
Where β and α are constant offset and gain errors, respectively. The term e[n] is signal
dependent and denotes the nonlinear error resulting from mismatch. Furthermore, it can be
shown [11] that:
b 2 b−k
e[n] = ∑∑ ∆ k , r ⋅ sk , r [n] (18)
k =1 r =1
Where k denotes the layer (b layers in total), r denotes the position of the switching element in
∆
the layer, k ,r is the total mismatch error associated with switching block r in layer k and
sk ,r is the switching sequence produced by switching block r in layer k. This means that if
each switching block produces an L’th order shaped sequence sk ,r , uncorrelated from the
other switching sequences, then the total error e will be shaped the same way.
Consequently, one should be able to produce the switching sequence using any arbitrary free-
running delta-modulator and the error will be shaped in the same way. However, there are
some other requirements on the switching sequence that complicates this. For the outputs to
be on integer form, it is an obvious requirement that both ( X + S ) and ( X − S ) must always
be even numbers. This puts a restriction on S, namely that:
⎧even if x[n] is even

S [n] = ⎨ (19)
⎩ odd if x[n] is odd
Furthermore, to ensure Y1 and Y2 are both positive and smaller than the total number of
elements they are connected to ( Y1, 2 ≤ 2 k −1 for any switching block in layer k), it is another
requirement that:
S [n] =≤ min{x[n],2k − x[n]} (20)
The challenge in the tree structure is consequently to generate a sequence that is both shaped
and at the same times holds these requirements. But as will be shown, this is indeed feasible,
although higher order shaped sequences will saturate significantly.
1.2.3.2 The swiching sequence generator
To efficiently generate a shaped sequence that satisfies the before-mentioned requirements,

the sequence generator is implemented as a free-running, ternary output delta-sigma
modulator. The fact that the outputted switching sequence is ternary means that the
requirement in (20) is automatically satisfied. To satisfy (19), the ternary output is generated
by a binary quantizer (±1 output) and a multiplier, as shown in fig.9. Since the LSB of X is 0b
when X is even:
⎧ 0 if x[n] is even
S [n] = ⎨ (21)
⎩± 1 if x[n] is odd
As a consequence, both (19) and (20) are satisfied. The filter H(z) decides the shaping of the
output sequence and is in the first order case a simple discrete-time integrator.
1
1 − z −1
Figure 9: Conceptual schematic of the first order switching sequence generator.
In this first order structure, it can easily be verified that the integrator output will always be 0
or ±1, so the quantizer will never overload. This means it will always work as an ideal first-
order modulator and provide ideal first-order shaping. It can also be implemented in a very
simple manner. For higher-order switching generators, implementation is somewhat more
complex, which will be explained shortly.
1.2.3.3 Hardware efficient 1st order switching network
A simple implementation of the described switching block structure is achieved through

“extra-LSB-encoding” of the data sequence. This means the binary code is modified to have
two LSBs, both with unity weight. If the number is odd, the two LSBs are necessarily
different, if the number is even, they are equal. This simplifies the logic as the switching
sequence can be added without any adder.
If the number is even, the switching sequence output must be zero and the switching block
should only perform a right shift division of X for both Y1 and Y2. One of the LSBs from X
must be added as the extra LSB of Y1,2, since the two LSBs being 11b means they constitute a
value of 2 and 1 should be added to each output (see (16)).
If the number is odd, the switching sequence is non-zero and must be added to and subtracted
from Y1 and Y2 respectively. When using the extra-LSB-encoding, S can be added to or
subtracted from the outputs (see fig.8) without an adder. By discarding the two LSBs of X,
which are now 01b or 10b, you have reduced the output value by 0.5. This means that setting
the extra LSB of the output to 0b means Y = ⎣ X / 2⎦ and setting the extra LSB to 1b means
Y = ⎡ X / 2⎤ . Thus, setting the extra LSB of Y1 to 1b if S = 1 and 0b if S = −1 , is equivalent to
combining (16) and (21). The inversion of S for Y2 (see fig.8) is simply done through a binary
inverter.
Figure 10: Switching network logic with extra LSB encoding.
It’s easy to confirm fig.10 is equivalent to fig.8 with number conservation and a ternary
switching sequence and the arithmetic has been realized using very simple digital logic. It will
also be shown that the sequence generator for low order can be realized very elegantly.
We first consider the first-order modulator in fig.9. The integrator can be seen as a ternary
state-machine. If X is odd, the modulator output S is zero and the integrator state will not
change (see fig.9). This means the state machine can be stopped. If X is even, the integrator
will follow the state sequence [1,0,-1,0,1,0,-1...]. Whenever the integrator state is -1, the
output must be 0b, when the integrator state is 1, the output must be 1b. The integrator must
change state from 1 to -1 in two cycles. This leads to the simple implementation shown in
fig.11.
Figure 11: Simple implementation of sequence generator.
If the flip-flop outputs Q1Q2 are 11b, the integrator state is 1. If they are 00b, the state is -1 and
if they are either 01b or 10b, the state is 0. This way, the binary quantizer from fig.9 is with no
offset (if it rounds zero up or down depends on the last value). To avoid tonal behaviour,
which will translate to tones in the DAC error (18), the filter can be dithered. If the integrator
output is 0, one can randomize which way the binary quantizer flips (1 or -1) by using a
binary random input as shown in fig.12. Here, if the integrator state is 0, the output is
determined from the random sequence.
Figure 12: Dithered first order sequence generator.

1.2.3.4 The second order switching generator.
The second order switching generator is described in [11] and detailed in [12]-[13]. Since the
integrator output is now no longer limited to 0 or ±1 state values, implementation will be
significantly more complex. The basic structure is a two integrator expansion of the first order
structure shown previously in fig.9. However, in the second order structure the output from
the second integrator can grow to any value and if it’s saturating the circuit can become
unstable. To avoid the danger of instability even with a limited size second integrator, a feed-
forward coefficient α<1 is introduced. The conceptual block schematic is shown in fig.12.
α LSB of X
1 1 S
+
1 − z −1 1 − z −1
-1
Figure 13: Conceptual schematic, second order switching generator.
The design can be simplified by recognising that, if the second integrator is designed to
saturate at some value M, a gain of α=1/(M+1) can be achieved by allowing the first integrator
to override the second one whenever its output is nonzero. In digital logic, the dithered
version of the second order generator in fig.13 can then be implemented as shown in fig.14.
Figure 14: Efficient hardware implementation of second order switching generator
To explain the way this circuit operates let’s first look at the two integrators: The state of the
first one will never grow beyond ±1, from the same reasoning as the first order integrator, and
it can be implemented as a 2-bit state machine. The second integrator output can grow to, in
theory, any integer value. As is seen, if the first integrator is not zero (the >0 output is 1b),
then this integrator overrides the second one and Sb is equal to its sign. It then also disables
the second integrator. In this case the circuit works exactly like the first order structure in
fig.11 or fig.12. If the first integrator is zero, the second integrator takes over and its sign is
connected to the output Sb. If this is also at the zero state, the output is determined from a
dither sequence as for the first order version.
Since the second integrator saturates at some value M, and the first order integrator in this
case takes over operation, the gain α and the resulting noise shaping of the switching sequence
S is determined by the size of the second integrator. If its saturating range is infinite, the gain
( )2
α is zero and the provides the optimal second order 1 − z −1 spectral shaping of S. If the
( )
saturating range is 0, the gain α is infinite and the circuit falls back to first-order 1 − z −1 -
spectral shaping. Any value in between gives a reduced, or more conservative, second order
shaping.
It has been shown in [12]-[13] that if the second integrator is only 2-bit (M=1), the baseband
noise floor is about 10dB lower than in the first-order dithered case. If the second integrator is
6-bit (M=32) or more, the shaping is close to second order. Of course a larger up/down-
counter means more hardware, and there are 2N-1 switching generators for a N-bit tree-
structure DEM, so the trade off between hardware complexity and required performance
should be carefully evaluated. A 2-bit second integrator gives a quite cheap performance
( )
2
boost, while more than 6-bit is rarely needed unless perfect 1 − z −1 -shaping of mismatch
errors is necessary to provide sufficient baseband SNDR.
Suggestions of improvements to the Galton Tree-structure to increase stability for higher

order shaping, have also been shown in recent publications [14]-[15].
1.2.4 Other structures.
In addition to the DWA-structure and the Tree-structure there are two other fundamental
approaches of Dynamic Element Matching to the author’s knowledge that are published. One
is the vector feedback approach [16]-[18], which has very good performance for even second
order mismatch shaping, but suffers from high hardware complexity and is not feasible to
implement for a large number of quantization levels. The butterfly shuffler method [19]-[20]
on the other hand is computationally simple, but limited to first order mismatch shaping.
These methods will not be treated in detail in this document and interested readers are
suggested to look up the references.
1.3 Dynamic element matching for large quantizers
As we have seen from the previous sections, the number of elements increases exponentially
with the number of bits in the output signal. This means that both routing and DEM
complexity also increases exponentially with the number of elements. Generally, most DEM
DACs are inefficient at more than 4-5 bits. Of course, DEM can not be applied to the binary
weighted DAC, as it has only one unique element selection per output code. However, the
complexity can be reduced through segmentation. This is shown in its basic form in fig.15.
Figure 15: Segmented DEM scrambler
Here, the lowest M bits are subjected to normal dynamic element matching (e.g. DWA) and
run through a 2N-element unit DAC with weight 1. The remaining N-M bits are also run
through a DEM scrambler, before being converted by a separate equal-element DAC. The
MSB DAC is of course weighted by 2M to make the digital sum correspond to the modulator
output code.
If N=8, this would normally require 256 elements in a unit element DAC. By setting M=4, the
LSB-DAC can be made with 16 elements and the MSB-DAC also with 16-elements, each of
these being 16 times bigger than the elements for the LSB-DAC. The algorithm complexity
has thus been reduced from 256 elements to 32.
However, the weighting is still realized in analog, i.e. by matching of the LSB and MSB
DACs. This well result in a gain-error between the DACs that, in the configuration shown
above, is not cancelled in any way. This will again result in significant distortion.
Another way to view this is to look at the segmentation in the signal domain. Picking out the
MSBs is equivalent to truncation and it is the truncation error that is fed to the lower DAC. If
the coarse and fine DACs are ideally matched, the output will sum perfectly and the
additional truncation error will not leak through, no additional error is made.
Figure 16: Segmented DEM scrambler with segmentation in the signal domain.
If the matching is not ideal however, the output will not sum correctly and the truncation error
e will leak through. We can assume the fine DAC being weighted with a non-ideal factor α,
instead of the ideal factor 1, in which case:
Aout = X + (1 − α ) ⋅ e (22)
If there is 1% gain-error, e.g. α=0.99, then 1% of the truncation noise from an N-M bit
truncation leaks to the output. If, in the above figure, N=8 and M=4, then 1% of a 4-bit
truncation error will be visible at the output, which will clearly lead to much too high
distortion for 20 or so ENOBs.
This problem can be minimized however, by introducing an additional delta-sigma modulator

like shown in fig.17. This method was first proposed by Adams [20].
2N-M-element
DAC
N
Delta-Sigma x+eDSM N-M bit
Modulator DEM
W=2M
N-M msb's
Aout
+
M+K lsb's
M+K bit
+ DEM
W=1
-eDSM
2M+K-element
DAC
Figure 17: Segmented DEM scrambler with shaping of gain error
In the case seen in fig.17, a fine DAC weighting of α instead of 1, will now lead to a
combined output given by:
Aout = X + (1 − α ) ⋅ eDSM (23)
Now, the error that leaks through is the shaped quantization error from a delta-sigma
modulator, which contains very little energy in the baseband. The gain-error distortion has
been shaped out of the band of interest. The form of the shaping is of course determined
completely by the second modulator which divides the output from the main delta-sigma
quantizer into a coarse and fine part. If the mismatch is 1%, 1% of the additional error
generated by the second modulator will leak through. Thus, its performance can at worst be
100 times better than the second modulator.
If the error is signal-dependent, for instance if the modulator is first-order and undithered, the
modulator will produce some baseband distortion and idle tones that leak through and a
higher-order modulator appears more desirable. However, in a higher order modulator, the
peak DSM error is significantly larger than the truncation error and the lower fine DAC would
need many additional bits (in fig.17, K is many). Then the analog overhead will be greater and
the digital complexity reduction will be lost.
2N-M-element
DAC
N First-order
x+eDSM N-M-bit
delta-sigma
DEM
W=2M
Modulator N-M msb's
Aout
+
M+1 lsb's
M+1 bit
+ DEM
W=1
-eDSM
2M+1-element
DAC
Figure 18: Segmented DEM scrambler with first order shaping of gain error
If the STF is one and the NTF equals (1-z-1) the peak DSM error is on the other hand only
twice the peak quantization error and the lower DAC needs only one extra bit, as seen in
fig.18. Using a 1st order modulator is thus the preferred choice in published literature [20]-
[21], but matching must be so good that the leakage does not significantly compromise
performance. With 1% scale error, the output error caused by the undithered first order
modulator will be 100 times better than its quantization error, so the expected output
distortion can easily be estimated.
Now, as the observant reader might have figured out, there is no reason you can’t do this
again with the error signal. Figure 19 shows a two step segmentation using three separate
first-order modulators. Here a 13-bit word is reduced to four 4-bit terms and the DEM
complexity is reduced from 213 to 4·24.
Figure 19: Extension to two-step DAC segmentation.
As shown in the thesis of Steensgaard [21], this can be extended until you have a row of
three-level DACs, each with a very simple DEM cell and a weight twice that of the next one.
This is conceptually very similar to the pipeline ADC with 1.5-bit shaping, but with delta-
sigma shaping of the gain-error between stages. The DEM complexity now increases
proportionally to N instead of 2N for the regular DEM and this is very suitable for large
quantizers at the cost of some analog overhead. The latter can be seen from noting that in
fig.19 the analog area in terms of unit elements has increased from 8192 to 9360.
Not surprisingly, since the switching sequence generator in the tree-structure DEM is very
similar to a delta-sigma modulator, the same approach can be applied to the Galton tree
structure which can be transformed to a segmenting encoder by a slight change in the
switching block networks [22]-[23]. The general structure is shown in fig.20. Of course, the
degree of segmentation is adjustable like in the Steensgaard approach. In fig.20, two bits are
not segmented, while the rest are.
Figure 20: The general segmented Galton Tree structure.
1.4 Element mismatch cancellation for low sampling rate material
As an alternative to the mismatch approach, one could imagine a scheme where all the
elements contribute equally regardless of the input value. Then the mismatch, as a
combination of each element’s individual mismatch, would only result in a constant gain-
error. The mismatch induced noise would not be shaped, but rather completely cancelled from
sample to sample. Of course, it would be a necessity for such a concept that each element is
modulated, so that the combined output is still proportional to the input signal. Such a scheme
has indeed been suggested based on PWM (Pulse Width Modulation) of the unit elements
[25], with extremely good results for audio bandwidths. The elements are PWM modulated in
a special way for a mismatch cancelled, ISI and jitter insensitive, non-PWM output stream.
The main drawback is, like we will soon see, that this is not possible to implement for high
sampling rate material. A more traditional PWM approach with predistortion and DEM is also
shown in [26] with good results, though its implementation specifics will not be covered in
this paper.
1.4.1 Shifted and rotated PWM modulation of unit elements.
The algorithm suggested by Reefman [25] is a very elegant and simple to implement solution
based on a rotational PWM scheme. First, we will look at a straightforward PWM-modulation
of all elements as shown for an example 3-bit continuous time DAC as shown in fig.21.
A= ∑D
Figure 21: Straightforward PWM modulation of each unit element.
In the case straightforward PWM modulation of all unit elements, the total combined output is
also PWM modulated. This is fully equivalent to direct Uniform PWM (UPWM) modulation
of the input signal, which is well known to introduce distortion [24]. Also, although being ISI-
free due to equal switching in each sample regardless of input value (the PWM can be seen as
a sort of RTZ-code), the output transitions are very large and the output is very susceptible to
dynamic errors from clock jitter.
These shortcomings can however be overcome by skewing and rotating each unit element, as
shown in fig. 22.
x[n]=3 x[n+1]=4 x[n+2]=5
D0
D1
D2
D3
D4
D5
D6
D7
8
7
A= ∑D
6
5
4
3
2
1
0
Figure 22: Shifted and rotated PWM scheme [22]

Now, we see that the combined output is no longer PWM-modulated, it’s recombined to a
linear PCM representation of the digital input amplitude. This means that the UPWM
distortion is eliminated and the mismatch is completely cancelled by every element
contributing equally regardless of input value. In addition, it is seen that each element
switches on and off exactly once per sample, so none have any ISI. This means that the
recombined output is also completely ISI free. The output is also not more susceptible to jitter
distortion than any straightforward mismatch-shaping encoder.
To summarize, the benefits of this algorithm include:
• Complete elimination of distortion due to static element mismatch.1

• Complete elimination of ISI without RTZ or similar dedicated schemes.
• No increase in jitter sensitivity.
These are major advantages, and indeed the converter reported in [23] has exceptional
performance, achieving 115dB SNDR in a 0.18µm process with very low power
consumption. However, there are some disadvantages.
The first is obviously the usable range. Due to the time resolution of the PWM modulation,
the maximum clock frequency used to align the element PWM pulses will be given by:
f PWM = f s ⋅ OSR ⋅ 2 Nbits (24)
This means that for high-resolution devices, relying on either high oversampling or a high
number of bits, it will not be feasible to implement for bandwidths in the MHz-range.
However, for audio sample rates (typically 48kHz or 96kHz), you can still achieve high
inband SQNR.
The second major disadvantage is illustrated in fig.23. We can see that if the input jumps
more than 1 LSB from one sample to the next, the elements it skips (in this case D4) are
switched on and off more often than the others. This means the ISI-eliminating property is
lost.
1
Actually, mismatch in an element will produce a very small UPWM-type distortion contribution as shown in
[22]. However this is a very low order effect that is not significant for any reasonable mismatch levels.
A = ∑D
Figure 23: Illustration of ISI elimination being lost if |x[n]-x[n-1]|>1
Since ISI is a dynamic error source that can create severe distortion, this means that the
converter published in [25] is designed so the output from the delta-sigma modulator is not
allowed to skip more than 1 LSB per sample. This mandates a very conservative NTF, which
reduces inband SQNR.
To guarantee the property of x[n] = x[n − 1] ± 1 the converter in [25] in addition uses a hard
slew-limiter inside the delta-sigma loop. Simulations suggest that for the delta-sigma loop to
remain stable with this slew-limiter, it is necessary that NTF ∞ << 1.5 , perhaps as low as 1.2.
With an oversampling ratio of 128, 48kHz input sampling rate and 6 bit quantization, this
gives around 130dB inband SQNR, which is sufficient. However, the maximum clock
frequency as dictated by (24), is almost 400MHz. In [25], 128xOSR and a 5-bit ∆Σ is used
and around 127dB SQNR is reported. To achieve SQNR in the 140dB range, a clock speed of
several GHz would be necessary. So the upper performance and bandwidth is limited by this
requirement. For audio applications it has however proved to provide the best performance
reported to date [25].
[1]: R.J.Van De Plassche, "Dynamic element matching for high accuracy monolithic DA
converters", IEEE Trans. Circuits and Systems, SC-11, pp. 795-800, 12/76.
[2]: L.R.Carley, "A noise shaping coder topology for 15+ bit converters", IEEE J. Solid
State Circuits, SC:-24, pp 267-273, 04/89.
[3]: B.H. Leung and S. Sutarja: "Multibit Sigma-Delta A/D Converter Incorporating A
Novel Class of Dynamic Element Matching Techniques," IEEE Trans. Circuits and
Systems-II: Analog and Digital Signal Processing, vol. 39, No. 1 (1992), pp. 35-51.
[4]: R.T.Baird, T.S.Fiez, "Linearity enhancement of multi-bit ∆-Σ A/D and D/A converters
using data weighted averaging", IEEE Trans. Circuits & Systems II, CASII-42, pp.
753-762. 12/95.
[5]: O.J.A.P. Nys, R.K. Henderson: An analysis of dynamic element matching techniques
in sigma-delta modulation”, Proc. IEEE Int. Symp. Circuits and Systems, ISCAS '96.,
'Connecting the World', vol. 1 , 12-15 May 1996
[6]: R. K. Henderson and O. Nys, "Dynamic element matching techniques with arbitrary
noise shaping function”, Proc. IEEE Int. Symp. Circuits and Systems, ISCAS’96, May
1996, pp. 293--296.
[7]: K.D. Chen and T.H. Kuo: “An Improved Technique for Reducing Baseband Tones in
Sigma-Delta Employing Data Weighted Averaging Algorithms Without Adding
Dither”, IEEE Trans. Circuits and Systems II, vol.46, no.1, pp 53-68, 1999.
[8]: M. Vadipour: “Techniques for Preventing Tonal Behaviour of Data Weighted
Averaging Algorithm in ∆Σ-Modulators”, IEEE Trans. Circuits and Systems II, vol.47,
no.11, pp 1137-1144, Nov.2000.
[9]: A.A. Hamoui and K. Martin, "Linearity enhancement of multibit ∆Σ modulators using
pseudo data-weighted averaging," Proc. IEEE International Symp. Circuits and
Systems ISCAS’02, May 2002, pp. III 285-288.
[10]: Xue-Mei Gong; Gaalaas, E.; Alexander, M.; Hester, D.; Walburger, E.; Bian, J.: “A
120 dB multi-bit SC audio DAC with second-order noise shaping”, ISSCC Digest of
Technical Papers. ISSCC. 2000, 7-9 Feb. 2000, Page(s):344 - 345, 469
[11]: I. Galton, "Spectral shaping of circuit errors in digital-to-analog converters,"
IEEE Trans. Circuits and Systems II: Analog and Digital Signal Processing, vol. 44,
no. 10, pp. 808-817, Nov., 1997.
[12]: J. Welz, I. Galton, E. Fogleman, "Simplified logic for first-order and second-order
mismatch-shaping digital-to-analog converters," IEEE Trans. Circuits and Systems II:
Analog and Digital Signal Processing, vol. 48, no. 11, pp. 1014-1028, Nov. 2001.
[13]: E. Fogleman, J. Welz, I. Galton, "An audio ADC Delta-Sigma modulator with 100-dB
peak SINAD and 102-dB DR using a second-order mismatch-shaping DAC," IEEE J.
Solid-State Circuits, vol. 36, no. 3, p.339-348, March 2001.
[14]: E.N. Aghdam, P. Benabes: “A New Mixed Stable DEM Algorithm for Bandpass
Multibit Delta-Sigma ADC”, Proc. ICECS2003, pp. 962-5.
[15]: E.N. Aghdam, P. Benabes: “Higher Order Dynamic Element Matching by Shortened
Tree-Structure in Delta-Sigma Modulators”, Circuit Theory and Design, 2005.
Proceedings of the 2005 European Conference, Vol. 1, pp: I/201- I/204, Sept. 2005
[16]: R. Schreier, B. Zhang: “Noise-Shaped Multibit D/A Converter Employing Unit
Elements”, Electronic Letters, vol.31, no.20, pp 1712-1713, Sept.28, 1995.
[17]: T. Shui et.al: “Mismatch-Shaping DAC for Lowpass and Bandpass Multi-bit Delta-
Sigma Modulators, Proc. IEEE Int. Symp. Circuits and Systems, ISCAS’98, 05-98.
[18]: A. Yasuda, H. Tanimoto, T. Lida: “A 100kHz, 9.6mW Multi-bit ∆Σ DAC and ADC
Using Noise Shaping Dynamic Element Matching With Tree Structure”, IEEE ISSCC
Dig. of Tech. Papers, vol.41, pp.64-65, Feb.1998.
[19]: T.W. Kwan, R.W. Adams, R.Libert: “A Stereo Multibit Sigma Delta DAC with
Asynchronous Master-Clock Interface”, IEEE J. Solid State Circuits, vol.31, no.12,
pp. 1881-1887, Dec.1996.
[20]: R.Adams, K.Nguyen, K.Sweetland: “A 113dB SNR Oversampling DAC with
Segmented Noise-Shaped Scrambling”, IEEE ISSCC Dig. of Tech Papers, vol.41,
pp.62-63, Feb.1998.
[21]: J. Steensgaard-Madsen: “High Performance Data Converters”, Ph.D. thesis, Technical
University of Denmark, Department of Information Technology, March 1999.
[22]: A. Fishov, E. Siragusa, J. Welz, E. Fogleman, I. Galton, "Segmented mismatch-
shaping D/A conversion," Proc. of the IEEE International Symp. Circuits and Systems,
May 2002.
[23]: K.L. Chan, and I. Galton, “A 14b 100MS/s DAC with fully segmented dynamic
element matching,” ISSCC Dig. Tech. Papers, pp.258-259, Feb.2006.
[24]: K. Nielsen: Audio Power Amplifier Techniques With Energy Efficient Power
Conversion”, Ph.D. Thesis, Technical University of Denmark, Department of Applied
Electronics, April 1998.
[25]: D. Reefman, J.vd Homberg, E. v Tuilj, C. Bastiaansen, L.vd Dussen: ”A New Digital-
to-Analogue Converter Design Technique for HiFi Applications”, Presented at the
AES 114th Convention, Convention Paper 5846, March 2003.
[26]: T. Rueger et.al.: “A 110dB Ternary PWM Current-Mode Audio DAC with
Monolithic 2Vrms Driver,” ISSCC Dig. Tech. Papers, Feb.2006.

Dynamic Element Matching

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Dynamic Element Matching

Încărcat de

Drepturi de autor:

Formate disponibile

1 Dynamic Element Matching

1.1 The mismatch issue

The nominal gain of the DAC, normalized to 1, will be given by:

1.2.2 Mismatch shaping and Data Weighted Averaging

An improvement of element randomization came with the introduction of mismatch-shaping

1.2.2.1 First order DWA

Figure 5: Data Weighted Averaging mismatch shaping

ptr (k ) = ( ptr (k − 1) + d (k ) ) mod N (7)

ε (k ) = IM [ ptr (k )] − IM [ ptr (k − 1)]

The tonal behaviour is addressed similarly to tones in a delta-sigma modulator, by ways of

1.2.2.2 Second order DWA

Where H(z) is a general noise shaping function:

In the time-domain, (12) simply becomes:

ε (k ) = IM [ ptr (k )] − 2 ⋅ IM [ ptr (k − 1)] + IM [ ptr (k − 2)] (15)

1.2.3 The Galton Tree Structure Dynamic Element Matching

1.2.3.1 The tree structure and mode of operation

A very hardware-efficient realization of dynamic element matching is the tree-structure

Figure 7: The Galton tree structure

The implementation of the switching block, to be in conformance with these rules, is as

From fig.8, it is clear that:

Aout [n] = β + α ⋅ Din [n] + e[n] (17)

⎧even if x[n] is even

S [n] =≤ min{x[n],2k − x[n]} (20)

1.2.3.2 The swiching sequence generator

To efficiently generate a shaped sequence that satisfies the before-mentioned requirements,

Figure 9: Conceptual schematic of the first order switching sequence generator.

1.2.3.3 Hardware efficient 1st order switching network

A simple implementation of the described switching block structure is achieved through

Figure 11: Simple implementation of sequence generator.

Figure 12: Dithered first order sequence generator.

Figure 14: Efficient hardware implementation of second order switching generator

Suggestions of improvements to the Galton Tree-structure to increase stability for higher

1.2.4 Other structures.

Figure 15: Segmented DEM scrambler

This problem can be minimized however, by introducing an additional delta-sigma modulator

Aout = X + (1 − α ) ⋅ eDSM (23)

Figure 19: Extension to two-step DAC segmentation.

Figure 20: The general segmented Galton Tree structure.

1.4 Element mismatch cancellation for low sampling rate material

Figure 21: Straightforward PWM modulation of each unit element.

x[n]=3 x[n+1]=4 x[n+2]=5

Figure 22: Shifted and rotated PWM scheme [22]

To summarize, the benefits of this algorithm include:

• Complete elimination of distortion due to static element mismatch.1

f PWM = f s ⋅ OSR ⋅ 2 Nbits (24)

Figure 23: Illustration of ISI elimination being lost if |x[n]-x[n-1]|>1

S-ar putea să vă placă și