Documente Academic
Documente Profesional
Documente Cultură
1 Introduction ......................................................................................................................2
2. What is Hierarchical DFT?..........................................................................................2
2.1
Modular vs. Hierarchical.....................................................................................2
2.2
Motivation for Hierarchical DFT ........................................................................3
3. Hierarchical DFT Background Information ................................................................3
Why register I/Os?...................................................................................................4
Testing inside the block...........................................................................................5
How do top level logic and block interconnects get tested?. ..................................5
What if all my I/Os are NOT registered? ................................................................6
Are there any special considerations for At-Speed test?.........................................7
Handling the core chains. ........................................................................................9
How many chains should there be?.......................................................................10
Block Level Pattern Generation. ...........................................................................11
4. SOC Level Implementations .....................................................................................11
Top Level Logic Insertion.....................................................................................11
Building Phase Level Netlists. ..............................................................................14
SOC Level Pattern Generation. .............................................................................14
Rev 1.1
April 5, 2006
Page 1 of 21
5.
1. Introduction This app note describes one manner in which Hierarchical DFT can be
implemented in a block-based design flow. Section 2 What is Hierarchical DFT? will provide a
definition of hierarchical DFT, why it is being implemented and some of the costs and benefits.
Section 3 Hierarchical DFT Background Information will go into detail on how testing
hierarchically is different from more traditional approaches and what DFT features must be
implemented at the block level. Section 4 SOC Level Implementations will go into more detail on
what features need to be implemented at the SOC level. For those who wish to skip the background
information contained in the first fours sections, Section 5 Block Level Scan Chain Insertion will
provide sample dofiles and test procedure files for use in DFTAdvisor. These examples will
demonstrate how the scan chains should be inserted at the block level. Section 6 TestKompress
Insertion will cover specific TestKompress commands needed in a hierarchical flow. Section 7 SOC
Level Integration will address the integration of blocks at the SOC level. Pattern generation (both
stuck-at and transition) is described with example scripts in Section 8 SOC Level Pattern
Generation . Pattern Verification is briefly addressed in Section 9. The Conclusions section will
summarize the benefits of a hierarchical approach to DFT and the additional work required over a
more traditional approach.
2.
What is Hierarchical DFT? Hierarchical DFT implies many things but fundamentally
it refers to inserting DFT at the block level in a manner that makes chip level integration of DFT
easier.
2.1 Modular vs. Hierarchical Modular applies only to TestKompress implementations
and is slightly different from hierarchical. The modular capability of TestKompress
allows the user to put a complete TestKompress engine inside a block in the IP
generation phase. An SOC design would contain multiple blocks with multiple, different
TestKompress configurations. In the pattern generation phase, modular TestKompress
generates patterns for all the blocks simultaneously. This implies that all the
TestKompress channels for each block are wired more or less directly to the SOC pins.
Hierarchical is differentiated from modular in that additional block level work is done
to isolate blocks from one another. This allows blocks to share SOC level pins for scan
chain access. Blocks can be tested individually (in a serial fashion) or in groups of
blocks. When implementing TestKompress in a hierarchical SOC design, modular
TestKompress is employed in order to be able to test multiple blocks at a time but it is not
necessary to test ALL the blocks simultaneously. Hierarchical DFT techniques can also
be used in a non-TestKompress (i.e. traditional) scan/ATPG flow.
Rev 1.1
April 5, 2006
Page 2 of 21
2.2 Motivation for Hierarchical DFT - The motivation for doing this work at the block
level is to stay consistent with the rest of the steps in the design flow (RTL, synthesis and
layout) which are also done at the block level. Hierarchical DFT also provides a
framework which allows cores to be re-used in other designs. Here is a more complete
list of the benefits of this DFT flow:
Flow compatibility with other design steps makes scheduling the completion of
DFT tasks more predictable
Allows for core re-use across multiple SOC design teams
Reduced tool capacity requirements
Faster tool runtime
ECOs (Engineering Change Orders) on individual blocks only require new test
patterns for those blocks. Pattern sets for other blocks are not affected.
Lower power consumption on tester by testing individual blocks or small groups
of blocks
Block isolation results in quicker identification of problems on the tester
Masking problem blocks on the tester (e.g. to address scan chain shift timing
problems) only affect that block. Other isolated blocks are still fully tested. This
is particularly valuable when screening initial prototype parts.
Faster production test time.
Some of the costs associated with this approach include:
More rigidly enforced design practices
Additional up-front planning for DFT
Additional top level routing overhead
3.
Rev 1.1
April 5, 2006
Page 3 of 21
Block 1
Block 3
Block 5
Block 9
Test Control
Block 2
Block 11
Block 10
Block 7
Clock Gen.
Block 12
Block 4
Block 6
Block 8
Figure 2 illustrates what the scan chains would look like in a block that has followed these
hierarchical design guidelines.
Rev 1.1
April 5, 2006
Page 4 of 21
Core
Functiona
I/O
Registere
Functiona
I/O
Registere
Partitio
Sca
Chai
Core Scan
April 5, 2006
Page 5 of 21
top level logic and other blocks) and output Partition cells launch data (to be captured by top
level logic or the input Partition cells of other blocks).
Output Partition
Chain
Input Partition
Chain
Top Level
Glue Logic
Top Level
Glue Logic
Rev 1.1
April 5, 2006
Page 6 of 21
Output Partition
Chain
Input Partition
Chain
Top Level
Glue Logic
Outlier
Logic
Outlier
Logic
Top Level
Glue Logic
U1
Launch Flop
1
Capture Flop
0
U2
SI FF3
SI FF1
SI FF2
SE
SE
SE
QB
QB
QB
States after loading scan chain and before two at-speed clocks
April 5, 2006
Page 7 of 21
behind it to provide the transition value. The result is that all the logic between the input
partition cell and the next point of registration cannot be tested for at-speed defects.
Launch
Capture
Input
Port
0
Q
U2
SI FF
SI FF
SE
SE
Q
Input
Port
SI FF1
1
SE
QB
Launch Flop
0
Input
Port
Capture Flop
U2
SI FF1
1
SE
Core Logic
SI FF2
0
QB
SE
SI FF2
0
QB
Output
Port
SE
QB
When testing inside a block, previous input partition scan cell serves as the transition origin
so long as SE is held active during capture. SE for core and output partition cells must be
inactive in order to capture.
April 5, 2006
Page 8 of 21
Transition Origin
X
Core Logic
SI FF1
1
SE
QB
Capture Flop
Launch Flop
X
Core Logic
SI FF1
1
Top Level
Glue Logic
0
Q
Input
Port
SI FF2
Output
Port
SE
SE
QB
QB
Q
QB
Launch Flop
X
Input
Port
0
D
Capture Flop
U2
SI FF1
1
SE
Core Logic
SI FF2
0
QB
SE
QB
April 5, 2006
Page 9 of 21
for the core chains separate from the two scan_enables used for the input and output partition
chains. The behavior of this signal though is the same as for normal scan chains for both stuck-at
and at-speed patterns.
How many chains should there be? The number of partition and core chains is driven by the
same planning process as is used for non-hierarchical implementations. If it is a normal ATPG
flow you target whatever number of chip level pins are available for scan. The difference being
that in a hierarchical implementation you may have more chip level pins to use because the
resources can be shared across all the blocks. In a TestKompress flow, the number of chains is
driven by the compression goals and the ratio of channels to chains being targeted. In all cases,
partition and core chains should be as balanced in length as possible.
Figure 2 illustrated a very simple partition scan chain implementation. Figure 10 is a more
accurate representation of what the block level chains would look like after addressing the atspeed requirements for partition scan chain insertion. For convenience purposes the core chains
shown are not balanced with the partition chains but the assumption is that you would balance
them.
Scan_enScan_in5 Scan_out5
Scan_in7 Scan_out7
Combinational
Scan_in1
D
SD
SE
CK
FF
FF
FF
FF
FF
QB
FF
D
SD
SE
CK
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
Q
FF
FF
QB
D
SD
SE
CK
Q
QB
D
SD
SE
CK
Scan_out4
Q
QB
Scan_out1
Scan_in2
D
SD
SE
CK
D
SD
SE
CK
Inputreg_scan_en
Rev 1.1
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
QB
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
QB
Scan_in4
D
SD
SE
CK
Q
QB
Scan_out3
FF
FF
D
SD
SE
CK
Q
QB
Scan_in3
Scan_out2
Scan_in6
Scan_out6
April 5, 2006
Outputreg_scan_en
Page 10 of 21
4.
SOC Level Implementations Once all the block level work is done to provide
isolation, there are a number of chip level implementations that can be employed to take
advantage of that effort. What is described here is a method by which blocks can be tested either
individually or in groups by sharing chip level scan chains through some muxing logic. This
approach will be referred to as serial test scheduling.
Top Level Logic Insertion To implement serial test scheduling, it is necessary for the SOC
level designer to create and insert muxing logic to control access from scan chain/channel outputs
to chip level outputs and a decoder block to control the mux selection. Chip level inputs need to
be wired in parallel to the scan chain/channel inputs of each block as well as the muxing and
decode logic. At this time, this is a manual, user-defined process. Since this logic is not
complex, some customers just treat this as another top level function and implement it in the top
level RTL. Figure 11 represents how the SOC design from Figure 1 would look when the blocks
are scanned hierarchically and top level muxing logic is added to share chip level scan chains.
Rev 1.1
April 5, 2006
Page 11 of 21
Chain1
Chain2
Chain3
Chain4
Block 1
Block 3
Block 5
Block 7
Scan_phase_sel
(2:0)
Block 9
Test Control
Block 11
Block 10
Clock Gen.
Block 12
Block 2
Block 4
Block 6
Block 8
Chain5
Chain6
Chain7
Chain8
Rev 1.1
April 5, 2006
Page 12 of 21
Chain1
Chain2
Chain3
Chain4
Scan_phase_sel
(2:0) = 000
Block 1
Test Control
Clock Gen.
Block 2
Chain5
Chain6
Chain7
Chain8
Rev 1.1
April 5, 2006
Page 13 of 21
Chain1
Chain2
Chain3
Chain4
Block 1
Block 3
Block 5
Block 7
Scan_phase_sel
(2:0) = 110
Block 9
Block 11
Test Control
Block 10
Clock Gen.
Block 12
Block 2
Block 4
Block 6
Block 8
Chain5
Chain6
Chain7
Chain8
Block Level Scan Chain Insertion This section provides details on how to insert
partition chains and core chains for hierarchical DFT as described in the previous section. The
important DFTAdvisor commands are highlighted and a sample dofile is given.
Rev 1.1
April 5, 2006
Page 14 of 21
DFTAdvisor has the ability to identify and insert partition chains. Recent enhancements have
been made to support the specific structure required for hierarchical testing. The specific
command used to do this is as follows:
SETup PArtition SCan
[-EXClude <pin_name .>]
[-INPUT_NUMber <integer> | -INPUT_MAX_length <integer>]
[-OUTPUT_NUMber <integer> | -OUTPUT_MAX_length <integer>]
[-INPUT_SEN <name>]
[-OUTPUT_SEN <name>]
Defined clocks and constrained pins are automatically excluded from the Partition chains. This
command can exclude any other I/Os using the EXClude switch. In order to achieve properly
balanced scan chains it may be necessary to have multiple input and output Partition chains. This
is achieved by using the INPUT_NUMber or INPUT_MAX_length for the input Partition
chains and OUTPUT_NUMber or OUTPUT_MAX_length for the output Partition chains.
Partition Chain Insertion - It is necessary to perform scan insertion in two passes. The first
pass will insert the Partition Scan Chains and then a subsequent pass will target the rest of the
scan cells in the core of the block. When running this second pass of scan insertion it is
necessary to specify the Partition Chains inserted in the first pass so that the scan cells on these
chains are ignored during the second pass of scan insertion and that a final ATPG Setup can be
properly generated. What follows is an example dofile to insert the partition scan chains:
add clock 0 clk1
add clock 0 clk2
add pin constraint test_enable c1
add cell model LATX type dlat G D
add cell model INVX type inv
set lockup latch on
setup partition scan exclude in1 in2 -input_num 4 \
-output_num 4 \
-input_sen input_partition_scan_en \
-output_sen output_partition_scan_en
set system mode dft
run
insert test logic balance_subchain clock merge \
edge merge output new
report scan chains
report test logic
write atpg setup partition_chains replace
write netlist block_partition_chains.v verilog replace
exit
Core Chain Insertion -Next step is to re-invoke DFTAdvisor on the block_partition_chains.v
netlist to insert the core chains. Use the setup dofile generated during the partition chain
insertion run in order to identify the existing partition chains. Here is what the core scan chain
insertion dofile would look like:
dofile partition_chains.dofile
Rev 1.1
April 5, 2006
Page 15 of 21
7.
SOC Level Integration The insertion and integration of the SOC level logic to select
the block level scan chains in phases is a manual process. All block level input scan chains
Rev 1.1
April 5, 2006
Page 16 of 21
(channels in TestKompress flow) are driven in parallel from the SOC input pins. The block level
output chains/channels go through muxing logic before connecting to the SOC output pins.
Control of the mux selection can be done via direct SOC input pins or by loading internal
registers. Other features to consider would be tying the scan inputs of non-targeted blocks to a
constant value in order to reduce power consumption during shifting.
8.
SOC Level Pattern Generation This section describes how to build phase level
netlists when invoking FastScan or TestKompress. Example dofiles and test procedure files are
shown for stuck-at pattern generation and transition pattern generation. Since pattern re-use is
not automated by the tools at this time, it is necessary to regenerate all patterns from the SOC
level. One advantage to the hierarchical division of the design is that the pattern generation for
multiple phases can be run simultaneously. Even though the patterns may be delivered on the
tester in a serial fashion, the pattern generation for each phase can be run in parallel.
8.1 Building Phase Level Netlists A significant reduction in machine capacity and ATPG
tool runtime can be realized by building a different netlist for each phase that contains
only the descriptions for blocks being targeted in that phase. It is not necessary to
actually create separate netlists, this can be achieved by selectively loading block level
netlists when invoking the tool. The assumption here is that the SOC level netlist
contains the complete definition of all top level logic but only instantiations of the blocks.
Block level descriptions exist as separate, stand-alone netlists. The following example
shows how FastScan should be invoked to test the first phase of pattern generation which
target blocks 1 and 2 as illustrated in Figure 12:
$MGC_HOME/bin/fastscan \
./netlists/soc_level.v \
./netlists/block1.v \
./netlists/block2.v \
-verilog \
-top soc_chip \
-lib ./libs/tsmc13.mdt \
-dofile ./dofiles/phase0_stuck_at.dofile \
-log ./logs/phase0_stuck_at.log \
-replace \
-nogui
The key points are that the top level description must be included (soc_level.v) as well as
the targeted blocks (block1.v and block2.v). What is not shown is that the verilog
descriptions for blocks 3-12 are excluded and therefore must be black-boxed once in the
tool. Be sure to specify the correct top module as well. Subsequent phases would be
invoked in the same manner. The soc_level.v must always be included but different
block level netlists will be substituted. For the top level testing phase, all netlists must be
included.
8.2 Stuck-at Pattern Generation The dofile required for pattern generation (either
FastScan or TestKompress) has a couple extra requirements associated with the
hierarchical implementation. The following dofile carries forward the same example
shown for building the phase level netlists. This phase0_stuck_at.dofile generates stuckat patterns for phase0 which targets blocks1 and 2. The hierarchical specific commands
are highlighted in green:
Rev 1.1
April 5, 2006
Page 17 of 21
April 5, 2006
Page 18 of 21
end;
end;
procedure shift =
scan_group grp1 ;
timeplate gen_tp2 ;
cycle =
force_sci ;
force edt_update 0;
measure_sco ;
pulse clka ;
pulse clkb ;
end;
end;
procedure load_unload =
scan_group grp1 ;
timeplate gen_tp2 ;
cycle =
force clka 0 ;
force clkb 0 ;
//all scan_enable signals must be active
//in order to shift
force scan_en 1 ;
force input_partition_scan_en 1 ;
force output_partition_scan_en 1 ;
end ;
apply shift 200;
end;
Each phase requires a separate dofile and test procedure file. The only thing that changes
for each phase are the pin constraints for the phase_select signals. A corresponding
change must be made in the test_setup portion of the test procedure file. Note that if
Engineering Change Orders (ECOs) are performed on a particular block, it is only
necessary to regenerate patterns for the phases that contain that block. You do not have
to regenerate all the patterns.
8.3 Transition Pattern Generation For transition patterns there are additional
requirements to constrain the input_partition_scan_en active when testing at the block
level and constrain output_partition_scan_en active when testing the top level logic and
interconnects. The CT constraint should be used for these signals because SE is not
expected to run at functional frequencies therefore it should be removed from the fault
list. When automatically black-boxing blocks, use the -auto Z option. Since the tool
cannot always distinguish inputs from outputs, using the Z option avoids a situation in
which a black box may try to drive what it thinks is a bus with an X value and possibly
resulting in tracing rule violations. The following dofile example is for Phase0 transition
patterns:
Rev 1.1
add
add
add
add
add
add
add
add
add
scan
scan
scan
scan
scan
scan
scan
scan
scan
chains
chains
chains
chains
chains
chains
chains
chains
chains
chain0
chain1
chain2
chain3
chain4
chain5
chain6
chain7
chain8
grp1
grp1
grp1
grp1
grp1
grp1
grp1
grp1
grp1
scan_in[0]
scan_in[1]
scan_in[2]
scan_in[3]
scan_in[4]
scan_in[5]
scan_in[6]
scan_in[7]
scan_in[8]
scan_out[0]
scan_out[1]
scan_out[2]
scan_out[3]
scan_out[4]
scan_out[5]
scan_out[6]
scan_out[7]
scan_out[8]
just loads in the various fault lists generated for each phase. The key is to use the protect switch so that detected faults dont get overwritten as AU faults in a subsequent
phase. The tool takes care of summing up the test coverage for the entire design. Here is
an example dofile for running the top level fault grade for stuck-at faults:
add
add
add
add
add
add
add
add
add
add
scan
scan
scan
scan
scan
scan
scan
scan
scan
scan
groups
chains
chains
chains
chains
chains
chains
chains
chains
chains
grp1 ./scripts/phase0.testproc
chain0 grp1 scan_in[0] scan_out[0]
chain1 grp1 scan_in[1] scan_out[1]
chain2 grp1 scan_in[2] scan_out[2]
chain3 grp1 scan_in[3] scan_out[3]
chain4 grp1 scan_in[4] scan_out[4]
chain5 grp1 scan_in[5] scan_out[5]
chain6 grp1 scan_in[6] scan_out[6]
chain7 grp1 scan_in[7] scan_out[7]
chain8 grp1 scan_in[8] scan_out[8]
Pattern Verification Gate level simulation with timing back-annotated can benefit from
the same flow used for pattern generation. By black boxing non-targeted blocks in the simulator,
the simulation times are significantly reduced.
Rev 1.1
April 5, 2006
Page 21 of 21
10.
Conclusions There are many variations on how a hierarchical DFT methodology can be
implemented. This application note is intended to show one of those variations and how the DFT
tools support this particular flow. For block level tasks, there is a high level of tool automation in
DFTAdvisor to properly insert the scan chains and in TestKompress as well if that is part of the
flow. The top level logic required for muxing the scan chain/channel access is simple in concept
but there is currently no tool automation to generate or integrate that logic. Pattern generation is
broken up into several phases and has some additional pin constraints to be considered
particularly with regard to at-speed testing.
The primary benefit of this methodology is the ability to finalize DFT tasks earlier in the
schedule of a hierarchically designed SOC. Core re-use is also simplified because it is no longer
necessary to balance scan chains between blocks if they are tested individually. Test quality
goals, and even functional requirements, are often sacrificed in order to meet schedule.
Completing DFT before it gets in the critical schedule path means test quality goals need not be
compromised. Additional benefits of this hierarchical approach include reduced tool capacity
requirements, faster tool runtime, less impact from late ECOs, quicker verification process and
easier debug on the tester.
Rev 1.1
April 5, 2006
Page 22 of 21