Sunteți pe pagina 1din 103

Cell-Based IC Physical Design and Verification

- SOC Encounter
Cell-Based Design Flow
Tape out
Verilog
VHDL

Post layout simulation


synthesis

DRC LVS
Gate level
netlist GDSII

Place & Route Routed Add LVS/Nanosim text


design Replace layout
SOC Encounter P&R flow

Netlist (verilog) IO,P/G Placement IO constraints

Timing constraints (sdc)


Specify floorplan

Amoeba Placement

Timing Analysis

Pre-CTS Optimization

Power Planning

Power Analysis

Clock Tree Synthesis

Timing Analysis

Post-CTS Optimization

Power Route
Output GDS, Netlist,Spef,DEF
SI Driven Route

Timing/SI Analysis
IO, P/G Placement
Corner1 I1 VDD O1 Corner2

I2 O2

IOVDD IOVSS

I3 O3

Corner3 I4 VSS O4 Corner4


Specify Floorplan

Hight

Width
Floorplan

I1 VDD O1

I2 O2
M2
IOVDD IOVSS
M1 M3
I3 O3

I4 VSS O4
Amoeba Placement
Power Planning
Clock Tree Synthesis

D D D D
Q Q Q Q
D D D D
Q Q Q Q
D D
Q Q
D D
Q Q
D D D D
Q Q Q Q
D D
Q Q
D D D D
Q Q Q Q

CLK CLK
D D D D
Q Q Q Q
Power Analysis
Power Route
Add IO Filler
Routing
SDC constraint
-- Create Clock
create_clock [-name clock_name]
[-period period_value]
[-waveform edge_list]
[-add]
[sources]

20

I_CLK

10
CHIP

create_clock –name CLK1 –period 20 –waveform {0 10} [get_ports I_CLK]


SDC constraint
-- create_generated_clock
create_generated_clock [-add]
[-master_clock]
Top [-name clock_name]
I_CLK [-source source_pin]
[-multiply_by mult]
[-divide_by div]
[-duty_cycle percent]
D QN div_clk [-neg]
[-edges edge_list]
[-edge_shift edge_shift_list]
clock_root_list

create_generated_clock –name CLK2 –source [get_ports I_CLK] –divide_by 2 [get_pins DF/QN]


SDC constraint
-- set_clock_latency

set_clock_latency [-source]
[-early | -late]
[-min | -max]
latency
pin_or_clock_list

set_clock_latency 2 [get_clocks {CLK1}]


SDC constraint
-- set_clock_uncertainty
set_clock_uncertainty
[-setup | -hold]
[-from clksig_from_list]
[-to clksig_to_list]
[-rise | -fall]
float
pin_or_clock_list

set_clock_uncertainty 0.5 [get_clocks {CLK1}]


SDC constraint
--set_input_delay
set_input_delay delay_value
[-min] [-max]
CLK1
[-rise] [-fall]
delay [-clock clock_name]
In1 .. In7
[-clock_fall]
In1 [-add_delay]
In2 [-network_latency_included]
: Design
: [-source_latency_included]
I_CLK port_pin_list

set_input_delay 1 –clock [get_clocks {CLK1}] [getports {In1}]


SDC constraint
--set_output_delay
set_output_delay delay_value
[-min] [-max]
CLK1
[-rise] [-fall]
delay
[-clock clock_name]
Out1
[-add_delay]
Out1 [-network_latency_included]
: CLK1 [-source_latency_included]
Design
:
port_pin_list
CLK1

set_output_delay 1 –clock [get_clocks {CLK1}] [getports {Out1}]


SDC constraint
--set_drive
5KΩ
In1 In1
set_drive [-min] [-max]
[-rise] [-fall]
3,2,4,3 In2 drive_strength
In2
port_list

rise_min, rise_max, fall_min, fall_max

set_drive 1 [get_ports {In1}]


SDC constraint
--set_load

Out1 5pf set_load [-min] [-max]


[-pin_load]
[-wire_load]
Out2 4~5pf load_value
port_list

set_load 1 [get_ports {Out1}]


SDC constraint
--set_false_path

set_false_path [{-from | -rise_from | -fall_from} pin_list]


[{-through | -rise_through | -fall_through} pin_list]
[{-to | -rise_to | -fall_to} pin_list]
[-reset_path]
[-hold | -setup]

set_false_path –from {A}


SDC constraint
--set_multicycle_path

set_multicycle_path {-hold | -setup}


{-start | -end}
[-reset_path]
[{-from | -rise_from | -fall_from} pin_list]
[{-through | -rise_through | -fall_through} pin_list]
[{-to | -rise_to | -fall_to} pin_list]

set_multicycle_path 2 –from {A} –to {B}


Static Timing Analysis

‹ Main steps of STA


¾ Break the design into sets of timing paths
¾ Calculate the delay of each path
¾ Check all path delays to see if the given timing constraints are met
‹ Four types of paths
PI
Start Point Combinational Logic
End Point

PO
Static Timing Analysis

AT=2
2
1
9 Path-based:
2+2+3 = 7
2+3+1+3 = 9
(OK)
(OK)
3 RAT=10 2+3+3+2 = 10 (OK)
3
1 2 5+1+1+3 = 10 (OK)
AT=5 3 5+1+3+2 = 11 (Fail)
1 5+1+2 = 8 (OK)

AT=2 AT=7
AT=2 RAT=5 RAT=7 Block-based:
2
1 Critical path is determined
3 RAT=10 as collection of gates with
AT=6 3
1 RAT=5 2 the same, negative slack:
AT=5 3 In our case, we see one
AT=11
AT=5
1 AT=9 RAT=10 critical path with slack = -1
RAT=4 RAT=8
Static Timing Analysis
Cell Delay
Cell Delay Dcell(I2) = f(Dtransition(I1), Ceq)

Transition Delay Dtransistion(I2) = g(Dtransition(I1), Ceq)

Output Input Transition


Capacitance 0 0.5 1 index1: input transition
0.1 0.123 0.234 0.456
Index2: output capacitance

0.2 0.222 0.432 0.801

Vin Dc Vout Dtransition(I2)


I1
I2
Dtransition(I1) I3
Req
Dcell(I2) Ceq
Static Timing Analysis
Setup time

‹ To meet the setup time requirement:


Trequire >= Tarrival
‹ Reg to Reg
¾ Tarrival = Tclk1+ TDFF1(clk->Q)+TPATH
¾ Trequire = Tclk2- TDFF2(setup)
¾ Tslack = Trequire- Tarrival Clk_source

clk1

TDFF1+Tpath

Tarrival

clk2 Tsetup

Tslack
Trequire
Static Timing Analysis
Setup time

‹ PI to Reg
¾ Tarrival = TPI(delay)+ TPATH
¾ Trequire = Tclk1- TDFF1(setup)
¾ Tslack = Trequire- Tarrival
Static Timing Analysis
Setup time

‹ Reg to PO
¾ Tarrival = Tclk1+ TDFF1(clk->Q)+TPATH
¾ Trequire = Tcycle- TPO(output delay)
¾ Tslack = Trequire- Tarrival
Static Timing Analysis
Setup time

‹ PI to PO
¾ Tarrival = TPI(delay)+ TPATH
¾ Trequire = Tcycle- TPO(output delay)
¾ Tslack = Trequire- Tarrival
Clk_source

TPI+Tpath

Tarrival

TPO(output delay)

Tslack
Trequire

Use set_max_delay or set_min_delay to overwrite STA constraint


Static Timing Analysis
hold time

‹ To meet the hold time requirement:


Trequire <= Tarrival
‹ Reg to Reg
¾ Tarrival = Tclk1+ TDFF1(clk->Q)+TPATH
¾ Trequire = Tclk2+ TDFF2(hold)
¾ Tslack = Tarrival-Trequire Clk_source

clk1
TDFF1+Tpath

clk2
Thold
Tslack
Trequire

Tarrival
Static Timing Analysis
hold time
‹ PI to Reg
¾ Tarrival = TPI(delay)+ TPATH
¾ Trequire = Tclk+ TDFF(hold)
¾ Tslack = Tarrival-Trequire
‹ Reg to PO
¾ Tarrival = Tclk+ TDFF(clk->Q)+TPATH
¾ Trequire = - TPO(output delay)
¾ Tslack = Tarrival-Trequire
‹ PI to PO
¾ Tarrival = TPI(delay)+ TPATH
¾ Trequire = - TPO(output delay)
¾ Tslack = Tarrival-Trequire
Timing exception: False path

‹ Why are there false path constraints in a design?


¾ A path may exist in the circuit but never be used in its normal
functional operation
¾ A functional path may exist but the timing is very slow or irrelevant
¾ A block may be reused and certain signal functions are no longer
required
¾ A path may exist in the circuit but no combination of input vectors may
ever exercise it
¾ A combinational loop exists in the design that needs to be broken
Timing exception: multi-cycle path

‹ Multicycle paths occur because the designer knows that the


particular logic function will not be used till a later cycle
IO constraint

‹Create an I/O assignment file manualy using the following template:

Version: 1
MicronPerUserUnit: value
Pin: pinName side |corner
Pad: padInstanceName side|corner [cellName]
Offset: length
Skip: length
Spacing: length
Keepclear: side offset1 offset2
IO constraint cont.

Version: 1

PAD_HALT
PAD_CLK
Pad: CORNER0 NW PCORNERDGZ
Pad: PAD_CLK N
Pad: PAD_HALT N

Pad: CORNER1 NE PCORNERDGZ


Pad: PAD_X1 W
Pad: PAD_X2 W

Pad: CORNER2 SW PCORNERDGZ


Pad: PAD_IOVDD1 S PVDD2DGZ

PAD_IOVDD1

PAD_IOVSS1
Pad: PAD_IOVSS1 S PVSS2DGZ

Pad: CORNER3 SE PCORNERDGZ


Pad: PAD_VDD1 E PVDD1DGZ
Pad: PAD_VSS1 E PVSS2DGZ
SSO Consideration
‹ SSO
¾ Simultaneously Switch Outputs
‹ SSN
¾ The noise produced by SSO buffers
‹ DI
¾ maximum number of copies for one specific kind of I/O pad
switching from high to low simultaneously without making
ground voltage level higher than 0.8 volt for one ground pad
‹ DF
¾ Drive Factor, DF = 1/DI
‹ SDF
¾ Sum of Drive Factor
SSO Consideration cont.
‹ Parameter of DF
¾ operating condition
¾ package inductance
¾ slew-rate control IO
¾ IO type with different drive strength
‹ In SSO case
¾ Required number of ground pads = SDF
¾ Required number of power pads = SDF/1.1
‹ Non SSO case (suggest)
¾ Required number of ground pads = SDF/1.5
¾ Required number of power pads = SDF/1.6
SDF Example

IO Type 2mA 4mA 8mA 12mA 16mA 24mA

DF Value 0.02 0.03 0.09 0.18 0.3 0.56

‹ If a design has 20 PDB02DGZ(2mA), 10


PDD16DGZ(16mA). then
‹ SDF = 20 x 0.02 + 10 x 0.3 = 3.4
‹ In SSO case,
¾ number of VSS pad = 3.4 Î 4
¾ number of VDD pad = 3.4/1.1 = 3.09 Î 4
Getting Started

‹ Source the encounter environment:


unix% source /usr/cadence/cic_setup/soc.csh
‹ Invoke soc encounter :
unix% encounter
‹ Do not run in background mode. Because the terminal become the
interface of command input while running soc encounter.
‹ The Encounter reads the following initialization files:
¾ $ENCOUNTER/etc/enc.tcl
¾ ./enc.tcl
¾ ./enc.pref.tcl
‹ Log file:
¾ encounter.log*
¾ encounter.cmd*
GUI

menus
design views
tool widgets
switch bar

design display area display control

name of design views


selected
object

auto query cursor coordinates


Tool Wedgits

Calculate
Zoom Hierarchy Fence Attribute Xwindow
Design Import Fit Previous Down/Up Density Editor dump/undump

Zoom Zoom Redraw Undo/Redo Design


In/Out Select Browser Summary Report
Design Views

‹ FloorplanView
¾ displays the hierarchical module and block
guides,connection flight lines and floorplan objects
‹ Amoeba View
¾ display the outline of modules after placement
‹ Placement View
¾ display the detailed placements of cells, blocks.
Display Control

Select Bar
Common Used Bindkeys

Key Action Key Action


q Edit attribute space Select Next
f Fits display e popup Edit
z Zoom in T editTrim
Z Zoom out 0-9 toggle layer[0-9] visibility
Arrows pans design area in the h/H hierarchy up/down
direction of the arrow
x clear Drc
Escape Cancel
Looking for more bindkey:
K Removes all rulers Design->Preference, Binding Key
Import Design 9
9
DesignÆDesign Import…

‹ Max Timing Libraries


¾ containing worst-case conditions for
setup-time analysis
‹ Min Timing Libraries 9
¾ containing best-case conditions for
hold-time analysis
‹ Common Timing Libraries
¾ used in both setup and hold analysis 9
9
‹ IO Assignment File:
¾ get a IO assignment template:
DesignÆSaveÆI/O File…
9
71
Import Design -- Timing

‹ Default Delay Pin Limit:


¾ Nets with terminal counts greater than
the specified value are assigned the
default net delay and net load entries.
‹ Default Net Delay:
9 ¾ Set the delay values for a net that
meets the pin limit default.
‹ Default Net Load:
¾ Set the load for a net that meets the
pin limit default.
9 ‹ Input Transition Delay:
¾ Set the Primary inputs and clock nets.
IO, P/G Placement

Corner1 I1 VDD O1 Corner2

I2 O2

IOVDD IOVSS

I3 O3

Corner3 I4 VSS O4 Corner4


Cell-Based Design Flow

Tape out
Verilog
VHDL

Post layout simulation


synthesis

DRC LVS
Gate level
netlist GDSII

Place & Route Routed Add LVS/Nanosim text


design Replace layout
Import Design –IPO/CTS

‹ Buffer Name/Footprint:
¾ specifies the buffer cell family to be inserted or swapped.
¾ required to run IPO and TD placement. Footprint Example:
‹ Delay Name/Footprint: For Cells:
BUFXL
¾ required to run a fix hold time violation BUFX1
‹ Inverter Name/Footprint: BUFX2
BUFX3
¾ required to run IPO and TD placement. BUFX4
‹ Get footprint of library cells by: BUFX8
BUFX12
¾ TimingÆReportÆCell Footprint BUFX16
BUFX20
Footprint : buf
Import Design -- Power

9
9

9
9
9
Global Net Connection

FloorplanÆ Gloval Net Connections…

9
Specify Floorplan 9
9

FloorplanÆSpecify Floorplan …

9 9
9 9

9
9

78
Specify Floorplan – Doube back rows

Double-back rows:
Row Spacing > 0

Row Spacing = 0
Core Limit, I/O Limnt
Power Planning: Add Rings

FloorplanÆCustom Power PlanningÆAddRings


Power Planning: Add Rings

Use wire group to avoid


slot DRC error

9
9
9
Power Planning: Wire Group

9Use wire group 9Use wire group


no interleaving 9interleaving
9number of bits = 2 9number of bits = 2
Power Planning: Block Ring
Power Planning: Block Ring cont.
Power Planning: Block Ring cont.

Block A Block B Block A Block B

Block C Block C

Without shared ring edges With shared ring edges


Power Planning: Add Stripes

9
Power Planning: Add Stripes

9
9
9
9
9

9
Power Planning:
Add Stripes
9
9
9

crossover

via array
Placement
PlaceÆPlace…
‹ Prototyping : Runs quickly, but components may not be placed at legal
location.
‹ Timing Driven:
¾ Build timing graph before place.
¾ meeting setup timing constraints
with routability.
¾ Limited IPO by
upsizeing/downsizing instances. 9
‹ Reorder Scan Connection 9
¾ nets connected to either the
scan-in or scan-out are ignored.
‹ Check placement after placed
¾ placeÆCheck Placement
Floorplan Purposes

‹ Develop early physical layout to ensure design objective can be


archived
¾ Minimum area for low cost
¾ Minimum congestion for design routable
¾ Estimate parasitic for delay calculation
¾ Analysis power for reliability
‹ gain early visibility into implementation issues
Difference Floorplan
Difference Performance
Wire Load After Placement

Logical wire load after placement


Module Constraint

‹ Soft Guide
‹ Guide
‹ Region Soft Guide Guide
‹ Fence

Region Fence
Guide , Region, Fence

‹ Placement constraint
‹ Create guide for timing issue
‹ A critical path should not through
two different modules
‹ The more region, the more
complicated floorplanning
Add Tiehi/Tielo cell

‹ Tiehi/Tielo cell connect tiehi/tielo net to supply voltage or


ground with resister
‹ Tiehi/Tielo cell is added for ESD protection.
‹ Set add tiehi/tielo cell mode:
encounter> setTieHiLoMode –maxFanOut #num –maxDistance #num
‹ PlaceÆTieHiLoÆAdd TieHiLo
Clock Problem

‹ Clock problem
¾ Heavy clock net loading
¾ Long clock insertion delay
¾ Clock skew
¾ Skew across clocks
¾ Clock to signal coupling effect
¾ Clock is power hungry
¾ Electromigration on clock net
‹ Clock is one of the most important treasure in a chip, do
not take it as other use.
Clock Tree Topology
Synthesize Clock Tree

Create Clock Tree Spec


clock spec

Specify Clock Tree Modify

Synthesis Clock Tree netlist


synthesis report
clock nets
routing guide
Display Clock Tree
Create Clock Tree Spec.

ClockÆCreate Clock Tree Spec

9
9

9
CTS

‹ CTS traces the clock starting from a root pin, and stops at:
¾ A clock pin
¾ A D-input pin
¾ An instance without a timing arc
¾ A user-specified leaf pin or excluded pin
‹ Write a CTS spec. template:
¾ specifyClockTree -template
CTS spec.

‹ A CTS spec. contain the following information.


¾ Timing constraint file (optional)
¾ Naming attributes (optional)
¾ Macro model data (optional)
¾ Clock grouping data (optional)
¾ Attributes used by NanoRoute routing solution (optional)
¾ Requirement for manual CTS or automatic CTS
Mapping from sdc to clock tree spec

Timing Constraints Clock Tree Specs


creat_clock AutoCTSRootPin / ClkGroup
create_generated_clock ThroughPin
set_clock_latency Maxdelay
set_clock_uncertainty Maxskew
set_clock_transition BufMaxTran / SinkMaxTran
Synthesize Clock Tree

ClockÆSynthesize Clock Tree

Reconvergence clock

Crossover clock
Clock Synthesis report

‹ Summary report and detail report


¾ number of sub trees
¾ rise/fall insertion delay
¾ trigger edge skew
¾ rise/fall skew
¾ buffer and clock pin transition time
¾ detailed delay ranges for all buffers add to clocks
‹ Clock nets
¾ Saves the generated clock nets
¾ used to guide clock net routing
‹ Clock routing guide
¾ Saves the clock tree routing data
¾ used as preroute guide while running Trial Route
Display Clock Tree
ClockÆDisplayÆDisplay Clock Tree…
Display Clock Tree
--by phase delay
Clock Tree Browser
ClockÆClock Tree Brower

‹ Display trig edge, rise/fall delay, rise/fall skew, input delay,


input tran of each cell.
‹ Resize/Delete leaf cell or clock buffer
‹ Reconnect clock tree
Optimization

TimingÆOptimization…

‹ IPO
¾ setup time
¾ hold time
¾ SI
¾ DRV (Design
Rule Violation)
Trial Route

‹ perform quick routing for congestion and parasitics


estimation
‹ Prototyping:
¾ Quickly to gauge the
feasibility of netlist.
¾ components in design might
no be routed at legal location
Trial Route Congestion Marker

‹ visually check the congestion


statistics.
‹ dump congestion area:
BLOCK
¾ dumpCongesArea -all file_name

V=25/20 H=16/18

The vertical (V) overflow is 25/20 (25 tracks are required , but only 20 tracks are available) .
The Horizontal (H) overflow is 16/18 (16 tracks are required , and18 tracks are available) .
Trial Route Congestion Marker cont.

Level Color Overflow Value


1 Blue One more track required
2 Green Two more track required
3 Yellow Three more track required
4 Red Four more track required
5 Magenta Five more track required
6 and higher Grey to White Six or more track required
Timing Analysis

TimingÆSpecify Analysis ConditionÆSpecify RC Extraction Mode …


TimingÆExtract RC…
TimingÆTiming Analysis…

‹ No Async/Async:
¾ recovery, removal check
‹ No Skew/Skew:
¾ check with/without clock
skew constraint
Slack Browser
TimingÆDebug Timing
Power Analysis

TimingÆExtract RC…
PowerÆEdit Pad Location…
PowerÆEdit Net Toggle Probability…

9
9
9
9
SRoute

‹ Route Special Net (power/ground net)


¾ Block pins
¾ Pad pins
¾ Pad rings
¾ Standard cell pins
¾ Stripes (unconnected)
Add IO filler

addIoFiller –cell PFILL –prefix IOFILLER


addIoFiller –cell PFILL_9 –prefix IOFILLER
addIoFiller –cell PFILL_1 –prefix IOFILLER
addIoFiller –cell PFILL_01 –prefix IOFILLER -fillAnyGap
‹ Connect io pad power bus by inserting IO filler.
‹ Add from wider filler to narrower filler.

ADD IO FILLER
Add IO filler cont.

‹ In order to avoid DRC error


¾ The sequence of placing fillers must be from wider fillers to
narrower ones.
¾ Only the smallest filler can use -fillAnyGap option.
NanoRoute
RouteÆNanoRoute
NanoRoute Attributes

RouteÆNanoRoute/Attributes
Antenna Effect

‹ In a chip manufacturing process, Metal is initially deposited


so it covers the entire chip.
‹ Then, the unneeded portions of the metal are removed by
etching, typically in plasma(charged particles).
‹ The exposed metal collect charge from plasma and form
voltage potential.
‹ If the voltage potential across the gate oxide becomes large
enough, the current can damage the gate oxide.
Antenna Ratio

metal2 Plasma
metal2 Plasma
via2 + + + + + ++ + + + metal1 + + +
via1

poly gate oxide

Area of process antennas on a node


Antenna Ratio =
Area of gates to the node
Antenna Problem Repair
‹ Add jumper
‹ Add antenna cell (diode)
‹ Add buffer
metal2

via1 metal1
poly
gate oxide
Add Core Filler

PlaceÆFillerÆAdd Filler…

‹ Connect the NWELL/PWELL layer in core rows.


‹ Insert Well contact.
‹ Add from wider filler to narrower filler.
Add bonding pads (stagger IO pads only)

Linear IO pad Stagger IO pad


Abutted Stagger IO
PIN

Logic and driver

PR boundary

Bonding matel

Inner Bonding

Outer Bonding
Add bonding pads (stagger IO pads only)

‹ For the limitation of bonding wire technique , the stagger IO


pads are used in order to reduce IO pad width.
‹ We have to add the bonding pads after APR is finished if
stagger IO pads is used. But SE does not provide a built-in
function for add bonding pads, CIC reaches this purpose by
the way of importing DEF.
‹ CIC provides a perl script to calculate the bonding pad
location. The full flow is described in next page
Output Data

DesignÆSaveÆGDS…
DesignÆSave->Netlist…
DesignÆSave->DEF
‹ Export GDS for DRC,LVS,LPE,and tape out.
‹ Export Netlist for LVS and simulation.
‹ Export DEF for reordered scan chain.
Stream Out map

‹ Layer/object name layer/object type layer number data type

METAL1 ALL 16 0
NAME METAL1/NET 16 0
NAME METAL1/SPNET 40 0
NAME METAL1/PIN 40 0
NAME METAL1/LEFPIN 16 0
VIA12 ALL 17 0
METAL2 ALL 18 0

S-ar putea să vă placă și