Documente Academic
Documente Profesional
Documente Cultură
03-SP3 07/31/2007
Predictable Success
Contents
Overview of Useful Skew Useful Skew in IC Compiler Running Useful Skew Analyzing and Debugging Useful Skew Results Case Study
Predictable Success
Contents
Overview of Useful Skew Useful Skew in IC Compiler Running Useful Skew Analyzing and Debugging Useful Skew Results Case Study
Predictable Success
Most timing violations are fixed by data path optimization With useful skew, you fix timing violations by adjusting clock arrival times at the registers or latches
Paths with negative slack
(-1) IN (-1) (0) OUT
(+2)
CLK
Predictable Success
(+2)
(0)
(0) OUT
IN
(0)
Predictable Success
Clock tree synthesis can achieve larger latency adjustment targets; design can have more useful skew At pre clock tree synthesis stage, parasitics are estimated based on virtual routing with more scope for miscorrelation Factors such as timing derate on clock path are not considered since the clock is ideal
Apply useful skew incrementally to fix timing violations in the post clock tree synthesis or post route stage
+
After detail routing, timing should be most accurate; therefore applying useful skew should be effective Clock tree optimization can only make small latency adjustments
Pre route clock tree optimization allows sizing, relocation and delay insertion. However, the ability to use these techniques to meet the latency adjustment is limited. Post route clock tree optimization only allows sizing
Predictable Success
Feedthrough path
IN2
OUT2
(+2)
CLK
Predictable Success
Contents
Overview of Useful Skew Useful Skew in IC Compiler
Overview of Useful Skew in IC Compiler Analyzing the Timing of the Design Determining the Pins to Be Optimized Understanding the Solution File Generated by skew_opt Sourcing the Solution File Prerequisites for Running Useful Skew How skew_opt Works With Clock Gating, I/O Paths, Hold Fixing, and Scan Chains Known Issues With the Current Useful Skew Implementation
Running Useful Skew Analyzing and Debugging Useful Skew Results Case Study
2007 Synopsys, Inc. (8)
Predictable Success
Predictable Success
CK1
CK2
Predictable Success
Fragile: nonoptimized pins that are written into the solution file
Unconstrained pins Clock pins inside interface logic models (ILMs) Level-sensitive latches Clock pins of level-sensitive latches Clock pins of registers on paths to or from level-sensitive latches When skew_opt_optimize_to_clock_gates is false (default is true), registers generating enable signals for clock gates are not optimized. See slide 49 for more details
Predictable Success
skew_opt sets float pin exceptions on clock pins whose latency needs to be adjusted (set_clock_tree_exception float) Clock tree synthesis stops traversal when it sees this exception on clock pins The portion of the clock structure beyond these pins is not optimized for skew, causing incorrect results
U1 IGC ECLK
U2
CLK
If a float pin exception is set on this pin, the registers U1 and U2 are not considered part of the clock tree
Predictable Success
CLK
Clock tree synthesis can only adjust latency to ILM clock pins
Clock tree synthesis at top level cannot adjust latencies to pins inside ILMs; skew_opt therefore considers them as fragile pins and sets the same float pin exception on these pins
Predictable Success
Contents
Overview of Useful Skew Useful Skew in IC Compiler
Overview of Useful Skew in IC Compiler Analyzing the Timing of the Design Determining the Pins to Be Optimized Understanding the Solution File Generated by skew_opt Sourcing the Solution File Prerequisites for Running Useful Skew How skew_opt Works With Clock Gating, I/O Paths, Hold Fixing, and Scan Chains Known Issues With the Current Useful Skew Implementation
Running Useful Skew Analyzing and Debugging Useful Skew Results Case Study
2007 Synopsys, Inc. (14)
Predictable Success
Clock latency set by skew_opt: If clock tree synthesis is able to implement the clock exceptions defined by skew_opt, you should expect to see the propagated clock latency on this pin very close to this value (__scl is aliased to set_clock_latency in the Tcl file) Clock exception equivalent to the clock latency set by skew_opt. See next slide on how skew_opt determines the clock exception value from the clock latency values (__scte is aliased to set_clock_tree_exceptions in the Tcl file)
Interclock delay balancing options set by skew_opt based on the timing relationship between the clock domains (__sicdo is aliased to set_inter_clock_delay_options in the Tcl file)
Predictable Success
set_clock_latency Understood by the timer; can be used to measure skew_opt QoR Clock tree synthesis does not honor clock latencies set at clock pins set_clock_tree_exceptions Not understood by timer Clock tree synthesis honors these constraints set_inter_clock_delay_options Interclock delay constraints based on skew_opt analysis
Predictable Success
By default, all three sets of Tcl commands are sourced: set_clock_latency set_clock_tree_exceptions set_inter_clock_delay_options
Use the following variable settings to control which Tcl commands are sourced from the solution file: skew_opt_skip_ideal_clocks skew_opt_skip_propagated_clocks skew_opt_skip_clock_balancing
Predictable Success
U1
(+2)
Calculating the clock exception value: 1. Find min (all clock latency values) 2. Float pin value for pin = (Min latency latency specified for pin)
skew_opt clock exceptions: set_clock_tree_exceptions -2.0 -float_pin U1/CK set_clock_tree_exceptions 0.0 float_pin U2/CK
Predictable Success
Contents
Overview of Useful Skew Useful Skew in IC Compiler
Overview of Useful Skew in IC Compiler Analyzing the Timing of the Design Determining the Pins to Be Optimized Understanding the Solution File Generated by skew_opt Sourcing the Solution File Prerequisites for Running Useful Skew How skew_opt Works With Clock Gating, I/O Paths, Hold Fixing, and Scan Chains Known Issues With the Current Useful Skew Implementation
Running Useful Skew Analyzing and Debugging Useful Skew Results Case Study
2007 Synopsys, Inc. (19)
Predictable Success
2.
Predictable Success
b.
Predictable Success
d. e. f. g.
Remove any ideal latencies set on the clock network (remove_ideal_latency) Ensure that the clocks are ideal before running pre clock tree synthesis skew_opt flow (remove_propagated_clock) Ensure that high-fanout nets are marked as ideal or run high-fanout net synthesis on these nets Remove any pin_load constraints set on clock ports. For example, set_load -min -pin_load 0.0 Clk
Predictable Success
Predictable Success
#All clocks are ideal before CTS open_mw_cel placed_cel remove_propagated_clock [all_fanout -clock] remove_propagated_clock {*} remove_ideal_latency -all remove_ideal_network all #Run clock_opt to get updated latencies set_inter_clock_delay_balance balance_groups {clk1 clk2} set_latency_adjustment_options -from_clock clk1 -to_clock vclk clock_opt -inter_clock_balance -update_clock_latency write_sdc updated.sdc sh grep set_clock_latency updated.sdc > updated.sdc.1 sh grep get_clock updated.sdc.1 > updated.tcl close_mw_cel #Load updated constraints into placed CEL and optimize the design #before running skew_opt open_mw_cel placed_cel source updated.tcl extract_rc -estimate remove_propagated_clock [all_fanout -clock] remove_propagated_clock {*} remove_ideal_latency -all remove_ideal_network -all place_opt
Predictable Success
Contents
Overview of Useful Skew Useful Skew in IC Compiler
Overview of Useful Skew in IC Compiler Analyzing the Timing of the Design Determining the Pins to Be Optimized Understanding the Solution File Generated by skew_opt Sourcing the Solution File Prerequisites for Running Useful Skew How skew_opt Works With Clock Gating, I/O Paths, Hold Fixing, and Scan Chains Known Issues With the Current Useful Skew Implementation
Running Useful Skew Analyzing and Debugging Useful Skew Results Case Study
2007 Synopsys, Inc. (25)
Predictable Success
ICG ECLK
CLK
The ideal latency specified for CLK is considered as the clock arrival time for all the pins on that clock domain, such as the clock pins of U1, U2 and integrated clock gating The enable timing seen by skew_opt is therefore optimistic
After clock tree synthesis, the clock arrival time at the integrated clock gating (ICG) clock pin will be less than that at the clock pin of U2 (The clock pin of the integrated clock gating is a non stop for clock tree synthesis)
To avoid this, you can explicitly set the clock latency at the clock pins of the clock gates taking into consideration the delay from the clock gate to the endpoints
Use a quick run of clock tree synthesis to determine these latencies
By default, skew_opt adjusts the clock arrival time at the clock pins of U1 and U2, and leaves the clock arrival time at the integrated clock gating unchanged
Predictable Success
ICG ECLK
CLK
In the above scenario, the register generating the enable signal for the clock gate has a data path to registers that are gated by the same clock gate By default, skew_opt adjusts the latency to the clock pins of both U1 and U2 During clock tree synthesis, the float pin exception on these pins causes the clock arrival time at the clock gate to change (as compared to the initial quick run of clock tree synthesis to estimate the clock arrival time at the clock gate), thus invalidating the skew_opt solution
When skew_opt_optimize_to_clock_gates is set to false, skew_opt does not optimize the latency on the clock pin of the register generating the enable signal
Predictable Success
Predictable Success
When both the setup and hold options are specified, skew_opt tracks WNS for both setup and hold for each startpoint and endpoint; the worst WNS governs the solution
Predictable Success
In a skew_opt flow, there will be larger real skew between clock pins after clock tree synthesis optimize_dft currently assumes zero skew between clock pins and can lead to larger hold violations
Predictable Success
Predictable Success
Contents
Overview of Useful Skew Useful Skew in IC Compiler Running Useful Skew
Known Issues With the Current Useful Skew Implementation Useful Skew User Interface
Predictable Success
Useful Skew Flows: Pre Clock Tree Synthesis and Post Clock Tree Synthesis
IC Compiler Placed CEL View (Prepared for Useful Skew)
This setting is required to avoid losing the propagated attribute that is annotated on the clocks by compile_clock_tree
clock_opt inter_clock_balance
route_opt
Predictable Success
Predictable Success
Predictable Success
Contents
Overview of Useful Skew Useful Skew in IC Compiler Running Useful Skew
Known Issues With the Current Useful Skew Implementation Useful Skew User Interface
Predictable Success
Variables that control which Tcl commands are sourced from the solution file
skew_opt_skip_ideal_clocks skew_opt_skip_propagated_clocks skew_opt_skip_clock_balancing
These variables do not affect the solution generated by skew_opt. They control only which Tcl commands are sourced in from the solution file.
Predictable Success
Description of Variables
skew_opt_skip_ideal_clocks
Default: false When set to true, skew_opt does not set ideal clock latencies on the clock pins
skew_opt_skip_propagated_clocks
Default: false When set to true, skew_opt does not set clock exceptions on the clock pins
skew_opt_skip_clock_balancing
Default: false When set to true, skew_opt does not set interclock balancing options
Predictable Success
Predictable Success
-hold
Optimize WNS for hold constraints; off by default. When both setup and hold are specified, the setup solution is constrained by the hold slack. It is possible that setup improvement achieved is not as much as when only the setup option is used.
-pins pin_list
Specifies a list of pins to optimize; by default, all adjustable clock pins are considered for optimization
-fix_boundary_pins
Do not optimize clock arrival time for registers on boundary paths
Predictable Success
-path_groups path_groups
Specifies the path groups considered for optimization; by default, all path groups are considered
-output file_name
Specifies a file name for the solution file; by default, the solution file name is skew_opt.tcl
-no_auto_source
Do not source the solution file at the end of skew_opt; by default, the solution file is sourced at the end of skew_opt
Predictable Success
-hold_margin hold_margin_value
The margin is subtracted from the hold slack to allow you to influence skew_opt to improve paths with positive slack. Default is 0 ns. Unit is ns.
-adjustment_limit adjustment_limit_value
Sets a limit on the latency adjustment that can be set on any pin. Default is no limit. Unit is ns.
-decrease_factor decrease_factor_value
Sets a fractional limit on latency decreases by using a value between zero and one. Default is 0.5. For designs with many clock tree levels, a larger decrease factor (e.g. 0.75) might yield more slack improvement.
Predictable Success
-resolution resolution_value
Snaps the clock tree exception value to a multiple of this value. Default is 0.001 ns. Unit is ns. The minimum allowed value is 0.0001 ns.
-no_optimization
Use the clock latencies set at the clock pins. For example, the tool takes the set_clock_latency commands you specified and converts them into clock exceptions; by default, this is disabled
Predictable Success
Contents
Overview of Useful Skew Useful Skew in IC Compiler Running Useful Skew Analyzing and Debugging Useful Skew Results
Measuring Useful Skew QoR Without Running Clock Tree Synthesis Understanding the Log File Debugging QoR Degradation in a Useful Skew Flow
Case Study
Predictable Success
Measuring Useful Skew QoR Without Running Clock Tree Synthesis (Pre Clock Tree Synthesis Only)
IC Compiler Placed CEL View (Prepared for Useful Skew) Pre Clock Tree Synthesis Flow
skew_opt no_auto_source
Set the variable to disable loading of clock exceptions in the solution file, then source the solution file
report_timing
Timing acceptable?
Y
Run skew_opt with a different set of options or go with the default flow If timing improvement is acceptable, continue with the skew_opt flow
Predictable Success
Contents
Overview of Useful Skew Useful Skew in IC Compiler Running Useful Skew Analyzing and Debugging Useful Skew Results
Measuring Useful Skew QoR Without Running Clock Tree Synthesis Understanding the Log File Debugging QoR Degradation in a Useful Skew Flow
Case Study
Predictable Success
Based on the constraints, skew_opt determines the pins to work with Pins on clock-gating cells, I/O ports, ILMs, and level-sensitive latches are excluded
Predictable Success
Predictable Success
Indicates the starting QoR and the improvement after each iteration Cumulative negative slack (CNS): the sum of negative slacks for all the constraints skew_opt is considering Final QoR achieved by skew_opt
Minimizing latency adjustments: Setup WNS ---------2.310e-01 -2.350e-01 Setup CNS ---------1.791e+01 -1.870e+01 Hold WNS -------+0.000e+00 +0.000e+00 +1.075 -0.500 Hold CNS -------+0.000e+00 +0.000e+00 --> --> +0.061 -0.089
Latency adjustments are minimized; this might have a small impact on the setup WNS Indicates that latency increases have been reduced from 1.075 ns to 0.061 ns; latency decreases reduced from 0.5 ns to 0.089 ns. The smaller the latency adjustment, the easier it is for clock tree synthesis to meet this target
Predictable Success
I/O constraints have been specified with respect to a real clock (instead of a virtual clock)
Predictable Success
Resources used for optimization: 1.22e-04 cpu hours 0.00e+00 gigabytes This design could not be further optimized.
When QoR improvement is less than the threshold, skew_opt does not generate a solution file
Predictable Success
Contents
Overview of Useful Skew Useful Skew in IC Compiler Running Useful Skew Analyzing and Debugging Useful Skew Results
Measuring Useful Skew QoR Without Running Clock Tree Synthesis Understanding the Log File Debugging QoR Degradation in a Useful Skew Flow
Case Study
Predictable Success
Predictable Success
Predictable Success
Debugging QoR Degradation in a Useful Skew (Pre Clock Tree Synthesis) Flow
Postroute CEL (baseline flow)
skew_opt flow timing worse than baseline?
skew_opt flow timing after clock tree synthesis worse than baseline?
Indicates correlation issue between post clock tree synthesis and postroute timing Check if interclock balancing tool issued any messages about clocks it could not balance
N
Does post skew_opt timing correlate with post clock tree synthesis?*
Y
Indicates clock tree synthesis is not able to implement the skew_opt solution
*See next slide for details on checking if the clock tree synthesis implementation correlates with the skew_opt solution
Predictable Success
Comparing the Clock Tree Synthesis Implementation to the skew_opt Solution in a Pre Clock Tree Synthesis Useful Skew Flow
skew_opt solution file (skew_opt.tcl) Post-clock tree synthesis clock timing report (report_clock_timing -nosplit -type latency nworst 1000000)
Compare clock latency in the skew_opt solution file to the clock timing after clock tree synthesis*
Possible convergence issues
Predictable Success
Contents
Overview of Useful Skew Useful Skew in IC Compiler Running Useful Skew Analyzing and Debugging Useful Skew Results Case Study
Predictable Success
Case Study: Design Information, Initial Timing, Timing with Baseline Flow
Predictable Success
Implicit ignore pins connecting to clock pins of gates with unconnected output; should not impact skew_opt flow
Predictable Success
Predictable Success
Updating the latencies on clock objeclock tree synthesis.(*psynopt*) Information: Latency computed from clock Clk1 will be applied on clock Clk1. (clock tree synthesis-530) Information: Updating the latency of clock Clk1 to 1.804981 (max) 0.735533 (min). (clock tree synthesis-531) Information: Latency computed from clock Clk2 will be applied on clock Clk2. (clock tree synthesis-530) Information: Updating the latency of Clk2 to 2.000936 (max) 0.848920 (min). (clock tree synthesis-531)
Predictable Success
Predictable Success
remove_propagated_clock [all_fanout -clock] remove_propagated_clock {*} remove_ideal_latency -all remove_ideal_network -all extract_rc estimate report_qor skew_opt report_qor
Predictable Success
Should correlate
Critical Path Clk Period: Total Negative Slack: No. of Violating Paths: No. of Hold Violations:
-----------------------------------
-----------------------------------
Minimizing latency adjustments: Setup WNS ---------2.090e-01 -2.090e-01 Setup CNS ---------3.749e+01 -3.749e+01 Hold WNS -------+0.000e+00 +0.000e+00 Hold CNS -------+0.000e+00 +0.000e+00
Should correlate
Predictable Success
Predictable Success
----------------------------------Timing Path Group Clk2' ----------------------------------Levels of Logic: Critical Path Length: Critical Path Slack: Critical Path Clk Period: Total Negative Slack: No. of Violating Paths: No. of Hold Violations: 9.00 2.95 0.81 8.00 0.00 0.00 274.00
-----------------------------------
Predictable Success
Baseline flow
Timing Path Group Clk1' ----------------------------------Levels of Logic: Critical Path Length: Critical Path Slack: Critical Path Clk Period: Total Negative Slack: No. of Violating Paths: No. of Hold Violations: 11.00 0.78 -0.76 4.00 -720.42 5113.00 41851.00
----------------------------------Timing Path Group Clk2' ----------------------------------Levels of Logic: Critical Path Length: Critical Path Slack: Critical Path Clk Period: Total Negative Slack: No. of Violating Paths: No. of Hold Violations: 9.00 2.92 1.00 8.00 0.00 0.00 281.00
----------------------------------Timing Path Group Clk2' ----------------------------------Levels of Logic: Critical Path Length: Critical Path Slack: Critical Path Clk Period: Total Negative Slack: No. of Violating Paths: No. of Hold Violations: 9.00 2.93 0.86 8.00 0.00 0.00 282.00
-----------------------------------
-----------------------------------
Predictable Success
Predictable Success
Predictable Success