Sunteți pe pagina 1din 19

Verification Strategy for PCI-Express

Presenter: Pradip Thaker


July 4th, 2008

Outline

PCI-Express Protocol Overview

Verification Paradigm

Design-for-Verification (Well-aligned implementation and


verification architectures)

A key ingredient for a timely verification closure

PCI to PCI Express

Limitations of PCI

Not enough bandwidth

Shared bus bandwidth


No support for Isochronous applications (TDM or Synchronous Traffic application)
Cost of hardware for parallel busses

Evolution Path

32-bit/33 MHz (132 MB/s)


64-bit/66 MHz (528 MB/s)

Growing faster is the only possibility (not wider)


Point-to-point communication (Shared bus connectivity impossible above 100/150
MHz)
CDR architecture (Speed limitation of a synchronous bus above few hundred MHz)
Backward compatibility a must

Fast forward to future PCI Express (PCIe)

Packet-level data-units over high-speed SERDES based connectivity


Layered architecture much like networking protocols

Mechanical, Physical, Data-link, Transaction, Software and System Layers

Compatible with existing PCI software infrastructure


Weird wedding of two distinct architectural and business practices Networking and
Computer
Creation of nightmarish scenario for chip verification (Details on later slides)
3

PCI-Express Protocol Overview - Terminology

Dual Simplex a related set of two differential pairs (Tx and Rx)
Lane Dual Simplex when PCI-Express compliant
Port A group of Txs and Rxs within a single device that represent a single connection
to PCI-Express fabric
Link Two ports and the collection of lanes that interconnect them
x1, x4, x8, xN Number of lanes within a port or a link
Upstream Flow of traffic towards the CPU or a port that establishes link in that
direction within the hierarchy
Downstream Flow of traffic away from the CPU or a port that establishes a link in that
direction within the hierarchy
Ingress Port the portion of a PCIe port that receives the incoming traffic
Egress Port the portion of a PCIe port that transmits outgoing traffic
Root Complex The combination of a PCIe host bridge and one or more downstream
ports
Endpoint A device that terminates a path within the hierarchy
Bridge A device that physically and electrically connects PCIe to another protocol
Switch A device that provides a physical connection between two or more PCIe ports
4

PCI-Express Hierarchy

PCI-Express Protocol Overview : Physical

Logical Functions

8B/10B Encoding and Decoding


Scrambling
Reset, initialization, multi-lane de-skew
Lane mapping
Adjustments of bit-transmission order for various throughput options (x1 through x32)
Logical idle behavior and transition to active state as per protocol
TLP and DLLP transmission and reception: Insertion and Processing of Special Symbols per protocol conditions
Link initialization (recovery from link errors, transition from low power states)
Link negotiations

Link synchronization

Width
Data-rate
Lane reversal
Polarity inversion
Bit-wise per lane
Symbol-wise per lane
Lane-to-lane de-skew

Ordered (TS and Skip) set handling and processing


Fast training sequence
Link power management
Delay insertions as per protocolmore that could not fit here

Electrical Functions

Link within 600 ppm at all times


Spread spectrum clocking
AC coupling
Interconnect parasitic capacitance adherence
Receiver DC commong mode voltage of 0 V
Transmitter DC common mode established during Detect
Receiver Detect under various scenarios
Total jitter
Maximum loss budget
De-emphasis
Maximum BER
Beaconmore that could not fit here

PCI-Express Protocol Overview : Data-link Layer

Link management

Point-to-point reliable data exchange


Error detection, re-try as well as Error Logging and Reporting
Power Management message decoding, state transitions for activation and de-activation
TLP sequence number generation and tracking
LCRC computation and decoding
DLLP integrity encoding and decoding
ACK/NAK generation and processing
ACK time-out notification and handling
Flow control computation, tracking and processing Credit based flow-control
Data poisoning
Completion Time-out
Re-transmission of packets
Package storage for re-try/replay
DLLP generation, processing and actuation based on current status

DL_UP, DL_Down, DL_Inactive, DL_Active, DL_Init state transitions


Slot power limit handling
Propagation of link-reset downstream

ACK DLLP
NAK DLLP
InitiFC1
InitFC2
UpdateFC
Power Management
Vendor specific

Cut-through routing
TLP/DLLP ordering permutations per protocol
TLP integrity check insertion and processing
ACK/NAK latency timer rules processing a limit-triggered response.more that could not fit here
7

PCI-Express Protocol Overview : Transaction Layer

Flow control management

TL manages, DL executes
Point-to-point, not end-to-end
Independent for each VC ID
Mechanism presumes Ideal conditions
Credit types PH, PD, NPH, NPD, CPLH, CPLD

Data transactions
TLP storage and processing for transmission or consumption
TLP generation: Header, Payload and Digest
TLP generation and handling of various lengths (4 Bytes to 4096 Bytes)
Transaction types

Memory (32-bit and 64-bite addressing)


I/O
Configuration
Message

Transaction Completion

Reads and non-posted writes


Completion routing is by ID
Provide completion status

Transaction Ordering
Routing rules
Arbitration

INTx
PME
ERR
Unlock
Slot Power
Hot Plug
Vendor-defined

Port arbitration
VC arbitration

Virtual channels
Traffic classes
Locked transactions support
Isochronous support
Advance error processing and reporting.more that could not fit here

PCI-Express Protocol Overview: Summary

Open standard containing over 500 pages


Many more pages of supporting literature
Each line of each page in the standards document is a cryptic
edict dictating a specific behavior for each condition

and not a detailed explanation about behavior or implementation

Much space for protocol detail misinterpretation resulting into


mal-function or non-compliance

Hundreds of configuration bits each controlling a complex


behavior within the chip with strict adherence to standard dictate
to guarantee backward software compatibility

No wiggle room to claim bug as a feature!!!


9

Verification Paradigm

Chips based on Open-Standard Pressure Points

Technology/Feature differentiator Marginal or Non-existing

Time-to-market Very Critical

Quality of First Silicon Critical

Addresses Two Key Aspects: TTM and Quality of Silicon

Verification Execution: Focal Points

First product To Establish Credible Presence


Sub-sequent products with various flavors To Capture Market Share
Bridges: PCI-to-PCIe, SATA-to-PCIe, 1394-to-PCIe, USB-to-PCIe etc.
Switches: 4-port x1 throughput, 4-port x4 throughput, 8-port x4 throughput, etc.
Root Complex: x1 throughput, x4 throughput, etc.

Verification Plays A Major Role in Success of Chips based on Open-Standard

Commodity product Power, Performance and Price

Functionality
Performance
Interoperability (Compliance and Compatibility)

Verification Platform Architecture and Methodology: Focal Points

Re-usability
Scalability (Modularity)
Comprehensiveness (with leveraging of automation)
10

Verification Strategy: A Broader Definition

Verification A vehicle to deliver chips with Zero Bugs(!),


Compliance and Superior performance

Performance Modeling (C/C++/SystemC)

RTL Verification
FPGA-based Emulation

Architecture and Micro-architecture of Key Data and Control Paths

Compliance and Compatibility testing


PCI-SIG certification to be on Integrators List
Performance verification

3rd party Compliance Checkers and Vectors


Mixed-signal Simulations

11

Functional Verification: Four Pillars

Coverage-driven constrained-random testing with reference models (HVLs)

Reference Model (RFM)


Temporal Checkers
Protocol Monitors
Sequence Generators
Constraints
Functional Coverage
Test-plan

Assertion-based verification for key building blocks

Detects design errors at the source increases observability and decreases debug-time
Can identify subtle bugs that may be hard to reach with SBV
Black-box assertions Protocol oriented
Effective for size/complexity to an extent (memory-size and run-time limitations)

Suitable for block-level deployment rather than end-to-end chip-level stand-alone verification
method

Complex properties are verified through bounded-proof (neither proven nor falsified)
Effective for control-path oriented logic (state space exploration rather than data-path logic)
verification
Assertions when written by engineer other than designer can help detect specification
(interpretation) class of errors

Asynchronous clock-domain simulations

Power-domain simulations Power Management Compliance Check-list

Improper Buffer Insertion, Missing Level Shifters, Missing Power Good, Power Sequencing Tests
12

Functional Verification: CDV (Re-usability and Scalability)

13

Functional Verification: Golden Rules for RFM

Reference Model shall be independent of the DUT implementation

Reference Model to be created by engineer other than designer of the block


Reference Model created in high-level language and hence it does not have any lowlevel mechanics analogous to RTL implementation to realize functionality

Reference Model shall support co-simulation with the DUT in order to predict
and verify run-time behavior

Reference Model for each block shall be created such that it can be integrated
into chip-level verification environment seamlessly

Hybrid Modeling

Control paths: Cycle-accurate modeling


Data paths: Packet-accurate or Data-unit-accurate modeling
Fully cycle-accurate model is maintenance nightmare as well as a cumbersome task
without significant value-add to verification quality

Comprehensiveness (with leveraging of automation)

CDV is only as powerful as comprehensiveness of automated checking features of


reference model and monitors
Can run millions of RTG cycles with comprehensive reference model and monitors
without much manual overhead

14

Performance Verification

Performance Parameters (to be supported with variable sized packets across mixed-traffic
types, across all traffic patterns, mixed VCs and mixed-packet sizes)
Aggregate Throughput
Latency (to be balanced against power dissipation)
Jitter in Latency
Availability/Blocking Internal back-pressure
N+1 Performance limitation (small TLPs back-to-back)
Flow-control credits
Load distribution and balancing (peer-to-peer as well as vertical traffic flows with
mixed of traffic types, VCs and packet sizes)
Link utilization No bubbles within or between TLPs (really challenging for cutthrough mode)
Zero tolerance for packet loss
Zero tolerance for wrong packet routing

20% overhead lost in 8B/10B coding


Small TLPs with header as well as DL layer overhead impacting transaction layer efficiency
even with 100% link utilization
Traffic-aware flow-control credit updates (large and small TLPs)

Performance Modeling (C/C++/SystemC)

Architecture and Micro-architecture of Key Data and Control Paths

FPGA-based Emulation
RTL Verification Not an adequate method for performance testing for PCIe development
15

Compliance Verification

Electrical Compliance Check-list

Signal Quality Analysis

Eye pattern, jitter and BER analysis

Signaling for upstream and downstream

Jitter Analysis DLL

Clock recovery

Interpolation

Transition/non-transition eye points

Data-Link Layer Compliance Check-list

Reserved Fields testing

NAK Response

Replay Timer

Replay Count

Link Retrain

Replay TLP Order

Bad CRC

Undefined Packet

Bad Sequence Number

Duplicate TLP

Transaction Layer Compliance Check-list

Completion request, completion time-out, read-data

Messaging Legacy interrupts, Native power management, Hot-plug, Error Signaling

Flow Control Initialization, Transmit and Receive States, Negotiated Link Width

Virtual Channel

System Architecture/Platform-configuration Check-list

Capability registers testing

Default values

Stress test

Slot reporting

Hot plug event reporting

16

Compliance Verification

Separate compliance check-list with some overlap for RC,


Endpoints and Switches

Integrated PHY in the silicon

FPGA-based emulation (Native or 3rd Party)

FPGA platforms with discrete PHY and digital logic

Compliance testing with Agilent PTC and PCI-SIG Golden Suite


Compatibility testing with over 80% of the systems during
PlugFest
PCI-SIG certification to be on Integrators List

Native protocol checkers static and temporal


3rd party Compliance Checkers and Vectors

Synopsys, Denali, nSys and others


17

Design-for-Verification

Cafeteria Architecture: Modular and Scalable

For rapid deployment of various flavors of bridges and switches based on flagship
platform part
Speed of Capturing market-share as critical as first product deployment to establish
credible presence

Modular architecture to enable thorough block-level or sub-system level


simulations

Functional partitioning to reduce scope of chip-level verification effort and


complexity

Standardized block interface

Push v/s Pull Inter-block Data-threads


Distributed v/s Centralized Control Processing

Reduce scope of Error of Specification and Error of Omission


Promote verification component re-use (BFMs, Sequences, etc.)
Minimum number as well as flavors of physical interconnects between blocks (may
use in-band signaling where applicable)

Emphasis on correct-by-construction practices during design-creation phase

Otherwise TTM Window will be missed due to prolonged verification or multiple respins (PCIe non-forgiving of bugs that hamper compliance or compatibility)
18

Thank You!

19

S-ar putea să vă placă și