Sunteți pe pagina 1din 8

In g .

Bu e r o
Ga r d i n e r

PCI Express Training Overview


Summary
Comparison PCI Express / PCI 32-Bit Ing. Buero
Gardiner
A collection of nearly 1000 slides constitute a base for tailoring a
one to three day PCI Express training specially crafted to meet the
customer's requirements. From a basic introduction through to an PCI Express Card

advanced course including hands-on practical sessions, the scope


for configuration is as wide as is the spectrum of tasks and
challenges likely to be encountered by any project team. PCI Card

The training goal is to give the participants all the information they
need to embrace this exciting and pervasive technology with
confidence.
Page 54 Okt-11 Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

The overview below is based on trainings which have already been


successfully presented to customers. Please contact us to discuss
how we can adapt or extend our material to serve you as well as
we have served others in the past.
Introduction to PCI Express - Transfer Rates Ing. Buero

Introductory Training
Gardiner

PCI Express Transfer Rates (PCI Express Gen. 1.x)

 One day
PCI 33: 132 MB/s, PCI Express x1: 250 MB/s per direction

PCIexpress transfer Rates


18

 Topics covered: 16

14

12

• Migration from PCI to PCI Express


10

GBytes/s
8

• Detailed description of PCI Express Transaction Layer


4

0
PCI 33 / 32 Bit PCI 66 / 64 Bit AGP 8x PCI-X 2.0 / QDR PCIe x1 PCIe x2 PCIe x4 PCIe x8 PCIe x12 PCIe x16 PCIe x32

Protocol (TLP)
Technology
per Direction Full Duplex

• Overview of Data-Link (DLL) and Physical layers (PHY) Page 35 Okt-11 Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

• Description of Link Training Sequence State Machine


(LTSSM) and its use in system debugging
• Overview of Configuration Space
• Overview of PCI Express interrupt concepts PCI Express Commands Ing. Buero
Gardiner
• Overview of board layout strategies
T E
R Fmt Type R TC R Attr R Length

• Device Driver Overview


D P

Fmt Type[4:0] TLP Type Description


00 0 0000 MRd Memory Read Request, 32 Bit Address
01 Memory Read Request, 64 Bit Address
00 0 0001 MrdLk Locked Mem. Rd. Request, 32 Bit Addr.
01 Locked Mem. Rd. Request, 64 Bit Addr.

Advanced Training
10 0 0000 MWr Memory Write Request, 32 Bit Addr.
11 Memory Write Request, 64 Bit Addr.
00 0 0010 IORd I/O Read Request
10 0 0010 IOWr I/O Write Request
00 0 0100 CfgRd0 Configuration Read, Type 0
10 0 0100 CfgWr0 Configuration Write, Type 0

 Three days
00 0 0101 CfgRd1 Configuration Read, Type 1
10 0 0101 CfgWr1 Configuration Write, Type 1
01 1 0rrr Msg Message Request, no payload
11 1 0rrr MsgD Message Request, with payload
00 0 1010 Cpl Completion, no payload

 Topics covered:
10 0 1010 CplD Completion, with payload
00 0 1011 CplLk Completion for Locked Mem. Rd, error
10 0 1011 CplDLk Completion for Locked Mem. Rd, with payload

• Migration from PCI to PCI Express Page 34 Okt-11 Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

• Overview of PCI Express architectures. Differences


between Gen 1, Gen 2 and Gen 3
• PCI SIG Standards overview
PCI Express Memory Request Ing. Buero

• Detailed description of PCI Express Transaction Layer Gardiner

Protocol (TLP). Packet routing in PCI Express. PCI 7 6 5


Byte 0
4 3 2 1 0 7 6 5
Byte 1
4 3 2 1 0 7
T E
6 5
Byte 2
4 3 2 1 0 7 6 5
Byte 3
4 3 2 1 0

Express ordering rules. Detailed discussion of


1. DWord R Fmt Type R TC R Attr R Length
D P
Requestor ID Last DW st
2. DWord Tag 1 DW BE
Bus No[7:0] Dev No[4:0] Fn [2:0] BE

transaction model (posted, non-posted)


Opt. 3. DWord Address [63:32] (only present if 64-bit addressing used. See Fmt field)

3. / 4. DWord Address [31:2] R

• Detailed description of Data-Link Layer (DLL). Link-layer ➢ Length does not include ECRC DWord.
 Presence of ECRC DWord is indicated by TD Field
bring-up, flow control, credit handling, inherent error ➢ Header Size always 3 DWords if address is within first 4 GBytes of Memory
address space
recovery strategies, DLL transactions for power ➢ Header Size of 4 DWords only if address above 4 GBytes
 4 DWord header with third DWord set to zero is illegal
management
Page 30 Okt-11 Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

PCI Express Training Overview Page 1 of 8


In g . Bu e r o
Ga r d i n e r

• Detailed description of Physical Layer (PHY). 8B/10B


coding philosophy, physical layer ordered sets, physical-
link bring-up, link-width recognition, lane polarity
reversal, PHY strategies for power saving
• The Link Training Sequence State Machine (LTSSM). Role Completion Scenario Ing. Buero
CplD – Multiple Completions Gardiner
in physical-layer bring-up and system debugging. Request TLP

• PIPE (Physical Interface for PCI Express) description. Requestor Completer

Interface between digital and analogue logic.


• Detailed description of PCI Configuration Space and Completion TLPs

extensions for PCI Express. Bus enumeration, Type 0


➢ Successful Memory Read Completion
 Only a Memory Read Request may receive multiple Completions

and Type 1 configuration headers (endpoints/bridges),  A Completer may respond to a Memory Read Request with a single
Completion Packet or with multiple Completion Packets

BAR principles, detailed description of capability In practice often generated by root complex (e.g. Intel chip sets)

➢ May be required from end-point (read request length > Max. Payload Size)

structures, class codes, role of configuration space in Page 42 Okt-11 Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

power management
• PCI Express power concepts. Summary from TLP, DLL
and PHY layers. Detailed description of layer interaction
• PCI Express Interrupt Concepts. Legacy Interrupts, MSI,
Ing. Buero
MSI-X. Overview of typical latency values , description of
Byte Count Calculation Overview
Gardiner

typical pitfalls. Max Payload Size

Rem. CplD Length min

• PCI Express Error model. Fatal, non-Fatal, Correctable. -

Reasons and interpretation. Role-based messaging


Request Length Byte Count

CplD Length
x4

• Discussion of board layout strategies. Reset concepts. Missing BEs in First BE -


- x4

Standard form factors.


+ Missing BEs in First BE

Missing BEs in Last BE -

+ Missing BEs in Last BE

• Overview of PCI Express hot-plug concepts


Byte Count for 1. CplD

CplD Sent?

• Tool description for debugging PCI express. Serial data Page 51 Okt-11 Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

analysers, protocol analysers. Presentation of protocol


analysis session with LeCroy protocol analyser
(practical).
• Device Drivers. Architecture description for Linux and
Windows. Skeleton driver code walk-through (Linux and
Windows). PCI Express Flow Control Ing. Buero
Gardiner

• Architecture discussion for PCI Express applications.


DMA. DMA integration in typical SoC bus systems such
as AXI, Avalon, OCP and Wishbone
• Discussion of simulation strategies. Optionally with ➢ PCI Express
 Packet transfer is atomic (no waits states, retries etc.)

practical session using Aldec active-hdl or Riviera 


Handshake Mechanism is per Packet
Flow Control is based on advertised Capacity of Receiver to accept a Packet
simulators (Credits)

Page 7 Okt-11 Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

Credit Examples Ing. Buero


Gardiner

1 1 Header Credit / 1 Payload Credit

1 Header Credit / 2 Payload Credits


2

3 1 Header Credit

➢ Scenario 1 - As posted write (1x PH, 1x PD), as non-posted write (1x NPH,
1x NPD), as completion (1x CPLH, 1x CPLD)
➢ Scenario 2 - As posted write (1x PH, 2x PD), as completion (1x CPLH, 2x
CPLD)
➢ Scenario 3 - As memory read, I/O read, config. read (1x NPH), as
completion (1x CPLH)
➢ Messages handled same as posted write (memory write)

Page 12 Okt-11 Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

PCI Express Training Overview Page 2 of 8


In g . Bu e r o
Ga r d i n e r

PCI Express Power Management


 One day
 Topics covered: Link Initialisation: LTSSM
Link Training Sequence State Machine
Ing. Buero
Gardiner

• Summary, migration PCI to PCI express Init


LTSSM

• Summary, PCI express transaction layers (TL, DLL, PHY)


Detect

Disabled

• Summary PCI/PCI Express configuration space


Compliance Polling

Hot Reset

Configuration

• Summary PCI Express hot-plug concepts L2 L0


Loopback

• Detailed overview of PHY layer and PIPE interface in PCI


Recovery

L1 L0s

express Gen 2 and Gen 3. Signal swing configuration


• Inherent (hardware) power management and how it is Page 26 Okt-11 Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

configured
• Software controlled power management. Aux power and
wake-up from external source PCI Express - Transition to L1 State Ing. Buero
Gardiner

• CLKREQ# concept for PCI Express in mobile solutions. Upstream Component Downstream Component

L1/L2 power states


Software write
Power cap. s to
Structure
ds
Device sen DLLP
1
PM_Enter_L
Repeatedly sent until Upstream
Device send
Electric Idle recognised PM_Reque s

PCI Express Advanced Concepts


st_Ack DLLP

➢ Initiated by System Software / Device Driver


 One day  Writes power state < D0 to Power Management Capability Structure in
PCI extended Configuration Space
Downstream Component sends PM_Enter_L1 DLLP once outstanding non-
 Topics covered

posted requests have completed and minimum credits accumulated

• Summary, migration PCI to PCI express Page 64 Okt-11 Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

• Summary, PCI express transaction layers (TL, DLL, PHY)


• Summary, PCI Express transaction model (posted, non-
posted, ordering rules) Allocating Virtual Function Device ID Ing. Buero
Gardiner

• Summary PCI/PCI Express configuration space PCIe Root Complex


➢ Locates a VF for Transactions routed by ID
 Completions

• Summary PCI Express hot-plug concepts


Device ID
PCI Bus 0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Bus No. Dev. No. Fct. No
PCIe Switch Device ID (ARI)
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

• Detailed description of L1/L2 power states and CLKREQ#


Bus 2 Bus 3 Bus 4 Bus No. Function No.

PCIe EP PCIe EP PCIe Switch

signalling
Bus 6 Bus 7 Bus 8
SR-IOV Cap. BAR0
BAR1
BAR2
PCIe EP PCIe EP PCIe EP BAR3
BAR4
VF VF PF BAR5

• Detailed discussion of Virtual Channels (VC)


NumVFs Dev ID
VF Stride 1st Vf Offset

Dev ID
+ Dev ID
VF BAR0
VF BAR1
Type 0 Header

• Detailed discussion of Address Translation Services


VF BAR2 xN
VF BAR3
VF BAR4 +
VF BAR5 Assigned implicitly on
Config Space Req.
VF_ID = DevID + 1st VF_Offset + (N x VF_Stride)

(ATS) Page 155 Okt-11. Licensed to Siemens AG for internal use only Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

• Detailed discussion of Alternate Routing ID (ARI)


• Detailed discussion of I/O virtualisation concepts. Single-
root, multi-root virtualisation (SR-IOV, MR-IOV)
• Detailed discussion of TLP prefixes and TLP processing System Virtualisation – A more complete Picture Ing. Buero
Gardiner
hints. OS OS RTOS System Image (SI)

• PCI Express atomic operations and multi-cast Virtualisation Intermediary (VI)


SR PCI Manager (SR_PCIM)

Transactions Core0 Core1 Core2 Core3 ➢ PF


Processor

• PCI Express ID-based ordering Physical Function


➢ VF
 Virtual Function
Translation
Agent (TA)

PCIe Endpoint PCIe Endpoint


PCIe Root Complex
VF VF PF PF VF VF VF

Page 149 Okt-11. Licensed to Siemens AG for internal use only Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

PCI Express Training Overview Page 3 of 8


In g . Bu e r o
Ga r d i n e r

PCI Express for Lattice FPGA Developers


 Two days PCI Express Requirements: Ing. Buero
AC-Coupling Capacitors Gardiner

 Topic covered
• Migration from PCI to PCI Express
• PCI SIG Standards overview ➢ PCI Express requires AC coupling capacitors in TX path
 TX and Ref. Clock on side A of add-in card (Components on Side B)

• Detailed description of PCI Express Transaction Layer 


75nF – 200nF. 0603 package acceptable but 0402 preferable
Locate close to connector or close to component, never in the middle

Protocol (TLP). Packet routing in PCI Express. PCI 


 Reflection are highest if caps placed midway between plug and FPGA
Intel recommendation: < 1/3 distance between connector and

Express ordering rules. Detailed discussion of 


component
Match the differential pair lengths on a segment-to-segment basis

transaction model (posted, non-posted)


Page 85 Okt-11 Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

• Overview of Data_link Layer (DLL). Link-layer bring-up,


flow control, credit handling, inherent error recovery
strategies
• Overview of Physical Layer (PHY). 8B/10B coding
philosophy, physical layer ordered sets, physical-link Message Signalled Interrupts (MSI)
Example
Ing. Buero
Gardiner

bring-up, link-width recognition, lane polarity reversal ➢ MSI Descriptor in Extended PCI Configuration Space

• Overview of PIPE Interface


 System writes Address and Message to Descriptor Registers
➢ Interrupt Sources 0 to 31 write Index Number into lower five Bits
➢ TLP
• Detailed description of PCI Configuration Space and  Command is Memory Write r 1100000 r TC
Req. ID
Transaction-Layer Packet
r Attr r 0000000001
Tag 0000 0011
 Traffic Class (TC) can be
extensions for PCI Express. Bus enumeration,
Addr High
chosen freely Addr Low r
Off. Message don't care
 Address copied from Capability
description of capability structures, class codes, role of
From Interrupt Source
Structure Source Index
Source Index

 Message and Source Index Cap. Structure in Extended Config. Space

configuration space in power management


Control / Status
merged to form Payload Addr High
Addr Low
Message 00000

• PCI Express Interrupt Concepts. Legacy Interrupts, MSI,


Description of typical pitfalls.
Page 10 Okt-11 Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

• Discussion of board layout strategies. Reset concepts.


Standard form factors.
• Tool description for debugging PCI express. Serial data
analysers, protocol analysers. Presentation of protocol Bus Functional Model Overview Ing. Buero

analysis session with LeCroy protocol analyser Gardiner

(practical). 8B + K
8B + K
PCIe IP-Core
SERDES

Protocol
Synthesis Application

• Device Driver Overview for Linux and Windows.


Block

Block

8B + K Logic
8B + K

Development tools.
PHY
FPGA
Simulation

• Practical. Configuration of Lattice PCI Express IP-Core


Trace-Out

Compl. List /
Scoreboard
Protocol Block

8B + K

• Practical. Design and synthesis of simple completer-only


Verilog Test
Lattice

System
Scenarios Memory
Arbiter

8B + K
Verilog

FPGA solution. Simulation of FPGA design with Aldec


TLP Gen.
VHDL
VHDL Test Trace-In
TLP Gen.
Scenarios

active-hdl simulator and bus-functional model (BFM). Page 58 Okt-11 Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

Participants get BFM and FPGA source-code as take-


away

Wishbone Burst Accesses Ing. Buero


Gardiner

Wishbone Burst Write Wishbone Burst Read

Page 73 Okt-11 Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

PCI Express Training Overview Page 4 of 8


In g . Bu e r o
Ga r d i n e r

Introductory Device Drivers for PCI Express


 Two or Three days (Linux and/or Windows). Hands-on Linux
and/or Windows day
 Topics covered
• Summary, migration PCI to PCI express
• Summary, PCI express transaction layers (TL, DLL, PHY)
• Summary, PCI Express transaction model (posted, non-
posted, ordering rules)
• Detailed description of PCI/PCI Express configuration
space
• Overview of PCI Express Interrupt Concepts
• Summary of PCI Express power concepts
• Discussion of LTSSM and its role in system bring-up
• Tools and frameworks for device driver development
• Tools for system analysis. PCI Express protocol analysis
• Presentation of FPGA design for practicals. Lattice Versa Application Software: Ing. Buero
Accessing Device Driver Gardiner
board. FPGA with source code and simple DMA #include <setupapi.h>

controller. DMA sources as obfuscated VHDL.


HANDLE hDevFile;
HDEEVINFO hdevInfo;
DWORD Buffer[BUF_SIZE];
SP_DEVICE_INTERFACE_DATA DeviceInterfaceData;
SP_DEVINFO_DATA DeviceInfoData;
PSP_DEVICE_INTERFACE_DETAIL_DATA pDeviceInterfaceDetail;

• Dedicated Linux / Windows day


DWORD nBytes
ULONG size;

hDevInfo = SetupDiGetClassDevs(&LSCC_PCIE_DEMO_007,
NULL, NULL,
DIGCF_DEVICEINTERFACE | DIGCF_PRESENT);

➢ Basic driver concepts. How is the hardware found, DeviceInterfaceData.cbSize = sizeof(SP_DEVICE_INTERFACE_DATA);


DeviceInfoData.cbSize = sizeof(SP_DEVINFO_DATA);
SetupDiEnumDeviceInterfaces(hDevInfo, NULL,
(LPGUID)&LSCC_PCIE_DEMO_007,

udev, ini-files, registry, GUIDs


0, &DeviceInterfaceData);

// Determine the Size of the Device Interface Detail


SetupDiGetDeviceInterfaceDetail(hDevInfo, &DeviceInterfaceData,
NULL, 0, &size, NULL);

➢ Stages of loading a driver. Claiming, configuring and


// Reserve enough Memory for the interface detail and call again
pDeviceInterfaceDetail = (PSP_DEVICE_INTERFACE_DETAIL_DATA) malloc(size);
SetupDiGetDeviceInterfaceDetail(hDevInfo, &DeviceInterfaceData,
pDeviceInterfaceDetail, size, NULL, &DeviceInfoData);
// Get Handle to Device
hDevFile = CreateFile(pDeviceInterfaceDetail->DevicePath,

releasing resources
GENERIC_READ|GENERIC_WRITE,
FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, 0, NULL);
. . . . . . .
WriteFile(hDevFile, Buffer, sizeof(Buffer), &nBytes, NULL);

➢ The hardware device as a file. File-I/O, IOCTLs


Page 23 Okt-11 Copyright © Ing. Buero Gardiner 2011. All Rights Reserved.

➢ Interrupt concepts
➢ Introduction to scatter/gather DMA
➢ Talking to the driver from application space
➢ Tracing driver operations with a protocol analyser

PCI Express Training Overview Page 5 of 8


In g . Bu e r o
Ga r d i n e r

PCI Express Training Overview Page 6 of 8


In g . Bu e r o
Gar d i n er

PCI Express Training Overview Page 7 of 8


In g . Bu e r o
Gar d i n er

Copyright © Ing. Buero Gardiner 2015. All Rights Reserved.


Feb. 2017

Ing. Buero Gardiner.


Heuglinstr. 29a, 81249 Muenchen, Deutschland
www.ib-gardiner.eu

S-ar putea să vă placă și