Documente Academic
Documente Profesional
Documente Cultură
► Related Work
► CCATB Modeling Abstraction
► Exploration Studies
► Conclusion
SoC Communication
► AHB ► APB
Pipelined Low Power
Burst modes Simple Interface
Split transactions Single Master
Multiple masters
AMBA 3.0
► Introduces AXI high performance protocol
Out of order completion
Fixed mode bursts
Advanced system cache support
►Specify if transaction is cacheable/bufferable
►Specify attributes such as write-back/write-through
Enhanced protection support
►Secure/non-secure transaction specification
Exclusive access (for semaphore operations)
Issues
► Selecting and configuring these
architectures for optimal PE
performance is a critical activity
in a SoC design Interface
bus architecture
(e.g. AMBA 2.0, AMBA 3.0
CoreConnect)
architecture parameters
Interface
(e.g. bus width, burst size)
?
PE
bus topologies
(e.g. shared, hierarchical)
protocol choices
(e.g. arbitration strategies)
Interface
PE
SoC Simulation Speed
Cycle Rate Technology
1 Silicon Reference Design
10-2 HW Emulator
10-3 Transaction Model
10-4 Cycle Accurate Model
10-6 RTL Model
10-7 Gate Level Model
► Related Work
► CCATB Modeling Abstraction
► Exploration Studies
► Conclusion
Communication Modeling Approaches
► Cycle Accurate (CA) Models
► Bus Cycle Accurate (BCA) Models
► Transaction Level Modeling (TLM)
► Hybrid Modeling Approaches
Cycle Accurate Models
master slave Algorithm
var1 = a + b; case CTR_WR:
wait(); CTR_WR = in;
REG = d << var1; wait();
bus
wait(); CTR_WR |=0xf;
TLM
HREQ.set(1);
e = REG4 | 0xff
arb wait();
ST_RG = in|0x1
wait(); wait();
pin interface
BCA
• Detailed system debug and analysis
pin interface
BCA
• High level system exploration
• Fast to model
- /10 to /50 RTL
CA
• Fast simulation speed, but model not
too detailed for exploring SoC designs
- >>1000x RTL
Register Transfer Level
Hybrid Approaches
master slave Algorithm
… …
var1 = a + b; case CTR_WR:
d = d << var1; CTR_WR = in;
bus
request(port1); CTR_WR |=0xf;
TLM
e = REG4 | 0xff
wait(3, SC_NS);
arb ST_RG = in|0x1
wait(3, SC_NS);
HSEL.set(1); …
► Related Work
► CCATB Modeling Abstraction
► Exploration Studies
► Conclusion
CCATB Modeling Abstraction
► Variant of Hybrid Modeling Approach
No pins at interface
read(), write() transaction interface
► Cycle Count Accurate at Transaction Boundaries
maintains overall cycle accuracy, essential for system
exploration
► Trades off intra transaction visibility for
simulation speed
more than 1.5x faster than fastest BCA models
Timing Analysis
CCATB
► Model Abstraction
IPs modeled at behavioral level
Bus model extends generic TLM channel, adding
►Timing
►Bus protocol details
► Communication Interface
extension of read(), write() transactions from TLM
Protocol details (e.g. burst size, cache hints) need to be passed
► Modeling Language - SystemC
fast (C/C++ native execution)
provides constructs (concurrency, timing) for hardware modeling
extensive commercial tool support (debugging, waveform
viewing)
Exploration with CCATB Models
► Bus Architecture
e.g. AMBA 2.0 or 3.0 or Coreconnect
► Bus widths
e.g. 16/32/64 bits
► Burst Sizes
for DMA and other bus masters
► Bus Hierarchy/Topology
e.g. Single or Multi layer
► Arbitration Strategy
e.g. static priority, TDMA, RR
► Buffer Sizes
e.g. for queued out of order request completion
► Advanced Modes
e.g. OO completion, CACHE/BUFFER hints
► IP Cores
processor/peripherals
Master Bus Slave
msg.length = 1; get_requests(r); status read(a, msg)
addr = TIMER_REG2; sl_req = arbitrate(r); { switch (addr)
write(bus->port1, addr, a = decode(sl_req); {
msg); if (a.read) case TIMER_REG2:
wait(); st= read(a, sl_req); msg.data = t_reg2;
… else x.stat = SLV_OK;
st = write(a, sl_req); return x;
read/write
(addr, data_control_token)
request + arbitration +
decode cycle delay
Slave delay
Simulation
Slave response Time
CCATB Transaction Token Fields
► Related Work
► CCATB Modeling Abstraction
► Exploration Studies
► Conclusion
Exploration Study
COMPLY
SWITRN
2000
1800
1600
1400
1200 COMPLY
1000 USBDRV
800 SWITRN
600
400
200
0
Topology Configuration
45
40
35
30
25 COMPLY
20 USBDRV
15 SWITRN
10
5
0
Original config A config B
Effect of Buffer Size on Performance
Transactions (read/write) / sec
1800
1700
1600
1500 COMPLY
1400 USBDRV
1300 SWITRN
1200
1100
1000
1 2 3 4 5 6 7
1800
1600
1400
1200
1000 CCATB
800 BCA
600
400
200
0
orig_c orig_u orig_s A_c A_u A_s B_c B_u B_s
► Related Work
► Communication Architectures
► CCATB Modeling Abstraction
► Exploration Studies
► Conclusion
Conclusion
► CCATB models
1.55x to 2.20x faster than fastest BCA models
Less Modeling effort compared to BCA models
►Since intra-transaction visibility is not a concern
Accurate exploration of communication space
►Performance figures comparable in accuracy to detailed
pin accurate BCA models
Conveniently fit into SoC Design Flow
►Easy to extend TLM level models to get CCATB models
►Easy to refine down to pin accurate BCA level
Thank You!
sudeep@cecs.uci.edu
CCATB
► Plug and play IP models from library
Master (DMAs, processor ISS etc)
Slave (Timers, Interrupt Controllers, Memory etc)
Bus (AMBA 2.0 AHB, AMBA 3.0 AHB etc)
► Performance statistics include
Arbitration Conflicts
IP Throughput
Bandwidth Utilization
Cycles spent waiting for bus (for all master IPs)
Instructions/transactions executed
Transaction Level Models (TLM)
► Transactiondefined as exchange of a data or an
event between two components
data can be single word, a series of words (burst)
or a complex data structure that is transferred
over a bus
► TLM captures reads/writes of register values and
interrupts between various system components
not concerned with micro architecture (pin details,
cycle accuracy, clock, protocols like handshaking)
COMMEX Features
► Fast communication space exploration at CCATB level
► Seamless interface refinement
from TLM level down to CCATB level
from CCATB down to BCA level
► Plug-and-play different IPs effortlessly
communication architectures (e.g. AMBA2, AMBA3,
CoreConnect)
masters (e.g. ARM926ej-s, ARM920, ARM940)
slaves (e.g. simple ITC, vectored ITC)
► Integrate preexisting IPs using SystemC wrapper code
e.g. ARM CCM models
IBM CoreConnect