Sunteți pe pagina 1din 11

Hardware-assisted verification, from its

dawn to SystemVerilog, UVM, and


transactors
Lauro Rizzatti , Hans van der Schoot & John Stickley -March 14, 2017
inShare
Save Follow PRINT PDF EMAIL

My [LRs] first exposure to hardware emulation happened circa 1995 upon visiting a major
processor firm in Austin, Texas. Its lab was jam-packed from floor to ceiling with monstrous
hardware emulators of different generations from Quickturn, the leader at the time.

What shocked me was the sight of a huge and messy bundle of cables that connected the design-
under-test (DUT) the processor design mapped inside the emulator to a socket on a PC
motherboard in place of the yet-to-be-taped-out processor. In the industry, this setup was called
in-circuit emulation (ICE). Behind closed doors, the engineers called it spaghetti cables. It had a
mean time between failures (MTBF) of a few hours. Signs all over the lab warned personnel
about stepping on the cables.

Back then, ICE was the only deployment mode of an emulator, and the very reason for its
existence. It allowed testing of the DUT with real-world traffic. The alternative was to exercise
the DUT with test vectors generated by a software-based testbench executed on a gate-level or
register transfer level (RTL) simulator.

Needless to say, testing the DUT in the context of a real-world testbed had an allure that no
testbench could ever match. The advantage forced emulation users to endure pain, frustration,
and discouragement, and caused design managers to blow through their tool budgets. Early
emulators were non-shareable resources, only used on-site, with price tags that put them in the
capital equipment purchase category.

Testbench simulation and ICE were based on two separated test environments that did not share
any commonality in those days of ASIC designs. Simulation was used from the early stages of
the design cycle all the way to full ASIC-level verification. ICE was supposed to be the icing-on-
the-cake for final system-level validation. But often the icing was not ready before serving the
cake. Namely, the DUT was ready for emulation after first silicon samples came back from the
foundry, defeating the purpose.

In the mid-1990s, the industry began a long journey to bridge the gap between ICE and
simulation. Unknown back then, the solution would come in the form of ICE virtualization: the
creation of a testbed functionally equivalent to ICE. It took many years of improvements in
simulation, emulation, and testbench technology to reach a point where co-emulation (a.k.a., co-
modeling or transaction-based acceleration) became the primary choice over ICE.
Ironically, spaghetti cables have been replaced by a soup of acronyms a welcome tradeoff
given the many advantages of virtualization. A short list includes:

PLI: Programming language interface


API: Application programming language
DPI: Direct programming interface
SCE-MI: Standard Co-Emulation Modeling Interface
BFM: Bus Functional Model
TLM: Transaction-Level Modeling
UVM: Universal Verification Methodology
and many more.

But were getting ahead of ourselves.

Early co-simulation acceleration via the Verilog PLI

The first integration between an emulator and a simulator was devised in the mid-1990s. At the
time, testbenches were written in the Verilog hardware description language (HDL), and the
integration was based on the IEEE Verilog PLI standard. The PLI provided a mechanism for
Verilog code to call functions written in the C programming language. The Verilog PLI standard
suffered from several drawbacks. It was:

rather awkward and difficult to use,


signal-oriented, dragging down simulation execution, and dramatically emulation
speed,
not user friendly due to a call back-based response for value change detection.

All emulation vendors at the time offered a capability promoted as co-simulation or (cycle-
based) simulation acceleration. The use of acceleration was a misnomer by a mile. Indeed, it
was like driving a Ferrari pulling a big trailer filled with 10 tons of gravel. It hobbled the
performance of the emulator, dropping it by three or four orders of magnitude. To be specific, in
ICE mode, the emulator effectively ran at a few megahertz. In co-simulation, it achieved at most
1 kHz.

In a typical verification environment, the DUT I/O interface includes thousands of signals, with
many switching states within each clock cycle. In simulation, the testbench and DUT
communicate via a cycle-accurate, bit-level or signal-level interface, and each I/O signal
transition is transferred between the two as it occurs.

In co-simulation, the testbench, processed by the simulator, and the DUT, now in the emulator,
communicate via the same cycle-accurate, signal-level interface, and again, each I/O signal
transition is exchanged between testbench and DUT as it occurs. This was certainly an advantage
from an implementation point of view, as it required no modeling changes, but was a severe
disadvantage for performance. In fact, even though the emulator could run orders of magnitude
faster, it had to wait for these transfers to complete. Further, the emulator often was stalled by the
testbench since it had to wait for the slow testbench to react to incoming signals and produce the
next set of stimuli.

The large communication overhead essentially killed overall performance. The actual
verification performance was limited by the performance of the host PC, the size and complexity
of the testbench, and/or the signal-level interface between the testbench and the DUT:
Figure 1 In this example, simulation profiling shows that the DUT consumes 80% of CPU time,
and the remaining 20% is used by the Verilog testbench and PLI, yielding about 5 theoretical
maximum speed-up in HDL co-simulation.

Todays testbenches consume more than 50% of the CPU time in simulation sometimes more
than 90% limiting co-simulation acceleration to less than a factor of two. Its no surprise that
co-simulation never took off, leaving the ICE mode in the position of prominence that it enjoyed
from the beginning.

Co-simulation & simulation acceleration with C/C++ testbenches

The EDA industry never sits idle. Time and again, new ideas and engineering feats enhance the
design verification landscape. In fact, new verification languages were devised to create ever
more advanced testbenches.

A case in point is the use of C/C++ to implement testbenches. The aim is to elevate the
abstraction level of the testbench and reduce the impact of the Verilog simulator the slow link
in the chain from the testbench setup.

The replacement of Verilog with C/C++ testbenches dispensed with the PLI-based
communication between testbench and DUT. Vendors resorted to using custom-made APIs
based on macros to implement the pin-level or signal-level communication interface.

This testbench approach is advantageous to the deployment of an emulator in charge of the DUT.
Now, acceleration factors of up to two digits were possible:
Figure 2 In this example, the simulation profiling shows that the DUT consumes 99% of CPU
time, the remaining 1% used by the C/C++ testbench and API, thus yielding about 100
theoretical maximum speed-up factor in HDL co-simulation.

Co-Emulation

Simulation acceleration with C/C++ testbenches at the signal-level improved execution speed
close to two orders of magnitude versus PLI-based co-simulation. Still, the emulator the strong
link in the setup was held back by the testbench the weak link and was prevented from
using all its underlying processing power.

The breakthrough came by splitting the testbench itself into two parts. A front-end, written at a
higher level of abstraction than RTL and executed on the workstation, would implement
whatever verification capability was expected from the testbench. A back-end, written in RTL
code and synthesized onto the emulator, would implement the testbench I/O protocols namely,
the state-machines that control the countless DUT I/O pin transitions a compute-intensive task
performed much more efficiently in hardware.

Furthermore, the communication between the front and back ends would come to be multi-cycle
transactions instead of signal-level transitions. Today, this is known as a dual-domain
environment with transaction-based inter-domain communication. A hardware-based domain, or
HDL domain, running in the emulator, and a software-based, or hardware verification language
(HVL) domain, executes on the host computer.

Function call-based communication implements transactions, connecting the two parts, both
inbound and outbound. The implementation can take several forms, but all should stem from an
Accellera standard called SCE-MI, now at version 2.1. SCE-MI is a set of modeling APIs
between behavioral models running on a workstation and synthesizable HDL models running on
an emulator. The foundation of todays standard is the SystemVerilog DPI (SV-DPI). The
communication between emulator and workstation can be implemented using SCE-MI based
DPI import and export functions and tasks, as well as SCE-MI pipe semantics.

The DPI is not afflicted by the drawbacks of the aforementioned PLI standard. Instead, it
presents several advantages:

Much simpler and more intuitive to use.


API-less (i.e., a user-defined function on one side called from the other side)
Transaction-oriented instead of signal-oriented, leading to much higher speed.

The two domains require two sets of tools, are generally fed different files, and have different
requirements. This scenario leads to increased performance, though the acceleration factor is
dependent on the size and frequency of the transactions and function calls, and other factors:
Figure 3 A split transactor converts transactions coming from the testbench into signal-level,
protocol-specific sequences required by the DUT, and vice versa. (source: Mentor Graphics).

The overall architecture fits well with an emulator, and is dominated by the emulator that now
can run at speed. Appropriately, it is called co-emulation.

Three benefits were anticipated from the co-emulation approach. First, writing a testbench at a
higher level of abstraction with fewer lines of code would be easier and less error-prone. Second,
the workstation would process such lightweight behavioral code significantly faster. Third, the
communication between the simulation front-end and the emulation back-end would move from
cycle-based, pin-level synchronization to function-based, transaction level synchronization,
further reducing stalling of the emulator. And, the bigger the transaction, the fewer
synchronization interruptions, resulting in faster execution of the overall setup.

Writing Co-Emulation Testbenches and Transactors

A review of a transactors characteristics and a highlight of what is required to implement a co-


emulation testbench is in order here.

Table 1 compares the characteristics of the two sides.

HVL/TB Side HDL Side


1.Untimed 1. Timed
2.Behavioral 2. Synthesizable
3.Class-based 3. Module/interface based
4. Dynamic 4. Static
5.Communication with HDL side only 5. Communication with HVL side only
through transactors through transactors
6.Programming optimization techniques 6. Synthesis skills and transactor design dictate
dictate performance performance
7.Changes dont cause emulation recompile 7. Changes may require emulation recompile
8. Standards like UVM 8. XRTL and synthesis standards apply

9. Verification engineers comfort zone 9. ASIC designers comfort zone


10. Emulation-friendly: separated TB-HDL
domains + untimed TB 10. Emulation-ready: emulation-friendly +
synthesizable HDL domain

Table 1 Characteristics of dual domain co-modeling

The transaction-based testbench (a.k.a. HVL) side is behavioral and untimed. It can be time-
aware but should not have explicit time-advancement statements like clock or unit delays. Time
advancement is executed on the HDL side, though the testbench can control timing indirectly via
remote function and task calls. The testbench may be class-based, like a UVM testbench, but
doesnt need to be well within a verification engineers comfort zone.
The HDL side is synthesizable and must bear the limitations of modern synthesis technology:
behavioral constructs are not generally supported, for example.

Mentor Graphics enhanced the capability to write BFMs by developing XRTL (for eXtended
RTL), a superset of SystemVerilog RTL. It includes various behavioral constructs, such as
implicit state machines, behavioral clock and reset generation, DPI functions, and tasks that can
be synthesized onto an emulator. The HDL domain is statically elaborated, a familiar capability
for most ASIC designers. Mentor calls this scenario TBX (TestBench Xpress), similar to an
accelerated transactor, to enable emulation with modern testbenches.

Benefits of Transactors

Transactors allow the emulator to process data continuously with minimal stalling, dramatically
increasing overall performance over PLI-based acceleration, and approaching the performance of
ICE.

Co-emulation offers several advantages over ICE. It eliminates the need for speed-rate adapters
and physical interfaces. With co-emulation, each physical interface is replaced with a
virtual/logical transaction-level interface. Likewise, speed-rate adapters, required for ICE, are
replaced with protocol-specific transactors.

Unlike speed-rate adapters, transactor models for the latest protocols are readily available off-
the-shelf and easily upgraded to accommodate protocol revisions. Vendors and users provide
libraries of transactors for standard interface protocols as well as tools to enable the development
of custom, proprietary transactors.

Its also possible to create an emulation-like environment by using transactors to connect the
DUT to virtual devices. A virtual device is a software model of a peripheral device that runs on
the workstation.

An additional merit of co-emulation is remote accessibility. As there are no physical interfaces


connected to the emulator, a user can fully use and manage it from anywhere in the world.

Transaction-based acceleration led to speed-ups of three to four orders of magnitude over


simulation. It finally gave design teams access to the full performance of the emulator without
sacrificing much, if any, of the flexibility/visibility of simulation. Namely, it achieved the best of
both worlds.

Co-Emulation with UVM

The dual-domain partitioning is required for co-emulation, but works perfectly well for
simulation. The architecture is verification methodology-neutral. It readily fits a methodology
like UVM since UVM has largely the same layering principles. The transactor layer is affected
here, but the BFM proxies make this largely transparent to the UVM or modern testbench
domain.
In terms of verification productivity, the combination of UVM and co-emulation provides
horizontal and vertical reuse benefits from UVM, and reuse across simulation, emulation, FPGA,
and other platforms.

Conclusions

Platform-portable, emulation-compatible transactors offer a unique combination of performance,


accessibility, flexibility, and scalability. Transactors support the development of a realistic
system-level test environment for the DUT. They also enable rapid creation of a high-speed,
system-level virtual platform by enveloping the emulated DUT with virtual components
interacting with its multitude of interfaces.

The use of transactors delivers all the benefits of ICE without the challenges of rate-adapter
availability and physical accessibility. No more spaghetti cables!

By adopting co-emulation for testbench acceleration, design teams can move their verification
strategy up a level of abstraction, and achieve the verification performance and productivity
necessary to fully debug and develop the most complex electronic hardware and software-based
systems.

Also see:

A standard who's time must come SCE-MI


EDA Viewpoint: Affordable SoC hardware emulation?
SystemVerilog Reference Verification Methodology: VMM Adoption
Hardware emulator performance
Emulation, acceleration, prototyping and simulation. Confused?

Dr. Lauro Rizzatti is a verification consultant and industry expert on hardware emulation
(www.rizzatti.com).

Dr. Hans van der Schoot is a recognized specialist in verification and emulation technology,
and currently engaged in the role of verification architect and methodologist at Mentor
Graphics.

John Stickley is a verification technologist at Mentor Graphics Emulation Division. His


research interests are in virtual platforms for system level modeling and design verification.

S-ar putea să vă placă și