Documente Academic
Documente Profesional
Documente Cultură
Feedback
Nick September 18, 2017
It is worth spending a few minutes just revisiting cache and it’s place
across the hierarchy of Arm IP. With the advent of DynamIQ, Arm’s new
cluster microarchitecture, there are a multitude of places where cache
lives:- within each core, usually called L1 cache, this is typically the
smallest and fastest cache in the system, shared between cores of like
type, usually called L2, shared across the cluster, called L3 and shared
across the clusters, which may be called Last Level Cache (LLC) or
System Cache, typically the slowest but largest cache in the system.
There are any number of architectural options available when
constructing such systems and therefore some or all these caches may
be present in your target system. Interestingly with the announcement
of the new CCIX protocol we will soon see Arm-based SoCs which also
share cache from chip-to-chip as well.
Feedback
Given the number of options and the need to integrate these complex
compute subsystems into bigger SoCs which may also utilize I/O
Coherency to optimize the system performance for high speed I/O such
as PCIExpress, it is essential that the caching is fully exercised before
committing to Silicon as a bug in the integration of the SoC could prove
disastrous.
False Sharing
I will now explain in a little more detail the “False Sharing” scenario, look
for my next blog coming soon which will detail the “True Sharing”
scenario.
False Sharing is a situation where cache lines are being used by a number
of cores, and hence the system considers them shared data, but in fact
the cores are using exclusively different parts of the cache line and
therefore do not actually share data with each other.
The gure below shows by colour which core is using which bytes of the
64 byte cache line. We can immediately see that within each cache line,
regions of data are exclusively used by one core only (one colour). This is
what we mean by False Sharing.
Feedback
Also notice the regions are not of regular size, but obviously a whole
number of bytes. The permutations of False Sharing situations are
enormous especially when considering the hierarchical cache
architecture permutations. Creating baremetal SW scenarios to cover a
good number of permutations using hand-written code would be a
signi cant challenge.
Feedback
Perspec is able to generate a huge number of speci c test cases, the
diagram above is one speci c solution, through powerful constraint
solver technology and the PSS model which abstractly de nes data
dependency independent of action ordering. This brings huge
productivity to the test writer as one test can create hundreds of
possible solutions, the user can pick one and then run it on the SoC they
are working on.
In the next blog I will dig a little deeper into how tests are created and
how users can use coverage to decide which test or tests they want to
run.
Feedback
Francisco Socal Jason Andrews