Sunteți pe pagina 1din 78

PDMS 2 Hour Tutorial

Multicore computing revolution


The need for change

Proposed Open Unified Technical Framework (OpenUTF)


architecture standards
OpenMSA, OSAMS, OpenCAF as future standards

Introduction to parallel computing


Programming models
High Speed Communications (HSC) through shared memory

Synchronization and Parallel Discrete Event Simulation (PDES)


Event Management
Time Management

Open discussion
PDMS 2 Hour Tutorial

Future of computing is

MULTICORE
PDMS 2 Hour Tutorial

I skate to where the puck is going to be, not where it has been!
Wayne Gretzky

PDMS 2 Hour Tutorial

Performance wall
Clock speed and power consumption
Memory access bottlenecks
Single instruction level parallelism

Multiple processors (cores) on a single chip is the future


No foreseeable limit to the number of cores per chip
Requires software to be written differently

Supercomputing community consensus: Low-level parallel


programming is too hard
Threads, shared memory, locks/semaphores, race-conditions,
repeatability, etc., are too hard and expensive to develop and debug
(fine-grained HPC is not for your average programmer)
Message-passing is much easier but can be less efficient
High-level approaches, tools, and frameworks are needed (OpenUTF,
new compilers, languages, math libraries, memory management, etc.)
PDMS 2 Hour Tutorial

Computer/Blade/Cluster
Board
Board
Chip

Board Chip

Node

Node

Node

Node

Node

Node

Node

Node

Chip
Node

Node

CloudCompu3ng,
Netcentric,GIG,
SystemsofSystems

Chip
Node

Node

Node

Node

World of computing is rapidly


changing and will soon demand
new parallel and distributed
service-oriented programming
methodologies and technical
frameworks.

Node

Node

PDMS 2 Hour Tutorial

Experts say that parallel and


distributed programming is too hard
for normal development teams. The
Open Unified Technical Framework
abstracts low-level programming
details.
6

Microsoft
Sponsor of the by-invitation-only 2007 Manycore Computing Workshop
that brought together the whos who of supercomputing together
Unanimous consensus on the need for multicore computing software
tools and frameworks for developers (e.g., OpenUTF)

Apple
Snow Leopard will have no new features (focus on multicore computing)
The next version of Apple's OS X operating system will include
breakthroughs in programming parallel processors, Apple CEO Steve
Jobs told The New York Times in an interview after this week's
Worldwide Developers Conference. "The way the processor industry is
going is to add more and more cores, but nobody knows how to
program those things," Jobs said. "I mean, two, yeah; four, not really;
eight, forget it.
http://bits.blogs.nytimes.com/2008/06/10/apple-in-parallel-turningthe-pc-world-upside-down/
PDMS 2 Hour Tutorial

Next generation chips


Intel has disclosed details on a chip that will compete directly with
Nvidia and ATI and may take it into unchartered technological and
market-segment waters. Larrabee will be a stand-alone chip, meaning it
will be very different than the low-endbut widely usedintegrated
graphics that Intel now offers as part of the silicon that accompanies its
processors. And Larrabee will be based on the universal Intel x86
architecture.
The number of cores in each Larrabee chip may vary, according to
market segment. Intel showed a slide with core counts ranging from 8 to
48, claiming performance scales almost linearly as more cores are
added: that is, 16 cores will offer twice the performance of eight cores.
http://i4you.wordpress.com/2008/08/05/intel-details-future-larrabeegraphics-chip

PDMS 2 Hour Tutorial

Next generation chips


Intel touts 8-core Xeon monster Nehalem-EX
Intel gave a demo yesterday of its eight-core, 2.3 billion-transistor
Nehalem-EX, which is set to launch later this year Nehalem EX
has up to 8 cores, which gives a total of 16 threads per socket.
By Jon Stokes | Last updated May 28, 2009 8:25 AM CT
http://arstechnica.com/hardware/news/2009/05/intel-touts-8-core-xeon-monster.ars

PDMS 2 Hour Tutorial

Service
Components

Model
Components

OpenUTF
Kernel

Open Unified Technical Framework (OpenUTF)

COMPOSABLE SYSTEMS
PDMS 2 Hour Tutorial

10

Simulation is not as cost effective as it should be we need to


do things differently Revolutionary, not evolutionary change!
Multicore computing revolution demands change in software
development methodology need standardized framework
New architecture standards we should be building models, not
simulations
Model and Service components developed in a common
framework automates integration for Test and Evaluation
Verification and Validation need a common test framework with
standard processes
Open source Overcomes the technology/cost barrier and
supports widespread community involvement
PDMS 2 Hour Tutorial

11

10 ms

1 ms

100 s

10 s

1 s

PDMS 2 Hour Tutorial

100 ns

10 ns

1 ns

12

Requires assessment of the current state


Existing tools, technologies, methodologies, data
models, existing interfaces, policies, requirements,
business models, contract language, lessons
learned, impediments to progress, etc.

Requires the right vision for the future


Lowered costs, better quality, faster end-to-end
execution, easier to use and maintain, feasible
technology, optimal use of workforce skill sets,
multiuse concepts, composability, modern
computational architectures, multiplatform, netcentric, etc.

Requires an executable transition strategy


Incremental evolution, risk reduction, phased
capability, accurately assessed transition costs,
available funding, prioritization, community buy-in
and participation, formation of new standards

PDMS 2 Hour Tutorial

13

1.

Engine and Model Separation

14. Platform Independence

2.
3.

Optimized Communications
Abstract Time

15. Scalability
16. LVC Interoperability Standards

4.
5.

Scheduling Constructs
Time Management

17. Web Services


18. Cognitive Behavior

6.

Encapsulated Components

19. Stochastic Modeling

7.
8.

Hierarchical Composition
Distributable Composites

20. Geospatial Representations


21. Software Utilities

9.

Abstract Interfaces

22. External Modeling Framework

10. Interaction Constructs

23. Output Data Formats

11. Publish/Subscribe

24. Test Framework

12. Data Translation Services


13. Multiple Applications

25. Community-wide Participation

PDMS 2 Hour Tutorial

14

OpenMSA Layered Technology

Focuses on parallel and distributed computing technologies


Modularizes technologies through a layered architecture
Contains OSAMS and OpenCAF
Proven technologies based on experience with large programs
Cost effective strategy for developing scalable computing technology
Provides interoperability without sacrificing performance
Facilitates sequential, parallel, and distributed computing paradigms

OSAMS Model/Service Composability


Focuses on interfaces and software development methodology to
support highly interoperable plug-and-play model/service components
Provided by OpenMSA but could be supported by other architectures

OpenCAF Cognitive Intelligent Behavior


Thoughts and stimulus, goal-oriented behaviors, decision branch
exploration, five dimensional excursions
Provided as an extension to OSAMS
PDMS 2 Hour Tutorial

15

OpenUTF

OpenMSA

OSAMS

OpenCAF

Architecture
Standards
Net-centricity
Data Models

Open Source
Technology
HPC/Multicore
Performance
Synchronization

Modularity
Composability
Interoperability
Flexibility
Programming
Constructs
VV&A

Behaviors
Cognitive
Thought
Processes
5D Simulation
Goal-oriented
Optimization

Behavior
Representation

HPC

Network

Services

Cognitive Rule
Triggering

Scheduling

Models

Bayesian
Branching

Modeling
Framework

Composites

Goals and State


Machines

Pub/Sub
Services

Decision
Support

LVC
Interoperability
Web-based
SOA

PDMS 2 Hour Tutorial

16

Direct
Federate

Abstract
Federate
CASE Tools

HLA
Federate

LVC Federation
& Enterprises

External System
Visualization/Analysis

HPC-RTI
Bridge

Gateway Interfaces
(HLA, DIS,
TENA, Web-based
SOA)

External Modeling
Framework (EMF)
&
Distributed
Blackboard

Entity Composite Repository


Model & Service Component Repository
SOM/FOM Data Translation Services
Distributed Simulation Management Services (OSAMS Pub/Sub Data Distribution)
Standard Modeling Framework (OSAMS, OpenCAF)
Time Management
Event Management Services
Standard Template Library (OSAMS)
Persistence (OSAMS)
Rollback Utilities (OSAMS)
Rollback Framework
Internal High Speed Communications
External Distributed Communications
ORB Network Services
General Software Utilities (OSAMS)
Threads
Operating System Services
PDMS 2 Hour Tutorial

17

Reasoning
Engine

Prioritized Goals

Thought
1

State, Action &


Task Management

Thought
2

Tasks
Tasks
Tasks
(5D Branching)

Thought
N

Stimulus - Perception
(Short Term Memory)

Data Processing
Behaviors, Tasks, Notifications, Abstract Methods, Uncertainty

Data Received
Federation Objects and/or Interactions
PDMS 2 Hour Tutorial

18

Based on OpenUTF Kernel Sensitivity List


Sensitive variables (stimulus) are registered with sensitive methods (thoughts)
Thoughts are automatically triggered whenever registered stimulus is modified
Thoughts can modify other stimulus to triggers additional thoughts
Terminates when solution converges or when reaching max thoughts

Outputs
A

Inputs

PDMS 2 Hour Tutorial

Left brain reasoning

Inputs are ints, doubles, or


Boolean

Inputs are prioritized when


they are associated with
RBRs

Inputs can be fed into


multiple reasoning nodes

Outputs can be inputs to


other reasoning nodes

Feedback loops are


permitted

19

Outputs
A

Learned reasoning

Inputs are ints, doubles, or


Boolean

TBR is trained and then


utilized to produce outputs
(can be continually trained
during execution)

Inputs can be fed into


multiple reasoning nodes

Outputs can be inputs to


other reasoning nodes

Feedback loops are


permitted

Inputs

PDMS 2 Hour Tutorial

20

1 = W + X + Y + Z
A = W W + X X + Y Y + Z Z TW 2T X1TY1T Z 3

Output

W
1

TW1

1
TX1

TW3

TZ2

TY1
TX2

TZ3
0

Inputs are normalized,


weighted, and summed

Sum is multiplied by the


product of thresholds to
produce the output

Output is normalized

Inputs can be fed into


multiple reasoning nodes

Outputs can be inputs to


other reasoning nodes

Feedback loops are


permitted

TZ1

Right brain reasoning

TZ1

TW2

Inputs

PDMS 2 Hour Tutorial

21

Arbitrary graphs can be


constructed from Rules,
Neural Nets, and Emotions

Outputs of graphs can


trigger changes to
behaviors by reprioritizing
goals

Behaviors are only


triggered once reasoning is
completed

Emotion Based Reasoning


Training Based Reasoning
Rule Based Reasoning

PDMS 2 Hour Tutorial

22

Monolithic
Applications

Simulations

Collection of Hardwired Services

Collection of Hardwired Models

Composable Plug and Play


OpenUTF
Kernel

Service
Components

Model
Components

Abstract
Interfaces

V&V Test
Framework

Net Centric Enterprise Framework


Composable
Systems

LVC

Web

GCCS

PDMS 2 Hour Tutorial

Data

Visualization
23

PDMS 2 Hour Tutorial

24

Reusable Software Components


Plug and Play Composability
Conceptual Model Interoperability
Pub/Sub Data Distribution & abstract Interfaces
V&V Test Framework
Performance Benchmarks

Parallel and Distributed Operation


Scalable Run-time Performance
Platform/OS Independence
OpenMSA: Technology
OSAMS: Modeling Constructs
OpenCAF: Behavior Representation

PDMS 2 Hour Tutorial

Composable Systems
LVC (HLA, DIS, TENA)
Web Services (SOA)
Data Model
C4I/GCCS
Visualization and Analysis

25

Net-centric Operation:
- Enterprise Frameworks
- Command and Control
- Standard Data Models

Standalone Operation:
- Laptops, Desktops, Clusters, HPC
- Standalone Operation
Legacy Interoperability:
- Pub/Sub Data Distribution
- Distributed Federation
- Training, Analysis, Test
- FOM/SOM
Composable
System

OpenUTF Kernel

Plug-and-play Model/Service Components

PDMS 2 Hour Tutorial

26

Net-centric SOA/LVC on Networks of Single-processor and Multicore Computers


Web
Services

Composites are distributed across processors to


achieve parallel performance

Dynamically
configured
structure
LVC
Interface

Composite
Net Centric System on
Multicore Computer

Transparently hosts hierarchical


services using the same interfaces
as model components
SOAP interface connects services to
external applications
Collections of related services are
dynamically configured and
distributed across processors on
multicore systems
Services internally communicate
through pub/sub services and
decoupled abstract interfaces
Seamlessly supports LVC
integration

Services communicate through Pub/Sub


data exchanges and abstract interfaces
Subscribed Data Received

Published Data Provided

Abstract Services Provided

Abstract Services Invoked

PDMS 2 Hour Tutorial

27

General concept

Government maintained software configuration management


Automatic platform-independent installation & make system
Test framework (verification, validation, and benchmarks)
Will seamlessly support mainstream interoperability standards
Designed for secure community-wide software distribution
OpenUTF
Global Installation & Make System

OpenUTF Kernel
320,000 Lines of Code

Component Repository
Installation & Make System
Source
Models
DAS
ETS
T&D

Services
Weather
CCSI
ATP-45

Include
Interfaces
Polymorphic Methods
Interactions
Federation Objects
XML Interfaces
Web Services

Installation & Make System


Library

Tests
Verification

Validation

Benchmarks

DAS
ETS
T&D
Weather
CCSI
ATP-45

DAS
ETS
T&D
Weather
CCSI
ATP-45

DAS
ETS
T&D
Weather
CCSI
ATP-45

PDMS 2 Hour Tutorial

28

Services

V&V Test
Framework

Data &
Interfaces
Development
Tools

Models

Web
Standards

Composability
Tools

LVC
Interoperability
Standards

Visualization
Tools

OpenUTF
Kernel

Analysis Tools

PDMS 2 Hour Tutorial

29

2D Mesh Topology
(m+n) worst case hops
16 Node Hypercube Topology
Log2(N) worst case hops

3D Mesh Topology
(l+m+n) worst case hops

Introduction to

PARALLEL COMPUTING
PDMS 2 Hour Tutorial

30

Startup

Node 0

Process
Cycle

Node N-1

Node 1

Initialize

Initialize

Initialize

Compute

Compute

Compute

Communicate

Process
Cycle

Communicate

Process
Cycle

Communicate

Store Results

Store Results

Store Results

File

File

File

PDMS 2 Hour Tutorial

31

Parallel computing vs. distributed computing


Parallel computing maps computations, data, and/or object instances of
within an application to multiple processors to obtain scalable speedup
Normally occurs on a single multicore computer, but can operate
across multiple machines
The entire application crashes if one node or thread crashes
Distributed computing interconnects loosely coupled applications within
a network environment to support interoperable execution
Normally occurs on multiple networked machines, but can operate
on a single multicore computer
Dynamic connectivity supports fault tolerance but loses scalability

Speedup(N) = T1 / TN
Efficiency(N) = Speedup / N
RelativeEfficiency(M,N) = (M / N) [Speedup(N) / Speedup(M)]
PDMS 2 Hour Tutorial

32

Time driven (or time stepping) is the simplest approach


for (double time=0.0; time < END_TIME; time+=STEP) {
UpateSystem(time);
Communicate();
}

The discrete event approach (or event stepping) manages


activities within the system more efficiently

Events occur at a point in time and have no duration


Events do not have to correspond to physical activities (pseudo-events)
Events occur for individual object instances, not for the entire system
Events when processed can modify state variables and/or schedule new
events

Parallel discrete event simulation offers unique


synchronization challenges
PDMS 2 Hour Tutorial

33

Distributed net-centric computing


Programs communicate through a network interface
TCP/IP, HTTPS, SOA and Web Services, Client/Server, CORBA,
Federations, Enterprises, Grid Computing, NCES, etc.

Parallel multicore computing


Processors directly communicate through high speed mechanisms
Threads, shared memory, message passing

Sequential
Program

Multi
Threaded

Shared
Memory

PDMS 2 Hour Tutorial

Message
Passing

34

Shared
Memory
Server

Shared
Memory
Server

Shared
Memory
Server

Parallel
Application

Parallel
Application

Parallel
Application

Cluster
Server
PDMS 2 Hour Tutorial

35

Startup and Terminate

Forks processes

Asynchronous Message
Passing
Unicast, destination-based
multicast, broadcast

Cleans up shared memory

Miscellaneous services

Automatic or user-defined memory


allocation

Node info, shared memory tuning


parameters, etc.

Synchronization

Hard and fuzzy barriers

Up to 256 message types

Coordinated Message Passing


Patterned after the Crystal Router

Global reductions
Performance Statistics

Synchronized operation
guarantees all messages received
by all nodes

Can support user-defined


operations

Unicast, destination-based
multicast, broadcast

Min, Max, Sum, Product, etc.

Synchronized data distribution

ORB Services

Broadcast, Scatter, Gather, Form


Matrix

PDMS 2 Hour Tutorial

Remote asynchronous method


invocation with user-specified
interfaces
36

Example of a global synchronization on five processing nodes

Stage 0

Stage 1

Stage 2

Stage 3

Node 0
Final
Result

Node 1
Node 2
Node 3

Wait Until
Completed

Node 4

PDMS 2 Hour Tutorial

37

PDMS 2 Hour Tutorial

38

Slots (circular buffer)


Node 0

One shared memory block per node

Node 1

Slots manage incoming messages for each node

Node 2

Circular buffer manages outgoing messages

Node 3

Steps in sending a message:


1.

Write header and message to head in senders


output message buffer.

2.

Write index of msg header in the receiving node


shared memory slot for the senders node.

Tail

Steps in receiving a message

Head

1.

Iterate over slot mgrs to find messages

2.

Read message using index in the slot

3.

Mark the header as being read

Potential technical issues


Cache coherency

Output Messages (circular buffer)

Instruction synchronization

PDMS 2 Hour Tutorial

39

Tail chasing Head

Head chasing Tail

Tail

Head

Circular
Buffer

Circular
Buffer

Head

Tail

PDMS 2 Hour Tutorial

40

Header 1

int NumBytes

Header Format

Message Format

Header 2

Header n

PDMS 2 Hour Tutorial

int Index
unsigned short Packet
unsigned short NumPackets
char DummyChar0
char DummyChar1
char DummyChar2
char ReadFlag

41

PDMS 2 Hour Tutorial

42

PDMS 2 Hour Tutorial

43

Parallel Discrete Event Simulation (PDES)

SYNCHRONIZATION
PDMS 2 Hour Tutorial

44

Standardized processing cycle interfaces to support any time


management algorithm
Uses virtual functions on scheduler to specialize processing steps
Supports reentrant applications (e.g., HPC-RTI, graphical interfaces,
etc.)

Highly optimized internal algorithms for managing events

Optimized and flexible event queue infrastructure


Native support for sequential, conservative, and optimistic processing
Internal usage of free lists to reduce memory allocation overheads
Optimized memory management with high speed communications

Statistics gathering and debug support


Rollback and rollforward application testing
Automatic statistics gathering (live critical path analysis, message
statistics, event processing and rollbacks, memory usage, etc.)
Merged trace file generation for debugging parallel simulations that can
be tailored to include rollback information, performance data, and user
output
PDMS 2 Hour Tutorial

45

Time Management Modes are generically implemented through


class inheritance from the WpScheduler
OpenMSA provides a generic framework to support basic parallel and
distributed event processing operations, which makes it easy to
implement new time management algorithms
OpenMSA creates the object implementing the requested time
management algorithm at run time
The base class WpScheduler provides generic event management
services for sequential, conservative, and optimistic processing
WpWarpSpeed, WpSonicSpeed, WpLightSpeed, and
WpHyperWarpSpeed time management objects inherit from
WpScheduler to implement their specific event processing and
synchronization algorithms

PDMS 2 Hour Tutorial

46

main {
Plug in User SimObjs
Plug in User Components
Plug in User Events
Execute
}

Initialize {
Launch processes
Establish Communications
Construct/Initialize SimObjs
Schedule Initial Events
}

Execute {
Initialize
Process Up To (End Time)
Terminate
}

Process Up To (Time) {
while (GVT < Time) {
Process GVT Cycle
}
}

Process GVT Cycle {


Process Events & User Functions
Update GVT
Commit Events
Print GVT Statistics
}

Terminate {
Terminate All SimObjs
Print Final Statistics
Shut Down Communications
}

PDMS 2 Hour Tutorial

47

*
*

PDMS 2 Hour Tutorial

48

*
*

PDMS 2 Hour Tutorial

49

Scheduler: A priority queue of Logical Processes (i.e., Simulation


Objects) ordered by next event time

Processed Events
Doubly Linked List

Simulation
Time

Future Pending Events


Priority Queue

Rollback Queue
Simulation Time
Event Messages

PDMS 2 Hour Tutorial

50

Priority queue uses new self-correcting tree data structure that


employs a heuristic to keep the tree roughly balanced
Tree data structure efficiently supports three critical operations
Element insertion in O(log2(n)) time
Element retraction in O(log2(n)) time
Element removal in O(1) time
Does not require storage of additional information in tree nodes to keep
the tree balanced
Tracks depth on insert and find operations to adjust tree
organization through specially combined multi-rotation operations
Goal is to minimize long left/left and/or right/right chains of elements
in the tree
Competes with STL Red-Black Tree
Beats STL when compiled unoptimized
Slightly worse than STL when compiled optimized

PDMS 2 Hour Tutorial

51

Rotation heuristic decreases depth


to keep the tree roughly balanced

OptimalDepth = log 2 (NumElements)


NumRotations = ActualDepth OptimalDepth

PDMS 2 Hour Tutorial

52

Rollback Manager
Manages list of rollbackable items that were created as rollbackable
operations are performed
Each event provides a rollback manager
Global pointer is set before the event is processed
Rollbacks are performed in reverse order to undo operations

Rollback Items
Each rollbackable operation generates a Rollback Item that is managed
by the Rollback Manager
Rollback utilities include (1) native data types, (2) memory
operations, (3) container classes, (4) strings, and (5) various misc.
operations
Rollback Items inherit from the base class to provide four virtual
functions
Rollback, Rollforward, Commit, Uncommit

PDMS 2 Hour Tutorial

53

Distributed Synchronization
Conservative Vs. Optimistic Algorithms
Rollbacks in the Time Warp Algorithm
The Event Horizon
Breathing Time Buckets
Breathing Time Warp
WarpSpeed
Four Flow Control Techniques

PDMS 2 Hour Tutorial

54

PDMS 2 Hour Tutorial

55

Conservative algorithms impose one or more constraints


Object interactions limited to just neighbors (e.g., Chandy-Misra)
Object interactions have non-zero time scales (e.g., lookahead)
Object interactions follow FIFO constraint

Optimistic algorithms impose no constraints but require a more


sophisticated engine
Support for rollbacks (and advanced features for rollforward)
Require flow control to provide stability
Optimistic approaches can sometimes support real-time applications
better...

The most important thing is for applications to develop their


models to maximize parallelism
Simulations will generally not execute in parallel faster than their critical
path
PDMS 2 Hour Tutorial

56

F
E

A
G

PDMS 2 Hour Tutorial

57

Self-scheduled
events and time
from D

FIFO
Input
Q

FIFO

Scheduled output
events and time to F

FIFO Input Q

Scheduled input
events and time
from E

Scheduled input
events and time
from C

FIFO

Scheduled output
events and time to B

FIFO
Input
Q

PDMS 2 Hour Tutorial

58

GVT is defined as the minimum time-tag of:


Unprocessed event
Unsent message
Message or antimessage in transit

Theoretically, GVT changes as events are processed


In practice, GVT is updated periodically by a GVT update algorithm

To correctly provide time management services to the outside


world, GVT must be updated synchronously between internal
nodes

PDMS 2 Hour Tutorial

59

PDMS 2 Hour Tutorial

60

PDMS 2 Hour Tutorial

61

PDMS 2 Hour Tutorial

62

100
Proximity Detection (32 Nodes)
259 Ground Sensors
1099 Aircraft

90

80

70

CPU Time

60

50

40

30

20

Time Warp
Breathing Time Buckets

10

0
0

10,000

20,000

Simulation Time
PDMS 2 Hour Tutorial

63

500,000
Proximity Detection (32 Nodes)
259 Ground Sensors
1099 Aircraft

Events and Rollbacks

400,000

300,000
Time Warp
Rollbacks

Processed
Events
200,000

100,000
Breathing Time Buckets
Rollbacks

0
0

10,000

20,000

Simulation Time

PDMS 2 Hour Tutorial

64

PDMS 2 Hour Tutorial

65

Generated
Messages

Generated
Messages

PDMS 2 Hour Tutorial

66

Opposite problems when comparing Breathing Time Buckets


and Time Warp
Imagine mapping events into a global event queue
Events processed by runaway nodes have good chance of
being rolled back
Should hold back messages from runaway nodes

PDMS 2 Hour Tutorial

67

Example with four nodes

Time Warp: Messages released as events are processed


Breathing Time Buckets: Messages held back
GVT: Flushes messages out of network while processing events
Commit: Releases event horizon messages and commits events

Wall Time

PDMS 2 Hour Tutorial

68

Abstract representation of logical time uses 5 tie-breaking


fields to guarantee unique time tags

double Time
int Priority1
int Priority2
int Counter
int UniqueId

Simulated physical time of the event


First user settable priority field
Second user settable priority field
Event counter of the scheduling SimObj
Globally unique Id of the scheduling SimObj

Guaranteed logical times


The OpenUTF automatically increments the SimObj event Counter to
guarantee that each SimObj schedules its events with unique time tags
Note, Counter may jump to ensure that events have increasing
time tags
SimObj Counter = max(SimObj Counter, Event Counter) + 1
The OpenUTF automatically stores the UniqueId of the SimObj in event
time tags to guarantee that events scheduled by different SimObjs are
unique
PDMS 2 Hour Tutorial

69

Four algorithms, selectable at run-time, are currently supported


in the OpenUTF reference implementation
LightSpeed for fast sequential processing
Optimistic processing overheads are removed
Parallel processing overheads are removed
SonicSpeed for ultra-fast sequential parallel and conservative event
processing
Highly optimized event management (no bells and whistles)
WarpSpeed for optimistic parallel event processing with four new flow
control techniques to ensure stability
Cascading antimessages can be eliminated
Individual event lookahead evaluation for message-sending risk
Message sending risk based on uncommitted event CPU time
Run-time adaptable flow control for risk and optimistic processing
HyperWarpSpeed for supporting five-dimensional simulation
Branch excursions, event splitting/merging, parallel universes
PDMS 2 Hour Tutorial

70

Case 1
Case 2

GVT

Time
Hold Back Messages

GVT

Time
Ok to Send Messages

PDMS 2 Hour Tutorial

71

Risk
Lookahead

Send Messages

Hold Back Message

PDMS 2 Hour Tutorial

Time

72

Tcpu6

Tcpu5

Tcpu4

Tcpu2
Tcpu3

Tcpu1

Tcpu0

Processing Threshold Exceeded


Hold Back Messages

Time

PDMS 2 Hour Tutorial

73

NRollbacks

Unstable - Decrease Nopt


Stable - Slightly Increase Nopt

NAntimessagess

Time

Unstable - Decrease Nrisk


Stable - Slightly Increase Nrisk

Time
PDMS 2 Hour Tutorial

74

PDMS 2 Hour Tutorial

75

PDMS 2 Hour Tutorial

76

Final thoughts

OPEN DISCUSSION
PDMS 2 Hour Tutorial

77

Participate in the PDMS Standing Study Group (PDMS-SSG)

Simulation Users
Model Developers
Technologists
Sponsors
Program Managers
Policy Makers

Receive OpenUTF hands-on training for the open source


reference implementation
One-week hands-on-training events can be arranged for groups if there
is enough participation

Begin considering OpenUTF architecture standards


OpenMSA layered technology
OSAMS plug-and-play components
OpenCAF representation of intelligent behavior
PDMS 2 Hour Tutorial

78

S-ar putea să vă placă și