Documente Academic
Documente Profesional
Documente Cultură
Inversion problem:
Development:
Inversion problem:
Development:
Database
Requirements:
Solution:
For each config worth of data, will pay a one-time insertion cost
Config data may insert out of order
Need to insert or delete
Requirements basically imply a balanced tree
Try DB using Berkeley Sleepy Cat:
Preliminary Tests:
Database key:
Interface function
String = source_sink_pfx_pfy_pfz_qx_qy_qz_Gamma_linkpath
Not intending (at the moment) any relational capabilities among
sub-keys
Array< Array<double> > read_correlator(const string& key);
(Clover) Temporal
Preconditioning
Consider Dirac op
det(D) = det(Dt + Ds/
Temporal precondition: det(D)=det(Dt)det(1+
Dt-1Ds/)
Strategy:
Temporal preconditiong
3D even-odd preconditioning
Expectations
Multi-Threading
on Multi-Core
Processors
Jie Chen, Ying Chen, Balint Joo and
Chip Watson
Scientific Computing Group
IT Division
Jefferson Lab
Motivation
Multi-threading
Test Environment
2.8 GHz
4 GB Memory (DDR2 667 MHz)
2.66 GHz
4 GB memory (FB-DDR2 667 MHz)
i386
x86_64
Multi-Core Architecture
PCI-E
PCI-E
Expansion
Bridge
HUB
Core 1 Core 2
FB DDR2
ESB2
I/O
Memory Controller
PCI Express
Intel Woodcrest
Intel Xeon 5100
DDR2
Core 1 Core 2
PCI-X
Bridge
AMD Opterons
Socket F
Multi-Core Architecture
L1 Cache
32 KB Data, 32 KB Instruction
L2 Cache
1 MB dedicated
128 bit width
6.4 GB/s bandwidth to cores
NUMA (DDR2)
64 KB Data, 64 KB Instruction
L2 Cache
Increased Latency
memory disambiguation
allows load ahead store
instructions
Executions
L1 Cache
FB-DDR2
Executions
AMD Opteron
Memory System
Performance
Memory System
Performance
Rand
Mem
Intel
150.3
AMD
173.8
L1
L2
Performance of
Applications
NPB-3.2 (gcc-4.1 x86-64)
Parallel Programming
Messages
Machine 1
OpenMP/Pthread
Machine 2
OpenMP/Pthread
Multi-Threads Provide
Higher Memory Bandwidth
to a Process
OpenMP
Master
Time
Fork
Join
OpenMP
omp_set_num_threads, omp_get_thread_num
Posix Thread
Complex
QCD Multi-Threading
(QMT)
Synchronization Overhead
for OMP and QMT on Intel
Platform (i386)
Synchronization Overhead
for OMP and QMT on AMD
Platform (i386)
Conclusions