Sunteți pe pagina 1din 16

Data Transfers between Processes in an SMP System

By Jayesh Patel

Content
Introduction Data transfer techniques
Copying through Shared Buffers Copying through Message Queues Copying with the Ptrace System Call Copying Using a Kernel Module (Kaput) Copying Using a NIC

Performance comparisons Conclusion

Introduction
SMP is multiprocessor system with centralized shared memory called as main memory, operating under single operating system with two or more homogeneous processor.

1. Copying through Shared Buffers


Data transfer is done via shared memory region located in shared memory. This memory region is preallocated between each pair of processes. Sending process copy the data from source(his buffer) to shared buffer. Then receiving process copy the data from shared buffer to destination (his buffer)
Process 1

Shared memory

Process 2

Copying through Shared Buffers


Synchronization needed to ensure that one process doesnt read the data before other process finished writing and vise versa. We use flag to indicate buffer is full or empty. Limitations:
Scalability : if new processor is added then we have to create buffer for this processor with every other processors. If data to be transfer is very large and single buffer is allocated then receiving process has to wait until entire data has been copied to buffer. Solution: use pair of small buffer and switching mechanism between them. If buffer size is too small then throughput is less and additional synchronization needed.

2. Copying through Message Queues


Each processor has pair of Queues in shared memory region accessible by all processes. This queues are called as Free queue and Receive Queue. This queue is of fixed size and each element is called as cell.
Process 1 Process 1 Free queue Receive queue Process 2 Process 2 Free queue

Receive queue

Shared Memory

Copying through Message Queues


Advantages: High scalability: If new processor is added than we have to add only two queues in memory. Disadvantages: If data is to transferred is larger than the size of cell than data is divided into chunks of cell size where each one is transferred in separate cells results in additional overhead.

3. Copying with the Ptrace System Call


Both of the previous method require data to be copied twice: 1. From source buffer/queue to buffer/queue of shared memory. 2. From buffer/queue of shared memory to destination buffer/queue. To eliminate one of the copy we need to access other processs address space. One way is to use ptrace mechanism. ptrace call depends on architecture of hardware and on Operating system. In Linux 2.6 processors shared memory lies at /proc/Pid/mem System
Process 2
2.1. Access memory location of process 1 i.e /proc/Pid/mem 2.2. read the data using read() system call

Process 1

Copying with the Ptrace System Call


Advantage: One copy operation is eliminated. Disadvantage: It uses system call hence increases the latency (Delay). Ptrace system call stop the source process hence cannot do any useful work.

4. Copying Using a Kernel Module (Kaput)


Recent Linux kernel allows processes to map their pages into the kernel memory. Kaput is kernel module of Linux 2.6. Kaput Process 1 registered
Token=aTrag6R4

Process 1

Process 2 Process 2 registered


Token=K8Hfr43a

kernel Memory

Copying Using a Kernel Module (Kaput)


Advantages: Require only one copy operation as in ptrace. Destination process doesnt halt because copying is done by the kernel module. Disadvantages: System call overhead as in ptrace.

5. Copying Using a NIC


NIC is Network Interface Controller. This mechanism is same as kaput difference is instead of kernel module NIC do the copy operation. In all previous methods when source processor perform the copy operation it replace the cache data with the data to be copied on shared memory region. Which increases the cache miss after the copy operation. NIC doesn't read the data from the cache of processor instead it read the data from the memory allocated to that processor in the shared memory region using RDMA(Remote Direct Memory Access). Advantages: Very less cache miss. No source processor involvement. Disadvantage: Since NIC uses I/O bus instead of system bus the latency is high.

Performance comparisons
Testing environment: Dual SMP 2 GHz Xeon node 4 GB of maim memory 512 kb 8 way associate L2 cache OS : Linux 2.6.10 Myrinet 2000 PC164C NIC 1. Latency:
Shared buffer

Latency (s) 1.5

Message queue
Kaput put Kaput get NIC-copy put NIC-copy get Ptrace

3.3
2.1 2.1 11.6 14.3 20.2

Performance comparisons
2. Throughput 3. Cache misses

Coclusion
In this paper, we have described five mechanisms for transferring data between processes in an SMP machine and evaluated them based on bandwidth, latency and their impact on the application's cache. Not all mechanisms may be available in all environments. For instance, users typically would not be able to load a kernel module for the Kaput mechanism. The NIC-copy mechanism we analyzed in this paper was performed with a NIC that is a few years old. Faster NICs and communication subsystems, such as Myricom's MX, can provide up to 495 MBps bandwidth and latency down to 2.6 s.

S-ar putea să vă placă și