Documente Academic
Documente Profesional
Documente Cultură
• No texture rendering.
• Profiler.
#define COUNT 10
#include <stdio.h>
#include <assert.h>
#include <cuda.h>
int main(void)
{
float* pDataCPU = 0;
float* pDataGPU = 0;
int i = 0;
//release memory
free(pDataCPU);
cudaFree(pDataGPU)
return 0;
}
CUDA Example 1 (notes)
• Registers:
o Fastest.
o Only accessible by a thread.
o Lifetime of a thread
• Shared memory:
o Could be as fast as registers if no bank conflicts or
reading from same address.
o Accessible by any threads within a block where it was
created.
o Lifetime of a block.
CUDA - Memory Units Description
(continue)
• Global Memory:
o Up to 150x slower then registers or share memory.
o Accessible from either host or device.
o Lifetime of an application.
• Local Memory
o Resides in global memory. Can be 150x slower then
registers and shared memory.
o Accessible only by a thread.
o Lifetime of a thread.
CUDA - Uses
• CUDA, Wikipedia.
o http://en.wikipedia.org/wiki/CUDA.