Documente Academic
Documente Profesional
Documente Cultură
Image Formats [API 5.3.1] Image Channel Data Types [API Table 17]
CL_SNORM_INT8 CL_SIGNED_INT8
Image Channel Order Values [API Table 16] CL_SNORM_INT16 CL_SIGNED_INT16
CL_UNORM_INT8 CL_SIGNED_INT32
CL_R CL_RG CL_RGBA CL_RGBx CL_UNORM_INT16 CL_UNSIGNED_INT8
CL_A CL_RA CL_ARGB CL_sRGB CL_UNORM_SHORT_565 CL_UNSIGNED_INT16
CL_DEPTH CL_Rx CL_BGRA CL_sRGBA CL_UNORM_SHORT_555 CL_UNSIGNED_INT32
CL_LUMINANCE CL_RGB CL_ABGR CL_sBGRA CL_UNORM_INT_101010 CL_HALF_FLOAT
CL_INTENSITY CL_RGx CL_ABGR CL_sRGBx CL_UNORM_INT_101010_2 CL_FLOAT
CL_PROGRAM_BUILD_GLOBAL_VARIABLE_TOTAL_SIZE
cl_int clFlush (cl_command_queue command_queue) cl_program program, cl_uint spec_id,
cl_int clFinish (cl_command_queue command_queue) size_t spec_size, const void* spec_value)
(Continued on next page >)
Event Objects [API 5.11 - 5.14] cl_int clRetainEvent (cl_event event) NOTE: The ability to execute native kernels is optional
Event objects can be used to refer to a kernel execution within OpenCL and the semantics of native kernels
command, and read, write, map, and copy commands on
cl_int clReleaseEvent (cl_event event) are implementation-defined. The OpenCL API includes
memory objects or user events. functions to query device capabilities and determine if
cl_int clSetEventCallback (cl_event event, this capability is supported.
Event objects cl_int command_exec_callback_type,
cl_event clCreateUserEvent ( void (CL_CALLBACK *pfn_event_notify)
cl_context context, cl_int *errcode_ret) (cl_event event, cl_int event_command_exec_status,
void *user_data), void *user_data) Profiling operations
cl_int clSetUserEventStatus ( cl_int clGetEventProfilingInfo (cl_event event,
cl_event event, cl_int execution_status) Markers, barriers, & waiting for events cl_profiling_info param_name,
cl_int clWaitForEvents (cl_uint num_events, cl_int clEnqueueMarkerWithWaitList ( size_t param_value_size, void *param_value,
const cl_event *event_list) cl_command_queue command_queue, size_t *param_value_size_ret)
cl_uint num_events_in_wait_list, param_name:
cl_int clGetEventInfo (cl_event event, const cl_event *event_wait_list, cl_event *event)
cl_event_info param_name, size_t param_value_size, CL_PROFILING_COMMAND_COMPLETE,
void *param_value, size_t *param_value_size_ret) cl_int clEnqueueBarrierWithWaitList ( CL_PROFILING_COMMAND_QUEUED,
cl_command_queue command_queue, CL_PROFILING_COMMAND_{SUBMIT, START, END}
param_name:
CL_EVENT_COMMAND_{QUEUE, TYPE}, cl_uint num_events_in_wait_list,
CL_EVENT_{CONTEXT, REFERENCE_COUNT}, const cl_event *event_wait_list, cl_event *event)
CL_EVENT_COMMAND_EXECUTION_STATUS
l Memory Model: SVM [API 3.3.3] • Coarse-Grained buffer SVM: Sharing at the granularity of regions of OpenCL buffer memory objects.
OpenCL extends the global memory region into host memory through a • Fine-Grained buffer SVM: Sharing occurs at the granularity of individual loads/stores into bytes within
shared virtual memory (SVM) mechanism. Three types of SVM in OpenCL: OpenCL buffer memory objects.
• Fine-Grained system SVM: Sharing occurs at the granularity of individual loads/stores into bytes occurring
anywhere within the host memory.
OpenCL C Language Reference Section and table references are to the OpenCL 3.0 C Language specification.
Supported Data Types [C 6.1] Built-in Vector Data Types image1d_t 1D image handle
Double types require that CL_DEVICE_DOUBLE_FP_CONFIG is n is 2, 3, 4, 8, or 16. image1d_buffer_t 1D image buffer
not zero. OpenCL Type API Type Description image1d_array_t 1D image array
[u]charn cl_[u]charn 8-bit [un]signed
Built-in Scalar Data Types image2d_depth_t 2D depth image
[u]shortn cl_[u]shortn 16-bit [un]signed image2d_array_depth_t 2D depth image array
OpenCL Type API Type Description [u]intn cl_[u]intn 32-bit [un]signed sampler_t sampler handle
bool -- true (1) or false (0) [u]longn cl_[u]longn 64-bit [un]signed
char cl_char 8-bit signed floatn cl_floatn 32-bit float Reserved Data Types
doublen cl_doublen 64-bit float booln
unsigned char, uchar cl_uchar 8-bit unsigned
halfn
short cl_short 16-bit signed Other Built-in Data Types quad, quadn
unsigned short, ushort cl_ushort 16-bit unsigned OpenCL Type Description complex half, complex halfn
int cl_int 32-bit signed event_t event handle imaginary half, imaginary halfn
unsigned int, uint cl_uint 32-bit unsigned l queue_t Requires support for OpenCL C 2.0 or complex float, complex floatn
the __opencl_c_device_enqueue imaginary float, imaginary floatn
long cl_long 64-bit signed l ndrange_t
l clk_event_t feature macro. complex double, complex doublen
unsigned long, ulong cl_ulong 64-bit unsigned imaginary double, imaginary doublen
float cl_float 32-bit float Requires support for OpenCL C 2.0 or
l reserve_id_t the__opencl_c_pipes feature macro. complex quad, complex quadn
double cl_double 64-bit IEEE 754 imaginary quad, imaginary quadn
cl_mem_fence_flags
floatnxm
half cl_half 16-bit float (storage only)
The following types shown below are only defined if the doublenxm
size_t -- 32- or 64-bit unsigned integer CL_DEVICE_IMAGE_SUPPORT device query is CL_TRUE.
long double, long doublen
ptrdiff_t -- 32- or 64-bit signed integer OpenCL Type Description
long long, long longn
intptr_t -- 32- or 64-bit signed integer image2d_t 2D image handle
unsigned long long, ulong long, ulong longn
uintptr_t -- 32- or 64-bit unsigned integer image3d_t 3D image handle
void void void image2d_array_t 2D image array
Preprocessor Directives & Macros [C 6.10] __ENDIAN_LITTLE__ 1 if device is little endian Attribute Qualifiers [C 6.11]
Use to specify special attributes of enum, struct, and union
#pragma OPENCL FP_CONTRACT on-off-switch __IMAGE_SUPPORT__ 1 if images are supported
types.
on-off-switch: ON, OFF, DEFAULT
__FAST_RELAXED_MATH__ 1 if –cl-fast-relaxed-math __attribute__((aligned(n))) __attribute__((endian(host)))
#pragma OPENCL EXTENSION extensionname : optimization option is specified __attribute__((aligned)) __attribute__((endian(device)))
behavior
__kernel_exec (X, typen) Same as: __attribute__((packed)) __attribute__((endian))
#pragma OPENCL EXTENSION all : behavior __kernel __attribute__((work_group_size_hint(X, 1, 1)))
__attribute__((vec_type_hint(typen))) Use to specify special attributes of variables or structure fields.
__FILE__ Current source file
__attribute__((aligned(alignment)))
__LINE__ Integer line number
__OPENCL_VERSION__ Integer version number, e.g: 300
Conversions, Type Casting Examples [C 6.2] Use to specify basic blocks and control-flow-statements.
T a = (T)b; // Scalar to scalar, or scalar to vector __attribute__((attr1)) {…}
CL_VERSION_1_0 Substitutes integer 100 for 1.0 T a = convert_T(b);
T a = convert_T_R(b); Use to specify that a loop (for, while, and do loops) can be
CL_VERSION_1_1 Substitutes integer 110 for 1.1
T a = as_T(b); unrolled. (Must appear immediately before the loop to be
CL_VERSION_1_2 Substitutes integer 120 for 1.2 T a = convert_T_sat_R(b); affected.)
__attribute__((opencl_unroll_hint(n)))
l CL_VERSION_2_0 Substitutes integer 200 for 2.0
R: one of the rounding modes __attribute__((opencl_unroll_hint))
CL_VERSION_3_0 Substitutes integer 300 for 3.0 _rte to nearest even
__OPENCL_C_VERSION__ Sub. integer for OpenCL C version _rtz toward zero
_rtp toward + infinity
_rtn toward - infinity
©2020 Khronos Group - Rev. 0420 www.khronos.org/opencl
OpenCL 3.0 (Provisional) Reference Guide OpenCL C Language Page 6
Access Qualifiers [C 6.6] Work-Item Built-in Functions [C 6.13.1]
Apply to 2D and 3D image types to declare if the image memory Query the number of dimensions, global, and local work size specified to clEnqueueNDRangeKernel, and global and local
object is being read or written by a kernel. identifier of each work-item when this kernel is executed on a device. Funtions shown in blue require the feature macro
__read_only, read_only __write_only, write_only __opencl_c_subgroups.
l__read_write, read_write (Requires OpenCL C 2.0 or the
__opencl_c_read_write_images feature macro.) uint get_work_dim () Number of dimensions in use
size_t get_global_size ( size_t get_global_offset (
Number of global work-items Global offset
uint dimindx) uint dimindx)
l Blocks [C 6.12]
size_t get_global_id ( Work-items
A result value type with a list of parameter types. Global work-item ID value size_t get_global_linear_id ()
uint dimindx) 1-dimensional global ID
Requires support for the__opencl_c_device_enqueue feature Work-items
size_t get_local_size ( Number of work-items in the work- size_t get_local_linear_id ()
macro or OpenCL C 2.0. For example: uint dimindx) group 1-dimensional local ID
1. The ^ declares variable “myBlock” is a Block. Number of work-items in
size_t get_enqueued_local_size ( Number of work-items in l uint get_sub_group_size ()
2. The return type for the Block “myBlock”is int. uint dimindx) a uniform work-group the subgroup
3. myBlock takes a single argument of type int. Maximum size of a
4. The argument is named “num.” size_t get_local_id (uint dimindx) Local work-item ID l uint get_max_sub_group_size () subgroup
5. Multiplier captured from block’s environment. size_t get_num_groups ( Number of subgroups in
Number of work-groups l uint get_num_sub_groups ()
uint dimindx) the work-group
k j l
int (^myBlock)(int) = size_t get_group_id ( Number of subgroups in
Work-group ID l uint get_enqueued_num_sub_groups () a uniform work-group
^(int num) {return num * multiplier; }; uint dimindx)
m n l uint get_sub_group_id () Sub-group ID
l uint get_sub_group_local_id () Unique work-item ID
uint4 read_imageui (read_only image2d_array_t image, uint4 read_imageui (read_only image1d_t image, Read and write functions for 3D images
sampler_t sampler, {int4, float4} coord) sampler_t sampler, {int, float} coord) Read an element from a 3D image, or write a color value to a
location in a 3D image.
float read_imagef (read_only image2d_depth_t image, float4 read_imagef (read_only image1d_array_t image,
sampler_t sampler, {int2, float2} coord) sampler_t sampler, {int2, float4} coord) float4 read_imagef (read_only image3d_t image,
sampler_t sampler, {int4, float4} coord)
float read_imagef (read_only image2d_array_depth_t image, int4 read_imagei (read_only image1d_array_t image,
sampler_t sampler, {int4, float4} coord) sampler_t sampler, {int2, float2} coord) int4 read_imagei (read_only image3d_t image,
sampler_t sampler, int4 coord)
float4 read_imagef (aQual image2d_t image, int2 coord) uint4 read_imageui (read_only image1d_array_t image,
sampler_t sampler, {int2, float2} coord) int4 read_imagei (read_only image3d_t image,
int4 read_imagei (aQual image2d_t image, int2 coord) sampler_t sampler, float4 coord)
uint4 read_imageui (aQual image2d_t image, int2 coord) float4 read_imagef (aQual image1d_t image, int coord)
uint4 read_imageui (read_only image3d_t image,
float4 read_imagef (aQual image2d_array_t image, int4 coord) float4 read_imagef (aQual image1d_buffer_t image, int coord) sampler_t sampler, {int4, float4} coord)
int4 read_imagei (aQual image1d_t image, int coord) float4 read_imagef (aQual image3d_t image, int4 coord)
int4 read_imagei (aQual image2d_array_t image, int4 coord)
int4 read_imagei (aQual image3d_t image, int4 coord)
uint4 read_imageui (aQual image2d_array_t image, int4 coord) uint4 read_imageui (aQual image1d_t image, int coord)
uint4 read_imageui (aQual image3d_t image, int4 coord)
float read_imagef (aQual image2d_depth_t image, int2 coord) int4 read_imagei (aQual image1d_buffer_t image, int coord)
float read_imagef (aQual image2d_array_depth_t image, uint4 read_imageui (aQual image1d_buffer_t image, int coord)
int4 coord) float4 read_imagef (aQual image1d_array_t image, int2 coord)
int4 read_imagei (aQual image1d_array_t image, int2 coord)
The write_image{f, i, ui} functions require support for OpenCL
C 2.0 or the __opencl_c_3d_image_writes feature macro. uint4 read_imageui (aQual image1d_array_t image, int2 coord)
void write_imagef (aQual image2d_t image,
int2 coord, float4 color)
void write_imagei (aQual image2d_t image,
int2 coord, int4 color)
Query image dimensions Common Built-in Functions [C 6.13.4]
void write_imageui (aQual image2d_t image, int2 get_image_dim (aQual image2d_t image) These functions operate component-wise and use round to
int2 coord, uint4 color)
int2 get_image_dim (aQual image2d_array_t image) nearest even rounding mode. Ts is type float, optionally double
int4 get_image_dim (aQual image3d_t image) (if double precision is supported). Tn is the vector form of Ts,
where n is 2, 3, 4, 8, or 16. T is Ts and Tn.
l int2 get_image_dim (aQual image2d_[array_]depth_t
Image Query Functions [C 6.13.14] image) T clamp (T x, T min, T max) Clamp x to range given by
Query image width, height, and depth in pixels Tn clamp (Tn x, Ts min, Ts max) min, max
Query image channel data type and order T degrees (T radians) radians to degrees
int get_image_width (aQual image{1,2,3}d_t image)
int get_image_channel_data_type ( T max (T x, T y)
int get_image_width (aQual image1d_buffer_t image) Max of x and y
aQual image{1,2,3}d_t image) Tn max (Tn x, Ts y)
int get_image_width (aQual image{1,2}d_array_t image)
int get_image_channel_data_type ( T min (T x, T y)
l int get_image_width (aQual image2d_[array_]depth_t image) Min of x and y
aQual image1d_buffer_t image) Tn min (Tn x, Ts y)
int get_image_height (aQual image{2,3}d_t image) int get_image_channel_data_type ( T mix (T x, T y, T a)
Tn mix (Tn x, Tn y, Ts a) Linear blend of x and y
int get_image_height (aQual image2d_array_t image) aQual image{1,2}d_array_t image)
l int get_image_height (aQual image2d_[array_]depth_t image) l int get_image_channel_data_type (aQual T radians (T degrees) degrees to radians
image2d_[array_]depth_t image)
int get_image_depth (image3d_t image) T step (T edge, T x) 0.0 if x < edge, else 1.0
int get_image_channel_order (aQual image{1,2,3}d_t image) Tn step (Ts edge, Tn x)
Query image array size int get_image_channel_order ( T smoothstep (T edge0, T edge1, T x)
Step and interpolate
aQual image1d_buffer_t image) T smoothstep (Ts edge0, Ts edge1, T x)
size_t get_image_array_size (aQual image1d_array_t image)
size_t get_image_array_size (aQual image2d_array_t image) int get_image_channel_order ( T sign (T x) Sign of x
l size_t get_image_array_size ( aQual image{1,2}d_array_t image)
aQual image2d_array_depth_t image) l int get_image_channel_order (
aQual image2d_[array_]depth_t image)
Relational Built-in Functions [C 6.13.6] int islessequal (float x, float y) int isordered (float x, float y)
These functions can be used with built-in scalar or vector types intn islessequal (floatn x, floatn y) intn isordered (floatn x, floatn y) Test if arguments are
as arguments and return a scalar or vector integer result. T Compare of x <= y
int islessequal (double x, double y) int isordered (double x, double y) ordered
is type float, floatn, char, charn, uchar, ucharn, short, shortn, longn islessequal (doublen x, doublen y) longn isordered (doublen x, doublen y)
ushort, ushortn, int, intn, uint, uintn, long, longn, ulong,
ulongn, or optionally double or doublen (if double precision is int islessgreater (float x, float y) int isunordered (float x, float y)
supported). Ti is type char, charn, short, shortn, int, intn, long, intn islessgreater (floatn x, floatn y) Compare of intn isunordered (floatn x, floatn y)
or longn. Tu is type uchar, ucharn, ushort, ushortn, uint, uintn,
Test if arguments are
int islessgreater (double x, double y) (x < y) || (x > y) int isunordered (double x, double y) unordered
ulong, or ulongn. n is 2, 3, 4, 8, or 16.
longn islessgreater (doublen x, doublen y) longn isunordered (doublen x, doublen y)
int isequal (float x, float y)
int isfinite (float) int signbit (float)
intn isequal (floatn x, floatn y)
Compare of x == y intn isfinite (floatn) intn signbit (floatn)
int isequal (double x, double y) Test for finite value Test for sign bit
int isfinite (double) int signbit (double)
longn isequal (doublen x, doublen y)
longn isfinite (doublen) longn signbit (doublen)
int isnotequal (float x, float y) int isinf (float)
intn isnotequal (floatn x, floatn y) 1 if MSB in component
intn isinf (floatn) int any (Ti x)
Compare of x != y Test for + or – infinity of x is set; else 0
int isnotequal (double x, double y) int isinf (double)
longn isnotequal (doublen x, doublen y) longn isinf (doublen) 1 if MSB in all
int all (Ti x) components of x are
int isgreater (float x, float y) int isnan (float) set; else 0
Test for a NaN
intn isgreater (floatn x, floatn y) intn isnan (floatn)
Compare of x > y Each bit of result is
int isgreater (double x, double y) int isnan (double) corresponding bit of
longn isgreater (doublen x, doublen y) Test for a NaN T bitselect (T a, T b, T c)
longn isnan (doublen) a if corresponding bit
of c is 0
int isgreaterequal (float x, float y) int isnormal (float)
intn isgreaterequal (floatn x, floatn y) Compare of x >= y Test for a normal For each component
intn isnormal (floatn)
value of a vector type,
int isgreaterequal (double x, double y) int isnormal (double) T select (T a, T b, Ti c) result[i] = if MSB of
Test for a normal T select (T a, T b, Tu c) c[i] is set ? b[i] : a[i]
longn isgreaterequal (doublen x, doublen y) Compare of x >= y longn isnormal (doublen)
value For scalar type, result
int isless (float x, float y) =c?b:a
intn isless (floatn x, floatn y)
Compare of x < y
int isless (double x, double y)
longn isless (doublen x, doublen y)
Vector Data Load/Store [C 6.13.7] floatn vload_halfn (size_t offset, Read a halfn from address void vstore_halfn_R (doublen Write a half vector to address
const [constant] half *p) (p + (offset * n)) data, size_t offset, half *p) (p + (offset * n))
T is type char, uchar, short, ushort, int, uint, long, ulong, or float,
optionally double (if double precision is supported). Tn refers to void vstore_half (float data,
vector form of type T, where n is 2, 3, 4, 8, or 16. size_t offset, half *p) Read half vector data from
The default rounding mode is round to nearest even. void vstore_half_R (float data, floatn vloada_halfn (size_t offset, aligned (p + (offset * n)). For
size_t offset, half *p) Write a half to address const [constant] half *p) half3, read from aligned
Load functions support global, local, private, and constant address (p + offset) (p + (offset * 4)).
void vstore_half (double data,
spaces. Store functions support global, local, and private address size_t offset, half *p)
spaces. For all, the generic address space may be supported with void vstore_half_R (double data, void vstorea_halfn_R (floatn data,
the __opencl_c_generic_address_space feature macro. size_t offset, half *p) size_t offset, half *p)
Tn vloadn (size_t offset, Read vector data from void vstore_halfn (floatn data, void vstorea_halfn (doublen data, Write half vector data to
const [constant] T *p) address (p + (offset * n)) size_t offset, half *p) size_t offset, half *p) aligned (p + (offset * n)).
void vstorea_halfn_R (doublen For half3, write to aligned
void vstoren (Tn data, Write vector data to address void vstore_halfn_R (floatn data, Write a half vector to address (p + (offset * 4)).
size_t offset, T *p) (p + (offset * n) size_t offset, half *p) (p + (offset * n)) data, size_t offset, half *p)
float vload_half (size_t offset, Read a half from address void vstore_halfn (doublen data, void vstorea_halfn (floatn data,
const [constant] half *p) (p + offset) size_t offset, half *p) size_t offset, half *p)
int vec_step (Tn a) Takes built-in scalar or vector data type argument. Returns 1 l Address Space Qualifier Functions [C 6.13.9]
for scalar, 4 for 3-component vector, else number of elements
int vec_step (typename) in the specified type. T refers to any of the built-in data types supported by OpenCL C or a user-defined type. These
functions require the __opencl_c_generic_address_space feature macro.
Tn shuffle (Tm x, TUn mask) Construct permutation of elements from one or two input
vectors, return a vector with same element type as input and [const] global T * to_global ([const] T *ptr) global address space
Tn shuffle2 (Tm x, Tm y, TUn mask) length that is the same as the shuffle mask.
[const] local T * to_local ([const] T *ptr) local address space
[const] private T * to_private ([const] T *ptr) private address space
Memory fence value:
[const] cl_mem_fence_flags get_fence ( CLK_GLOBAL_MEM_FENCE,
[const] T *ptr) CLK_IMAGE_MEM_FENCE,
CLK_LOCAL_MEM_FENCE
Atomic Functions [C 6.13.11]
OpenCL C implements a subset of the C11 atomics (see section 7.17 of the C11 specification) and
synchronization operations.
In the following tables, A refers to an atomic_* type (not including atomic_flag). C refers to
its corresponding non-atomic type. M refers to the type of the other argument for arithmetic l bool atomic_flag_test_and_set(
operations. For atomic integer types, M is C. For atomic pointer types, M is ptrdiff_t. volatile atomic_flag *object) Atomically sets the value pointed to by object
to true. Memory is affected according to the
The atomic_double type is available if double precision is supported. The default scope is bool atomic_flag_test_and_set_explicit( value of order. Returns atomically, the value
memory_scope_work_group for local atomics and memory_scope_device for global atomics. volatile atomic_flag *object, of the object immediately before the effects.
memory_order order[ , memory_scope scope])
The default scope is memory_scope_work_group for local atomics and memory_scope_device
for global atomics, therefore the non-explicit functions require OpenCL C 2.0 or both the feature l void atomic_flag_clear( Atomically sets the value pointed to by object
macros __opencl_c_atomic_order_seq_cst and __opencl_c_atomic_scope_device. volatile atomic_flag *object) to false. The order argument shall not be
Initializes the atomic object pointed to by obj void atomic_flag_clear_explicit( memory_order_acquire nor
void atomic_init(volatile A *obj, C value) to the value value. volatile atomic_flag *object, memory_order_acq_rel. Memory is affected
memory_order order[ , memory_scope scope]) according to the value of order.
void atomic_work_item_fence( Effects based on value of order. flags must be
cl_mem_fence_flags flags, memory_order CLK_{GLOBAL, LOCAL, IMAGE}_MEM_FENCE or Values for key for atomic_fetch and modify functions
order, memory_scope scope) a combination of these.
key op computation key op computation
l void atomic_store(volatile A *object, C desired) add + addition and & bitwise and
Atomically replace the value pointed to by
void atomic_store_explicit(volatile A *object, object with the value of desired. Memory is sub - subtraction min min compute min
C desired, memory_order order affected according to the value of order.
[ , memory_scope scope]) or | bitwise inclusive or max max compute max
xor ^ bitwise exclusive or
l C atomic_load(volatile A *object) Atomically returns the value pointed to by
C atomic_load_explicit(volatile A *object, object. Memory is affected according to the
Atomic Types and Enum Constants
memory_order order[ , memory_scope scope]) value of order.
Parameter type: memory_order
l C atomic_exchange(volatile A *object, C desired)
Atomically replace the value pointed to by Values Optional requirements
C atomic_exchange_explicit(volatile A *object, object with desired. Memory is affected
C desired, memory_order order memory_order_relaxed
according to the value of order.
[ , memory_scope scope]) l memory_order_acquire
With any built-in atomic function except atomic_work_item_fence,
l bool atomic_compare_exchange_strong( l memory_order_release requires OpenCL C 2.0 or __opencl_c_atomic_order_acq_rel
volatile A *object, C *expected, C desired) memory_order_ acq_rel
bool atomic_compare_exchange_strong_explicit( l memory_order_seq_cst Requires OpenCL C 2.0 or __opencl_c_atomic_order_seq_cst
volatile A *object, C *expected, C desired, Atomically compares the value pointed to by
memory_order success, object for equality with that in expected, and Parameter type: memory_scope
memory_order failure[ , memory_scope scope]) if true, replaces the value pointed to by object Values Optional requirements
l bool atomic_compare_exchange_weak( with desired, and if false, updates the value in
expected with the value pointed to by object. memory_scope_work_group
volatile A *object,
C *expected, C desired) These operations are atomic read-modify- memory_scope_work_item
Only used with atomic_work_item_fence with flags:
bool atomic_compare_exchange_weak_explicit( write operations. CLK_IMAGE_MEM_FENCE
volatile A *object, C *expected, C desired, l memory_scope_sub_group Requires __opencl_c_subgroups
memory_order success, l memory_scope_device Requires OpenCL C 2.0 or __opencl_c_atomic_scope_device
memory_order failure[ , memory_scope scope])
Requires OpenCL C 2.0 or
l C atomic_fetch_<key>(volatile A *object, l memory_scope_all_svm_devices __opencl_c_atomic_scope_all_svm_devices
M operand) Atomically replaces the value pointed to by
object with the result of the computation
C atomic_fetch_<key>_explicit(volatile A *object,
M operand, memory_order order
[ , memory_scope scope])
applied to the value pointed to by object and
the given operand.
(Continued on next page >)
©2020 Khronos Group - Rev. 0420 www.khronos.org/opencl
OpenCL 3.0 (Provisional) Reference Guide OpenCL C Language Page 10
Atomic Functions (continued) l Work-group Functions [C 6.13.15] Broadcast the value of a to all work-items in the work-group.
T is type int, uint, long, ulong, or float, optionally double (if double local_id must be the same value for all work-items in the work-
Atomic macros precision is supported). The sub_group_* work-group functions group.
#define ATOMIC_VAR_INIT(C value) require the feature macro __opencl_c_subgroups. All other
Expands to a token sequence to initialize an atomic object of a T work_group_broadcast (T a, size_t local_id)
work-group functions require OpenCL C 2.0 or
type that is initialization-compatible with value. T work_group_broadcast (T a, size_t local_id_x,
__opencl_c_work_group_collective_functions. size_t local_id_y)
#define ATOMIC_FLAG_INIT T work_group_broadcast (T a, size_t local_id_x,
Returns a non-zero value if predicate evaluates to non-zero for size_t local_id_y, size_t local_id_z)
Global atomic objects declared with the atomic_flag type
can be initialized to a clear state with the ATOMIC_FLAG_INIT all or any work-items in the work-group.
T sub_group_broadcast (T x, size_t local_id)
macro, for example: int work_group_all (int predicate)
global atomic_flag guard = ATOMIC_FLAG_INIT; int work_group_any (int predicate) Do an exclusive or inclusive scan operation specified by <op> of
int sub_group_all (int predicate) all values specified by work-items in the work-group. The scan
Atomic integer and floating-point types results are returned for each work-item. <op> may be min, max,
int sub_group_any (int predicate)
† indicates types supported by a limited subset of atomic or add.
operations.
‡ indicates size depends on whether implemented on 64-bit or Return result of reduction operation specified by <op> for all T work_group_scan_exclusive_<op> (T x)
32-bit architecture. values of x specified by work-items in work-group. <op> may be T work_group_scan_inclusive_<op> (T x)
§ indicates types supported only with these extensions enabled: min, max, or add. T sub_group_scan_exclusive_<op> (T x)
cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics T sub_group_scan_inclusive_<op> (T x)
T work_group_reduce_<op> (T x)
atomic_int atomic_double †§ atomic_ptrdiff_t ‡§ T sub_group_reduce_<op> (T x)
atomic_uint atomic_long § atomic_intptr_t ‡§
atomic_flag atomic_ulong § atomic_uintptr_t ‡§
atomic_float † atomic_size_t ‡§
printf output synchronization reserve_id_t work_group_reserve_read_pipe (pipe T p, uint num_packets) Reserve num_packets entries for
When the event associated with a particular kernel invocation reserve_id_t work_group_reserve_write_pipe (pipe T p, uint num_packets)
completes, the output of applicable printf calls is flushed to the reading from or writing to p. Returns a
reserve_id_t sub_group_reserve_read_pipe (pipe T p, uint num_packets) valid reservation ID if the reservation
implementation-defined output stream. is successful.
reserve_id_t sub_group_reserve_write_pipe (pipe T p, uint num_packets)
printf format string
The format string follows C99 conventions and supports an
optional vector specifier:
%[flags][width][.precision][vector][length] conversion Notes
Examples:
The following examples show the use of the vector specifier in
the printf format string.