当我将工作组大小从16
更改为32
或更大的东西我得到一个CL_INVALID_WORK_GROUP_SIZE
错误。 matrix_size
是64
。原因为CL_INVALID_WORK_GROUP_SIZE
localWorkSize[0] = groupsize;
localWorkSize[1] = localWorkSize[0];
globalWorkSize[0] = matrix_size;
globalWorkSize[1] = globalWorkSize[0];
首先我检查了文档clEnqueueNDRangeKernel其中规定四(5个)不同的原因CL_INVALID_WORK_GROUP_SIZE
,但我认为他们不适用。请检查我的结论。 (我希望你不介意我的QA风格)
QCL_INVALID_WORK_GROUP_SIZE if local_work_size is specified and number of work-items specified by global_work_size is not evenly divisable by size of work-group given by local_work_size
一个64%32 = 0
Qor does not match the work-group size specified for kernel using the __attribute__((reqd_work_group_size(X, Y, Z))) qualifier in program source.
A据我了解的帮助,我没有使用__attribute__
。
QCL_INVALID_WORK_GROUP_SIZE if local_work_size is specified and the total number of work-items in the work-group computed as local_work_size[0] *... local_work_size[work_dim - 1] is greater than the value specified by CL_DEVICE_MAX_WORK_GROUP_SIZE in the table of OpenCL Device Queries for clGetDeviceInfo.
甲我查询clGetDeviceInfo
和CL_DEVICE_MAX_WORK_GROUP_SIZE
是512, 512, 64
QCL_INVALID_WORK_GROUP_SIZE if local_work_size is NULL and the __attribute__((reqd_work_group_size(X, Y, Z))) qualifier is used to declare the work-group size for kernel in the program source.
甲local_work_size
不是NULL
。
QCL_INVALID_WORK_ITEM_SIZE if the number of work-items specified in any of local_work_size[0], ... local_work_size[work_dim - 1] is greater than the corresponding values specified by CL_DEVICE_MAX_WORK_ITEM_SIZES[0], .... CL_DEVICE_MAX_WORK_ITEM_SIZES[work_dim - 1].
一个
我希望,我没有忽略的东西。请告诉我,当你知道什么可能会导致CL_INVALID_WORK_GROUP_SIZE
或在我的结论中发现错误。
感谢您抽出宝贵的时间来阅读这一切:)
这个问题很老,但我只是想感谢你这个非常明确的解释,因为它只是找到了解决我的问题的方法! – 2012-02-22 14:28:54
@BigBourin。你非常欢迎。如果您还没有做过,请还请+1 Quantumboredom回答。 – Framester 2012-02-22 14:38:46