我是PyCUDA的新手。我想用__global__
声明的函数调用__device__
声明的函数。我如何在pyCUDA中做到这一点?从pycuda的全局函数中调用设备函数
import pycuda.driver as cuda
from pycuda.compiler import SourceModule
import numpy as n
import pycuda.autoinit
import pycuda.gpuarray as gp
d=gp.zeros(shape=(128,128),dtype=n.int32)
h=n.zeros(shape=(128,128),dtype=n.int32)
mod=SourceModule("""
__global__ void matAdd(int *a)
{
int px=blockIdx.x*blockDim.x+threadIdx.x;
int py=blockIdx.y*blockDim.y+threadIdx.y;
a[px*128+py]+=1;
matMul(px);
}
__device__ void matMul(int px)
{
px=5;
}
""")
m=mod.get_function("matAdd")
m(d,block=(32,32,1),grid=(4,4))
d.get(h)
上面的代码是给我下面的错误
7-linux-i686.egg/pycuda/../include/pycuda kernel.cu]
[stderr:
kernel.cu(8): error: identifier "matMul" is undefined
kernel.cu(12): warning: parameter "px" was set but never used
1 error detected in the compilation of "/tmp/tmpxft_00002286_00000000-6_kernel.cpp1.ii".
]
我不确定我是否理解这个问题。在PyCUDA中,您仍然使用CUDA C编写设备代码。如果您使用C++而不是Python编写主机代码,那也没什么两样。那么你在问什么? – talonmies 2012-08-10 13:29:28