CUDA - cudaMallocPitch和cudaMemcpy2D使用，错误：InvalidValue，InvalidPitchValue

好吧，所以我想获得一个二维数组cuda工作，但它成为一个痛苦。该错误在标题中，并发生在cudaMemcpy2D。我认为这个问题对训练有素的人来说很明显。提前感谢您的帮助，我已经超越了我的班级，目前正在学习指针。CUDA - cudaMallocPitch和cudaMemcpy2D使用，错误：InvalidValue，InvalidPitchValue

#include <cuda_runtime.h> 
#include <iostream> 
#pragma comment (lib, "cudart") 

/* Program purpose: pass a 10 x 10 matrix and multiply it by another 10x10 matrix */ 

float matrix1_host[100][100]; 
float matrix2_host[100][100]; 

float* matrix1_device; 
float* matrix2_device; 
size_t pitch; 
cudaError_t err; 

__global__ void addMatrix(float* matrix1_device,float* matrix2_device, size_t pitch){ 
    // How this works 
    // first we start to cycle through the rows by using the thread's ID 
    // then we calculate an address from the address of a point in the row, by adding the pitch (size of each row) and * it by 
    // the amount of rows we've already completed, then we can use that address of somewhere at a start of a row to get the colums 
    // in the row with a normal array grab. 

    int r = threadIdx.x; 

     float* rowofMat1 = (float*)((char*)matrix1_device + r * pitch); 
     float* rowofMat2 = (float*)((char*)matrix2_device + r * pitch); 
     for (int c = 0; c < 100; ++c) { 
      rowofMat1[c] += rowofMat2[c]; 
     } 

} 

void initCuda(){ 
    err = cudaMallocPitch((void**)matrix1_device, &pitch, 100 * sizeof(float), 100); 
    err = cudaMallocPitch((void**)matrix2_device, &pitch, 100 * sizeof(float), 100); 
    //err = cudaMemcpy(matrix1_device, matrix1_host, 100*100*sizeof(float), cudaMemcpyHostToDevice); 
    //err = cudaMemcpy(matrix2_device, matrix2_host, 100*100*sizeof(float), cudaMemcpyHostToDevice); 
    err = cudaMemcpy2D(matrix1_device, 100*sizeof(float), matrix1_host, pitch, 100*sizeof(float), 100, cudaMemcpyHostToDevice); 
    err = cudaMemcpy2D(matrix2_device, 100*sizeof(float), matrix2_host, pitch, 100*sizeof(float), 100, cudaMemcpyHostToDevice); 
} 

void populateArrays(){ 
    for(int x = 0; x < 100; x++){ 
     for(int y = 0; y < 100; y++){ 
      matrix1_host[x][y] = (float) x + y; 
      matrix2_host[y][x] = (float) x + y; 
     } 
    } 
} 

void runCuda(){ 
    dim3 dimBlock (100); 
    dim3 dimGrid (1); 
    addMatrix<<<dimGrid, dimBlock>>>(matrix1_device, matrix2_device, 100*sizeof(float)); 
    //err = cudaMemcpy(matrix1_host, matrix1_device, 100*100*sizeof(float), cudaMemcpyDeviceToHost); 
    err = cudaMemcpy2D(matrix1_host, 100*sizeof(float), matrix1_device, pitch, 100*sizeof(float),100, cudaMemcpyDeviceToHost); 
    //cudaMemcpy(matrix1_host, matrix1_device, 100*100*sizeof(float), cudaMemcpyDeviceToHost); 
} 

void cleanCuda(){ 
    err = cudaFree(matrix1_device); 
    err = cudaFree(matrix2_device); 

    err = cudaDeviceReset(); 
} 


int main(){ 
    populateArrays(); 
    initCuda(); 
    runCuda(); 
    cleanCuda(); 
    std::cout << cudaGetErrorString(cudaGetLastError()); 
    system("pause"); 
    return 0; 
}

来源

2013-03-15 Joshua Waring

首先，一般来说，对于matrix1和matrix2应该有一个单独的音调变量。在这种情况下，它们将从API调用返回的cudaMallocPitch相同的值，但在一般情况下它们可能不是。

在您的cudaMemcpy2D专线上，the second parameter to the call是目标音高。这只是您为此特定目标矩阵（即第一个参数）调用cudaMallocPitch时返回的音高值。

第四个参数是源音高。由于这是用一个普通的主机分配来分配的，所以它的宽度除了字节外没有其他的间距。

所以你有你的第二个和第四个参数交换。

因此，不是这样的：

err = cudaMemcpy2D(matrix1_device, 100*sizeof(float), matrix1_host, pitch, 100*sizeof(float), 100, cudaMemcpyHostToDevice);

试试这个：

err = cudaMemcpy2D(matrix1_device, pitch, matrix1_host, 100*sizeof(float), 100*sizeof(float), 100, cudaMemcpyHostToDevice);

同样地，对于第二次调用cudaMemcpy2D。第三个调用实际上是OK的，因为它的方向相反，源矩阵和目标矩阵交换，所以它们正确地与您的音高参数对齐。

来源

2013-03-15 04:53:42

好吧，所以我改变了一些东西，我想第一个是主机阵列的音高，这让我非常困惑。虽然我仍然得到一个错误11 InvalidValue – 2013-03-15 05:03:39

那么，你正在做相当马虎的错误检查，所以你真的不知道错误来自哪条线。这是他们教你在课堂上进行错误检查的方式吗？你应该[检查每个cuda返回值]（http://stackoverflow.com/questions/14038589/what-is-the-canonical-way-to-check-for-errors-using-the-cuda-runtime- api），特别是当你遇到问题时。无论如何，我错过了你的参数'cudaMallocPitch'也是不正确的，你需要一个＆符号来传递你指针的地址：'err = cudaMallocPitch（（void **）＆matrix1_device，...' – 2013-03-15 05:37:57

其实，我去通过调试器一行一行地检查错误值，它正好从第一行cudaMemcpy2D ，但除此之外，谢谢你是这个问题，我现在一直坚持这一点。 – 2013-03-15 05:41:54

CUDA - cudaMallocPitch和cudaMemcpy2D使用，错误：InvalidValue，InvalidPitchValue

回答

相关问题