在输出XSIZE和YSIZE字段的含义投cudaMalloc3D

的指针的documentation of cudaMalloc3D说在输出XSIZE和YSIZE字段的含义投cudaMalloc3D

返回cudaPitchedPtr包含附加字段xsize和 ysize，分配的逻辑宽度和高度，这是等效到分配期间编程人员提供的宽度和高度范围参数。

但是，如果我跑以下最低例子

#include<stdio.h> 
#include<cuda.h> 
#include<cuda_runtime.h> 
#include<device_launch_parameters.h> 
#include<conio.h> 

#define Nrows 64 
#define Ncols 64 
#define Nslices 16 

/********************/ 
/* CUDA ERROR CHECK */ 
/********************/ 
// --- Credit to http://stackoverflow.com/questions/14038589/what-is-the-canonical-way-to-check-for-errors-using-the-cuda-runtime-api 
void gpuAssert(cudaError_t code, char *file, int line, bool abort = true) 
{ 
    if (code != cudaSuccess) 
    { 
     fprintf(stderr, "GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line); 
     if (abort) { exit(code); } 
    } 
} 

void gpuErrchk(cudaError_t ans) { gpuAssert((ans), __FILE__, __LINE__); } 

/********/ 
/* MAIN */ 
/********/ 
int main() { 

    // --- 3D pitched allocation and host->device memcopy 
    cudaExtent extent = make_cudaExtent(Ncols * sizeof(float), Nrows, Nslices); 
    cudaPitchedPtr devPitchedPtr; 
    gpuErrchk(cudaMalloc3D(&devPitchedPtr, extent)); 

    printf("xsize = %i; xsize in bytes = %i; ysize = %i\n", devPitchedPtr.xsize, devPitchedPtr.pitch, devPitchedPtr.ysize); 

    return 0; 
}

我收到：

xsize = 256; xsize in bytes = 512; ysize = 64

所以，ysize实际上等于Nrows，但xsize不同于要么Ncols或xsize in bytes/sizeof(float) 。

请帮我理解xsize和ysize字段的含义cudaMalloc3D的cudaPitchedPtr？

非常感谢您的帮助。

我的系统：Windows 10, CUDA 8.0,GT 920M,cc 3.5。

来源

2017-05-08 JackOLantern

xsize是您请求的间距宽度，以字节为单位。 pitch是以字节为单位的实际音高宽度。 ysize是您请求的行数 – talonmies

不是文档中的“分配*至少*宽度*高度*线性内存的深度字节数”和“函数*可能填充*分配...”。 – Shadow

@talonmies非常感谢您的及时评论。 – JackOLantern

xsize = Ncols * sizeof(float)

xsize是分配的逻辑宽度（以字节为单位），而不是在投宽度

逻辑宽度= 256个字节

投宽度= 512字节

它等于（相同）宽度p您在分配期间提供的参数（即您传递给make_cudaExtent的第一个参数）

来源

2017-05-08 19:20:31

谢谢罗伯特您的及时答复。现在我很清楚'xsize'是'bytes'中“测量”的列的数量。 – JackOLantern

在输出XSIZE和YSIZE字段的含义投cudaMalloc3D

回答

相关问题