在下面的代码中,我简单地从main调用函数foo两次。该函数只是执行设备内存分配,然后递增该指针。然后退出并返回主界面。CUDA:重新分配内存时无效的设备指针错误
第一次foo被称为内存被正确分配。但是,现在你可以在输出中看到的,当我再次调用foo,CUDA内存分配与错误无效的设备指针
失败我尝试了两种foo的调用之间使用的cudaThreadSynchronize(),但没有收获。为什么内存分配失败?
实际上错误被casued由于
matrixd + = 3;
因为如果我不这样做增量错误消失。
但是,为什么即使我使用cudaFree()?
请帮助我理解这一点。
我的输出是这里
Calling foo for the first time
Allocation of matrixd passed:
I came back to main safely :-)
I am going back to foo again :-)
Allocation of matrixd failed, the reason is: invalid device pointer
我主要的()在这里FOO(的
#include<stdio.h>
#include <cstdlib> // malloc(), free()
#include <iostream> // cout, stream
#include <math.h>
#include <ctime> // time(), clock()
#include <bitset>
bool foo();
/***************************************
Main method.
****************************************/
int main()
{
// Perform one warm-up pass and validate
std::cout << "Calling foo for the first time"<<std::endl;
foo();
std::cout << "I came back to main safely :-) "<<std::endl;
std::cout << "I am going back to foo again :-) "<<std::endl;
foo();
getchar();
return 0;
}
定义)是在这个文件:
#include <cuda.h>
#include <cuda_runtime_api.h>
#include <device_launch_parameters.h>
#include <iostream>
bool foo()
{
// Error return value
cudaError_t status;
// Number of bytes in the matrix.
int bytes = 9 *sizeof(float);
// Pointers to the device arrays
float *matrixd=NULL;
// Allocate memory on the device to store matrix
cudaMalloc((void**) &matrixd, bytes);
status = cudaGetLastError(); //To check the error
if (status != cudaSuccess) {
std::cout << "Allocation of matrixd failed, the reason is: " << cudaGetErrorString(status) <<
std::endl;
cudaFree(matrixd); //Free call for memory
return false;
}
std::cout << "Allocation of matrixd passed: "<<std::endl;
////// Increment address
for (int i=0; i<3; i++){
matrixd += 3;
}
// Free device memory
cudaFree(matrixd);
return true;
}
更新
更好的错误检查。此外,我只将设备指针递增一次。这次我得到以下输出:
Calling foo for the first time
Allocation of matrixd passed:
Increamented the pointer and going to free cuda memory:
GPUassert: invalid device pointer C:/Users/user/Desktop/Gauss/Gauss/GaussianElem
inationGPU.cu 44
行号44是cudaFree()。为什么它仍然失败?
#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }
inline void gpuAssert(cudaError_t code, const char *file, int line, bool abort=true)
{
if (code != cudaSuccess)
{
fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line);
if (abort) exit(code);
}
}
// GPU function for direct method Gross Jorden method.
bool foo()
{
// Error return value
cudaError_t status;
// Number of bytes in the matrix.
int bytes = 9 *sizeof(float);
// Pointers to the device arrays
float *matrixd=NULL;
// Allocate memory on the device to store each matrix
gpuErrchk(cudaMalloc((void**) &matrixd, bytes));
//cudaMemset(outputMatrixd, 0, bytes);
std::cout << "Allocation of matrixd passed: "<<std::endl;
////// Incerament address
matrixd += 1;
std::cout << "Increamented the pointer and going to free cuda memory: "<<std::endl;
// Free device memory
gpuErrchk(cudaFree(matrixd));
return true;
}
如果您检查'cudaFree'调用'的返回状态会怎么样? – talonmies
@talonmies你是对的,只是检查,我用cudagetlasterror(),低于cudafree和是的它显示,它是失败的但又是为什么? – user3891236
没错。所以你的问题基本上是由不完整的错误检查造成的。你可以看到如何正确地做到这一点[这里](http://stackoverflow.com/q/14038589/681865)。内存分配不失败。 – talonmies