CUDA 5,设备功能3.5,VS 2012,64位Win 2012 Server。CUDA固定内存从设备中刷新
线程之间没有共享内存访问,每个线程都是独立的。
我使用零拷贝的固定内存。在主机上,只有当我在主机上发出cudaDeviceSynchronize
时,我才能读取设备写入的固定内存。
我希望能够到:
- 水冲到锁定的存储,一旦设备已经更新了它。
- 不会阻止设备线程(可能由异步复制)
我打过电话__threadfence_system
和__threadfence
每个设备的写入后,但没有刷新。
下面是一个完整的示例代码CUDA演示我的问题:
#include <conio.h>
#include <cstdio>
#include "cuda.h"
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
__global__ void Kernel(volatile float* hResult)
{
int tid = threadIdx.x + blockIdx.x * blockDim.x;
printf("Kernel %u: Before Writing in Kernel\n", tid);
hResult[tid] = tid + 1;
__threadfence_system();
// expecting that the data is getting flushed to host here!
printf("Kernel %u: After Writing in Kernel\n", tid);
// time waster for-loop (sleep)
for (int timeWater = 0; timeWater < 100000000; timeWater++);
}
void main()
{
size_t blocks = 2;
volatile float* hResult;
cudaHostAlloc((void**)&hResult,blocks*sizeof(float),cudaHostAllocMapped);
Kernel<<<1,blocks>>>(hResult);
int filledElementsCounter = 0;
// naiive thread implementation that can be impelemted using
// another host thread
while (filledElementsCounter < blocks)
{
// blocks until the value changes, this moves sequentially
// while threads have no order (fine for this sample).
while(hResult[filledElementsCounter] == 0);
printf("%f\n", hResult[filledElementsCounter]);;
filledElementsCounter++;
}
cudaFreeHost((void *)hResult);
system("pause");
}
目前该样品没有被从设备读取,除非我发出cudaDeviceSynchronize
将无限期地等待。下面的作品样本,但它是不我希望,因为它违背了异步复制的目的是什么:
void main()
{
size_t blocks = 2;
volatile float* hResult;
cudaHostAlloc((void**)&hResult, blocks*sizeof(float), cudaHostAllocMapped);
Kernel<<<1,blocks>>>(hResult);
cudaError_t error = cudaDeviceSynchronize();
if (error != cudaSuccess) { throw; }
for(int i = 0; i < blocks; i++)
{
printf("%f\n", hResult[i]);
}
cudaFreeHost((void *)hResult);
system("pause");
}
你解决了这个问题吗?您是否尝试使用动态并行机制将数据写入CPU主机的内存?在内核函数中使用'cudaMemcpyAsync(uva_host_ptr,device_ptr,size);',如以下链接所示:http://on-demand.gputechconf.com/gtc/2012/presentations/S0338-GTC2012-CUDA-Programming- Model.pdf – Alex 2013-10-13 21:34:50