自定义图层API（TensorRT 2.1）的简单示例？

我正在使用TensorRT 2.1并希望实现一个简单的自定义图层。（我们的目标是一个嵌入系统上使用3210运行Single Shot Detector。）自定义图层API（TensorRT 2.1）的简单示例？

为了实践，我想使一个Inc层（只是加入1.0到输入张量的值，并保持尺寸相同）。

我执行Inc类继class Reshape : public Iplugin后sampleFasterRNN.cpp例子。除了getOutputDimensions()之外，我保持所有内容几乎相同以保持相同的尺寸。（这看起来很好）。

我应该在哪里实施“添加1.0”部分？我想它应该在“enqueue（）”中。所以，我试过

int enqueue(int batchSize, const void*const *inputs, void** outputs, void*, cudaStream_t stream) override 
{ 
    # the below is from the Reshape class. seems to copy from input to output 
    CHECK(cudaMemcpyAsync(outputs[0], inputs[0], mCopySize * batchSize, cudaMemcpyDeviceToDevice, stream)); 
    # add 1.0 to first ten values 
    float* foutputs = (float*) outputs[0]; 
    int i; for (i = 0; i < 10; i++) foutputs[i] += 1.0; 
    return 0; 
}

但是，这部分会导致“段错误”错误。

我的问题是：

在哪里，我怎么能实现输入和输出之间的一些计算？
任何人都可以提供一个简单的例子吗？

来源

2017-08-10 YW P Kwon

请参阅文件samples/samplePlugin/samplePlugin.cpp并查看FCPlugin类。你的实际计算应该进入enqueue方法。您可能必须编写一个执行增量的CUDA内核。

来源

2017-08-24 09:53:41

自定义图层API（TensorRT 2.1）的简单示例？

回答

相关问题