Caffe的卷积真的如何工作？

所以我玩弄pycaffe的卷积函数作为基本卷积层的一部分。这是我的convolution.prototxt文件：Caffe的卷积真的如何工作？

name: "convolution" 
input: "data" 
input_dim: 1 
input_dim: 1 
input_dim: 227 
input_dim: 227 

layer { 
    name: "conv" 
    type: "Convolution" 
    bottom: "data" 
    top: "conv" 
    convolution_param { 
    num_output: 96 
    kernel_size: 11 
    stride: 1 
    } 
}

这些参数都是相同的AlexNet的第一CONV层（除步幅，这实际上是4）。

我有一台配备NVIDIA GeForce GT 650M 1024 MB GPU的Macbook Pro。我不确定这意味着什么，但我的笔记本电脑也有一个Intel HD 4000作为内置GPU。

我在笔记本电脑上进行了一些测试，同时改变了跨度超参数，首先是在GPU模式下，然后是CPU。调用caffe.set_device(0); caffe.set_mode_gpu()后

1）变的进步：

Stride 1: 27.26 ms 
Stride 2: 14.27 ms 
Stride 3: 10.57 ms 
Stride 4: 7.45 ms

2）变的进步呼唤caffe.set_mode_cpu()后：

Stride 1: 49.77 ms # expected 
Stride 2: 9.92 ms # this and the results after this don't make sense 
Stride 3: 4.50 ms 
Stride 4: 1.96 ms

（平均3）

我只是试图根据这些测试来了解Caffe的卷积是如何工作的。任何人都可以帮我解释这一点吗？为什么CPU模式比GPU模式执行得更快？

测试代码中，我使用的，如果你在看到自己感兴趣：

import numpy as np 
import caffe 
import time 

caffe.set_device(0) 
caffe.set_mode_gpu() # caffe.set_mode_cpu() 

net = caffe.Net('convolution.prototxt', caffe.TEST) 
total = 0.0 
for _ in range(3): 
    net.blobs['data'].data[...] = np.random.randn(1, 1, 227, 227) # there really is an ellipsis there 
    net.params['conv'][0].data[...] = np.random.randn(96, 1, 11, 11) 
    s = time.time() 
    r = net.forward() 
    e = time.time() 
    total += (e - s) 

print total/3 * 1000

来源

2016-07-08 cᴏʟᴅsᴘᴇᴇᴅ