Pyfaster RCNN ZF网络模型中的Softmax输入尺寸

我有兴趣重现zf net的原型文件文件中的步骤。我不确定的部分是softmax层。Pyfaster RCNN ZF网络模型中的Softmax输入尺寸

layer { 
    name: "rpn_cls_score" 
    type: "Convolution" 
    bottom: "rpn/output" 
    top: "rpn_cls_score" 
    convolution_param { 
    num_output: 18 # 2(bg/fg) * 9(anchors) 
    kernel_size: 1 pad: 0 stride: 1 
    weight_filler { type: "gaussian" std: 0.01 } 
    bias_filler { type: "constant" value: 0 } 
    } 
}

然后进行整形，以尺寸（1,2,9 * H，W）在这里：

layer { 
    bottom: "rpn_cls_score" 
    top: "rpn_cls_score_reshape" 
    name: "rpn_cls_score_reshape" 
    type: "Reshape" 
    reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } } 
}

最后rpn_cls_score与尺寸（1.18，H，W）在这里创建它传递给softmax：

layer { 
    name: "rpn_cls_prob" 
    type: "Softmax" 
    bottom: "rpn_cls_score_reshape" 
    top: "rpn_cls_prob" 
}

我的问题是这样的。根据caffe在线文档，softmax采用1维输入，但rpn_cls_score_reshape具有维度（1,2,9 * h，w）。 softmax是否总和了所有的指数，还是它选择了一个规范轴并且只和其余指标相加（正如C++代码似乎表明的那样）？在这种情况下，这意味着它会将rpn_cls_score_reshape分成两个数组（1，channel = 1,9 * h，w）和（1，channel = 2,9 * h，w），每个数值为1索引，并在每一个通过求和9 * h * w分量的指数来执行softmax，然后将它们重新组装成具有原始尺寸（1,2,9 * h，w）的数组并将其作为rpn_cls_prob返回。如果不是，softmax如何处理具有多个维度的输入数组？

谢谢..

来源

2017-10-13 Mjas

由于SofmaxParameter在caffe.proto的记载，它有它的默认设置为1的参数轴：

// The axis along which to perform the softmax -- may be negative to index 
// from the end (e.g., -1 for the last axis). 
// Any other axes will be evaluated as independent softmaxes. 
optional int32 axis = 2 [default = 1];

所以你的C++实现的理解是正确的，并关于softmax如何在N> 1的情况下处理ND输入的问题是每个轴都单独评估。
至于更快的RCNN，如果你只对前景盒感兴趣，你可以分割rpn_cls_score blob并只使用它的后半部分（即在训练你的网络集num_output: 9 # instead of 18后，或在训练期间使用Slice层取第二个只有一半）。请注意相应地更改caffemodel，以防训练后照常训练并更改num_output。

来源

2017-10-15 10:06:33 rkellerm

Pyfaster RCNN ZF网络模型中的Softmax输入尺寸

回答

相关问题