Android上的基准量化

我一直在用benchmark_model基准测试Exynos 7420上的tensorflow模型。我想按照Pete Warden's blog的速度测试量化，但还是无法通过量化代码来编译benchmark_model，因为它们会破坏很多东西。Android上的基准量化

我已经按照这个stack overflow thread列出的指导原则：

// tensorflow /工具/基准/ BUILD cc_binary

deps = [":benchmark_model_lib", 
      "//tensorflow/contrib/quantization/kernels:quantized_ops", 
      ],

// tensorflow /的contrib /量化/粒/ BUILD：

deps = [ 
    "//tensorflow/contrib/quantization:cc_array_ops", 
    "//tensorflow/contrib/quantization:cc_math_ops", 
    "//tensorflow/contrib/quantization:cc_nn_ops", 
    #"//tensorflow/core", 
    #"//tensorflow/core:framework", 
    #"//tensorflow/core:lib", 
    #"//tensorflow/core/kernels:concat_lib_hdrs", 
    #"//tensorflow/core/kernels:conv_ops", 
    #"//tensorflow/core/kernels:eigen_helpers", 
    #"//tensorflow/core/kernels:ops_util", 
    #"//tensorflow/core/kernels:pooling_ops", 
    "//third_party/eigen3", 
    "@gemmlowp//:eight_bit_int_gemm", 
],

然后运行：

巴泽勒建立-c选择--cxxopt =' - S td = gnu ++ 11' - crosstool_top = // external：android/crosstool --cpu = armeabi-v7a --host_crosstool_top = @ bazel_tools // tools/cpp：toolchain tensorflow/tools/benchmark：benchmark_model --verbose_failures

哪个（跟随链接后的所有其他说明）成功与例外，它无法链接到pthread。

我试过在tensorflow/tensorflow.bzl tfcopts（）中删除-lpthread，在tensorflow/tools/proto_text/BUILD和tensorflow/cc/BUILD中也是这样。

def tf_copts(): 
    return (["-fno-exceptions", "-DEIGEN_AVOID_STL_ARRAY"] + 
      if_cuda(["-DGOOGLE_CUDA=1"]) + 
      if_android_arm(["-mfpu=neon"]) + 
      select({"//tensorflow:android": [ 
        "-std=c++11", 
        "-DMIN_LOG_LEVEL=0", 
        "-DTF_LEAN_BINARY", 
        "-O2", 
        ], 
        "//tensorflow:darwin": [], 
        "//tensorflow:ios": ["-std=c++11",], 
        #"//conditions:default": ["-lpthread"]})) 
        "//conditions:default": []}))

仍然收到链接错误。

external/androidndk/ndk/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/../lib/gcc/arm-linux-androideabi/4.9/../../../../arm-linux-androideabi/bin/ld: error: cannot find -lpthread 
collect2: error: ld returned 1 exit status

任何帮助非常感谢，我相当卡住。

ENV：

的Ubuntu 14.04
tensorflow提交＃4462
android_ndk_r11c
Android的SDK-linux的r24.4.1
的Python 2.7.12 ::连续分析，公司
./configure不支持GCP，HDFS或GPU

来源

2016-09-21 Dwight Crow

TF团队转录GitHub answer from Andrew Harp。谢谢！！！

上述变化都是不必要的。你可以量化为benchmark_model具有以下工作（或任何目标依赖于android_tensorflow_lib）：

混帐拉--recurse-子模块（以获得@gemmlowp库，也可以克隆的git --recursive）
下面编辑以// tensorflow /核心/ BUILD

diff --git a/tensorflow/core/BUILD b/tensorflow/core/BUILD 
--- a/tensorflow/core/BUILD 
+++ b/tensorflow/core/BUILD 
@@ -713,8 +713,11 @@ cc_library(
# binary size (by packaging a reduced operator set) is a concern. 
cc_library(
    name = "android_tensorflow_lib", 
- srcs = if_android([":android_op_registrations_and_gradients"]), 
- copts = tf_copts(), 
+ srcs = if_android([":android_op_registrations_and_gradients", 
+      "//tensorflow/contrib/quantization:android_ops", 
+      "//tensorflow/contrib/quantization/kernels:android_ops", 
+      "@gemmlowp//:eight_bit_int_gemm_sources"]), 
+ copts = tf_copts() + ["-Iexternal/gemmlowp"], 
    linkopts = ["-lz"], 
    tags = [ 
     "manual",

只是测试，工程巨大。有趣的是，量化产生的图形大小的四分之一，但推断执行4-5倍像未经量化的图一样缓慢 - 似乎量子化操作仍在被优化。

来源

2016-09-23 04:18:53

好了，现在正在工作，是的，我们仍在优化量化的操作，所以不要将当前的速度作为您可以获得的最大值。 –

Android上的基准量化

回答

相关问题