2017-06-05 57 views
-1

我一直在使用tensorflow近两年来,还从来没见过这样的。在一个新的Ubuntu盒子上,我在virtualenv中安装了tensorflow。当我运行示例代码时,出现无效设备错误。它发生在调用tf.Session()时。tensorflow不寻常的CUDA相关的错误

WARNING:tensorflow:From full_code.py:27: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02. 
Instructions for updating: 
Use `tf.global_variables_initializer` instead. 
2017-06-05 11:01:55.853842: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 
2017-06-05 11:01:55.853867: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 
2017-06-05 11:01:55.853876: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 
2017-06-05 11:01:55.853886: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 
2017-06-05 11:01:55.853893: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. 
2017-06-05 11:01:55.937978: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: 
name: GeForce GTX 660 Ti 
major: 3 minor: 0 memoryClockRate (GHz) 1.0455 
pciBusID 0000:04:00.0 
Total memory: 2.95GiB 
Free memory: 2.91GiB 
2017-06-05 11:01:55.938063: W tensorflow/stream_executor/cuda/cuda_driver.cc:485] creating context when one is currently active; existing: 0x19e5370 
2017-06-05 11:01:56.014220: E tensorflow/core/common_runtime/direct_session.cc:137] Internal: failed initializing StreamExecutor for CUDA device ordinal 1: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_INVALID_DEVICE 

下面是完整的规范。

Ubuntu 14.04 
CUDA 8.0 
GeForce GTX 660 Ti 
python 3.4.3 
+0

你验证CUDA安装? –

+0

@RobertCrovella不知道如何? – horaceT

+0

检查CUDA Linux安装指南 –

回答

1

感谢来自谷歌的人,我想通了哪里出了问题。在这个戴尔盒子里,有两个Nvidia显卡。第一个与制造商一起,是一个NVS 310卡。据我所知,这个没有任何计算能力,我从来没有打算大量使用它。

我然后加入第二卡,GTX 660 Ti和我打算用这一个为所有计算。

当Tensorflow被调用时,它默认为设备0,这是NVS 310当然它抛出一个无效的错误。

当我这样做,

CUDA_VISIBLE_DEVICES = 1条蟒蛇myscript.py

它的工作原理。

+0

因此,解决方案涉及硬件细节,你完全忽略在你的问题中提及? – talonmies

+0

@talonmies完全是我的不好。在有多个GPU的情况下,我对CUDA的行为有了更多的发现。 – horaceT