2015-02-09 139 views
3

我正在学习使用Spark。到目前为止,我遵循this文章。当我尝试导入pyspark时,出现以下错误。在pyspark有一个文件accumulators.py。为独立应用程序导入pyspark

>>> import os 
>>> import sys 
>>> os.environ['SPARK_HOME'] = "E:\\spark-1.2.0" 
>>> sys.path.append("E:\\spark-1.2.0\\python") 
>>> from pyspark import SparkContext 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "E:\spark-1.2.0\python\pyspark\__init__.py", line 41, in <module> 
    from pyspark.context import SparkContext 
    File "E:\spark-1.2.0\python\pyspark\context.py", line 30, in <module> 
    from pyspark.java_gateway import launch_gateway 
    File "E:\spark-1.2.0\python\pyspark\java_gateway.py", line 26, in <module> 
    from py4j.java_gateway import java_import, JavaGateway, GatewayClient 
ImportError: No module named py4j.java_gateway 
>>> sys.path.append("E:\\spark-1.2.0\\python\\build") 
>>> from pyspark import SparkContext 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "E:\spark-1.2.0\python\pyspark\__init__.py", line 41, in <module> 
    from pyspark.context import SparkContext 
    File "E:\spark-1.2.0\python\pyspark\context.py", line 25, in <module> 
    from pyspark import accumulators 
ImportError: cannot import name accumulators 

如何解决此错误?我用windows 7 and java-8。 Python版本是Python 2.7.6 :: Anaconda 1.9.2 (64-bit)

+0

能否打印'sys.path'你的追加后的价值? – 2015-02-09 20:52:19

回答

0

尝试增加E:\火花1.2.0 \ python的\ LIB \ py4j-0.8.2.1-src.zip您PYTHONPATH

2

我遇到了同样的问题,以下同一篇文章和能够通过更改00-pyspark-setup.py脚本来修复它,以便将SPARK_HOME/python/lib路径直接添加到python的sys.path而不是SPARK_HOME/python。

我的全00-pyspark-startup.py脚本如下内容:

import os 
import sys 

# Configure the environment 
#if 'SPARK_HOME' not in os.environ: 
# os.environ['SPARK_HOME'] = '/srv/spark' 

# Create a variable for our root path 
SPARK_HOME = os.environ['SPARK_HOME'] 

# Add the PySpark/py4j to the Python Path 
sys.path.insert(0, os.path.join(SPARK_HOME, "python", "lib")) 
sys.path.insert(0, os.path.join(SPARK_HOME, "python")) 
+0

谢谢,但我得到了另一个错误'没有模块名为py4j.protocol' – Fan 2017-12-18 01:08:22

+0

然后我试图将'python/lib/py4j-0.10.4-src.zip'文件添加到PATH中,它工作。 – Fan 2017-12-18 01:20:29