我正在学习使用Spark。到目前为止,我遵循this文章。当我尝试导入pyspark时,出现以下错误。在pyspark有一个文件accumulators.py。为独立应用程序导入pyspark
>>> import os
>>> import sys
>>> os.environ['SPARK_HOME'] = "E:\\spark-1.2.0"
>>> sys.path.append("E:\\spark-1.2.0\\python")
>>> from pyspark import SparkContext
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "E:\spark-1.2.0\python\pyspark\__init__.py", line 41, in <module>
from pyspark.context import SparkContext
File "E:\spark-1.2.0\python\pyspark\context.py", line 30, in <module>
from pyspark.java_gateway import launch_gateway
File "E:\spark-1.2.0\python\pyspark\java_gateway.py", line 26, in <module>
from py4j.java_gateway import java_import, JavaGateway, GatewayClient
ImportError: No module named py4j.java_gateway
>>> sys.path.append("E:\\spark-1.2.0\\python\\build")
>>> from pyspark import SparkContext
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "E:\spark-1.2.0\python\pyspark\__init__.py", line 41, in <module>
from pyspark.context import SparkContext
File "E:\spark-1.2.0\python\pyspark\context.py", line 25, in <module>
from pyspark import accumulators
ImportError: cannot import name accumulators
如何解决此错误?我用windows 7 and java-8
。 Python版本是Python 2.7.6 :: Anaconda 1.9.2 (64-bit)
能否打印'sys.path'你的追加后的价值? – 2015-02-09 20:52:19