2016-04-25 186 views
1

使用python 2.7的Pyspark工作正常。 我安装的Python 3.5.1(从源代码安装) ,当我在终端使用python导入pyspark错误Pyspark 3.5.1

Python 3.5.1 (default, Apr 25 2016, 12:41:28) 
[GCC 4.8.4] on linux 
Type "help", "copyright", "credits" or "license" for more information. 
Traceback (most recent call last): 
    File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/shell.py", line 30, in <module> 
    import pyspark 
    File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/__init__.py", line 41, in <module> 
    from pyspark.context import SparkContext 
    File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/context.py", line 28, in <module> 
    from pyspark import accumulators 
    File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/accumulators.py", line 98, in <module> 
    from pyspark.serializers import read_int, PickleSerializer 
    File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/serializers.py", line 58, in <module> 
    import zlib 
ImportError: No module named 'zlib' 

我试过蟒蛇3.4.3,也能正常工作

回答

0

你进行检查,以运行pyspark我得到这个错误确定zlib实际上在你的python安装中?它应该是默认的,但奇怪的事情发生。

0

你在.bashrc文件中提供了系统python3.5.1到“PYSPARK_PYTHON”的确切路径吗?

Welcome to 
    ____    __ 
/__/__ ___ _____/ /__ 
_\ \/ _ \/ _ `/ __/ '_/ 
/__/.__/\_,_/_/ /_/\_\ version 2.1.1 
    /_/ 

Using Python version 3.6.1 (default, Jun 23 2017 16:20:09) 
SparkSession available as 'spark'. 

这就是我的PySpark提示符显示的内容。阿帕奇星火verison是2.1.1

PS:我用Anaconda3(Python的3.6.1)对我的日常PySpark码我PYSPARK_DRIVER设置为 'jupyter'

上面的例子是用我的系统默认的Python 3.6