2017-09-25 141 views
0

我想执行这个程序,但没有看到任何输出在控制台上写pprint语句。蟒蛇火花流输出

from __future__ import print_function 
import sys 
from pyspark import SparkContext 
from pyspark.streaming import StreamingContext 
if __name__ == "__main__": 
    if len(sys.argv) != 2: 
     print("Usage: hdfs_wordcount.py <directory>", file=sys.stderr) 
     exit(-1) 
    sc = SparkContext(appName="PythonStreamingHDFSWordCount") 
    ssc = StreamingContext(sc, 1) 
    lines = ssc.textFileStream(sys.argv[1]) 
    counts = lines.flatMap(lambda line: line.split(" "))\ 
        .map(lambda x: (x, 1))\ 
        .reduceByKey(lambda a, b: a+b) 
    counts.pprint() 
    ssc.start() 
    ssc.awaitTermination() 

https://github.com/apache/spark/blob/master/examples/src/main/python/streaming/hdfs_wordcount.py

回答