2017-04-24 119 views
3

如下所示,Spark控制台输出进度条正在混淆输出。是否有可用于关闭舞台进度栏的配置或标志?或者更好的是,如何修复控制台日志,以便在完成阶段后进度条消失?这可能只是PySpark的一个错误,但我不确定。PySpark修复/删除控制台进度条

(CID, (v1/n1, v2/n2)) 
[Stage 46:============================================>   (19 + 4)/24]('1', (0.020000000000000035, 4.805)) 
('5', (6.301249999999998, 0.125)) 
('10', (21.78000000000001, 3.125)) 
('7', (0.005000000000000009, 0.6049999999999996)) 

(CID, sqrt(v1/n1 + v2/n2)) 
('1', 2.19658826364888) 
('5', 2.5350049309616733) 
('10', 4.990490957811667) 
('7', 0.7810249675906652) 

(CID, (AD_MEAN, NCI_MEAN)) 
('7', (1.0, 5.5)) 
('5', (7.75, 5.3)) 
('10', (13.5, 6.0)) 
('1', (3.0, 5.0)) 

(CID, (AD_MEAN - NCI_MEAN)) 
('7', -4.5) 
('5', 2.45) 
('1', -2.0) 
('10', 7.5) 

(CID, (NUMER, DENOM)) 
[Stage 100:===================================================> (30 + 2)/32]('10', (7.5, 4.990490957811667)) 
('5', (2.45, 2.5350049309616733)) 
('7', (-4.5, 0.7810249675906652)) 
('1', (-2.0, 2.19658826364888)) 

它会变得更糟,有时(滚动到右):

$ spark-submit main.py 
17/04/28 11:36:23 WARN Utils: Your hostname, Pandora resolves to a loopback address: 127.0.1.1; using 146.95.36.193 instead (on interface wlp3s0) 
17/04/28 11:36:23 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 
17/04/28 11:36:24 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
[Stage 0:>               (0 + 2                   [Stage 32:=============================>       (4 + 4[Stage 37:>               (0 + 0[Stage 35:=====>   (4 + 2)/12][Stage 37:>     (0 + 0[Stage 35:===========>  (8 + 4)/12][Stage 37:>     (0 + 0[Stage 37:=======>             (1 + 3[Stage 37:=============================>       (4 + 0[Stage 36:========>  (13 + 4)/24][Stage 37:=========>  (4 + 0[Stage 36:==============> (21 + 3)/24][Stage 37:=========>  (4 + 1[Stage 37:====================================>      (5 + 3[Stage 38:===================================>     (20 + 4)[Stage 38:====================================================> (30 + 2)                   SORTED (t-value, CID) 
[(-5.761659596980321, '7'), (-0.9105029072119708, '1'), (0.9664675480810896, '5'), (1.5028581483070664, '10')] 

回答

2

你既可以禁用通过设置

  • spark.ui.showConsoleProgress =假

  • 减少日志记录级别的log4j.propertiesINFO更高,即ERROR

相关星火jiras:

spark.ui.showConsoleProgress总是在星火,因为版本1 .2,但只会在Spark 2.2中记录。

+0

SparkContext是否是spark变量?从SparkSession.builder()。getOrCreate()创建? – juanpscotto