2016-12-03 62 views
0

我跑在MapReduce的以下Python代码:的MapReduce:ValueError异常:值过多解压(预期2)

from mrjob.job import MRJob 
import collections 

bigram = collections.defaultdict(float) 
unigram = collections.defaultdict(float) 


class MRWordFreqCount(MRJob): 

    def mapper(self, _, line): 
     # Now we loop over lines in the system input 
     line = line.strip().split() 
     # go through each word in sentence 
     i = 0 
     for word in line: 
      if i > 0: 
       hist = word 
      else: 
       hist = '' 

      word = CleanWord(word) # Get the new word 

      # If CleanWord didn't return a string, move on 
      if word == None: continue 

      i += 1 
      yield word.lower(), hist.lower(), 1.0 

if __name__ == '__main__': 
    MRWordFreqCount.run() 

我得到的错误:ValueError异常:值过多解压(预期2)但我无法弄清楚为什么。有什么建议么? 我正在运行的cmd代码是: python myjob.py Test.txt --mapper

+1

您正在从'mapper'返回3个值,而您似乎只能返回2个值。 –

+0

谢谢。是的,你是对的 - MrJobs mapper函数只需要一个键,值作为输出。 https://pythonhosted.org/mrjob/guides/concepts.html#mapreduce-and-apache-hadoop – user1761806

回答

1

在MapReduce作业中,只发出键和值对。要做到这一点,您可以应用以下策略类型:

yield (word.lower(), hist.lower()), 1.0 
相关问题