2012-07-27 56 views
1

尝试加载从HDFS一个文本文件,通过地图,减少Java程序的MongoDB,我收到以下错误:当我执行另一个代码加载文本文件的MongoDB

12/07/26 19:19:02 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 
12/07/26 19:19:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
12/07/26 19:19:02 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 
12/07/26 19:19:02 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 
12/07/26 19:19:02 INFO input.FileInputFormat: Total input paths to process : 1 
12/07/26 19:19:02 WARN snappy.LoadSnappy: Snappy native library not loaded 
12/07/26 19:19:02 INFO mapred.JobClient: Running job: job_local_0001 
should setup context 
12/07/26 19:19:02 INFO mapred.MapTask: io.sort.mb = 100 
12/07/26 19:19:02 INFO mapred.MapTask: data buffer = 79691776/99614720 
12/07/26 19:19:02 INFO mapred.MapTask: record buffer = 262144/327680 
12/07/26 19:19:02 INFO mapred.MapTask: Starting flush of map output 
12/07/26 19:19:02 INFO mapred.MapTask: Finished spill 0 
12/07/26 19:19:02 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting 
12/07/26 19:19:02 INFO mapred.LocalJobRunner: 
12/07/26 19:19:02 INFO mapred.Task: Task attempt_local_0001_m_000000_0 is allowed to commit now 
should commit task 
12/07/26 19:19:02 INFO mapred.LocalJobRunner: 
12/07/26 19:19:02 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done. 
should setup context 
12/07/26 19:19:02 INFO mapred.LocalJobRunner: 
12/07/26 19:19:02 INFO mapred.Merger: Merging 1 sorted segments 
12/07/26 19:19:02 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 20 bytes 
12/07/26 19:19:02 INFO mapred.LocalJobRunner: 
12/07/26 19:19:02 WARN mapred.FileOutputCommitter: Output path is null in cleanup 
12/07/26 19:19:02 WARN mapred.LocalJobRunner: job_local_0001 
java.lang.IllegalArgumentException: Unable to connect to MongoDB Output Collection. 
    at com.mongodb.hadoop.util.MongoConfigUtil.getOutputCollection(MongoConfigUtil.java:272) 
    at com.mongodb.hadoop.MongoOutputFormat.getRecordWriter(MongoOutputFormat.java:41) 
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:559) 
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:414) 
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:256) 
Caused by: java.lang.IllegalArgumentException: Unable to connect to collection: null 
    at com.mongodb.hadoop.util.MongoConfigUtil.getCollection(MongoConfigUtil.java:262) 
    at com.mongodb.hadoop.util.MongoConfigUtil.getOutputCollection(MongoConfigUtil.java:269) 
    ... 4 more 
Caused by: java.lang.NullPointerException 
    at com.mongodb.Mongo$Holder._toKey(Mongo.java:679) 
    at com.mongodb.Mongo$Holder.connect(Mongo.java:657) 
    at com.mongodb.hadoop.util.MongoConfigUtil.getCollection(MongoConfigUtil.java:259) 
    ... 5 more 
12/07/26 19:19:03 INFO mapred.JobClient: map 100% reduce 0% 
12/07/26 19:19:03 INFO mapred.JobClient: Job complete: job_local_0001 
12/07/26 19:19:03 INFO mapred.JobClient: Counters: 14 
12/07/26 19:19:03 INFO mapred.JobClient: FileSystemCounters 
12/07/26 19:19:03 INFO mapred.JobClient:  FILE_BYTES_READ=219 
12/07/26 19:19:03 INFO mapred.JobClient:  HDFS_BYTES_READ=11 
12/07/26 19:19:03 INFO mapred.JobClient:  FILE_BYTES_WRITTEN=57858 
12/07/26 19:19:03 INFO mapred.JobClient: Map-Reduce Framework 
12/07/26 19:19:03 INFO mapred.JobClient:  Reduce input groups=0 
12/07/26 19:19:03 INFO mapred.JobClient:  Combine output records=1 
12/07/26 19:19:03 INFO mapred.JobClient:  Map input records=1 
12/07/26 19:19:03 INFO mapred.JobClient:  Reduce shuffle bytes=0 
12/07/26 19:19:03 INFO mapred.JobClient:  Reduce output records=0 
12/07/26 19:19:03 INFO mapred.JobClient:  Spilled Records=1 
12/07/26 19:19:03 INFO mapred.JobClient:  Map output bytes=16 
12/07/26 19:19:03 INFO mapred.JobClient:  Combine input records=1 
12/07/26 19:19:03 INFO mapred.JobClient:  Map output records=1 
12/07/26 19:19:03 INFO mapred.JobClient:  SPLIT_RAW_BYTES=133 
12/07/26 19:19:03 INFO mapred.JobClient:  Reduce input records=0 

这加载数据从MongoDB中到MongoDB中以同样的方式我正在执行以下代码:

public class WordCountH2M { 
    private static final Log log = LogFactory.getLog(WordCountM2H.class); 
    public static class TokenizerMapper extends Mapper<LongWritable, Text ,Text, IntWritable> { 
     static IntWritable one = new IntWritable(1); 
     public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { 
      context.write(new Text(value.toString()),one); 
     } 
    } 

    public static class IntSumReducer extends Reducer<Text, IntWritable, Text,IntWritable> { 
     public void reduce(Text key, IntWritable values, Context context) throws IOException, InterruptedException { 
      context.write(key,values); 
     } 
    } 

    public static void main(String[] args) { 
     try { 
      final Configuration conf = new Configuration(); 
      final Job job = new Job(conf, "word count"); 
      job.setJarByClass(WordCountH2M.class); 
      FileInputFormat.addInputPath(job, new Path("hdfs:****user/user1/input-data/samplefile/file.txt")); 
      MongoConfigUtil.setOutputURI(conf, "mongodb://127.0.0.1:12333/test.ss1"); 
      job.setMapperClass(TokenizerMapper.class); 
      job.setReducerClass(IntSumReducer.class); 
      job.setOutputKeyClass(Text.class); 
      job.setOutputValueClass(IntWritable.class); 
      job.setInputFormatClass(TextInputFormat.class); 
      job.setOutputFormatClass(MongoOutputFormat.class); 
      System.exit(job.waitForCompletion(true) ? 0 : 1); 
     } catch (Exception e) { 
      System.out.println(e.getMessage()); 
     } 
    } 
} 
+0

确定:'MongoConfigUtil.setOutputURI(CONF “的MongoDB://127.0.0.1:12333/test.ss1”);' 是正确的?它说它不能连接到输出集合。 mongodb是否在您列出的端口和主机上运行? – ranman 2012-07-30 17:47:29

回答

0

我有类似的问题,这是由使用用户火花运行火花的工作,而不是根本解决。

希望这会有所帮助。

感谢,

约翰