2017-02-25 139 views
2

我有一个上传到hdfs的CSV文件。我正在使用opencsv分析器来读取数据。我在hadoop类路径中也有我的jar文件,并将其上载到hdfs中的以下位置/jars/opencsv-3.9.jar中。我得到的错误也附加了。CSV类未发现异常

这里是我的代码片段

public class TermLabelledPapers { 

    public static class InputMapper extends Mapper<LongWritable, Text, Text, Text> { 

    @Override 
    protected void map(LongWritable key, Text value, Context context) 
      throws IOException, InterruptedException { 

     CSVParser parser = new CSVParser(); 
     String[] lines = parser.parseLine(value.toString()); 
     //readEntry.readHeaders(); 
     String doi = lines[0]; 
     String keyphrases = lines[3]; 

     Get g = new Get(Bytes.toBytes(doi.toString())); 
     context.write(new Text(doi), new Text(keyphrases)); 

    } 
} 

public static class PaperEntryReducer extends TableReducer<Text, Text, ImmutableBytesWritable> { 

    @Override 
    protected void reduce(Text doi, Iterable<Text> values, Context context) 
      throws IOException, InterruptedException { 

    } 
} 


public static void main(String[] args) throws Exception { 

    Configuration conf = HBaseConfiguration.create(); 
    conf.set("hbase.zookeeper.quorum", "172.17.25.18"); 
    conf.set("hbase.zookeeper.property.clientPort", "2183"); 
    //add the external jar to hadoop distributed cache 
    //addJarToDistributedCache(CsvReader.class, conf); 

    Job job = new Job(conf, "TermLabelledPapers"); 
    job.setJarByClass(TermLabelledPapers.class); 
    job.setMapperClass(InputMapper.class); 
    job.setMapOutputKeyClass(Text.class); 
    job.setMapOutputValueClass(Text.class); 
    job.addFileToClassPath(new Path("/jars/opencsv-3.9.jar")); 
    FileInputFormat.setInputPaths(job, new Path(args[0])); // "metadata.csv" 

    TableMapReduceUtil.initTableReducerJob("PaperBagofWords", PaperEntryReducer.class, job); 
    job.setReducerClass(PaperEntryReducer.class); 
    job.waitForCompletion(true); 
} 

} 

其运行作业后,显示出来的错误是

Error: java.lang.ClassNotFoundException: com.csvreader.CsvReader 
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) 
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) 
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) 
at java.lang.ClassLoader.loadClass(ClassLoader.java:357) 
at mcad.TermLabelledPapers$InputMapper.map(TermLabelledPapers.java:69) 
at mcad.TermLabelledPapers$InputMapper.map(TermLabelledPapers.java:1) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) 
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) 
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 
+0

添加到hadoop类路径中?然后使用'hadoop classpath'命令检查以确保它在那里。 –

回答

0

理想的情况下,这种错误不应该来。如果罐子在Hadoop的类路径。如果你是一个maven项目,你可以尝试创建jar-with-dependencies,它将包含所有依赖jar和你的jar。这可以帮助诊断问题。