2014-06-10 27 views
0

我想创造Hadoop中一个新的数据类型,但我得到了下面的错误,从我的自定义inputformat类下面是我的代码:WholeFileRecordReader不能转换为org.apache.hadoop.mapred.RecordReader

错误 - WholeFileRecordReader不能被转换为org.apache.hadoop.mapred.RecordReader

码 -

进口产生java.io.IOException;

import org.apache.hadoop.fs.FileSystem; 
import org.apache.hadoop.fs.Path; 
import org.apache.hadoop.io.LongWritable; 
import org.apache.hadoop.io.Text; 
import org.apache.hadoop.mapred.FileInputFormat; 
import org.apache.hadoop.mapred.FileSplit; 
import org.apache.hadoop.mapred.InputSplit; 
import org.apache.hadoop.mapred.JobConf; 
import org.apache.hadoop.mapred.RecordReader; 
import org.apache.hadoop.mapred.Reporter; 
import org.apache.hadoop.mapred.TaskAttemptContext; 



public class wholeFileInputFormat extends FileInputFormat<Text, apriori>{ 

public RecordReader<Text, apriori> getRecordReader(
      InputSplit input, JobConf job, Reporter reporter) 
      throws IOException { 

     reporter.setStatus(input.toString()); 

    return (RecordReader<Text, apriori>) new WholeFileRecordReader(job,FileSplit)input); 

     } 

} 

我的自定义记录阅读器是如下

import java.io.FileInputStream; 
import java.io.FileNotFoundException; 
import java.io.IOException; 
import java.io.InputStream; 

import org.apache.hadoop.conf.Configuration; 
import org.apache.hadoop.fs.FileSystem; 
import org.apache.hadoop.fs.Path; 
import org.apache.hadoop.io.IOUtils; 
import org.apache.hadoop.io.LongWritable; 
import org.apache.hadoop.io.Text; 
import org.apache.hadoop.mapred.FileSplit; 
import org.apache.hadoop.mapred.JobConf; 
import org.apache.hadoop.mapreduce.InputSplit; 
import org.apache.hadoop.mapreduce.RecordReader; 
import org.apache.hadoop.mapreduce.TaskAttemptContext; 

class WholeFileRecordReader extends RecordReader<Text, apriori> { 


private FileSplit fileSplit; 
private Configuration conf; 
private InputStream in; 
private Text key = new Text(""); 
private apriori value = new apriori(); 
private boolean processed = false; 


public void initialize(JobConf job, FileSplit split) 
     throws IOException { 

    this.fileSplit = split; 
    this.conf = job; 
    final Path file = fileSplit.getPath(); 
    String StringPath = new String(fileSplit.getPath().toString()); 
    String StringPath2 = new String(); 
    StringPath2 = StringPath.substring(5); 
    System.out.println(StringPath2); 
    in = new FileInputStream(StringPath2); 

    FileSystem fs = file.getFileSystem(conf); 
    in = fs.open(file); 
    } 


public boolean nextKeyValue() throws IOException, InterruptedException { 
    if (!processed) { 
     byte[] contents = new byte[(int) fileSplit.getLength()]; 
     Path file = fileSplit.getPath(); 
     key.set(file.getName()); 

     try { 
      IOUtils.readFully(in, contents, 0, contents.length); 
      value.set(contents, 0, contents.length); 
     } finally { 
      IOUtils.closeStream(in); 
     } 

     processed = true; 
     return true; 
    } 

    return false; 
} 

@Override 
public Text getCurrentKey() throws IOException, InterruptedException { 
    return key; 
} 

@Override 
public apriori getCurrentValue() throws IOException, InterruptedException { 
    return value; 
} 

@Override 
public float getProgress() throws IOException { 
    return processed ? 1.0f : 0.0f; 
} 

@Override 
public void close() throws IOException { 
    // Do nothing 
} 

@Override 
public void initialize(InputSplit arg0, TaskAttemptContext arg1) 
     throws IOException, InterruptedException { 
    // TODO Auto-generated method stub 

} 

} 
+0

您可以共享WholeFileRecordReader类的完全限定名称吗? – donut

+0

嗨,先生,甜甜圈我已编辑我的查询.. – user2758378

回答

0

WholeFileRecordReader类是org.apache.hadoop.mapreduce.RecordReader class.This类的子类不能被转换为org.apache.hadoop.mapred.RecordReader class.Can您尝试在这两个类中使用相同的API

按照Java编程语言的规则,只有来自相同类型层次结构的类或接口(统称为类型)才可以被转换或相互转换。如果您尝试投射两个不共享相同类型层次结构的对象,即它们之间没有父子关系,则会出现编译时错误。您可以参考此link

0

有包不匹配,错误即将到来。

在你的代码中,你组合了MRv1和MRv2,因此你得到了错误。

Packages org.apache.hadoop.mapred是Mrv1。 (Map Reduce version 1)

org.apache.hadoop.mapreduce是Mrv2。 (地图缩小版2)

在你的代码合并两个MRv1和MRv2:

import org.apache.hadoop.mapred.FileSplit; 

import org.apache.hadoop.mapred.JobConf; 

import org.apache.hadoop.mapreduce.InputSplit; 

import org.apache.hadoop.mapreduce.RecordReader; 

import org.apache.hadoop.mapreduce.TaskAttemptContext; 

要么使用所有的导入包为org.apache.hadoop.mapred(MRv1)或org.apache.hadoop.mapreduce(MRv2)。

希望这会有所帮助。

相关问题