2013-03-07 50 views
1

我第一次在Hadoop中使用自定义数据类型。这里是我的代码:使用Hadoop自定义数据类型时EOF异常

自定义数据类型:

public class TwitterData implements Writable { 

private Long id; 
private String text; 
private Long createdAt; 

public TwitterData(Long id, String text, Long createdAt) { 
    super(); 
    this.id = id; 
    this.text = text; 
    this.createdAt = createdAt; 
} 

public TwitterData() { 
    this(new Long(0L), new String(), new Long(0L)); 
} 

@Override 
public void readFields(DataInput in) throws IOException { 
    System.out.println("In readFields..."); 
    id = in.readLong(); 
    text = in.readLine(); 
    createdAt = in.readLong(); 
} 

@Override 
public void write(DataOutput out) throws IOException { 
    System.out.println("In write..."); 
    out.writeLong(id); 
    out.writeChars(text); 
    out.writeLong(createdAt); 
} 

public Long getId() { 
    return id; 
} 

public void setId(Long id) { 
    this.id = id; 
} 

public String getText() { 
    return text; 
} 

public void setText(String text) { 
    this.text = text; 
} 

public Long getCreatedAt() { 
    return createdAt; 
} 

public void setCreatedAt(Long createdAt) { 
    this.createdAt = createdAt; 
} 
} 

映射:

public class Map extends Mapper<Object, BSONObject, Text, TwitterData>{ 

@Override 
public void map(Object key, BSONObject value, Context context) throws IOException, InterruptedException { 
    BSONObject user = (BSONObject) value.get("user"); 
    String location = (String) user.get("location"); 

    TwitterData twitterData = new TwitterData((Long) value.get("id"), 
      (String) value.get("text"), (Long) value.get("createdAt")); 

    if(location.toLowerCase().indexOf("india") != -1) { 
     context.write(new Text("India"), twitterData); 
    } else { 
     context.write(new Text("Other"), twitterData); 
    } 
} 
} 

主要职位代码:

job.setMapOutputKeyClass(Text.class); 
job.setMapOutputValueClass(TwitterData.class); 

我抛出映射过程后,此异常。我很害怕它为什么会显示这个错误。谁能帮帮我吗。 在此先感谢。

回答

2

你写字符,你读线条。这是两个不同的序列化过程。

你需要做的,是这样做的:

@Override 
public void readFields(DataInput in) throws IOException { 
    id = in.readLong(); 
    text = in.readUTF(); 
    createdAt = in.readLong(); 
} 

@Override 
public void write(DataOutput out) throws IOException { 
    out.writeLong(id); 
    out.writeUTF(text); 
    out.writeLong(createdAt); 
} 
+0

感谢托马斯。我为我工作。那么当我们没有writeLine()时,readLine()的用法是什么?同样,对于writeChars().. – 2013-03-07 09:34:18

+0

@AbhendraSingh从一个文本文件中读取并逐行读取时。写一行可能会写成UTF +'\ n'。 – 2013-03-07 11:53:01