我试图解析多个文件并将它们分成一组HashMap中的字段。这是一个样本文件。解析Java中的文本文件以获取字段的HashMap
COCONUT OIL CONTRACT TO CHANGE - DUTCH TRADERS
ROTTERDAM, March 18 - Contract terms for trade in coconut
oil are to be changed from long tons to tonnes with effect from
the Aug/Sep contract onwards, Dutch vegetable oil traders said.
Operators have already started to take account of the
expected change and reported at least one trade in tonnes for
Aug/Sept shipment yesterday.
我需要的程序,这个文档解析为一个自定义文档类具有键,文件名,文件名称,地点,日期,作者,内容,类别字段中。
这是我尝试过的。
public static Document parse(String filename) {
File f = new File(filename);
if (f.isFile()){
String fileId;
if (filename.indexOf(".") > 0) {
fileId = filename.substring(0, filename.lastIndexOf("."));
}
String category = f.getParent();
InputStream in = new FileInputStream(f);
byte buf[] = new byte[1024];
int len = in.read(buf);
while(len > 0){
..........
}
in.close();
}
return null;
}
我很抱歉你试图在这里完成? :O – 2014-09-19 19:18:44
那么,这是一个开始,但很难以相同的方式继续。如果我是你,我现在不再编写代码,首先找出需要采取的高级步骤。把这些步骤写在一张纸上。 '1。将文件完全读入字符串。 2.提取文件标题...等等。然后你可以开始一步一步编码,在每一步之后测试结果。 – biziclop 2014-09-19 19:20:17