我需要读取(15000)excel文件以供论文学习。我使用的Apache POI打开以后对它们进行分析,但之后在5000左右的文件我得到以下异常和堆栈跟踪:Java Apache-poi,excel文件内存泄漏
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3044)
at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3065)
at org.apache.xmlbeans.impl.store.Locale$SaxHandler.startElement(Locale.java:3263)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.reportStartTag(Piccolo.java:1082)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseAttributesNS(PiccoloLexer.java:1822)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseOpenTagNS(PiccoloLexer.java:1521)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseTagNS(PiccoloLexer.java:1362)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.yylex(PiccoloLexer.java:4682)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yylex(Piccolo.java:1290)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yyparse(Piccolo.java:1400)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.parse(Piccolo.java:714)
at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3479)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1277)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1264)
at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:345)
at org.apache.poi.POIXMLTypeLoader.parse(POIXMLTypeLoader.java:92)
at org.openxmlformats.schemas.spreadsheetml.x2006.main.WorksheetDocument$Factory.parse(Unknown Source)
at org.apache.poi.xssf.usermodel.XSSFSheet.read(XSSFSheet.java:173)
at org.apache.poi.xssf.usermodel.XSSFSheet.onDocumentRead(XSSFSheet.java:165)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.parseSheet(XSSFWorkbook.java:417)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.onDocumentRead(XSSFWorkbook.java:382)
at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:178)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:249)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:302)
at de.spreadsheet_realtions.analysis.WorkbookAnalysis.analyze(WorkbookAnalysis.java:18)
代码(目前只需要打开文件,并关闭文件):
public static void main(String[] args) {
start();
}
public void start(){
File[] files = getAllFiles(Config.folder);
ZipSecureFile.setMinInflateRatio(0.00);
for(File f: files){
analyze(f);
}
}
public void analyze(File file){
Workbook workbook = null;
try {
workbook = new XSSFWorkbook(file); //line 18
} catch (Exception e1) {e1.printStackTrace(); return;}
// later would be here the code to analyze the workbook
try {
workbook.close();
} catch (Exception e) {e.printStackTrace();}
}
我试着用OPCPackage.open(文件)也得到了同样的结果。
我做错了什么,或者我能做些什么来解决这个问题?谢谢你的帮助。
编辑: 下面的代码相同。
try (XSSFWorkbook workbook = new XSSFWorkbook(file)){
} catch (Exception e1) {e1.printStackTrace(); return;}
这可能是一个非常大的文件,该文件会导致OOM基于您为java进程定义的内存设置。你可以尝试只运行OOM发生的一个文件,看看这一个单独是否已经触发了OOM? – centic
是的,它是一个大文件(42mb),并且正在运行这个文件:-)谢谢。 – MichaD