2010-10-19 100 views
0

我得到java.lang.OutOfMemoryError:从文本文件读取时,GC开销限制超出错误。我不确定发生了什么问题。我正在运行我的程序具有足够的内存。外层循环迭代16000次,外层循环的每次迭代内层循环迭代大约300,000次。当代码试图从内层循环读取一行时抛出错误。任何建议都会grately appreciated.The下面是我的代码片段:读取文本文件时,GC开销限制超出错误

//Read from the test data output file till not equals null 
//Reads a single line at a time from the test data 
while((line=br.readLine())!=null) 
{ 
    //Clears the hashmap 
    leastFive.clear(); 

    //Clears the arraylist 
    fiveTrainURLs.clear(); 
    try 
    { 
     StringTokenizer st=new StringTokenizer(line," "); 
     while(st.hasMoreTokens()) 
     { 
      String currentToken=st.nextToken(); 

      if(currentToken.contains("File")) 
      { 
       testDataFileNo=st.nextToken(); 
       String tok=""; 
       while((tok=st.nextToken())!=null) 
       { 
        if (tok==null) break; 

        int topic_no=Integer.parseInt(tok); 
        topic_no=Integer.parseInt(tok); 
        String prob=st.nextToken(); 

        //Obtains the double value of the probability 
        double double_prob=Double.parseDouble(prob); 
        p1[topic_no]=double_prob; 

       } 
       break; 
      } 
     } 
    } 
    catch(Exception e) 
    { 
    } 

    //Used to read over all the training data file 
    FileReader fr1=new FileReader("/homes/output_train_2000.txt"); 

    BufferedReader br1=new BufferedReader(fr1); 
    String line1=""; 

    //Reads the training data output file,one row at a time 
    //This is the line on which an exception occurs! 
    while((line1=br1.readLine())!=null) 
    { 
     try 
     { 
      StringTokenizer st=new StringTokenizer(line1," "); 

      while(st.hasMoreTokens()) 
      { 
       String currentToken=st.nextToken(); 

       if(currentToken.contains("File")) 
       { 
        trainDataFileNo=st.nextToken(); 
        String tok=""; 
        while((tok=st.nextToken())!=null) 
        { 
         if(tok==null) 
          break; 

         int topic_no=Integer.parseInt(tok); 
         topic_no=Integer.parseInt(tok); 
         String prob=st.nextToken(); 

         double double_prob=Double.parseDouble(prob); 

         //p2 will contain the probability values of each of the topics based on the indices 
         p2[topic_no]=double_prob; 

        } 
        break; 
       } 
      } 
     } 
     catch(Exception e) 
     { 
      double result=klDivergence(p1,p2); 

      leastFive.put(trainDataFileNo,result); 
     } 
    } 
} 

回答

3

16000 * 30万= 4.8亿元。如果每个令牌只占用6个字节,本身就超过24GB。当垃圾收集器最终以24GB启动到gc时,它将运行很长时间。好像你需要把它分解成更小的块。您可以将您的应用内存限制在1GB等合理的范围内,这样GC就可以更快地开始工作,并在完成工作的时候完成某些工作。

+0

另外,我相信Windows会忽略超过1.2GB的vm max大小限制。 – Noah 2011-07-28 22:26:21

相关问题