2017-06-01 148 views
1

我有一个包含10000条记录的大型数据集,其中5000个属于类1,其余5000个属于类-1。我使用随机森林,并获得了超过90%的良好准确性。在Java中使用随机森林打印实际和预测的类标签

现在,如果我有一个ARFF文件

@relation cds_orf 
 
@attribute start numeric 
 
@attribute end numeric 
 
@attribute score numeric 
 
@attribute orf_coverage numeric 
 
@attribute class {1,-1} 
 
@data 
 
(suppose this contains 5 records)

我的输出应该是这样的

No Actual_class Predicted class 
 
1  1     1 
 
2  1     1 
 
3 -1     -1 
 
4  1     -1 
 
5  1     1

我想要的JAV打印此输出的代码。谢谢。 (注意:我已经使用了classifier.classifyInstance(),但它给出了NullPointerException)

回答

2

那么我经过大量研究后自己找到了答案。以下代码执行相同的操作,并将输出写入另一个文件orf_out。

import java.io.BufferedReader; 
 
import java.io.BufferedWriter; 
 
import java.io.FileReader; 
 
import java.io.FileWriter; 
 
import java.io.PrintWriter; 
 
import java.util.Random; 
 
import weka.classifiers.Evaluation; 
 
import weka.classifiers.trees.RandomForest; 
 
import weka.core.Instances; 
 
    
 
/** 
 
* 
 
* @author samy 
 
*/ 
 
public class WekaTest { 
 
    
 
    /** 
 
    * @throws java.lang.Exception 
 
    */ 
 
    public static void rfnew() throws Exception { 
 
     BufferedReader br; 
 
     int numFolds = 10; 
 
     br = new BufferedReader(new FileReader("orf_arff")); 
 
    
 
     Instances trainData = new Instances(br); 
 
     trainData.setClassIndex(trainData.numAttributes() - 1); 
 
     br.close(); 
 
     
 
     RandomForest rf = new RandomForest(); 
 
     rf.setNumTrees(100);   
 
     
 
     Evaluation evaluation = new Evaluation(trainData); 
 
     evaluation.crossValidateModel(rf, trainData, numFolds, new Random(1)); 
 
     rf.buildClassifier(trainData); 
 
     PrintWriter out = new PrintWriter("orf_out"); 
 
     out.println("No.\tTrue\tPredicted"); 
 
     for (int i = 0; i < trainData.numInstances(); i++)  
 
     { 
 
      String trueClassLabel; 
 
      trueClassLabel = trainData.instance(i).toString(trainData.classIndex()); 
 
      // Discreet prediction 
 
      double predictionIndex = 
 
      rf.classifyInstance(trainData.instance(i)); 
 

 
      // Get the predicted class label from the predictionIndex. 
 
      String predictedClassLabel;    
 
      predictedClassLabel = trainData.classAttribute().value((int) predictionIndex); 
 
      out.println((i+1)+"\t"+trueClassLabel+"\t"+predictedClassLabel); 
 
     } 
 
     
 
     out.println(evaluation.toSummaryString("\nResults\n======\n", true)); 
 
     out.println(evaluation.toClassDetailsString()); 
 
     out.println("Results For Class -1- "); 
 
     out.println("Precision= " + evaluation.precision(0)); 
 
     out.println("Recall= " + evaluation.recall(0)); 
 
     out.println("F-measure= " + evaluation.fMeasure(0)); 
 
     out.println("Results For Class -2- "); 
 
     out.println("Precision= " + evaluation.precision(1)); 
 
     out.println("Recall= " + evaluation.recall(1)); 
 
     out.println("F-measure= " + evaluation.fMeasure(1)); 
 
     out.close(); 
 
    } 
 
}

我需要在我的代码使用buildClassifier。