2013-04-25 88 views
1

我正在运行一个小测试,虽然它是一个微基准测试,但它确实模仿了我们在生产中实际做得非常好。为什么此Java代码的性能如此不一致?

我创建了一个二维数组,5列和10,000,000行,填充0-19(含)之间的随机整数。然后,我想总结第三列中的所有数字,只要第二列的值是偶数即可。我做了100次热身,然后再做100次,每次花费多少时间。

在我的机器上绝大多数时间大约需要9秒,但是偶尔需要6秒。

它看起来不像垃圾收集或JIT编译。

有没有人有任何想法,为什么它会非常偶然地快得多?

我运行的Linux上JDK7u11所述代码与下列参数: - 服务器-XX:+ PrintCompilation -Xms500m -Xmx500m -verbose:GC -XX:+ PrintGCTimeStamps -XX:+ PrintGCDetails然而,使用各种不同的的J​​DK(从6一直到8),并删除所有这些参数似乎并没有显着影响时间。

下面是代码:

import java.util.ArrayList; 
import java.util.Random; 

public class JavaPerformanceTest { 
    public static void main(String[] args) { 
     int numColumns = 5; 
     int numRows = 10000000; 
     int[][] data = new int[numColumns][numRows]; 
     Random rand = new Random(1234); 
     for (int j = 0; j < numColumns; j++) { 
      for (int i = 0; i < numRows; i++) { 
       data[j][i] = rand.nextInt(20); 
      } 
     } 
     int warmUp = 100; 
     ArrayList<Integer> sums = new ArrayList<Integer>(); 
     System.out.println("warm up " + warmUp + " times"); 
     long warmUpStart = System.nanoTime(); 
     for (int i = 0; i < warmUp; i++) { 
      sums.add(sum(numRows, data)); 
     } 
     long warmUpEnd = System.nanoTime(); 
     System.out.println("warm up complete " + (warmUpEnd - warmUpStart)/1000000); 
     int numberOfRuns = 100; 
     int finalSum = 0; 
     long startTime = System.nanoTime(); 
     for (int i = 0; i < numberOfRuns; i++) { 
      finalSum = sum(numRows, data); 
     } 
     long endTime = System.nanoTime(); 
     long diff = (endTime - startTime)/1000000; 
     System.out.println("Time taken: " + diff + " Sum: " + finalSum); 
    } 


    public static int sum(int numRows, int[][] columnBased) { 
     int sum = 0; 
     for (int i = 0; i < numRows; i++) { 
      if ((columnBased[1][i] % 2) == 0) { 
       sum += columnBased[2][i]; 
      } 
     } 
     return sum; 
    } 
} 

谢谢,尼克。

回答

1

有很多可能的原因导致性能下降,包括缓存未命中和分支预测失败。我会确保你的代码是最优的,然后重复它以确保你的结果是稳定的。

import java.util.ArrayList; 
import java.util.Random; 

public class JavaPerformanceTest { 
    public static void main(String[] args) { 
     int numColumns = 5; 
     int numRows = 10000000; 
     byte[][] data = new byte[numColumns][numRows]; 
     Random rand = new Random(1234); 
     for (int j = 0; j < numColumns; j++) { 
      for (int i = 0; i < numRows; i++) { 
       data[j][i] = (byte) rand.nextInt(20); 
      } 
     } 
     int warmUp = 10; 
     ArrayList<Integer> sums = new ArrayList<Integer>(); 
     System.out.println("warm up " + warmUp + " times"); 
     long warmUpStart = System.nanoTime(); 
     for (int i = 0; i < warmUp; i++) { 
      sums.add(sum(numRows, data)); 
     } 
     long warmUpEnd = System.nanoTime(); 
     System.out.println("warm up complete " + (warmUpEnd - warmUpStart)/1000000); 
     for (int t = 0; t < 3; t++) { 
      int numberOfRuns = 100; 
      int finalSum = 0; 
      long startTime = System.nanoTime(); 
      for (int i = 0; i < numberOfRuns; i++) { 
       finalSum = sum(numRows, data); 
      } 
      long endTime = System.nanoTime(); 
      long diff = (endTime - startTime)/1000000; 
      System.out.println("Time taken: " + diff + " Sum: " + finalSum); 
     } 
    } 


    public static int sum(int numRows, byte[][] columnBased) { 
     int sum = 0; 
     byte[] col1 = columnBased[1]; 
     byte[] col2 = columnBased[2]; 
     for (int i = 0; i < numRows; i++) 
      // use multiplication instead of "if" to avoid branch prediction failures 
      sum += ((col1[i] + 1) & 1) * col2[i]; 
     return sum; 
    } 
} 

打印

warm up 10 times 
warm up complete 109 
Time taken: 1006 Sum: 47505460 
Time taken: 1006 Sum: 47505460 
Time taken: 1026 Sum: 47505460 

总结:优化的代码将提高其性能远远超过使用命令行参数播放。