2016-02-14 60 views
1

所以我正在编写一个解析CSV的程序。我使用split方法将值分隔成一个字符串数组,但我读过一些文章,它使用substring和indexOf更快。我基本上写了我会用这两种方法做什么,似乎分裂会更好。有人可以解释这是如何更好,或者如果我没有正确使用这些方法?这是我写的:拆分方法vs子串和索引

int indexOne = 0, indexTwo; 
for (int i = 0; i < 4; i++) //there's four diff values in one line 
{ 
    if (line.indexOf(",", indexOne) != -1) 
    { 
     indexTwo = line.indexOf(",", indexOne); 
     lineArr[i] = line.substring(indexOne, indexTwo); 
     indexOne = indexTwo+1; 
    } 
} 
+0

你可能会链接一些这些文章? –

+0

考虑使用lodash或下划线或类似的东西来处理这样的事情。 – Michael

+1

@AustinD这里有一个链接http://demeranville.com/battle-of-the-tokenizers-delimited-text-parser-performance/有人把它放在stackexchange的评论这里是该线程http://programmers.stackexchange.com/questions/221997 /最快路径分割-a-delimited-string-in-java – trevalexandro

回答

1

下面是随甲骨文JDK 8更新73.你可以在“快速路径”的情况看,当你在一个字符字符串传递源采取的代码,它属于使用indexOf的循环与您的逻辑类似。

简短的回答是,你的代码有点快,但我会留给你决定是否足以避免在你的用例中使用split。

就我个人而言,我倾向于同意@pczeus评论使用分裂,除非您确实有证据表明它引起了问题。

public String[] split(String regex, int limit) { 
    /* fastpath if the regex is a 
    (1)one-char String and this character is not one of the 
     RegEx's meta characters ".$|()[{^?*+\\", or 
    (2)two-char String and the first char is the backslash and 
     the second is not the ascii digit or ascii letter. 
    */ 
    char ch = 0; 
    if (((regex.value.length == 1 && 
     ".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) || 
     (regex.length() == 2 && 
      regex.charAt(0) == '\\' && 
      (((ch = regex.charAt(1))-'0')|('9'-ch)) < 0 && 
      ((ch-'a')|('z'-ch)) < 0 && 
      ((ch-'A')|('Z'-ch)) < 0)) && 
     (ch < Character.MIN_HIGH_SURROGATE || 
     ch > Character.MAX_LOW_SURROGATE)) 
    { 
     int off = 0; 
     int next = 0; 
     boolean limited = limit > 0; 
     ArrayList<String> list = new ArrayList<>(); 
     while ((next = indexOf(ch, off)) != -1) { 
      if (!limited || list.size() < limit - 1) { 
       list.add(substring(off, next)); 
       off = next + 1; 
      } else { // last one 
       //assert (list.size() == limit - 1); 
       list.add(substring(off, value.length)); 
       off = value.length; 
       break; 
      } 
     } 
     // If no match was found, return this 
     if (off == 0) 
      return new String[]{this}; 

     // Add remaining segment 
     if (!limited || list.size() < limit) 
      list.add(substring(off, value.length)); 

     // Construct result 
     int resultSize = list.size(); 
     if (limit == 0) { 
      while (resultSize > 0 && list.get(resultSize - 1).length() == 0) { 
       resultSize--; 
      } 
     } 
     String[] result = new String[resultSize]; 
     return list.subList(0, resultSize).toArray(result); 
    } 
    return Pattern.compile(regex).split(this, limit); 
}