2014-11-03 48 views
2

我有HTML文件和文档。在解析过程中,我将分割导致并发修改错误的textnode。在Jsoup中如何在迭代过程中更改/拆分节点

private void processInContent(Node ele) { 
     String text = "";   
     for (Node child : ele.childNodes()) {    
      Node parentNode = child.parentNode(); 
      if (child instanceof TextNode && !("a").equalsIgnoreCase(parentNode.nodeName())) { 
       TextNode childText = (TextNode) child; 
       text = childText.text(); 
       System.out.println(text); 
       Matcher m = pattern.matcher(text); 
       while (m.find()) { 
        String matched = null; 
        boolean url = false; 
        if (m.group(2) != null) { 
         matched = m.group(6); 

        } else { 
         break; 
        } 
        text = childText.text(); 
        TextNode replaceNode = childText.splitText(text.indexOf(matched)); 
        TextNode lastNode = replaceNode.splitText(matched.length());      
        Element anchorEle = ele.ownerDocument().createElement("a"); 
        anchorEle.attr("href", "mailto:" + matched); 

        anchorEle.attr("target", "_blank"); 
        anchorEle.text(matched); 
        replaceNode.replaceWith(anchorEle);      
        childText = lastNode; 
       } 
      } 
     } 
    } 

样本含量

<div id="abc"><br>---- The email address is [email protected]</b> contains abc 
domain email address <br></div> 

我要添加锚标记,其中导致下面的异常电子邮件地址

java.util.ConcurrentModificationException 
     at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) 
     at java.util.AbstractList$Itr.next(AbstractList.java:343) 
     at java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1008) 
     at JSOUPParse.processInContent(JSOUPParse.java:253) 
     at JSOUPParse.main(JSOUPParse.java:318) 

请帮我解决这个问题。

回答

3

该问题是由于在遍历它们时将节点添加到Node ele而引起的。这是非法的,这就是java.util.ConcurrentModificationException的含义。

您可以将要处理的节点存储在循环中,然后可以在另一个循环中进行修改。

private void processInContent(Node ele) { 
    String text = "";   

    ArrayList<Node> toReplace = new ArrayList<Node>(); 
    for (Node child : ele.childNodes()) {    
     Node parentNode = child.parentNode(); 
     if (child instanceof TextNode && !("a").equalsIgnoreCase(parentNode.nodeName())) { 
      toReplace.add(child); 
     } 
    } 
    for (Node child : toReplace){ 
     TextNode childText = (TextNode) child; 
     text = childText.text(); 
     Matcher m = pattern .matcher(text); 
     while (m.find()) { 
      // more code ......... 
      Element anchorEle = ele.ownerDocument().createElement("a"); 
      // more code ......... 
     } 
    } 
} 

此代码不会抛出ConcurrentModificationException

希望它能帮助。