使用Java的XPath循环遍历节点并提取特定的子节点值

我从Google Googling了解到，使用XPath从XML中提取数据比使用DOM循环更有意义。使用Java的XPath循环遍历节点并提取特定的子节点值

目前，我已经实现了一个使用DOM的解决方案，但代码很冗长，感觉不整洁，无法维护，所以我想切换到更清洁的XPath解决方案。

比方说，我有这样的结构：

<products> 
    <product> 
     <title>Some title 1</title> 
     <image>Some image 1</image> 
    </product> 
    <product> 
     <title>Some title 2</title> 
     <image>Some image 2</image> 
    </product> 
    ... 
</products>

我希望能够为循环每个<product>元素的运行，而这里面for循环，提取标题和图像节点值。

我的代码如下所示：

InputStream is = conn.getInputStream();   
DocumentBuilder builder = 
    DocumentBuilderFactory.newInstance().newDocumentBuilder(); 
Document doc = builder.parse(is); 
XPathFactory factory = XPathFactory.newInstance(); 
XPath xpath = factory.newXPath(); 
XPathExpression expr = xpath.compile("/products/product"); 
Object result = expr.evaluate(doc, XPathConstants.NODESET); 
NodeList products = (NodeList) result; 
for (int i = 0; i < products.getLength(); i++) { 
    Node n = products.item(i); 
    if (n != null && n.getNodeType() == Node.ELEMENT_NODE) { 
     Element product = (Element) n; 
     // do some DOM navigation to get the title and image 
    } 
}

里面我for环我一次<product>为Node，它被强制转换为Element。

我可以简单地使用我的XPathExpression实例来编译和运行其他XPath在Node或Element？

来源

2010-10-22 BoomShaka

是的，你总是可以做这样的 -

XPathFactory factory = XPathFactory.newInstance(); 
XPath xpath = factory.newXPath(); 
XPathExpression expr = xpath.compile("/products/product"); 
Object result = expr.evaluate(doc, XPathConstants.NODESET); 
expr = xpath.compile("title"); // The new xpath expression to find 'title' within 'product'. 

NodeList products = (NodeList) result; 
for (int i = 0; i < products.getLength(); i++) { 
    Node n = products.item(i); 
    if (n != null && n.getNodeType() == Node.ELEMENT_NODE) { 
     Element product = (Element) n; 
     NodeList nodes = (NodeList) expr.evaluate(product,XPathConstants.NODESET); //Find the 'title' in the 'product' 
     System.out.println("TITLE: " + nodes.item(0).getTextContent()); // And here is the title 
    } 
}

在这里，我给例如提取“标题”的价值。以同样的方式，你可以做'图像'

来源

2010-10-22 11:58:04 Gopi

我不是这种方法的忠实粉丝，因为你必须建立一个文件（这可能是昂贵的），然后才能将XPath应用到它。

我发现VTD-XML在将XPath应用于文档时效率更高，因为您不需要将整个文档加载到内存中。以下是一些示例代码：

final VTDGen vg = new VTDGen(); 
vg.parseFile("file.xml", false); 
final VTDNav vn = vg.getNav(); 
final AutoPilot ap = new AutoPilot(vn); 

ap.selectXPath("/products/product"); 
while (ap.evalXPath() != -1) { 
    System.out.println("PRODUCT:"); 

    // you could either apply another xpath or simply get the first child 
    if (vn.toElement(VTDNav.FIRST_CHILD, "title")) { 
     int val = vn.getText(); 
     if (val != -1) { 
      System.out.println("Title: " + vn.toNormalizedString(val)); 
     } 
     vn.toElement(VTDNav.PARENT); 
    } 
    if (vn.toElement(VTDNav.FIRST_CHILD, "image")) { 
     int val = vn.getText(); 
     if (val != -1) { 
      System.out.println("Image: " + vn.toNormalizedString(val)); 
     } 
     vn.toElement(VTDNav.PARENT); 
    } 
}

另请参阅此文章Faster XPaths with VTD-XML。

来源

2010-10-22 12:49:55 dogbane

使用Java的XPath循环遍历节点并提取特定的子节点值

回答

相关问题