使用xpath读取xhtml标记的问题

我使用xpath读取xhtml文档，我想读取xhtml文件的<p>标记内的所有元素。为此，我正在做这样的事情。使用xpath读取xhtml标记的问题

XPath xpath = XPathFactory.newInstance().newXPath();     
XPathExpression expr = xpath.compile("//p[2]/*");     
Object result = expr.evaluate(doc, XPathConstants.NODESET); 
NodeList nodes = (NodeList) result; 
for (int i = 0; i < nodes.getLength(); i++) { 
    System.out.println("Nodes>>>>>>>>"+nodes.item(i).getNodeValue()); 
}

XHMTL样品看起来像这样..

<?xml version="1.0" encoding="UTF-8" standalone="no"?> 
<html xmlns="http://www.w3.org/1999/xhtml"> 
    <head><title>test</title></head> 
    <body> 
     <p class="default"> <span style="color: #000000; font-size: 12pt; font-family: sans-serif"> Test Doc</span> </p> 
     <p class="default"> <span style="color: #000000; font-size: 12pt; font-family: sans-serif"> Test Doc1</span> </p> 
     <p class="default"> <span style="color: #000000; font-size: 12pt; font-family: sans-serif"> Test Doc2</span> </p> 
    </body> 
</html>

但我无法获得<p>标签内的节点，不是不能够进入for循环。

任何人都可以帮助我解决这个问题。

在此先感谢

来源

2011-11-02 user972590

尝试本地名称（）中的XPath – Kris

我是新来这个，你可以给详细的解答 – user972590

请添加到您的问题XHTML样本 - 完整的文件，包括HTML标记 - 你预期的那样工作和不。 – Alohci

你的代码试图打印nodeValue S元素的节点，这是不太可能你想要的东西。我希望你想要Text节点的nodeValue。

另一个问题可能是命名空间。看起来你的xpath试图在没有命名空间中匹配p元素，当它应该试图匹配http://www.w3.org/1999/xhtml命名空间中的p元素时。

来源

2011-11-02 08:04:32 Alohci

 XPathExpression expr = xpath.compile(".//*[local-name()='p'][@id='ur_id']");

你能检查吗？我认为这会让你成为你的节点。很高兴访问http://saxon.sourceforge.net/saxon6.5/expressions.html并了解解析XPath的基础知识。

来源

2011-11-02 09:34:10 Kris

“// XXX [@ attrib ='abc']”将选择属性为attrib ='abc'的节点 – Kris

您可以使用XPathAPI（javadoc）将您的节点提取为通用Java列表。

String expr = "//p[2]/*"; 

Map<String, String> ns = new Map<String, String>; 
ns.put("html", "http://www.w3.org/1999/xhtml"); 

List<String> nodeValues = XPathAPI.html.selectNodeListAsStrings(doc, expr, ns); 
for (String nodeValue : nodesValues) { 
    System.out.println("Nodes>>>>>>>> " + nodeValue); 
}

或

List<String> nodeValues = XPathAPI.html.selectListOfNodes(doc, expr, ns); 
for (Node node : nodes) { 
    System.out.println("Nodes>>>>>>>> " + node.getTextContent()); 
}

免责声明：我是XPathAPI类库的作者。

来源

2011-11-02 10:45:02 gioele

使用xpath读取xhtml标记的问题

回答

相关问题