2013-04-17 75 views
0

我有一个带有CDATA的XML。这里是我尝试加载到Java DOM中的XML。Java DOM无法识别CDATA

<?xml version="1.0" encoding="utf-8"?><search:Search xmlns:search="Search"><search:Response xmlns="Search"><search:Store xmlns="Search">"; 
<search:Result xmlns="Search">"; 
<search:Properties xmlns="Search">"; 
<email2:ConversationId xmlns:email2="Email2"><![CDATA["B3:5F:18:81:37:4B:E4:4C:97:CE:9A:5A:18:6E:DE:8D:"]]></email2:ConversationId>"; 
<email:Categories xmlns:email="Email"></email:Categories> 
</search:Properties> 
</search:Result> 
</search:Store> 
</search:Response> 
</search:Search> 

下面是加载它的代码:

import org.w3c.dom.Attr; 
import org.w3c.dom.CDATASection; 
import org.w3c.dom.Document; 
import org.w3c.dom.NamedNodeMap; 
import org.w3c.dom.Node; 
import org.w3c.dom.NodeList; 
import org.w3c.dom.bootstrap.DOMImplementationRegistry; 
import org.w3c.dom.ls.DOMImplementationLS; 
import org.w3c.dom.ls.LSParser; 
... 
... 
    try { 
     DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance(); 
     DOMImplementationLS impl = (DOMImplementationLS)registry.getDOMImplementation("LS"); 
     LSParser builder = impl.createLSParser(DOMImplementationLS.MODE_SYNCHRONOUS, null); 
     DOMInputImpl input = new DOMInputImpl(); 
     input.setByteStream(new ByteArrayInputStream(xmlString.getBytes("utf-8"))); 
     xmlDoc = builder.parse(input); 
     return xmlDoc; 
    } catch (ClassNotFoundException | InstantiationException 
      | IllegalAccessException | ClassCastException | UnsupportedEncodingException e) { 
     throw new MyException(e); 
    } 

然而,我发现,解析的文件不具有CDATA节点类型org.w3c.dom.CDATASection。相反,nodetype以#text的形式出现。

任何帮助将不胜感激。

回答

0

有趣! LSParser将CDATA节节点“转换”为文本节点(可能在规范的某处说明)。然而,如果你使用JAXP API(这是一个少得多的噪音),你会得到#cdata-部分

DocumentBuilderFactory f = DocumentBuilderFactory.newInstance(); 
    f.setNamespaceAware(true); 
    DocumentBuilder builder = f.newDocumentBuilder(); 
    Document doc = builder.parse(...);