2012-04-12 146 views
0

通过API,我得到一个XML文件,我试图通过org.w3c.dom和XPath进行解析。 XML文件的一部分描述HTML内容:使用Java将XML解析为HTML org.w3c.dom

<Para>Since 2001, state and local health departments in the US have accelerated efforts to prepare for bioterrorism and other high-impact public health emergencies. These activities have been spurred by federal funding and guidance from the US Centers for Disease Control and Prevention (CDC) and the Health Resources and Services Administration (HRSA) 
    <CitationRef CitationID="B1">1</CitationRef> 
    <CitationRef CitationID="B2">2</CitationRef> . Over time, the emphasis of this guidance has expanded from bioterrorism to include "terrorism and non-terrorism events, including infectious disease, environmental and occupational related emergencies" 
    <CitationRef CitationID="B4">4</CitationRef> as well as pandemic influenza. 
</Para> 

这应该成为这样的:

<p>Since 2001, state and local health departments in the US have accelerated efforts to prepare for bioterrorism and other high-impact public health emergencies. These activities have been spurred by federal funding and guidance from the US Centers for Disease Control and Prevention (CDC) and the Health Resources and Services Administration (HRSA) 
    <a href="link/B1">1</a> 
    <a href="link/B2">3</a> . Over time, the emphasis of this guidance has expanded from bioterrorism to include "terrorism and non-terrorism events, including infectious disease, environmental and occupational related emergencies" 
    <a href="link/B4">4</a> as well as pandemic influenza. 
</p> 

我如何能做到这一点有什么建议?主要问题是检索标签并在保持其位置的同时更换它们。

+0

这听起来像XSLT一个完美的工作,因为它是将XML输入一些其他的XML格式,或者转换成HTML语言。如果您需要关于XSLT代码的帮助,请将XSLT标签添加到您的问题中。 – 2012-04-13 09:35:17

回答

1

这里是你如何能做到这一点与XSLT:

<xsl:stylesheet 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    version="1.0"> 

<xsl:template match="@* | node()"> 
    <xsl:copy> 
    <xsl:apply-templates select="@* | node()"/> 
    </xsl:copy> 
</xsl:template> 

<xsl:template match="Para"> 
    <p> 
    <xsl:apply-templates select="@* | node()"/> 
    </p> 
</xsl:template> 

<xsl:template match="CitationRef[@CitationID]"> 
    <a href="link/{@CitationID}"> 
    <xsl:apply-templates/> 
    </a> 
</xsl:template> 

</xsl:stylesheet> 
+0

感谢您的回复,我正在研究XSLT(http://www.rgagnon.com/javadetails/java-0407.html),有没有办法让我提供您提供的XSL文件,需要哪种XML被解析和输出全部是一个字符串(所以不是文件)? – user485659 2012-04-13 10:53:39

+0

我非常确定输入,样式表和结果作为一个字符串可能与JAXP,这只是一个问题,使用正确的源http://docs.oracle.com/javase/6/docs/api/javax/xml /transform/stream/StreamSource.html和结果类型(例如通过StringReader的StreamSource)。我会把它留给那些比我更熟悉Java API的人。 – 2012-04-13 11:56:08

+0

感谢您的提示,我懂了它的工作原理!对于输入XML,我使用以下代码:'nl =(Node)xpath.evaluate(“// expression/here”,doc,XPathConstants.NODE); DOMSource source = new DOMSource(nl);' – user485659 2012-04-13 12:58:02