2016-11-22 111 views
0

我使用xslt从以下xslt中提取数据。无论如何剥夺CData。目前它包括CData以及它提取时。使用xslt从xml中剥离CData

<Product> 
<ExternalId><![CData[55037]]></ExternalId> 
<Name><![CData[Reindeer Booties]]></Name> 
<Description><![CData[Everybody say, "Aww!" Prepare for maximum cuteness when these plush reindeer booties are unwrapped from their special box. Faux fur provides plenty of warmth for tiny toes and softness for delicate skin. A pompom nose with 3D ears and antlers are enough to bring out the festive spirit in anyone.]]></Description> 
<Brand>XYZ</Brand> 
<CategoryExternalId>1_15_1</CategoryExternalId> 
<ProductPageUrl><![CData[http://www.xyz.co.uk/baby-accessories/SE037/baby-reindeer-booties]]></ProductPageUrl> 
<ImageUrl><![CData[http://www.xyzimages.com/images/product/16S_550.jpg]]></ImageUrl> 
<SwatchImageUrl><![CData[]]></SwatchImageUrl> 
<Price>84.8000</Price> 
<Wasprice>154.9500</Wasprice> 
<ManufacturerPartNumber></ManufacturerPartNumber> 
<EAN></EAN> 
<Colours><![CData[blue-pink]]</Colours> 
</Product> 

我期待下面的输出

<Product> 
<ExternalId>55037</ExternalId> 
<Name>Reindeer Booties></Name> 
<Description>Everybody say, "Aww!" Prepare for maximum cuteness when these plush reindeer booties are unwrapped from their special box. Faux fur provides plenty of warmth for tiny toes and softness for delicate skin. A pompom nose with 3D ears and antlers are enough to bring out the festive spirit in anyone.</Description> 
<Brand>XYZ</Brand> 
<CategoryExternalId>1_15_1</CategoryExternalId> 
<ProductPageUrl>http://www.xyz.co.uk/baby-accessories/SE037/baby-reindeer-booties</ProductPageUrl> 
<ImageUrl>http://www.xyzimages.com/images/product/16S_550.jpg</ImageUrl> 
<SwatchImageUrl></SwatchImageUrl> 
<Price>84.8000</Price> 
<Wasprice>154.9500</Wasprice> 
<ManufacturerPartNumber></ManufacturerPartNumber> 
<EAN></EAN> 
<Colours>blue-pink</Colours> 
</Product> 
+0

您可以显示(您的xslt的相关部分)吗? –

回答

0

您展示我们的投入不是格式良好的XML,并且不能由XSLT处理:

  • 首先,CDATA sections必须<![CDATA[因为你拥有它开始,而不是与 <![CData[( XML区分大小写)。

  • 接下来,CDATA部分必须以]]>结尾。这个结局是在你输入的 线14失踪(你只有]]

一旦你解决这些瑕疵,并有良好的XML输入,如:

XML

<Product> 
    <ExternalId><![CDATA[55037]]></ExternalId> 
    <Name><![CDATA[Reindeer Booties]]></Name> 
    <Description><![CDATA[Everybody say, "Aww!" Prepare for maximum cuteness when these plush reindeer booties are unwrapped from their special box. Faux fur provides plenty of warmth for tiny toes and softness for delicate skin. A pompom nose with 3D ears and antlers are enough to bring out the festive spirit in anyone.]]></Description> 
    <Brand>XYZ</Brand> 
    <CategoryExternalId>1_15_1</CategoryExternalId> 
    <ProductPageUrl><![CDATA[http://www.xyz.co.uk/baby-accessories/SE037/baby-reindeer-booties]]></ProductPageUrl> 
    <ImageUrl><![CDATA[http://www.xyzimages.com/images/product/16S_550.jpg]]></ImageUrl> 
    <SwatchImageUrl><![CDATA[]]></SwatchImageUrl> 
    <Price>84.8000</Price> 
    <Wasprice>154.9500</Wasprice> 
    <ManufacturerPartNumber></ManufacturerPartNumber> 
    <EAN></EAN> 
    <Colours><![CDATA[blue-pink]]></Colours> 
</Product> 

还可以再敷一个简单的,身份的变换而已,样式表:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> 
<xsl:strip-space elements="*"/> 

<!-- identity transform --> 
<xsl:template match="@*|node()"> 
    <xsl:copy> 
     <xsl:apply-templates select="@*|node()"/> 
    </xsl:copy> 
</xsl:template> 

</xsl:stylesheet> 

返回:

结果

<?xml version="1.0" encoding="UTF-8"?> 
<Product> 
    <ExternalId>550&lt;37</ExternalId> 
    <Name>Reindeer Booties</Name> 
    <Description>Everybody say, "Aww!" Prepare for maximum cuteness when these plush reindeer booties are unwrapped from their special box. Faux fur provides plenty of warmth for tiny toes and softness for delicate skin. A pompom nose with 3D ears and antlers are enough to bring out the festive spirit in anyone.</Description> 
    <Brand>XYZ</Brand> 
    <CategoryExternalId>1_15_1</CategoryExternalId> 
    <ProductPageUrl>http://www.xyz.co.uk/baby-accessories/SE037/baby-reindeer-booties</ProductPageUrl> 
    <ImageUrl>http://www.xyzimages.com/images/product/16S_550.jpg</ImageUrl> 
    <SwatchImageUrl/> 
    <Price>84.8000</Price> 
    <Wasprice>154.9500</Wasprice> 
    <ManufacturerPartNumber/> 
    <EAN/> 
    <Colours>blue-pink</Colours> 
</Product> 
+0

感谢您的帮助,但它仍然没有脱掉cdata。请提供其他任何提示? –

+0

实际上刚刚实现我的c#应用程序创建CDATA这样的<![CDATA [52011]] >不像那样<![CDATA [52011]]>。请任何解决方案? –

+0

@Ibex请编辑你的问题,并显示一个小的,但你的真实输入的完整例子 - 请参阅:[mcve]。 –

0

你真正的问题是你已经损坏的xml,应该修复错误的来源,而不是修补的结果。 CData不应位于尖括号标记中。它应该以'!'开始并以']'结尾。以下正则表达式将修复错误。

using System.Xml; 
using System.Xml.Linq; 
using System.IO; 
using System.Text.RegularExpressions; 

namespace ConsoleApplication28 
{ 
    class Program 
    { 
     const string FILENAME = @"c:\temp\test.xml"; 
     static void Main(string[] args) 
     { 
      string xml = File.ReadAllText(FILENAME); 
      string pattern = @"(?'open'<)(?'cdata'!\[CData[^\>]+)(?'close'>)"; 
      string fixedXml = Regex.Replace(xml, pattern, "${cdata}"); 
      XDocument doc = XDocument.Parse(fixedXml); 
     } 
    } 
} 
+0

“* CData不应该放在尖括号中。*”恐怕你对此非常误解 - 请参阅:https://en.wikipedia.org/wiki/CDATA –

0

由于您使用C#,那么你完全可以不用做XSLT,只是使用LINQ to XML。

var doc = XDocument.Load("test.xml"); 

foreach (var n in doc.DescendantNodes().OfType<XCData>().ToList()) 
{ 
    n.ReplaceWith(n.Value); 
} 

doc.Save("test2.xml"); 

当然,您输入的XML应该很好地形成,由michael.hor257k指出。