2016-09-20 40 views
1

示例XML:
拼合复杂的XDocument不知道DOM

<Pricing> 
    <PriceGuide id="e4c3db5c"> 
    <Name>Price Guide A</Name> 
    <Products> 
     <Product id="1"> 
     <Name>Product 1</Name> 
     <Prices> 
      <Price> 
      <Region id="40">Chicago</Region> 
      <PriceLow>48</PriceLow> 
      <PriceHigh>52</PriceHigh> 
      <UnitOfMeasure>MT</UnitOfMeasure> 
      </Price> 
      <Price> 
      <Region id="71">Dallas</Region> 
      <PriceLow>45.5</PriceLow> 
      <PriceHigh>47</PriceHigh> 
      <UnitOfMeasure>MT</UnitOfMeasure> 
      </Price> 
     </Prices> 
     </Product> 
     <Product id="2"> 
     <Name>Product 2</Name> 
     <Prices> 
      <Price> 
      <Region id="40">Chicago</Region> 
      <PriceLow>48</PriceLow> 
      <PriceHigh>49</PriceHigh> 
      <UnitOfMeasure>MT</UnitOfMeasure> 
      </Price> 
      <Price> 
      <Region id="101">Los Angeles </Region> 
      <PriceLow>43</PriceLow> 
      <PriceHigh>45</PriceHigh> 
      <UnitOfMeasure>MT</UnitOfMeasure> 
      </Price> 
      <Price> 
      <Region id="71">Dallas</Region> 
      <PriceLow>45.5</PriceLow> 
      <PriceHigh>48.5</PriceHigh> 
      <UnitOfMeasure>MT</UnitOfMeasure> 
      </Price> 
     </Prices> 
     </Product> 
    </Products> 
    </PriceGuide> 
</Pricing> 



预期结果:(数据写入到CSV文件或倾倒入一个DataTable)

Price Guide A, Product 1, Chicago, 48, 52, MT 
Price Guide A, Product 1, Dallas, 45.5, 47, MT 
Price Guide A, Product 2, Chicago, 48, 49, MT 
Price Guide A, Product 2, Los Angeles, 43, 45, MT 
Price Guide A, Product 2, Dallas, 45.5, 48.5, MT 



主要问题:
我基本上得到一个未知的XML文件,我必须显示为一个平坦的表。

这是我可以处理的许多文件之一的例子。 我不知道DOM提前,所以我不能做给定的节点名称的直LINQ查询。我试图尝试在DOM的背后行走,但是当你在递归中时,很难知道什么时候写出一条记录。

额外积分:
从示例中,有时节点上有属性。如果有属性“id”,我想在输出中包含该值。在这种情况下我的输出是:提前

e4c3db5c, Price Guide A, 1, Product 1, 40, Chicago, 48, 52, MT 
e4c3db5c, Price Guide A, 1, Product 1, 71, Dallas, 45.5, 47, MT 
e4c3db5c, Price Guide A, 2, Product 2, 40, Chicago, 48, 49, MT 
e4c3db5c, Price Guide A, 2, Product 2, 101, Los Angeles, 43, 45, MT 
e4c3db5c, Price Guide A, 2, Product 2, 71, Dallas, 45.5, 48.5, MT 



感谢。

编辑:
下面的工作,但需要我提前了解XML结构。我期待推广此代码:

var details = 
from level1 in _xmlDoc.Root.Elements("PriceGuide") 
from level2 in level1.Elements("Name") 
from level3 in level2.Elements("Products") 
from level4 in level3.Elements("Product") 
from level5 in level4.Elements("Name") 
from level6 in level5.Elements("Prices") 
from level7 in level6.Elements("Price") 
from level8a in level7.Elements("Region") 
from level8b in level7.Elements("PriceLow") 
from level8c in level7.Elements("PriceHigh") 
from level8d in level7.Elements("UnitOfMeasure") 
select new 
{ 
       PriceGuideId = (string)level1.Attribute("id"), 
       PriceGuideName = (string)level2.Value, 
       ProductId = (string)level3.Attribute("id"), 
       ProductName = (string)level4.Value, 
       RegionId = (string)level8a.Attribute("id"), 
       RegionName = (string)level8a.Value, 
       PriceLow = (string)level8b.Value, 
       PriceHigh = (string)level8c.Value, 
       UnitOfMeasure = (string)level8d.Value, 
}; 

我知道它没有多大帮助。

+0

如果您不知道DOM,您如何知道要提取哪些数据。 – Jules

+0

基本上是层次结构中每个节点的值。 (在可能的情况下抛出“id”属性的值)。我甚至不介意有节点的XML节点的空值,但如果编码更容易,则不需要值。 – Rumtis

+0

我添加了什么工程,如果我硬编码的节点名称,但我需要一种方法来推广的代码。 – Rumtis

回答

0

我不知道如何在linq中做到这一点。这是一个快速和肮脏的代码,可以运行

XmlDocument dom = new XmlDocument(); 
    dom.LoadXml("<Pricing><PriceGuide id=\"e4c3db5c\"><Name>Price Guide A</Name><Products><Product id=\"1\"><Name>Product 1</Name><Prices><Price><Region id=\"40\">Chicago</Region><PriceLow>48</PriceLow><PriceHigh>52</PriceHigh><UnitOfMeasure>MT</UnitOfMeasure></Price><Price><Region id=\"71\">Dallas</Region><PriceLow>45.5</PriceLow><PriceHigh>47</PriceHigh><UnitOfMeasure>MT</UnitOfMeasure></Price></Prices></Product><Product id=\"2\"><Name>Product 2</Name><Prices><Price><Region id=\"40\">Chicago</Region><PriceLow>48</PriceLow><PriceHigh>49</PriceHigh><UnitOfMeasure>MT</UnitOfMeasure></Price><Price><Region id=\"101\">Los Angeles </Region><PriceLow>43</PriceLow><PriceHigh>45</PriceHigh><UnitOfMeasure>MT</UnitOfMeasure></Price><Price><Region id=\"71\">Dallas</Region><PriceLow>45.5</PriceLow><PriceHigh>48.5</PriceHigh><UnitOfMeasure>MT</UnitOfMeasure></Price></Prices></Product></Products></PriceGuide></Pricing>"); 

    List<KeyValuePair<int, String>> result = FlattenXML(dom.DocumentElement, "", 0); 
    var q = result.Where(c => c.Key == result.Max(b => b.Key)).Select(b => b.Value.Substring(0, b.Value.Length - 1)).ToArray(); 

    Console.WriteLine(String.Join(System.Environment.NewLine, q)); 

    private List<KeyValuePair<int, String>> FlattenXML(XmlElement node, String parent, int level) 
    { 
     List<KeyValuePair<int, String>> result = new List<KeyValuePair<int, String>>(); 
     String detail = ""; 

     if (node.HasAttribute("id")) 
      parent += node.Attributes["id"].InnerText + ","; 

     if (node.InnerText == node.InnerXml && node.InnerText != "") 
     { 
      parent += node.InnerText + ","; 
     } 

     foreach (XmlElement child in node.ChildNodes) 
     { 
      if (child.InnerText == child.InnerXml && child.InnerText != "") 
      { 
       detail += child.InnerText + ","; 
       level++; 
      } 

      if (child.FirstChild != child.LastChild) 
      { 
       List<KeyValuePair<int, String>> childResult = FlattenXML(child, parent + detail, level); 
       result.AddRange(childResult); 
      } 
     } 
     result.Add(new KeyValuePair<int, String>(level, parent + detail)); 
     return result; 
    } 
+0

看起来很有希望。非常感谢你。我将在一些示例XML文件上进行测试,并在明天之前更新此问题。 – Rumtis