2011-04-25 56 views
9

这是我在StackOverflow上的第一篇文章,所以请耐心等待。如果我的代码示例有点长,我很抱歉。使用C#和LINQ,我试图在一个更大的XML文件中识别一系列第三级id元素(本例中为000049)。每个第三级id是独一无二的,我想要的是基于每个子系列的后代信息。更具体地说,如果type == Alocation type(old) == vaultlocation type(new) == out,那么我想选择那个id。以下是我正在使用的XML和C#代码。如何使用C#和LINQ在XML深处提取信息?

通常我的代码有效。如下所述,它将返回000049的id两次,这是正确的。不过,我发现了一个小故障。如果我删除包含type == A的第一个history块,我的代码仍会返回000049的id两次,因为它应该只返回一次。我知道它为什么会发生,但我找不到一个更好的方式来运行查询。有没有更好的方式来运行我的查询来获得我想要的输出并仍然使用LINQ?

我的XML:

<?xml version="1.0" encoding="ISO8859-1" ?> 
<data type="historylist"> 
    <date type="runtime"> 
     <year>2011</year> 
     <month>04</month> 
     <day>22</day> 
     <dayname>Friday</dayname> 
     <hour>15</hour> 
     <minutes>24</minutes> 
     <seconds>46</seconds> 
    </date> 
    <customer> 
     <id>0001</id> 
     <description>customer</description> 
     <mediatype> 
      <id>kit</id> 
      <description>customer kit</description> 
      <volume> 
       <id>000049</id> 
       <history> 
        <date type="optime"> 
         <year>2011</year> 
         <month>04</month> 
         <day>22</day> 
         <dayname>Friday</dayname> 
         <hour>03</hour> 
         <minutes>00</minutes> 
         <seconds>02</seconds> 
        </date> 
        <userid>batch</userid> 
        <type>OD</type> 
        <location type="old"> 
         <repository>vault</repository> 
         <slot>0</slot> 
        </location> 
        <location type="new"> 
         <repository>out</repository> 
         <slot>0</slot> 
        </location> 
        <container>0001.kit.000049</container> 
        <date type="movedate"> 
         <year>2011</year> 
         <month>04</month> 
         <day>22</day> 
         <dayname>Friday</dayname> 
        </date> 
       </history> 
       <history> 
        <date type="optime"> 
         <year>2011</year> 
         <month>04</month> 
         <day>22</day> 
         <dayname>Friday</dayname> 
         <hour>06</hour> 
         <minutes>43</minutes> 
         <seconds>33</seconds> 
        </date> 
        <userid>vaultred</userid> 
        <type>A</type> 
        <location type="old"> 
         <repository>vault</repository> 
         <slot>0</slot> 
        </location> 
        <location type="new"> 
         <repository>out</repository> 
         <slot>0</slot> 
        </location> 
        <container>0001.kit.000049</container> 
        <date type="movedate"> 
         <year>2011</year> 
         <month>04</month> 
         <day>22</day> 
         <dayname>Friday</dayname> 
        </date> 
       </history> 
       <history> 
        <date type="optime"> 
         <year>2011</year> 
         <month>04</month> 
         <day>22</day> 
         <dayname>Friday</dayname> 
         <hour>06</hour> 
         <minutes>43</minutes> 
         <seconds>33</seconds> 
        </date> 
        <userid>vaultred</userid> 
        <type>S</type> 
        <location type="old"> 
         <repository>vault</repository> 
         <slot>0</slot> 
        </location> 
        <location type="new"> 
         <repository>out</repository> 
         <slot>0</slot> 
        </location> 
        <container>0001.kit.000049</container> 
        <date type="movedate"> 
         <year>2011</year> 
         <month>04</month> 
         <day>22</day> 
         <dayname>Friday</dayname> 
        </date> 
       </history> 
       <history> 
        <date type="optime"> 
         <year>2011</year> 
         <month>04</month> 
         <day>22</day> 
         <dayname>Friday</dayname> 
         <hour>06</hour> 
         <minutes>45</minutes> 
         <seconds>00</seconds> 
        </date> 
        <userid>batch</userid> 
        <type>O</type> 
        <location type="old"> 
         <repository>out</repository> 
         <slot>0</slot> 
        </location> 
        <location type="new"> 
         <repository>site</repository> 
         <slot>0</slot> 
        </location> 
        <container>0001.kit.000049</container> 
        <date type="movedate"> 
         <year>2011</year> 
         <month>04</month> 
         <day>22</day> 
         <dayname>Friday</dayname> 
        </date> 
       </history> 
       <history> 
        <date type="optime"> 
         <year>2011</year> 
         <month>04</month> 
         <day>22</day> 
         <dayname>Friday</dayname> 
         <hour>11</hour> 
         <minutes>25</minutes> 
         <seconds>59</seconds> 
        </date> 
        <userid>ihcmdm</userid> 
        <type>A</type> 
        <location type="old"> 
         <repository>out</repository> 
         <slot>0</slot> 
        </location> 
        <location type="new"> 
         <repository>site</repository> 
         <slot>0</slot> 
        </location> 
        <container>0001.kit.000049</container> 
        <date type="movedate"> 
         <year>2011</year> 
         <month>04</month> 
         <day>22</day> 
         <dayname>Friday</dayname> 
        </date> 
       </history> 
       <history> 
        <date type="optime"> 
         <year>2011</year> 
         <month>04</month> 
         <day>22</day> 
         <dayname>Friday</dayname> 
         <hour>11</hour> 
         <minutes>25</minutes> 
         <seconds>59</seconds> 
        </date> 
        <userid>ihcmdm</userid> 
        <type>S</type> 
        <location type="old"> 
         <repository>out</repository> 
         <slot>0</slot> 
        </location> 
        <location type="new"> 
         <repository>site</repository> 
         <slot>0</slot> 
        </location> 
        <container>0001.kit.000049</container> 
        <date type="movedate"> 
         <year>2011</year> 
         <month>04</month> 
         <day>22</day> 
         <dayname>Friday</dayname> 
        </date> 
       </history> 
      </volume> 
      ... 

我的C#代码:

IEnumerable<XElement> caseIdLeavingVault = 
    from volume in root.Descendants("volume") 
    where 
     (from type in volume.Descendants("type") 
     where type.Value == "A" 
     select type).Any() && 
     (from locationOld in volume.Descendants("location") 
     where 
      ((String)locationOld.Attribute("type") == "old" && 
       (String)locationOld.Element("repository") == "vault") && 
      (from locationNew in volume.Descendants("location") 
       where 
        ((String)locationNew.Attribute("type") == "new" && 
        (String)locationNew.Element("repository") == "out") 
       select locationNew).Any() 
     select locationOld).Any() 
    select volume.Element("id"); 

    ... 

foreach (XElement volume in caseIdLeavingVault) 
{ 
    Console.WriteLine(volume.Value.ToString()); 
} 

感谢。


好吧,我再次陷入困境。鉴于同样的情况和@Elian的解决方案下面(这很好用),我需要"optime""movedate"日期为history用于选择id。那有意义吗?我希望像这样的东西来结束:

select new { 
    id = volume.Element("id").Value, 

    // this is from "optime" 
    opYear = <whaterver>("year").Value, 
    opMonth = <whatever>("month").Value, 
    opDay = <whatever>("day").Value, 

    // this is from "movedate" 
    mvYear = <whaterver>("year").Value, 
    mvMonth = <whatever>("month").Value, 
    mvDay = <whatever>("day").Value 
} 

我已经尝试了许多不同的组合,但Attribute S代表<date type="optime"><date type="movedate">不断收到我的方式,我似乎无法得到我想要的东西。


好的。我发现了一个solution行之有效:

select new { 
    caseId = volume.Element("id").Value, 

    // this is from "optime" 
    opYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("year").Value, 
    opMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("month").Value, 
    opDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("day").Value, 

    // this is from "movedate" 
    mvYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("year").Value, 
    mvMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("month").Value, 
    mvDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("day").Value 
}; 

然而,当它发现一个id没有"movedate"它失败。其中一些存在,所以现在我正在为此工作。


好了,晚了,昨天下午,我终于想通了解决方案,我一直想:

var caseIdLeavingSite = 
    from volume in root.Descendants("volume") 
    where volume.Elements("history").Any(
     h => h.Element("type").Value == "A" && 
     h.Elements("location").Any(l => l.Attribute("type").Value == "old" && ((l.Element("repository").Value == "site") || 
                       (l.Element("repository").Value == "init"))) && 
     h.Elements("location").Any(l => l.Attribute("type").Value == "new" && l.Element("repository").Value == "toVault") 
     ) 
    select new { 
     caseId = volume.Element("id").Value, 
     opYear = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("year").Value, 
     opMonth = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("month").Value, 
     opDay = volume.Descendants("date").Where(t => t.Attribute("type").Value == "optime").First().Element("day").Value, 
     mvYear = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ? 
       (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("year").Value) : "0", 
     mvMonth = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ? 
        (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("month").Value) : "0", 
     mvDay = (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").Any() == true) ? 
       (volume.Descendants("date").Where(t => t.Attribute("type").Value == "movedate").First().Element("day").Value) : "0" 
    }; 

这满足该@Elian帮助了,并抓住所必需的额外最新信息的要求。它还解释了那些使用三元运算符?:没有元素"movedate"的少数情况。

现在,如果有人知道如何提高效率,我仍然感兴趣。谢谢。

回答

8

我想你想是这样的:

IEnumerable<XElement> caseIdLeavingVault = 
    from volume in document.Descendants("volume") 
    where volume.Elements("history").Any(
     h => h.Element("type").Value == "A" && 
      h.Elements("location").Any(l => l.Attribute("type").Value == "old" && l.Element("repository").Value == "vault") && 
      h.Elements("location").Any(l => l.Attribute("type").Value == "new" && l.Element("repository").Value == "out") 
     ) 
    select volume.Element("id"); 

您的代码独立检查如果卷A类型的<history>元素和(不一定是相同的),它具有所需<location>元素<history>元素。

上面的代码检查是否存在<history>元素,它既是A类型,又包含所需的<location>元素。

更新: Abatishchev建议使用xpath查询而不是LINQ to XML的解决方案,但他的查询太简单,并且不会完全按照您的要求返回。下面的XPath查询会做的伎俩,但也有点长:

data/customer/mediatype/volume[history[type = 'A' and location[@type = 'old' and repository = 'vault'] and location[@type = 'new' and repository = 'out']]]/id 
+0

@Elian感谢您的回答。我会试一试。 – meffordm 2011-04-25 21:53:30

+0

@Elian这似乎工作。再次感谢! – meffordm 2011-04-25 22:08:44

+0

@meffordm:不要忘记接受这个正确的答案 – abatishchev 2011-04-26 07:10:23

1

对你有什么用这样的复杂和昂贵的LINQ to XML查询时,您可以使用简单的XPath查询:

using System.Xml; 

string xml = @"..."; 
string xpath = "data/customer/mediatype/volume/history/type[text()='A']/../location[@type='old' or @type='new']/../../id"; 

var doc = new XmlDocument(); 
doc.LoadXml(xml); // or use Load(path); 

var nodes = doc.SelectNodes(xpath); 

foreach (XmlNode node in nodes) 
{ 
    Console.WriteLine(node.InnerText); // 000049 
} 

,或者如果你不需要XML DOM模型:

using System.Xml.XPath; 

XPathDocument doc = null; 
using (var stream = new StringReader(xml)) 
{ 
    doc = new XPathDocument(stream); // specify just path to file if you have such one 
} 
var nav = doc.CreateNavigator(); 
XPathNodeIterator nodes = (XPathNodeIterator)nav.Evaluate(xpath); 
foreach (XPathNavigator node in nodes) 
{ 
    Console.WriteLine(node.Value); 
} 
+0

+1;有时候本机查询就是答案。 – 2011-04-25 20:42:20

+0

你的xpath查询不会做同样的事情,虽然我猜你是正确的,在这种情况下xpath查询会更短。 – 2011-04-25 20:48:15

+0

@Elian:可能它没有,我在XPath方面不太好,但我总体上表现出了这个想法。 – abatishchev 2011-04-25 20:54:09