2010-12-02 73 views
2

我打算实现一个方法来比较两个大的XML文件(但其他每个都少于10,000个元素行)。如何高效地逐项比较两个大的XML文件?

下面的方法工作,但是当文件超过100行时它不好。它开始非常缓慢。我如何找到更有效的解决方案。也许需要高C#编程设计或更好的C#算法#& XML处理。

感谢您的意见提前。

//Remove the item which not in Event Xml and ConfAddition Xml files 
XmlDocument doc = new XmlDocument(); 
doc.Load(xmlFile_AlarmSettingUp); 

bool isNewAlid_Event = false; 
bool isNewAlid_ConfAddition = false; 
int alid = 0; 

XmlNodeList xnList = doc.SelectNodes("/Equipment/AlarmSettingUp/EnabledALIDs/ALID"); 

foreach (XmlNode xn in xnList) 
{       
    XmlAttributeCollection attCol = xn.Attributes; 

    for (int i = 0; i < attCol.Count; ++i) 
    { 
     if (attCol[i].Name == "alid") 
     { 
      alid = int.Parse(attCol[i].Value.ToString()); 
      break; 
     } 
    } 

    //alid = int.Parse(attCol[1].Value.ToString()); 

    XmlDocument docEvent_Alarm = new XmlDocument(); 
    docEvent_Alarm.Load(xmlFile_Event); 
    XmlNodeList xnListEvent_Alarm = docEvent_Alarm.SelectNodes("/Equipment/Alarms/ALID"); 
    foreach (XmlNode xnEvent_Alarm in xnListEvent_Alarm) 
    { 
     XmlAttributeCollection attColEvent_Alarm = xnEvent_Alarm.Attributes; 
     int alidEvent_Alarm = int.Parse(attColEvent_Alarm[1].Value.ToString()); 
     if (alid == alidEvent_Alarm) 
     { 
      isNewAlid_Event = false; 
      break; 
     } 
     else 
     { 
      isNewAlid_Event = true; 
      //break; 
     } 
    } 

    XmlDocument docConfAddition_Alarm = new XmlDocument(); 
    docConfAddition_Alarm.Load(xmlFile_ConfAddition); 
    XmlNodeList xnListConfAddition_Alarm = docConfAddition_Alarm.SelectNodes("/Equipment/Alarms/ALID"); 
    foreach (XmlNode xnConfAddition_Alarm in xnListConfAddition_Alarm) 
    { 
     XmlAttributeCollection attColConfAddition_Alarm = xnConfAddition_Alarm.Attributes; 
     int alidConfAddition_Alarm = int.Parse(attColConfAddition_Alarm[1].Value.ToString()); 
     if (alid == alidConfAddition_Alarm) 
     { 
      isNewAlid_ConfAddition = false; 
      break; 
     } 
     else 
     { 
      isNewAlid_ConfAddition = true; 
      //break; 
     } 
    }       

    if (isNewAlid_Event && isNewAlid_ConfAddition) 
    { 
     // Store the root node of the destination document into an XmlNode 
     XmlNode rootDest = doc.SelectSingleNode("/Equipment/AlarmSettingUp/EnabledALIDs"); 
     rootDest.RemoveChild(xn); 
    } 

} 
doc.Save(xmlFile_AlarmSettingUp); 

我的XML文件是这样的。这两个XML文件是相同的样式。除了有些时候,其中一个可能会被我的应用程序修改。这就是为什么我需要比较他们,如果修改。

<?xml version="1.0" encoding="utf-8"?> 
<Equipment xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> 
    <Licence LicenseId="" LicensePath="" /> 
    <!--Alarm Setting Up XML File--> 
    <AlarmSettingUp> 
    <EnabledALIDs> 
     <ALID logicalName="Misc_EV_RM_STATION_ALREADY_RESERVED" alid="536870915" alcd="7" altx="Misc_Station 1 UnitName 2 SlotId already reserved" ceon="Misc_AlarmOn_EV_RM_STATION_ALREADY_RESERVED" ceoff="Misc_AlarmOff_EV_RM_STATION_ALREADY_RESERVED" /> 
     <ALID logicalName="Misc_EV_RM_SEQ_READ_ERROR" alid="536870916" alcd="7" altx="Misc_Sequence ID 1 d step 2 d read error for wafer in 3 UnitName 4 SlotId" ceon="Misc_AlarmOn_EV_RM_SEQ_READ_ERROR" ceoff="Misc_AlarmOff_EV_RM_SEQ_READ_ERROR" /> 
... 
... 
... 
    </EnabledALIDs> 
    </AlarmSettingUp> 
</Equipment> 
+1

呃,当您尝试对性能进行基准测试时,嵌套迭代不适用。清理你的代码。 – 2010-12-02 10:02:41

回答

1

的“ALID/@ alid”似乎是你的关键,所以我会做(foreach (XmlNode xn in xnList)之前)的第一件事就是建立一个字典(假设这是唯一的)在docEvent_Alarm.SelectNodes("/Equipment/Alarms/ALID") @alid值 - 那么你可以在没有O(n * m)性能的情况下完成大部分工作 - 它会更多O(n + m)(这是一个很大的差别)。

var lookup = new Dictionary<string, XmlElement>(); 
foreach(XmlElement el in docEvent_Alarm.SelectNodes("/Equipment/Alarms/ALID")) { 
    lookup.Add(el.GetAttribute("alid"), el); 
} 

那么你可以使用:

XmlElement other; 
if(lookup.TryGetValue(otherKey, out other)) { 
    // exists; element now in "other" 
} else { 
    // doesn't exist 
} 
+0

`var`来自.net 2.0?我必须基于.net 2.0。 – 2010-12-02 10:04:53

1

的XmlDocument和相关的类(XmlNode的,...)不是非常快的XML处理。改为尝试使用XmlTextReader。

此外,您还可以拨打docEvent_Alarm.Load(xmlFile_Event);docConfAddition_Alarm.Load(xmlFile_ConfAddition);亲代循环的每次迭代 - 这并不好。如果您的xmlFile_EventxmlFile_ConfAddition在所有处理过程中都是永久性的 - 最好在主循环之前初始化它。