2017-06-02 67 views
3

我有一个非常奇怪的问题,使用C#XDocument.Validate或XMLReaderSettings和所需的配置,针对有效的XSD验证XML文档。问题是:当XML文档中存在错误时,验证过程无法在特定条件下捕获所有错误,并且我无法找到这种异议的模式。XDocument.Validate没有捕获所有针对XSD的错误

这里是我的XSD:

<?xml version="1.0" encoding="utf-8"?> 
 
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" 
 
\t \t \t targetNamespace="http://www.somesite.com/somefolder/messages" 
 
\t \t \t xmlns:xs="http://www.w3.org/2001/XMLSchema"> 
 
    <xs:element name="Message"> 
 
    <xs:complexType> 
 
    <xs:sequence> 
 
     <xs:element name="Header"> 
 
     <xs:complexType> 
 
      <xs:sequence> 
 
      <xs:element name="MessageId" type="xs:string" /> 
 
      <xs:element name="MessageSource" type="xs:string" /> 
 
      </xs:sequence> 
 
     </xs:complexType> 
 
    </xs:element> 
 
    <xs:element name="Body"> 
 
     <xs:complexType> 
 
      <xs:sequence> 
 
      <xs:element name="Abc001"> 
 
       <xs:complexType> 
 
        <xs:sequence> 
 
        <xs:element name="Abc002" type="xs:string" /> 
 
        <xs:element name="Abc003" type="xs:string" minOccurs="0" /> 
 
        <!--<xs:element name="Abc004" type="xs:string" />--> 
 
        <xs:element name="Abc004"> 
 
         <xs:simpleType> 
 
         <xs:restriction base="xs:string"> 
 
          <xs:maxLength value="200"/> 
 
         </xs:restriction> 
 
         </xs:simpleType> 
 
        </xs:element> 
 
         <xs:element name="Abc005"> 
 
         <xs:complexType> 
 
          <xs:sequence> 
 
           <xs:element name="Abc006" type="xs:unsignedShort" /> 
 
           <xs:element name="Abc007"> 
 
           <xs:complexType> 
 
            <xs:sequence> 
 
            <xs:element name="Abc008" type="xs:string"/> 
 
            <xs:element name="Abc009" type="xs:string" minOccurs="0"/> 
 
            <xs:element name="Abc010" type="xs:string"/> 
 
            </xs:sequence> 
 
           </xs:complexType> 
 
           </xs:element> 
 
           <xs:element name="Abc011" type="xs:date" /> 
 
           <xs:element name="Abc012"> 
 
           <xs:complexType> 
 
            <xs:sequence> 
 
            <xs:element name="Abc013" type="xs:string" /> 
 
            <xs:element name="Abc014" type="xs:string" /> 
 
            </xs:sequence> 
 
           </xs:complexType> 
 
           </xs:element> 
 
          </xs:sequence> 
 
         </xs:complexType> 
 
         </xs:element> 
 
        </xs:sequence> 
 
       </xs:complexType> 
 
      </xs:element> 
 
      </xs:sequence> 
 
     </xs:complexType> 
 
    </xs:element> 
 
    </xs:sequence> 
 
    </xs:complexType> 
 
</xs:element> 
 
</xs:schema>

这里是正在验证针对该XSD的XML文档:

<?xml version="1.0" encoding="utf-8"?> 
 
<Message xmlns="http://www.somesite.com/somefolder/messages"> 
 
\t <Header> 
 
\t \t <MessageId>Lorem</MessageId> 
 
\t \t <MessageSource>Ipsum</MessageSource> 
 
\t </Header> 
 
\t <Body> 
 
\t \t <Abc001> 
 
\t \t \t <Abc002>dolor</Abc002> 
 
\t \t \t <Abc003>sit amet</Abc003> 
 
\t \t \t <Abc004>consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.</Abc004> 
 
\t \t \t <Abc005> 
 
\t \t \t \t <Abc006>1234</Abc006> 
 
\t \t \t \t <Abc007> 
 
\t \t \t \t \t <Abc008>Ut enim</Abc008> 
 
\t \t \t \t \t <Abc009>ad</Abc009> 
 
\t \t \t \t \t <Abc010>minim</Abc010> 
 
\t \t \t \t </Abc007> 
 
\t \t \t \t <Abc011>1982-10-17</Abc011> 
 
\t \t \t \t <Abc012> 
 
\t \t \t \t \t <Abc013>veniam</Abc013> 
 
\t \t \t \t \t <Abc014>nostrud</Abc014> 
 
\t \t \t \t </Abc012> 
 
\t \t \t </Abc005> 
 
\t \t </Abc001> 
 
\t </Body> 
 
</Message>

现在,当我在XML中引入一些验证错误并根据XSD验证它时,它确实发现了所有错误。这里是容易出错的XML(我已标记引入的错误在哪里):

<?xml version="1.0" encoding="utf-8"?> 
 
<Message xmlns="http://www.somesite.com/somefolder/messages"> 
 
\t <Header> 
 
\t \t <MessageId>Lorem</MessageId> 
 
\t \t <MessageSource>Ipsum</MessageSource> 
 
\t </Header> 
 
\t <Body> 
 
\t \t <Abc001> 
 
\t \t \t <Abc002>dolor</Abc002> 
 
\t \t \t <Abc003>sit amet</Abc003> 
 
\t \t \t 
 
\t \t \t <!--The value for Abc004 is increased beyond the allowed 200 characters--> 
 
\t \t \t 
 
\t \t \t <Abc004>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</Abc004> 
 
\t \t \t <Abc005> 
 
\t \t \t \t <Abc006>1234</Abc006> 
 
\t \t \t \t <Abc007> 
 
\t \t \t \t \t <Abc008>Ut enim</Abc008> 
 
\t \t \t \t \t <ABC009>AD</ABC009> 
 
\t \t \t \t \t 
 
\t \t \t \t \t <!--<Abc010>minim</Abc010> Required element removed--> 
 
\t \t \t \t </Abc007> 
 
\t \t \t \t 
 
\t \t \t \t <!--Date formate below is wrong--> 
 
\t \t \t \t <Abc011>1982-10-37</Abc011> 
 
\t \t \t \t 
 
\t \t \t \t <Abc012> 
 
\t \t \t \t \t <Abc013>veniam</Abc013> 
 
\t \t \t \t \t <Abc014>nostrud</Abc014> 
 
\t \t \t \t </Abc012> 
 
\t \t \t </Abc005> 
 

 
\t \t \t <!--the element below is not allowed--> 
 
\t \t \t <Abc15>Not allowed</Abc15> 
 
\t \t </Abc001> 
 
\t </Body> 
 
</Message>

这里是我得到的XML,显示所有错误:

<MessageResponse xmlns="http://www.somesite.com/somefolder/messages"> 
 
    <Result>false</Result> 
 
    <Status>Failed</Status> 
 
    <FaultCount>4</FaultCount> 
 
    <Faults> 
 
     <Fault> 
 
      <FaultCode>ERR01</FaultCode> 
 
      <FaultMessage>The 'http://www.somesite.com/somefolder/messages:Abc004' element is invalid - The value 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.' is invalid according to its datatype 'String' - The actual length is greater than the MaxLength value.</FaultMessage> 
 
     </Fault> 
 
     <Fault> 
 
      <FaultCode>ERR02</FaultCode> 
 
      <FaultMessage>The element 'Abc007' in namespace 'http://www.somesite.com/somefolder/messages' has invalid child element 'ABC009' in namespace 'http://www.somesite.com/somefolder/messages'. List of possible elements expected: 'Abc009, Abc010' in namespace 'http://www.somesite.com/somefolder/messages'.</FaultMessage> 
 
     </Fault> 
 
     <Fault> 
 
      <FaultCode>ERR03</FaultCode> 
 
      <FaultMessage>The 'http://www.somesite.com/somefolder/messages:Abc011' element is invalid - The value '1982-10-37' is invalid according to its datatype 'http://www.w3.org/2001/XMLSchema:date' - The string '1982-10-37' is not a valid Date value.</FaultMessage> 
 
     </Fault> 
 
     <Fault> 
 
      <FaultCode>ERR04</FaultCode> 
 
      <FaultMessage>The element 'Abc001' in namespace 'http://www.somesite.com/somefolder/messages' has invalid child element 'Abc15' in namespace 'http://www.somesite.com/somefolder/messages'.</FaultMessage> 
 
     </Fault> 
 
    </Faults> 
 
</MessageResponse>

这是奇怪的部分。当我在“Abc001”元素的开始处引入一个更多的错误,并且还保留所有其他现有错误时,结果完全混乱。这里是新引入的错误XML:

<?xml version="1.0" encoding="utf-8"?> 
 
<Message xmlns="http://www.somesite.com/somefolder/messages"> 
 
\t <Header> 
 
\t \t <MessageId>Lorem</MessageId> 
 
\t \t <MessageSource>Ipsum</MessageSource> 
 
\t </Header> 
 
\t <Body> 
 
\t \t <Abc001> 
 
\t \t \t <!--newly introduced error - removed the following element--> 
 
\t \t \t <!--<Abc002>dolor</Abc002>--> 
 
\t \t \t <Abc003>sit amet</Abc003> 
 
\t \t \t <!--The value for Abc004 is increased beyond the allowed 200 characters--> 
 
\t \t \t <Abc004>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</Abc004> 
 
\t \t \t <Abc005> 
 
\t \t \t \t <Abc006>1234</Abc006> 
 
\t \t \t \t <Abc007> 
 
\t \t \t \t \t <Abc008>Ut enim</Abc008> 
 
\t \t \t \t \t <ABC009>AD</ABC009> 
 
\t \t \t \t \t <!--<Abc010>minim</Abc010>--> 
 
\t \t \t \t </Abc007> 
 
\t \t \t \t <Abc011>1982-10-37</Abc011> 
 
\t \t \t \t <Abc012> 
 
\t \t \t \t \t <Abc013>veniam</Abc013> 
 
\t \t \t \t \t <Abc014>nostrud</Abc014> 
 
\t \t \t \t </Abc012> 
 
\t \t \t </Abc005> 
 
\t \t \t <!--the element below is not allowed--> 
 
\t \t \t <Abc15>Not allowed</Abc15> 
 
\t \t </Abc001> 
 
\t </Body> 
 
</Message>

最后,这里是验证结果:

<MessageResponse xmlns="http://www.somesite.com/somefolder/messages"> 
 
    <Result>false</Result> 
 
    <Status>Failed</Status> 
 
    <FaultCount>1</FaultCount> 
 
    <Faults> 
 
     <Fault> 
 
      <FaultCode>ERR01</FaultCode> 
 
      <FaultMessage>The element 'Abc001' in namespace 'http://www.somesite.com/somefolder/messages' has invalid child element 'Abc003' in namespace 'http://www.somesite.com/somefolder/messages'. List of possible elements expected: 'Abc002' in namespace 'http://www.somesite.com/somefolder/messages'.</FaultMessage> 
 
     </Fault> 
 
    </Faults> 
 
</MessageResponse>

这里是我的C#代码我正在使用来验证:

public async Task<IMIDPreValidationAckMessage> ValidateXmlMessage(XDocument doc) 
    { 
     var result = new PreValidationAckMessage(); 
     result.Result = true; 
     result.Status = "Succeeded"; 

     var xsd = HttpContext.Current.Server.MapPath("~/message01.xsd"); 

     try 
     { 
      var uri = new System.Uri(xsd); 

      var localPath = uri.LocalPath; 

      var docNameSpace = doc.Root.Name.Namespace.NamespaceName; 

      XmlSchemaSet schemas = new XmlSchemaSet(); 
      schemas.Add(docNameSpace, localPath); 

      XmlReaderSettings xrs = new XmlReaderSettings(); 
      xrs.ValidationType = ValidationType.Schema; 
      xrs.ValidationFlags |= XmlSchemaValidationFlags.ReportValidationWarnings; 
      xrs.Schemas = schemas; 

      result.XSDNamespace = doc.Root.GetDefaultNamespace().NamespaceName; 
      var errCode = 1; 

      xrs.ValidationEventHandler += (s, e) => 
      { 
       var msg = e.Message; 
       result.Result = false; 
       result.Status = "Failed"; 
       result.FaultCount++; 
       result.Faults.Add(new Fault 
       { 
        FaultCode = "ERR" + errCode++.ToString().PadLeft(2, '0'), 
        FaultMessage = e.Message 
       }); 
      }; 

      using (XmlReader xr = XmlReader.Create(doc.CreateReader(), xrs)) 
      { 
       while (xr.Read()) { } 
      } 
     } 
     catch (System.Exception ex) 
     { 
      result.Result = false; 
      result.Status = "Unknown Error"; 
     } 
     return result; 
    } 

有人能告诉我这里有什么问题吗?

+0

将所有类都包含在最后一个片段中,这样才能复制粘贴并运行它。 – Evk

+0

@Evk:感谢您的快速回复。我没有发布的代码是嵌套类,而且它们在这里真的不相关。如果您只是将错误消息添加到验证事件处理程序内的字符串列表中,则它应该足以进行测试。我的代码只是收集错误消息并从中创建另一个XML文档。就这样。 –

回答

1

看来,XmlReader停止首次遇到的错误元素验证。这里是一个链接的旧(过时)XmlValidatingReaderValidationEventHandler说明:

If an element reports a validation error, the rest of the content model for that element is not validated, however, its children are validated. The reader only reports the first error for a given element.

而且似乎它与常规XmlReader相同(虽然它的文档没有提到它明确)。

在第一个例子中,错误是在最里面的元素(比如元素的无效文本值)或者最后一个子元素中,所以它们都被报告并且不会被跳过。但是在上一个示例中,您在根Abc001元素的开头处引入了错误,因此跳过了其余的Abc001内容以及所有错误。

+0

再次感谢您的快速回复。你说的有道理,尽管我的印象是验证贯穿整个元素树并报告所有错误。我会等一会儿再给别的反馈告诉我们。如果没有收到其他解释,我会将您标记为已接受的答案。谢谢。 –

+0

您可以通过在树的各个部分中引入错误来检查此问题。一般而言,验证不会停止在第一个错误上,只会在给定子树(元素)中的第一个错误上停止。在你的最后一个例子中,如果你有多个'Abc001'元素 - 它只会跳过第一个元素(因为它在开始时有错误),但会继续到后续元素。如果在''元素后面引入错误 - 直到此时它才会分析'Abc005'的内容。 – Evk

+0

我忘了回到这个问题,并标记答案。抱歉!由于没有其他更好的反馈收到,我测试了你的建议,看起来你是对的。所以,我认为这是被接受的答案。再次,对于延误感到抱歉。 –

相关问题