.NET中“混合”类型的正确XML序列化和反序列化

我目前的任务涉及编写一个用于处理HL7 CDA文件的类库。
这些HL7 CDA文件是具有已定义XML模式的XML文件，因此我使用xsd.exe为XML序列化和反序列化生成.NET类。.NET中“混合”类型的正确XML序列化和反序列化

XML模式包含各种类型，其中包含mixed =“true”属性，指定此类型的XML节点可能包含与其他XML节点混合的普通文本。
的XML模式的这些类型之一的相关部分看起来是这样的：

<xs:complexType name="StrucDoc.Paragraph" mixed="true"> 
    <xs:sequence> 
     <xs:element name="caption" type="StrucDoc.Caption" minOccurs="0"/> 
     <xs:choice minOccurs="0" maxOccurs="unbounded"> 
      <xs:element name="br" type="StrucDoc.Br"/> 
      <xs:element name="sub" type="StrucDoc.Sub"/> 
      <xs:element name="sup" type="StrucDoc.Sup"/> 
      <!-- ...other possible nodes... --> 
     </xs:choice> 
    </xs:sequence> 
    <xs:attribute name="ID" type="xs:ID"/> 
    <!-- ...other attributes... --> 
</xs:complexType>

的产生这种类型的代码看起来是这样的：

/// <remarks/> 
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.3038")] 
[System.SerializableAttribute()] 
[System.Diagnostics.DebuggerStepThroughAttribute()] 
[System.ComponentModel.DesignerCategoryAttribute("code")] 
[System.Xml.Serialization.XmlTypeAttribute(TypeName="StrucDoc.Paragraph", Namespace="urn:hl7-org:v3")] 
public partial class StrucDocParagraph { 

    private StrucDocCaption captionField; 

    private object[] itemsField; 

    private string[] textField; 

    private string idField; 

    // ...fields for other attributes... 

    /// <remarks/> 
    public StrucDocCaption caption { 
     get { 
      return this.captionField; 
     } 
     set { 
      this.captionField = value; 
     } 
    } 

    /// <remarks/> 
    [System.Xml.Serialization.XmlElementAttribute("br", typeof(StrucDocBr))] 
    [System.Xml.Serialization.XmlElementAttribute("sub", typeof(StrucDocSub))] 
    [System.Xml.Serialization.XmlElementAttribute("sup", typeof(StrucDocSup))] 
    // ...other possible nodes... 
    public object[] Items { 
     get { 
      return this.itemsField; 
     } 
     set { 
      this.itemsField = value; 
     } 
    } 

    /// <remarks/> 
    [System.Xml.Serialization.XmlTextAttribute()] 
    public string[] Text { 
     get { 
      return this.textField; 
     } 
     set { 
      this.textField = value; 
     } 
    } 

    /// <remarks/> 
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="ID")] 
    public string ID { 
     get { 
      return this.idField; 
     } 
     set { 
      this.idField = value; 
     } 
    } 

    // ...properties for other attributes... 
}

如果我反序列化一个XML元素，其段落节点如下所示：

<paragraph>first line<br /><br />third line</paragraph>

的结果是该项目和文本阵列读这样的：

itemsField = new object[] 
{ 
    new StrucDocBr(), 
    new StrucDocBr(), 
}; 
textField = new string[] 
{ 
    "first line", 
    "third line", 
};

从这个是没有可能的方式来确定文本和其他节点的确切顺序。
如果我再次序列化此，结果看起来就像这样：

<paragraph> 
    <br /> 
    <br />first linethird line 
</paragraph>

默认串行只是先序列化的项目的文本。

我试图在StrucDocParagraph类实现IXmlSerializable，这样我可以控制内容的反序列化和序列化，但它是相当复杂的，因为有参与了很多课，我没来解决但因为我不不知道付出的努力是否值得。

是否有某种容易的解决方法到这个问题，或者它甚至可以通过自定义序列化通过IXmlSerializable？或者我应该只使用XmlDocument或XmlReader/XmlWriter来处理这些文件？

来源

2010-04-02 Stefan

为了解决这个问题，我不得不修改生成的类：

从Text财产移动XmlTextAttribute到Items属性及其参数Type = typeof(string)
取出Text财产
删除textField字段

因此生成的代码（经修饰）看起来像这样：

/// <remarks/> 
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.3038")] 
[System.SerializableAttribute()] 
[System.Diagnostics.DebuggerStepThroughAttribute()] 
[System.ComponentModel.DesignerCategoryAttribute("code")] 
[System.Xml.Serialization.XmlTypeAttribute(TypeName="StrucDoc.Paragraph", Namespace="urn:hl7-org:v3")] 
public partial class StrucDocParagraph { 

    private StrucDocCaption captionField; 

    private object[] itemsField; 

    private string idField; 

    // ...fields for other attributes... 

    /// <remarks/> 
    public StrucDocCaption caption { 
     get { 
      return this.captionField; 
     } 
     set { 
      this.captionField = value; 
     } 
    } 

    /// <remarks/> 
    [System.Xml.Serialization.XmlElementAttribute("br", typeof(StrucDocBr))] 
    [System.Xml.Serialization.XmlElementAttribute("sub", typeof(StrucDocSub))] 
    [System.Xml.Serialization.XmlElementAttribute("sup", typeof(StrucDocSup))] 
    // ...other possible nodes... 
    [System.Xml.Serialization.XmlTextAttribute(typeof(string))] 
    public object[] Items { 
     get { 
      return this.itemsField; 
     } 
     set { 
      this.itemsField = value; 
     } 
    } 

    /// <remarks/> 
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="ID")] 
    public string ID { 
     get { 
      return this.idField; 
     } 
     set { 
      this.idField = value; 
     } 
    } 

    // ...properties for other attributes... 
}

现在，如果我反序列化 XML元素，其中段落节点看起来像这样：

<paragraph>first line<br /><br />third line</paragraph>

的结果是该项目阵列如下所示：

itemsField = new object[] 
{ 
    "first line", 
    new StrucDocBr(), 
    new StrucDocBr(), 
    "third line", 
};

这是正是我需要的，项目的顺序和他们的内容是正确。
如果我再次序列化这一点，结果又是正确的：

<paragraph>first line<br /><br />third line</paragraph>

什么我指出了正确的方向是由纪尧姆答案，我也认为它必须能够这样。然后还有在MSDN documentation to XmlTextAttribute是这样的：

您可以将XmlTextAttribute到返回一个字符串数组的字段或属性。 您也可以将属性应用于类型的数组，但您必须将类型属性设置为字符串。在这种情况下，插入到数组中的任何字符串都会作为XML文本序列化为。

所以序列化和反序列化工作现在正确，但我不知道是否有任何其他副作用。也许无法用xsd.exe从这些类生成模式，但我不需要那样。

来源

2010-04-06 10:03:32 Stefan

这似乎不工作了（我的System.Xml版本是4.0.0）问题是它通过ItemsElementName字符串数组跟踪Items数组中元素的名称，并且元素必须匹配1对1 。如果您正在通过反序列化XML文档填充的对象模型进行工作，则此要求会导致错误，因为XMLSerializer不会为其ItemsElementName数组中放置代表性条目。所以一个文本节点后跟一个xml元素后跟一个文本节点会导致Items数组中有3个条目，但在ItemsElementName中只有1个条目。 – shahzbot 2017-10-30 20:25:19

什么

itemsField = new object[] 
{ 
    "first line", 
    new StrucDocBr(), 
    new StrucDocBr(), 
    "third line", 
};

？

来源

2010-04-02 15:25:34 Guillaume

当我尝试序列化对象时（由于itemsField中的字符串，itemsField数组可能只包含由[XmlElement]属性指定的公共属性'Items'）的那些类型的对象时，会导致InvalidOperationException，）。 – Stefan 2010-04-06 06:37:54

您可能会在这里找到帮助： http://msdn.microsoft.com/en-us/library/kz8z99ds.aspx 任何模式验证警告？ – Guillaume 2010-04-06 08:06:23

我在搜索期间已经找到该页面，这是关于另一个问题。我的xml文档的模式是正确的，我在反序列化之前和序列化之后验证它。但是我刚才找到了我的问题的答案，你对itemsField数组的建议已经接近，它只需要在生成的代码中进一步修改。我会在几分钟后发布。 – Stefan 2010-04-06 09:09:38

我遇到了与此相同的问题，并遇到了更改xsd.exe生成的.cs的解决方案。虽然它确实起作用，但我并不习惯更改生成的代码，因为我需要记得在重新生成类时随时执行它。它还导致了一些笨拙的代码，它必须测试并转换为XmlNode []以用于mailto元素。

我的解决方案是重新思考xsd。我放弃了混合类型的使用，并且基本上定义了我自己的混合类型。

我有这个

XML: <text>some text <mailto>[email protected]</mailto>some more text</text> 

<xs:complexType name="text" mixed="true"> 
    <xs:sequence> 
     <xs:element minOccurs="0" maxOccurs="unbounded" name="mailto" type="xs:string" /> 
    </xs:sequence> 
    </xs:complexType>

，并改为

XML: <mytext><text>some text </text><mailto>[email protected]</mailto><text>some more text</text></mytext> 

<xs:complexType name="mytext"> 
    <xs:sequence> 
     <xs:choice minOccurs="0" maxOccurs="unbounded"> 
     <xs:element name="text"> 
      <xs:complexType> 
      <xs:simpleContent> 
       <xs:extension base="xs:string" /> 
      </xs:simpleContent> 
      </xs:complexType> 
     </xs:element> 
     <xs:element name="mailto"> 
      <xs:complexType> 
      <xs:simpleContent> 
       <xs:extension base="xs:string" /> 
      </xs:simpleContent> 
      </xs:complexType> 
     </xs:element> 
     </xs:choice> 
    </xs:sequence> 
    </xs:complexType>

我生成的代码，现在给了我一个类会将myText：

public partial class myText{ 

    private object[] itemsField; 

    /// <remarks/> 
    [System.Xml.Serialization.XmlElementAttribute("mailto", typeof(myTextTextMailto))] 
    [System.Xml.Serialization.XmlElementAttribute("text", typeof(myTextText))] 
    public object[] Items { 
     get { 
      return this.itemsField; 
     } 
     set { 
      this.itemsField = value; 
     } 
    } 
}

的元素的顺序现在被保存在serilization/deserialisation，但我不得不测试/投与/类型myTextTextMailto和myTextText。

只是想我会把它作为替代方法，为我工作。

来源

2011-03-24 14:43:22 Feenster

我同意你的方法是定义和使用他自己的XML模式的人的首选解决方案。我的问题是，我没有选择更改XSD，因为它是由第三方控制的。所以我不得不修改生成的类，因为你陈述的原因，只有在没有其他方式时才应该这样做。 – Stefan 2011-03-25 08:04:46

.NET中“混合”类型的正确XML序列化和反序列化

回答

相关问题