我正在解析和验证(xsd)长XML(始终格式良好)的文件,报告了所有验证问题。具有持续验证的Java XMLReader:未定义的行为
我的解析器报告并继续应用错误,但有一个奇怪的例外:当由几个节点(子节点)组成的节点(父节点)在任何子节点上验证失败时,正确解析所有子节点,但验证停止,直到下一个父节点开始。
考虑简单XSD:
<?xml version="1.0" encoding="UTF-8" ?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="customerDataFile">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="customerList"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element name="customerList">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="customerData" minOccurs="1" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element name="customerData">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="NameField1"/>
<xsd:element ref="NameField2"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element type="name_field" name="NameField1"/>
<xsd:element type="name_field" name="NameField2"/>
<xsd:simpleType name="name_field">
<xsd:restriction base="xsd:string">
<xsd:maxLength value="45"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:schema>
和这5个例子:
<?xml version="1.0" encoding="UTF-8"?>
<customerDataFile xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="customerDataFile.xsd">
<customerList>
<customerData>
<NameField1>Somecompany</NameField1>
<NameField2>Somefirstname</NameField2>
</customerData>
<customerData>
<NameField1>Somecompany</NameField1>
<NameField2>Somefirstname</NameField2>
</customerData>
</customerList>
</customerDataFile>
<?xml version="1.0" encoding="UTF-8"?>
<customerDataFile xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="customerDataFile.xsd">
<customerList>
<customerData>
<Unknown1>Somecompany</Unknown1>
<NameField2>Somefirstname</NameField2>
</customerData>
<customerData>
<Unknown1>Somecompany</Unknown1>
<NameField2>Somefirstname</NameField2>
</customerData>
</customerList>
</customerDataFile>
<?xml version="1.0" encoding="UTF-8"?>
<customerDataFile xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="customerDataFile.xsd">
<customerList>
<customerData>
<NameField1>Somecompany</NameField1>
<Unknown2>Somefirstname</Unknown2>
</customerData>
<customerData>
<NameField1>Somecompany</NameField1>
<Unknown2>Somefirstname</Unknown2>
</customerData>
</customerList>
</customerDataFile>
<?xml version="1.0" encoding="UTF-8"?>
<customerDataFile xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="customerDataFile.xsd">
<customerList>
<customerData>
<Unknown1>Somecompany</Unknown1>
<Unknown2>Somefirstname</Unknown2>
</customerData>
<customerData>
<Unknown1>Somecompany</Unknown1>
<Unknown2>Somefirstname</Unknown2>
</customerData>
</customerList>
</customerDataFile>
<?xml version="1.0" encoding="UTF-8"?>
<customerDataFile xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="customerDataFile.xsd">
<customerList>
<customerData>
<Unknown2>Somefirstname</Unknown2>
</customerData>
<customerData>
<Unknown1>Somecompany</Unknown1>
</customerData>
</customerList>
</customerDataFile>
即输出如下:
- 没有错误 - 正确
- 2错误(一次每customerData) - 正确 个
- 2个错误(每秒customerData) - 正确
- 2个错误(每customerData只有一个) - 不正确
- 2个错误(即使缺少的元素是严重) - 不正确
这是荒谬的;我找不到任何类似的参考(它看起来像一个主要问题)。
相关的代码是:
public void process(String schemaLocation, String xmlLocation) {
Source source = new StreamSource(new File(schemaLocation));
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema(source);
SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setSchema(schema);
spf.setNamespaceAware(true);
SAXParser saxParser = spf.newSAXParser();
CustomerHandler handler = new CustomerHandler();
CustomerErrorHandler errorHandler = new CustomerErrorHandler();
InputStream inputStream = new FileInputStream(new File(xmlLocation));
Reader reader = new InputStreamReader(inputStream, "UTF-8");
InputSource is = new InputSource(reader);
is.setEncoding("UTF-8");
saxParser.setContentHandler(handler);
saxParser.setErrorHandler(errorHandler);
saxParser.parse(is); }
其中CustomerErrorHandler简单
public class CustomerErrorHandler implements ErrorHandler {
@Override
public void error(SAXParseException arg0) throws SAXException {
System.out.println(arg0.getMessage());
}
@Override
public void fatalError(SAXParseException arg0) throws SAXException {
System.out.println(arg0.getMessage());
}
@Override
public void warning(SAXParseException arg0) throws SAXException {
System.out.println(arg0.getMessage());
}
}
有没有人对为什么会发生这种情况,什么我做错了任何指针,以及最重要的是,如何如果这种方法不起作用,是否可以正确地对XML文档进行完整验证?
您使用Xerces的解析器是哪一种? Validatior行为是解析器特有的,特别是在错误和扩展功能方面。 – Ironluca
是的,Xerces 2是默认的。 –