无效字节2 4字节文档解析

我试图从字节解析文档如下无效字节2 4字节文档解析

String result = /* some valid xml document */ 
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); 
DocumentBuilder parser = factory.newDocumentBuilder(); 
try { 
    Document document = parser.parse(new ByteArrayInputStream(result.getBytes())); 
} catch (MalformedByteSequenceException e) { 
    System.out.println("(MalformedByteSequenceException) " + e.getMessage()); 
}

甲MalformedByteSequenceException被抛出并且下一个被打印在控制台中UTF-8序列

"(MalformedByteSequenceException) Invalid byte 2 of 4-byte UTF-8 sequence."

奇怪的是，同样的代码在我的本地环境（Windows 10），但不是在远程环境（Windows Server 2012中）

我试图复制ERR或者在我的本地环境中将TomEE版本从1.7.4更改为1.7.1，我尝试将JRE从1.7.0_80更改为1.7.0，我尝试将远程系统中的完整Tomee文件夹复制到本地机器，错误只发生在远程环境中

使用result.getBytes(Charset.forName("UTF-8"))而不是result.getBytes()也没有工作。

来源

2017-08-16 Gustavo Arias Méndez

您的Unicode文件在序言中是否有编码？ – bmargulies

xml文件来自web服务响应，而不是来自文件 –

同样的问题。有没有编码=？ – bmargulies

我找到了解决方案。在setenv.bat开始设置此，

rem Set encoding 
set JAVA_OPTS=%JAVA_OPTS% -Dfile.encoding=UTF-8

我不知道这背后的理由，但似乎该JVM使用一些奇怪的Windows编码，而不是UTF-8，你需要

来源

2017-08-16 19:43:08

调用String.getBytes()与调用String.getBytes("<value of file.encoding>")完全相同。

但是，根本没有必要调用它。请致电parse，并通过设置电话StringReader。

来源

2017-08-16 20:26:46 bmargulies

无效字节2 4字节文档解析

回答

相关问题