？字符在返回时转换为问号

我有一个很奇怪的问题。我正在从亚马逊AWS SQS中获取消息。虽然把我压缩和编码的消息，例如：？字符在返回时转换为问号

String responseMessageBodyOriginal = gson.toJson(responseData); 
String responseMessageBodyCompressed = compressToBase64String(responseMessageBodyOriginal); 
AmazonSqsHelper.sendMessage(responseMessageBodyCompressed, queue, null);

压缩和编码函数，如下所示：

public static String compressToBase64String(String data) throws IOException { 
    ByteArrayOutputStream bos = new ByteArrayOutputStream(data.length()); 
    GZIPOutputStream gzip = new GZIPOutputStream(bos); 
    gzip.write(data.getBytes()); 
    gzip.close(); 
    byte[] compressedBytes = bos.toByteArray(); 
    bos.close(); 
    return new String(Base64.encodeBase64(compressedBytes)); 
}

在另一方面，当接收消息时，这是代码：

List<Message> sqsMessageList = AmazonSqsHelper.receiveMessages(queueUrl, max_message_read_count, 
        default_visibility_timeout); 
int num_messages = sqsMessageList.size(); 
if (num_messages > 0) { 
    for (Message m : sqsMessageList) { 
     String responseMessageBodyCompressed = m.getBody(); 
     String responseMessageBodyOriginal = decompressFromBase64String(responseMessageBodyCompressed); 
    } 
}

，并用于解码和解压功能是这样的：

public static String decompressFromBase64String(String compressedString) throws IOException { 
    byte[] compressedBytes = Base64.decodeBase64(compressedString); 
    ByteArrayInputStream bis = new ByteArrayInputStream(compressedBytes); 
    GZIPInputStream gis = new GZIPInputStream(bis); 
    BufferedReader br = new BufferedReader(new InputStreamReader(gis, "UTF-8")); 
    StringBuilder sb = new StringBuilder(); 
    String line; 
    while ((line = br.readLine()) != null) { 
     sb.append(line); 
    } 
    br.close(); 
    gis.close(); 
    bis.close(); 
    return sb.toString(); 
}

但问题是，有时如果我通过像“Â®”，那么那些越来越转换为字符???? ，解码后如果我打印的消息。

无法弄清楚为什么编码和解码行为怪异。任何帮助，将不胜感激。

来源

2016-11-27 HackAround

您了解ASCII和Unicode之间的区别，Unicode是如何编码UTF-8的？如果没有，请查看维基百科。在处理流中某处存在不匹配的地方，数据以单向编码但以不同方式解码。 –

什么是'？'？那两个Unicode代码点（'0xe2 0xae'）？或者，它是UTF-8编码吗？如果后者是无效的，因为“0xe2”表示3字节编码的开始。这些数据来自哪里？在不知道你认为这些角色代表了什么的情况下，确定你的问题在哪里并不是真的可能。 –

@JimGarrison这些是网址。网址包含这些字符。我知道区别:)而且我确保编码和解码以相同的方式完成。例如：https://www.amazon.com/beavers-officially-licensed-university-keyscaperâ®/dp/b00ikp2ccq?psc=1 – HackAround

问题在于使用平台的默认字符集（data.getBytes()）进行编码，同时使用UTF-8进行解码。

在compressToBase64String更改data.getBytes()到data.getBytes(StandardCharsets.UTF_8)。

来源

2016-11-28 07:56:52

？字符在返回时转换为问号

回答

相关问题