2015-07-04 61 views
1

我最近发现<codecvt>标题,所以我想在UTF-8和UTF-16之间转换。错误endian与wstring_convert

我使用C++ 11中的wstring_convertcodecvt_utf8_utf16构面。 我遇到的问题是,当我尝试将UTF-16字符串转换为UTF-8时,再次在UTF-16中,字节顺序发生了变化。

对于这个代码:

#include <codecvt> 
#include <string> 
#include <locale> 
#include <iostream> 

using namespace std; 

int main(int argc, char const *argv[]) 
{ 
    wstring_convert<codecvt_utf8_utf16<char16_t>, char16_t> 
               convert; 

    u16string utf16 = u"\ub098\ub294\ud0dc\uc624"; 

    cout << hex << "UTF-16\n\n"; 
    for (char16_t c : utf16) 
    cout << "[" << c << "] "; 

    string utf8 = convert.to_bytes(utf16); 

    cout << "\n\nUTF-16 to UTF-8\n\n"; 
    for (unsigned char c : utf8) 
    cout << "[" << int(c) << "] "; 
    cout << "\n\nConverting back to UTF-16\n\n"; 

    utf16 = convert.from_bytes(utf8); 

    for (char16_t c : utf16) 
    cout << "[" << c << "] "; 
    cout << endl; 
} 

我得到这样的输出:

UTF-16

[B098] [B294] [d0dc] [C624]

UTF -16至UTF-8

[EB] [82] [98] [EB] [8A] [94 ] [ED] [83] [图9C] [EC] [98] [A4]

转换回UTF-16

[98b0] [94b2] [DCD0] [24c6]

当我将wstring_convert的第三个模板参数更改为std::little_endian时,字节被反转。

我错过了什么?

+0

无法重现:http://coliru.stacked-crooked.com/a/5599be701f3ebb32 – Cubbi

+0

感谢您的答复,这是奇怪的,我m使用gcc 5,我会尝试从今晚的资源中编译它,看看我是否得到相同的行为。 – Dante

+0

将编译器切换到gcc也不会在coliru上重现此问题:http://coliru.stacked-crooked.com/a/cbac3e56d8f55c30 – Cubbi

回答