将Unicode分配给wchar_t变量

如何将Unicode（如字符）向上箭头分配给wchar_t变量？将Unicode分配给wchar_t变量

2016-07-15 pik_jan

'wchar_t的变量= L'\ u1234';'（用所需的Unicode代码点替换1234）。 –

@IgorTandetnik绝对正确，这是唯一可靠的方法。如果您需要查找Unicode代码点值，只需Google即可。例如，下面是一个箭头页面：https://en.wikipedia.org/wiki/Template:Unicode_chart_Arrows –

字符直接赋值在Windows上对代码点U + 10000及以上无效，因为'wchar_t'为16位。在这种情况下，您需要使用占用两个'wchar_t'的代理对。 – Sergio

wchar_t可能在Linux上是32位，但在Windows 16位上，UTF-16LE编码有时需要两个wchar_t来存储一个Unicode代码点。

UTF-16LE和UTF-16BE不是线性的，具有不能分成两个字符串的对。而wchar_t是不可移植的。

因此更好使用UTF-8和char。

void append_utf8(string& s, uint cp) { 
    if (cp < 0x80 && cp != 0) { 
     // Let encode U+0 too (Modified UTF-8), as non-terminator? 
     s.append(1, (char) cp); 
    } else { 
     char cpBytes[6]; 
     int bi = 0; 
     int lastPrefix = 0xC0; 
     int lastMask = 0x1F; 
     for (;;) { 
      int b = 0x80 | (cp & 0x3F); 
      cpBytes[bi] = (char)b; 
      ++bi; 
      cp >>= 6; 
      if ((cp & ~lastMask) == 0) { 
       cpBytes[bi] = (char) (lastPrefix | cp); 
       ++bi; 
       break; 
      } 
      lastPrefix = 0x80 | (lastPrefix >> 1); 
      lastMask >>= 1; 
     } 
     while (bi > 0) { 
      --bi; 
      s.append(1, cpBytes[bi]); 
     } 
    } 
} 

string s; 
append_utf8(s, 0x2191): // For U+2191 up arrow. 
append_utf8(s, 0x1F913): // For U+01F913 emoji nerd face.

对于Windows类似宽字符（UTF-16）：

void append_wch(std::wstring& s, uint cp) { 
    if (cp < 0x10000) { 
     s.append(1, (wchar_t) cp); 
    } else { 
     cp -= 0x10000; 
     uint w = (cp >> 6) + 0xD800; 
     s.append(1, (wchar_t) w); 
     w = (cp & 0x3FF) + 0xDC00; 
     s.append(1, (wchar_t) w); 
    } 
}

（心，我与Java的影响污点）

来源

2016-07-15 15:13:27

Windows API是UTF-16，因此在该平台上使用UTF-8需要大量额外的工作。请参阅http://utf8everywhere.org/ –

@MarkRansom是的，只有Windows的桌面开发可能就是OP所需要的。谢谢 –

将Unicode分配给wchar_t变量

回答

相关问题