2010-03-05 58 views
4
+--------------------------+--------------------------------------------------------+ 
| Variable_name   | Value             | 
+--------------------------+--------------------------------------------------------+ 
| character_set_client  | utf8             | 
| character_set_connection | utf8             | 
| character_set_database | utf8             | 
| character_set_filesystem | binary             | 
| character_set_results | utf8             | 
| character_set_server  | utf8             | 
| character_set_system  | utf8             | 
| character_sets_dir  | /usr/local/mysql-5.1.41-osx10.5-x86_64/share/charsets/ | 
+--------------------------+--------------------------------------------------------+ 
8 rows in set (0.00 sec) 

mysql> select version(); 
+-----------+ 
| version() | 
+-----------+ 
| 5.1.41 | 
+-----------+ 
1 row in set (0.00 sec) 

mysql> select char(0x00FC); 
+--------------+ 
| char(0x00FC) | 
+--------------+ 
| ?   | 
+--------------+ 
1 row in set (0.00 sec)

期待实际的utf8字符 - >“ü”而不是“?”尝试char(0x00FC使用utf8)也是,但没有去。MySQL CHAR()函数和UTF8输出?

使用MySQL版本5.1.41

去过印花布谷歌,找不到这样的东西。 MySQL文档简单地说,在MySQL版本5.0.14之后,多字节输出预计值大于255。

感谢

+0

你的shell使用什么字符集? – thetaiko 2010-03-05 03:26:21

回答

7

你混淆了UTF-8使用Unicode。

0x00FC为ü的的Unicode代码点:

mysql> select char(0x00FC using ucs2); 
+----------------------+ 
| char(0x00FC using ucs2) | 
+----------------------+ 
| ü     | 
+----------------------+ 

UTF-8编码,0x00FC is represented by two bytes

mysql> select char(0xC3BC using utf8); 
+-------------------------+ 
| char(0xC3BC using utf8) | 
+-------------------------+ 
| ü      | 
+-------------------------+ 

UTF-8是仅仅编码的方式以二进制形式的Unicode字符。这意味着节省空间,这就是ASCII字符只占用一个字节的原因,而诸如ü的iso-8859-1字符只占用两个字节。其他一些字符需要三个或四个字节,但它们不太常见。

+0

谢谢 - 非常有帮助。 – jason 2010-03-05 05:10:17