2016-09-26 84 views
0

我有一个包含各种语言(包括ASCII和本地字符)的文件。我希望我的shell可以处理任何语言 - 英文,阿拉伯文,中文,日文等。设置字符编码在Cygwin Shell中读取多个字符集

我阅读了'国际化'中的cygwin页面和支持的字符集列表(如下)。另外,我已阅读怪异字符的文件:https://cygwin.com/faq-nochunks.html#faq.using.weirdchars

Charset    Codepage 
------------------- ------------------------------------------- 
ASCII     20127 (US_ASCII) 

CP437     437 (OEM United States) 
CP720     720 (DOS Arabic) 
CP737     737 (OEM Greek) 
CP775     775 (OEM Baltic) 
CP850     850 (OEM Latin 1, Western European) 
CP852     852 (OEM Latin 2, Central European) 
CP855     855 (OEM Cyrillic) 
CP857     857 (OEM Turkish) 
CP858     858 (OEM Latin 1 + Euro Symbol) 
CP862     862 (OEM Hebrew) 
CP866     866 (OEM Russian) 
CP874     874 (ANSI/OEM Thai) 
CP932   932 (Shift_JIS, not exactly identical to SJIS) 
CP1125     1125 (OEM Ukraine) 
CP1250     1250 (ANSI Central European) 
CP1251     1251 (ANSI Cyrillic) 
CP1252     1252 (ANSI Latin 1, Western European) 
CP1253     1253 (ANSI Greek) 
CP1254     1254 (ANSI Turkish) 
CP1255     1255 (ANSI Hebrew) 
CP1256     1256 (ANSI Arabic) 
CP1257     1257 (ANSI Baltic) 
CP1258     1258 (ANSI/OEM Vietnamese) 

ISO-8859-1   28591 (ISO-8859-1) 
ISO-8859-2   28592 (ISO-8859-2) 
ISO-8859-3   28593 (ISO-8859-3) 
ISO-8859-4   28594 (ISO-8859-4) 
ISO-8859-5   28595 (ISO-8859-5) 
ISO-8859-6   28596 (ISO-8859-6) 
ISO-8859-7   28597 (ISO-8859-7) 
ISO-8859-8   28598 (ISO-8859-8) 
ISO-8859-9   28599 (ISO-8859-9) 
ISO-8859-10    - (not available) 
ISO-8859-11    - (not available) 
ISO-8859-13   28603 (ISO-8859-13) 
ISO-8859-14    - (not available) 
ISO-8859-15   28605 (ISO-8859-15) 
ISO-8859-16    - (not available) 

Big5     950 (ANSI/OEM Traditional Chinese) 
EUCCN or euc-CN   936 (ANSI/OEM Simplified Chinese) 
EUCJP or euc-JP  20932 (EUC Japanese) 
EUCKR or euc-KR   949 (EUC Korean) 
GB2312     936 (ANSI/OEM Simplified Chinese) 
GBK      936 (ANSI/OEM Simplified Chinese) 
GEORGIAN-PS    - (not available) 
KOI8-R    20866 (KOI8-R Russian Cyrillic) 
KOI8-U    21866 (KOI8-U Ukrainian Cyrillic) 
PT154     - (not available) 
SJIS     - (not available, almost, but not exactly CP932) 
TIS620 or TIS-620  874 (ANSI/OEM Thai) 

UTF-8 or utf8   65001 (UTF-8) 

我的主要问题:是否有可能有cygwin外壳同时读取多国语言?我还没有真正能够找到这方面的很多。任何方向高度赞赏。

+0

默认情况下Cygwin的使用UTF-8作为编纂。您可以使用iconv将任何代码页转换为另一个代码页。详情请见'man iconv' – matzeri

回答

0

你究竟是什么意思?

在现代Windows(Windows 10)中最近的Cygwin中,我可以让Cygwin显示各种字符。例如

$ env LANG=ru_RU.UTF-8 cp --help 
$ env LANG=zh_CN.UTF-8 cp --help 
$ env LANG=ja_JP.UTF-8 cp --help 

将显示俄文,中文,日文文本等。

如果这没有工作,你也可以用一个额外的步骤iconv做到这一点在Windows PowerShell中,尽管对于后处理输出:

PS C:\cygwin\bin> .\env.exe LANG=zh_CN.UTF-8 .\cp.exe --help | .\iconv.exe -f UTF-8 -t UTF-16