2017-09-13 110 views
0

我试图将文本转换为十六进制字符以便在mac上构建dmg。
我被烧坏的事实是,十六进制似乎并没有在mac和windows上为ascii字符> 127指向相同的字符。而且它似乎基本的javascript函数只给出了“windows”的翻译。
我需要的 “陆委会” 翻译成十六进制...在Mac上将文本转换为十六进制

到目前为止,我这样做:

const fileData = await parseJson(readFile(item.file, "utf-8")) 
const buttonsStr = labelToHex(fileData.lang) 

function labelToHex(label: string) { 
    return hexEncode(label).toString().toUpperCase() 
} 

function hexEncode(str: string) { 
    let i 
    let result = "" 

    for (i = 0; i < str.length; i++) { 
    result += unicodeToHex(str.charCodeAt(i)) 
    } 

    return result 
} 

function unicodeToHex(unicode: number) { 
    const hex = unicode.toString(16) 
    return ("0" + hex).slice(-2) 
} 

如果我通过:法语EAE
我得到: 46 72 61 6E E7 61 69 73 E9 E0 E8
但是当我读回,我得到:FranÀaisè‡Ë
我期待到g等: 46 72 61 6E 8D 61 69 73 8E 88 8F
,使得读回给出: 46 72 61 6E E7 61 69 73 E9 E0 E8

这对应于那些片材:
https://academic.evergreen.edu/projects/biophysics/technotes/program/ascii_ext-mac.htm https://academic.evergreen.edu/projects/biophysics/technotes/program/ascii_ext-pc.htm

尽管如此,我还是无法找到一个npm包,它会根据操作系统转换为十六进制或仅仅是一些我仍然没有发现的晦涩js函数?
我跑出来的想法,只是要做到:

function unicodeToHex(unicode: number) { 
    if (unicode < 128) { 
    const hex = unicode.toString(16) 
    return ("0" + hex).slice(-2) 
    } 

    if (unicode === 233) { return "8E" }//é 
    if (unicode === 224) { return "88" }//à 
    if (unicode === 232) { return "8F" }//è 

    return "3F" //? 
} 

,但我真的想避免...

+0

charCodeAt返回一个unicode值,这比你的1字节修剪涉及的方式要多。如果你只想要节点中的字符串的十六进制值。试试这个 - >'新缓冲区('Françaiséàè')。toString('hex')'='4672616ec3a761697320c3a9c3a0c3a8',即16个字节,为你11个字符的字符串。 – Keith

+0

这似乎是合乎逻辑的,但它仍然在我的伤害中不起作用。我得到垃圾而不是éàè:√ß√˘ß©ß®_这让我觉得它可能是因为dmg不知道它是utf8 ? – ether

+1

'dmg不知道它是utf8'很可能,..如果是这样,有工具可以将utf8转换为选定的代码页,所以如果你能找出dmg文件使用的代码页,你可以使用类似于 - > https://www.npmjs.com/package/codepage – Keith

回答

0

我已经找到一种方法来做到这一点,这要感谢代码页,如@Keith所述

const cptable = require("codepage") 
function hexEncode(str: string, lang: string, langWithRegion: string) { 
    let code 
    let hex 
    let i 
    const macCodePages = getMacCodePage(lang, langWithRegion) 
    let result = "" 

    for (i = 0; i < str.length; i++) { 
    try { 
     code = getMacCharCode(str, i, macCodePages) 
     if (code === undefined) { 
     hex = "3F" //? 
     } else { 
     hex = code.toString(16) 
     } 

     result += hex 
    } catch (e) { 
     debug("there was a problem while trying to convert a char to hex: " + e) 
     result += "3F" //? 
    } 
    } 

    return result 
} 

function getMacCodePage(lang: string, langWithRegion: string) { 
    switch (lang) { 
    case "ja": //japanese 
     return [10001] //Apple Japanese 
    case "zh": //chinese 
     if (langWithRegion === "zh_CN") { 
     return [10008] //Apple Simplified Chinese (GB 2312) 
     } 
     return [10002] //Apple Traditional Chinese (Big5) 
    case "ko": //korean 
     return [10003] //Apple Korean 
    case "ar": //arabic 
    case "ur": //urdu 
     return [10004] //Apple Arabic 
    case "he": //hebrew 
     return [10005] //Apple Hebrew 
    case "el": //greek 
    case "elc": //greek 
     return [10006] //Apple Greek 
    case "ru": //russian 
    case "be": //belarussian 
    case "sr": //serbian 
    case "bg": //bulgarian 
    case "uz": //uzbek 
     return [10007] //Apple Macintosh Cyrillic 
    case "ro": //romanian 
     return [10010] //Apple Romanian 
    case "uk": //ukrainian 
     return [10017] //Apple Ukrainian 
    case "th": //thai 
     return [10021] //Apple Thai 
    case "et": //estonian 
    case "lt": //lithuanian 
    case "lv": //latvian 
    case "pl": //polish 
    case "hu": //hungarian 
    case "cs": //czech 
    case "sk": //slovak 
     return [10029] //Apple Macintosh Central Europe 
    case "is": //icelandic 
    case "fo": //faroese 
     return [10079] //Apple Icelandic 
    case "tr": //turkish 
     return [10081] //Apple Turkish 
    case "hr": //croatian 
    case "sl": //slovenian 
     return [10082] //Apple Croatian 
    default: 
     return [10000] //Apple Macintosh Roman 
    } 
} 

function getMacCharCode(str: string, i: number, macCodePages: any) { 
    let code = str.charCodeAt(i) 
    let j 
    if (code < 128) { 
    code = str.charCodeAt(i) 
    } 
    else if (code < 256) { 
    //codepage 10000 = mac OS Roman 
    code = cptable[10000].enc[str[i]] 
    } 
    else { 
    for (j = 0; j < macCodePages.length; j++) { 
     code = cptable[macCodePages[j]].enc[str[i]] 
     if (code !== undefined) { 
     break 
     } 
    } 
    } 

    return code 
}