Python - 使用unicode latin-1修改xml文件的问题

-1

我是新手python用户，我正在尝试使用DOM库更改xml文件中的几行内容。Python - 使用unicode latin-1修改xml文件的问题

但是我正面临一个unicode问题来做这个动作。我的XML文件中有一行

<!--Selector generado a partir del proyecto de configuraciÃ³n : RigelJars_Configuration -->

看来，该行没有使用Unicode工作的拉丁-1“（在此引用：configuraciÃ³n）

writer.write("%s<!--%s-->%s" % (indent, self.data, newl)) 
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf3' in position 58: ordinal not in range(128)

下面这个简单的代码：

# _*_ coding:utf-8 _*_ 
import os,re,sys 
from xml.dom.minidom import DOMImplementation, Document,parse 
for dirname,subdirs,files in os.walk('/tmp/lab/resource.jar'): 
    for filename in files: 
     if filename == 'variableCfgSelector.xml': 
      domapp=parse('/tmp/lab/resource.jar/variableCfgSelector.xml') 
      print('Arquivo atualmente sendo alterado: variableCfgSelector.xml ') 
      childs = [node for node in domapp.childNodes if node.nodeType == domapp.ELEMENT_NODE] 
      for parent in childs: 
       childs2 = [node for node in parent.childNodes if node.nodeType == domapp.ELEMENT_NODE] 
       for child in childs2: 
        if child.nodeName =='environment': 
         child.firstChild.replaceWholeText('ebanking') 
domapp.writexml(open('/tmp/lab/resource.jar/novo_VariableCfgSelection.xml','w'),addindent='',newl='',encoding='UTF-8') 
domapp.unlink()

来源

2017-03-07 Daniel Camilo

无论是它的Unicode或它的拉丁-1，它是不可能在同一时间。您显示的复制/粘贴看起来像UTF-8编码的Unicode，但您使用的软件配置为显示Latin-1。显示试图读取文件的代码可能有助于澄清这一点，但这几乎肯定是现有问题的重复 - 您是否搜索过类似的错误？ – tripleee

另请参阅[Stack Overflow'字符编码'标记wiki]（http://stackoverflow.com/tags/character-encoding/info）了解如何提出有关此问题的一些提示。 – tripleee

文件是Latin-1，在这种情况下，您应该在打开时指定此编码，或者您有双重编码的文本，其中已经是UTF-8的文本第二次从Latin-1错误地编码为UTF-8 。在这种情况下，您可以尝试撤消转换并最终生成有效的UTF-8。在不知道文件中的实际字节的情况下，我们只能猜测;但我会把我的赌注放在第一个场景上。 – tripleee

我的问题已经修复使用一个简单的打开的文件（作为文本文件），而不是使用dom库的解析。我的源文件采用ANSI格式进行了统一编码，使用open可以定义正确的unicode来工作。感谢所有

来源

2017-03-09 18:30:59

Python - 使用unicode latin-1修改xml文件的问题

回答

相关问题