2017-08-09 87 views
0

我试图用utf-8文本格式加载一个.csv文件,并用管道分隔符将其写入cp1252(ansi)格式。以下代码在Python 3.6中工作,但我需要它在Python 2.6中工作。但是,'open'函数不允许Python 2.6中的编码关键字。无法将csv从utf-8转换为使用csv writer python 2.6的ansi

import datetime 
import csv 

# Define what filenames to read 
filenames = ["FILE1","FILE2"] 
infilenames = [filename+".csv" for filename in filenames] 
outfilenames = [filename+"_out_.csv" for filename in filenames] 

# Read filenames in utf-8 and write them in cp1252 
for infilename,outfilename in zip(infilenames,outfilenames): 
    infile = open(infilename, "rt",encoding="utf8") 
    reader = csv.reader(infile,delimiter=',',quotechar='"',quoting=csv.QUOTE_MINIMAL) 

    outfile = open(outfilename, "wt",encoding="cp1252") 
    writer = csv.writer(outfile, delimiter='|', quotechar='"', quoting=csv.QUOTE_NONE,escapechar='\\') 
    for row in reader: 
     writer.writerow(row)  

infile.close() 
outfile.close() 

我尝试了几种解决方案:

  • 没有定义编码。某些Unicode字符错误结果
  • 使用io库(io.open而不是打开)。结果在“类型错误:不能将str写入文本流中的文本”。

有没有人知道在Python 2.X中的正确解决方案?

+0

Python 2中的'csv'不喜欢' unicode'字符串,所以在标准库中没有简单的修复。但是,有第三方解决方案。例如,查看[这个问题]的答案(https://stackoverflow.com/questions/904041/reading-a-utf8-csv-file-with-python)。 – lenz

回答

0

有可能这里会有一些多余的代码,但我得到这个做以下工作:

  • 首先我没有使用.decode和.encode funtion使“CP1252”的enconding。
    • 然后我读从CP1252编码文件的CSV和它写了一个新的CSV

...

import datetime 
import csv 

# Define what filenames to read 
filenames = ["FILE1","FILE2"] 


infilenames = [filename+".csv" for filename in filenames] 
outfilenames = [filename+"_out_.csv" for filename in filenames] 
midfilenames = [filename+"_mid_.csv" for filename in filenames] 

# Iterate over each file 
for infilename,outfilename,midfilename in zip(infilenames,outfilenames,midfilenames): 

    # Open file and read utf-8 text, then encode in cp1252 
    infile = open(infilename, "r") 
    infilet = infile.read() 
    infilet = infilet.decode("utf-8") 
    infilet = infilet.encode("cp1252","ignore") 

    #write cp1252 encoded file 
    midfile = open(midfilename,"w") 
    midfile.write(infilet) 
    midfile.close() 

    # read csv with new cp1252 encoding 
    midfile = open(midfilename,"r") 
    reader = csv.reader(midfile,delimiter=',', quotechar='"',quoting=csv.QUOTE_MINIMAL) 

    # define output 
    outfile = open(outfilename, "w") 
    writer = csv.writer(outfile, delimiter='|', quotechar='"',quoting=csv.QUOTE_NONE,escapechar='\\') 

    #write output to new csv file 
    for row in reader: 
     writer.writerow(row) 

    print("written file",outfilename) 
    infile.close() 
    midfile.close() 
    outfile.close()