Python 2和3 CSV阅读器

我想用csv模块来读取utf-8 csv文件，并且由于编码的原因，我创建了一个python 2和3的泛型代码。Python 2和3 CSV阅读器

这里是在Python 2.7的原代码：

with open(filename, 'rb') as csvfile: 
    csv_reader = csv.reader(csvfile, quotechar='\"') 
    langs = next(csv_reader)[1:] 
    for row in csv_reader: 
     pass

但是，当我与Python 3运行它，它不喜欢，我打开文件，而无需“编码”的事实。我试过这个：

with codecs.open(filename, 'r', encoding='utf-8') as csvfile: 
    csv_reader = csv.reader(csvfile, quotechar='\"') 
    langs = next(csv_reader)[1:] 
    for row in csv_reader: 
     pass

现在python 2无法解码“for”循环中的行。所以...我应该怎么做？

来源

2011-03-03 Syl

所以，你想要在Python 2.7和3上运行的代码不变吗？可能是不可能的，因为如此多的字符串处理等发生了变化。 – 2011-03-03 12:20:47

是否可以为python 2或3指定块代码？ – Syl 2011-03-03 12:22:49

你可以检查'sys.version'并在你的代码中包装一个'if-else'语句，是的。 – 2011-03-03 12:31:20

事实上，在Python 2中，文件应该以二进制模式打开，但在Python 3中以文本模式打开。（你忘了）。

您必须在if块中打开文件。

import sys 

if sys.version_info[0] < 3: 
    infile = open(filename, 'rb') 
else: 
    infile = open(filename, 'r', newline='', encoding='utf8') 


with infile as csvfile: 
    ...

来源

2011-03-03 13:06:38

你可以在文件句柄上使用''来控制吗？ – 2011-03-03 13:09:08

@Tim：这不是文件句柄，它是一个文件对象，你可以在文件对象上使用''''。这就是你在开放的时候所做的事情（...'。 – 2011-03-03 13:20:25

有道理，你从来没有真正看到过它，它总是在文档中打开（...）'，但这种方式不是一半是坏的 - 你可以把'open（）'包装在'try'块中，并抓住'File not found'等文件，然后交给'with'块。 – 2011-03-03 13:41:32

老问题我知道，但我正在寻找如何做到这一点。以防万一有人过来这可能会发现它有用。

这就是我解决我的问题，感谢Lennart Regebro的提示。：

if sys.version > '3': 
     rd = csv.reader(open(input_file, 'r', newline='', 
     encoding='iso8859-1'), delimiter=';', quotechar='"') 
else: 
     rd = csv.reader(open(input_file, 'rb'), delimiter=';', 
     quotechar='"')

然后做你需要做什么：

for row in rd: 
     ......

来源

2013-08-20 10:31:09 jscurtu

更新：虽然我原来的答复代码工作我同时在https://pypi.python.org/pypi/csv342提供一个Python 3一样的界面，释放小包装为Python 2，所以独立的你的Python版本，你可以简单地做一个

import csv342 as csv 
import io 
with io.open('some.csv', 'r', encoding='utf-8', newline='') as csv_file: 
    for row in csv.reader(csv_file, delimiter='|'): 
     print(row)

原创回答：这里有一个解决方案，即使使用Python 2实际上也可以将文本解码为Unicode字符串，因此可以使用UTF-8以外的编码。

下面的代码定义了一个函数csv_rows()，它返回文件内容作为列表序列。实例：

for row in csv_rows('some.csv', encoding='iso-8859-15', delimiter='|'): 
    print(row)

下面是csv_rows()的两个变体：一个用于Python 3+和另一个用于Python 2.6+。在运行期间，它会自动选择适当的变体。 UTF8Recoder和UnicodeReader是examples in the Python 2.7 library documentation的逐字拷贝。

import csv 
import io 
import sys 


if sys.version_info[0] >= 3: 
    # Python 3 variant. 
    def csv_rows(csv_path, encoding, **keywords): 
     with io.open(csv_path, 'r', newline='', encoding=encoding) as csv_file: 
      for row in csv.reader(csv_file, **keywords): 
       yield row 

else: 
    # Python 2 variant. 
    import codecs 

    class UTF8Recoder: 
     """ 
     Iterator that reads an encoded stream and reencodes the input to UTF-8 
     """ 
     def __init__(self, f, encoding): 
      self.reader = codecs.getreader(encoding)(f) 

     def __iter__(self): 
      return self 

     def next(self): 
      return self.reader.next().encode("utf-8") 


    class UnicodeReader: 
     """ 
     A CSV reader which will iterate over lines in the CSV file "f", 
     which is encoded in the given encoding. 
     """ 

     def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds): 
      f = UTF8Recoder(f, encoding) 
      self.reader = csv.reader(f, dialect=dialect, **kwds) 

     def next(self): 
      row = self.reader.next() 
      return [unicode(s, "utf-8") for s in row] 

     def __iter__(self): 
      return self 


    def csv_rows(csv_path, encoding, **kwds): 
     with io.open(csv_path, 'rb') as csv_file: 
      for row in UnicodeReader(csv_file, encoding=encoding, **kwds): 
       yield row

来源

2014-12-05 00:27:58 roskakori

Python 2和3 CSV阅读器

回答

相关问题