如何从XLSX文件使用python

enter image description here

，我想这个数据更改为字典这样的：

{ 
    0:{ 
     'a':1, 
     'b':100, 
     'c':2, 
     'd':10 
    }, 
    1:{ 
     'a':8, 
     'b':480, 
     'c':3, 
     'd':14 
    } 
... 
}

所以有人知道一个python lib做到这一点，并从行124开始，并且行141结束，

谢谢

来源

2011-04-02 zjm1126

您的第一个输出字典具有来自第124和125行的数据;你的第二行有来自第126行的数据...请编辑你的问题。请确认您想要的数据列是B，C，E和G. – 2011-04-02 04:43:26

'xlrd'（自版本0.8.0开始）支持直接读取'.xlsx'文件。（John Machin在他的回答中提到的“螺栓连接”模块最终被合并到'xlrd'包中。）相关：http://stackoverflow.com/questions/4371163/reading-xlsx-files-using-python – 2013-03-05 15:53:13

我认为你的意思是第一部分是'd：12';你的档案有多大？ – 2014-02-27 12:42:03

与xlrd选项：

（1）您的XLSX文件看起来并不非常大;保存为xls。（2）使用xlrd加上螺栓连接的beta测试模块xlsxrd（找到我的电子邮件地址，并要求它）;使用xlrd加上螺栓接通beta测试模块xlsxrd（找到我的电子邮件地址，并要求它）;使用xlrd。该组合将无缝读取xls和xlsx文件中的数据（相同的API;它检查文件内容以确定它是xls，xlsx还是冒名顶替者）。

在这两种情况下，像下面的（未经测试）的代码应该做你想要什么：

from xlrd import open_workbook 
from xlsxrd import open_workbook 
# Choose one of the above 

# These could be function args in real live code 
column_map = { 
    # The numbers are zero-relative column indexes 
    'a': 1, 
    'b': 2, 
    'c': 4, 
    'd': 6, 
    } 
first_row_index = 124 - 1 
last_row_index = 141 - 1 
file_path = 'your_file.xls' 

# The action starts here 
book = open_workbook(file_path) 
sheet = book.sheet_by_index(0) # first worksheet 
key0 = 0 
result = {} 
for row_index in xrange(first_row_index, last_row_index + 1): 
    d = {} 
    for key1, column_index in column_map.iteritems(): 
     d[key1] = sheet.cell_value(row_index, column_index) 
    result[key0] = d 
    key0 += 1

来源

2011-04-02 05:04:29

另一种选择是openpyxl。我一直想要尝试一下，但还没有开始尝试，所以我不能说它有多好。

来源

2011-04-03 09:54:20 joshayers

自发布此答案以来，我有机会尝试openpyxl。这很容易使用。我设法写出了一个相当大的电子表格 - 20个标签，每个标签有200列和500行。该操作使用大约2GB的内存。它还有一个优化的仅附加作者，作者声称可以编写无限大小的电子表格，但我还没有理由尝试。 – joshayers 2011-06-19 22:06:12

这是一个非常粗略的实现，只使用标准库。

def xlsx(fname): 
    import zipfile 
    from xml.etree.ElementTree import iterparse 
    z = zipfile.ZipFile(fname) 
    strings = [el.text for e, el in iterparse(z.open('xl/sharedStrings.xml')) if el.tag.endswith('}t')] 
    rows = [] 
    row = {} 
    value = '' 
    for e, el in iterparse(z.open('xl/worksheets/sheet1.xml')): 
     if el.tag.endswith('}v'): # <v>84</v> 
      value = el.text 
     if el.tag.endswith('}c'): # <c r="A3" t="s"><v>84</v></c> 
      if el.attrib.get('t') == 's': 
       value = strings[int(value)] 
      letter = el.attrib['r'] # AZ22 
      while letter[-1].isdigit(): 
       letter = letter[:-1] 
      row[letter] = value 
     if el.tag.endswith('}row'): 
      rows.append(row) 
      row = {} 
    return dict(enumerate(rows))

来源

2014-02-27 12:14:32

假设你有过这样的数据：

a,b,c,d 
1,2,3,4 
2,3,4,5 
...

在2014年的一个许多潜在的答案是：

import pyexcel 


r = pyexcel.SeriesReader("yourfile.xlsx") 
# make a filter function 
filter_func = lambda row_index: row_index < 124 or row_index > 141 
# apply the filter on the reader 
r.filter(pyexcel.filters.RowIndexFilter(filter_func)) 
# get the data 
data = pyexcel.utils.to_records(r) 
print data

现在的数据字典的数组：

[{ 
    'a':1, 
    'b':100, 
    'c':2, 
    'd':10 
}, 
{ 
    'a':8, 
    'b':480, 
    'c':3, 
    'd':14 
}... 
]

可以读取文档here

来源

2014-09-21 21:29:20 chfw

如何从XLSX文件使用python

回答

相关问题