调用从熊猫数据框中

我在练习从谷歌财经股市数据导入熊猫数据帧列中的数据时，Python的错误：调用从熊猫数据框中

import pandas as pd 
from pandas import Series 

path = 'http://www.google.com/finance/historical?cid=542029859096076&startdate=Sep+22%2C+2001&enddate=Sep+20%2C+2016&num=30&ei=3HvhV4n3D8XGmAGp4q74Ag&output=csv' 
df = pd.read_csv(path)

到目前为止好，和DF也显示了完整的数据集，我需要。

但是，调用特定列的时候，像

df['Date']

的Python示出下面的错误代码：

Traceback (most recent call last): 

    File "<ipython-input-31-cb486dd31fbc>", line 1, in <module> 
    df['Date'] 

    File "/Users/Username/anaconda/lib/python3.5/site-packages/pandas/core/frame.py", line 1997, in __getitem__ 
    return self._getitem_column(key) 

    File "/Users/Username/anaconda/lib/python3.5/site-packages/pandas/core/frame.py", line 2004, in _getitem_column 
    return self._get_item_cache(key) 

    File "/Users/Username/anaconda/lib/python3.5/site-packages/pandas/core/generic.py", line 1350, in _get_item_cache 
    values = self._data.get(item) 

    File "/Users/Username/anaconda/lib/python3.5/site-packages/pandas/core/internals.py", line 3290, in get 
    loc = self.items.get_loc(item) 

    File "/Users/Username/anaconda/lib/python3.5/site-packages/pandas/indexes/base.py", line 1947, in get_loc 
    return self._engine.get_loc(self._maybe_cast_indexer(key)) 

    File "pandas/index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas/index.c:4154) 

    File "pandas/index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas/index.c:4018) 

    File "pandas/hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12368) 

    File "pandas/hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12322) 

KeyError: 'Date'

在另一方面，其它的塔，例如DF [ '高']原来没问题。无论如何，我可以解决这个问题吗？

来源

2016-09-20 carl_pch

当我尝试它工作正常，正确分析。 – ayhan

（基于MaxU的回答，它可能正常工作，因为我使用Python 3.5）。 – ayhan

@ayhan，did'df ['Date']'为你工作吗？它不应该也在Python 3.5下工作... – MaxU

这个CSV文件包含BOM (Byte Order Mark) signature，所以试试这种方法：

df = pd.read_csv(path, encoding='utf-8-sig')

如何可以很容易地找出这个问题（感谢@jezrael's hint）：

In [11]: print(df.columns.tolist()) 
['\ufeffDate', 'Open', 'High', 'Low', 'Close', 'Volume']

，并在第一列注意

注意：作为@ayhan已经注意到，从版本0.1开始9.0 Pandas will take care of it automatically：

的Bug pd.read_csv（）造成BOM文件被不忽略BOM GH4793

来源

2016-09-20 18:30:20 MaxU

嘿谢谢！这样可以很好地工作。您能否更详细地解释一下为什么它会产生差异，或者指出一些关于BOM签名的来源？再次感谢。 –

更好看，如果使用'print（df.columns.tolist（））'，+1 – jezrael

调用从熊猫数据框中

回答

相关问题