后,我看了一个Excel文件:如何创建一个使用apply()后创建的pandas数据框的新列?
进口大熊猫作为PD
在:
df = pd.read_excel('file.xlsx')
df = df.drop('Unnamed: 0', 1)
df
日期:
A B C D E
0 2345 typeA NO http://www.example.com/...
2 23423 483 NO http://www.example.com/...
3 234234 typeC NO http://www.example.com/...
4 2343 typeA NO http://www.example.com/...
5 23423 typeA NO http://www.example.com/...
6 234 typeA NO http://www.example.com/...
我正在申请多项功能于几个大熊猫数据帧列那在添加更多列后创建:apply()
:
在:
df['E'] = df['D'].apply(checker)
df
日期:
A B C D E
0 2345 typeA NO http://www.example.com/... OK
1 234 483 NO http://www.example.com/... FALSE
2 23423 483 NO http://www.example.com/... OK
3 234234 typeC NO http://www.example.com/... OK
4 2343 typeA NO http://www.example.com/... OK
5 23423 typeA NO http://www.example.com/... FALSE
6 234 typeA NO http://www.example.com/... OK
然后我做:df = df[df.E == 'OK']
和df = df.loc[df.E =='OK']
然后,我申请到上述数据框中一个新的功能:
在:
df['F'] = df['D'].apply(new_function_foo)
虽然它的实际工作,因为我想我得到了这样的警告:
日期:
/usr/local/lib/python3.5/site-packages/ipykernel/__main__.py:10: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
我查了一下资料,我试图跟进说明:
df['F'] = df.loc[['E']].apply(function_foo)
并且
df['ColF'] = df.loc[:,'ColE'].apply(function_foo)
但是,我不明白如何解决上述警告。因此,我该如何正确应用功能?
这不是在警告来自例如,如果你做到这一点。你可能在之前的代码中创建了df的副本。在这条线之前寻找类似于df2 = df的东西或类似的东西 –
@StevenG,我忘了提及:我做了'df = df [df.E =='OK']' – tumbleweed
而不是这样做:'df = df.loc [df.E =='OK']' –