遍历分组大熊猫DF和出口个别地块

The documentation似乎有点稀疏，为每一个元素是如何工作的，所以这里有云：遍历分组大熊猫DF和出口个别地块

我有一堆文件，我想遍历并输出一个阴谋，为每一个单一的文件。

df_all.head()

Dem-Dexc Aem-Dexc Aem-Aexc S  E  fit  frame filename 
0 18150.0595 18548.2451 15263.7451 0.7063 0.5054 0.879 1.0 Traces_exp22_tif_pair16.txt 
1 596.9286 7161.7353 1652.8922 0.8244 0.9231 0.879 2.0 Traces_exp22_tif_pair16.txt 
2 93.2976  3112.3725 2632.6667 0.5491 0.9709 0.879 3.0 Traces_exp22_tif_pair16.txt 
3 1481.1310 4365.4902 769.3333 0.8837 0.7467 0.879 4.0 Traces_exp22_tif_pair16.txt 
4 583.1786 6192.6373 1225.5392 0.8468 0.9139 0.879 5.0 Traces_exp22_tif_pair16.txt

现在我想组和迭代：

for group in df_all.groupby("filename"): 
    plot = sns.regplot(data = group, x = "Dem-Dexc", y = "frame")

，但我得到TypeError: tuple indices must be integers or slices, not str。我为什么得到这个？

来源

2017-10-16 komodovaran_

我认为你需要改变：

for group in df_all.groupby("filename")

到：

for i, group in df_all.groupby("filename"): 
    plot = sns.regplot(data = group, x = "Dem-Dexc", y = "frame")

为解压tuples。

或选择[1]元组的第二个值：

for group in df_all.groupby("filename"): 
    print (group) 

('Traces_exp22_tif_pair16.txt',  Dem-Dexc Aem-Dexc Aem-Aexc  S  E fit frame \ 
0 18150.0595 18548.2451 15263.7451 0.7063 0.5054 0.879 1.0 
1 596.9286 7161.7353 1652.8922 0.8244 0.9231 0.879 2.0 
2  93.2976 3112.3725 2632.6667 0.5491 0.9709 0.879 3.0 
3 1481.1310 4365.4902 769.3333 0.8837 0.7467 0.879 4.0 
4 583.1786 6192.6373 1225.5392 0.8468 0.9139 0.879 5.0 

         filename 
0 Traces_exp22_tif_pair16.txt 
1 Traces_exp22_tif_pair16.txt 
2 Traces_exp22_tif_pair16.txt 
3 Traces_exp22_tif_pair16.txt 
4 Traces_exp22_tif_pair16.txt )

VS：

for group in df_all.groupby("filename"): 
    plot = sns.regplot(data = group[1], x = "Dem-Dexc", y = "frame")

您可以通过检查tuple输出

for i, group in df_all.groupby("filename"): 
    print (group) 

    Dem-Dexc Aem-Dexc Aem-Aexc  S  E fit frame \ 
0 18150.0595 18548.2451 15263.7451 0.7063 0.5054 0.879 1.0 
1 596.9286 7161.7353 1652.8922 0.8244 0.9231 0.879 2.0 
2  93.2976 3112.3725 2632.6667 0.5491 0.9709 0.879 3.0 
3 1481.1310 4365.4902 769.3333 0.8837 0.7467 0.879 4.0 
4 583.1786 6192.6373 1225.5392 0.8468 0.9139 0.879 5.0 

         filename 
0 Traces_exp22_tif_pair16.txt 
1 Traces_exp22_tif_pair16.txt 
2 Traces_exp22_tif_pair16.txt 
3 Traces_exp22_tif_pair16.txt 
4 Traces_exp22_tif_pair16.txt

如果想保存输出到图片png ：

for i, group in df_all.groupby("filename"): 
    plot = sns.regplot(data = group, x = "Dem-Dexc", y = "frame") 
    fig = plot.get_figure() 
    fig.savefig("{}.png".format(i.split('.')[0]))

来源

2017-10-16 10:11:39 jezrael

现在我得到'类型错误：无法转换24540.50005151.2500633.5000550.0000-1204.50002094.0000-1045.2500742.750052.75003610.75009719（...），以numeric.' 现在看来似乎有一些问题解析数据帧，尽管打印看起来非常好。是什么赋予了？ –

我想你需要'df ['Dem-Dexc'] = df ['Dem-Dexc']。astype（float）'，因为在列中是字符串。 – jezrael

你说得对。这是我的错误。似乎我的一些数据框在初始导入时被组合为字符串。谢谢！ –

遍历分组大熊猫DF和出口个别地块

回答

相关问题