基于Python熊猫指数的补充选取元素

我有一个数据框，我选择了两个子集dfs，df_a和df_b。例如，在iris数据集：基于Python熊猫指数的补充选取元素

df_a = iris[iris.Name == "Iris-setosa"] 
df_b = iris[iris.Name == "Iris-virginica"]

什么是得到iris既不是在df_a也不df_b所有元素的最佳方法是什么？我不想提及定义为df_a和df_b的原始条件。我只是假设df_a和df_b是iris的子集，所以我想根据df_a和df_b的索引从iris中提取元素。基本上，假设：

df_a = get_a_subset(iris) 
df_b = get_b_subset(iris) 
# retrieve the subset of iris that 
# has all elements not in df_a or in df_b 
# ...

编辑：这里要说的是，似乎效率不高和不雅的解决方案，我相信熊猫有一个更好的办法：

# get subset of iris that is not in a nor in b 
df_rest = iris[map(lambda x: (x not in df_a.index) & (x not in df_b.index), iris.index)]

，第二个：

df_rest = iris.ix[iris.index - df_a.index - df_b.index]

这怎么可以在熊猫中最有效地/优雅地完成？谢谢。

这似乎比你的第二个解决方案快一点。使用.ix索引时有一些开销：

df[~df.index.isin(df_a.index+df_b.index)]

2013-02-20 18:18:56 Zelazny7

回答