熊猫数据帧更新使用另一种柱

我有两列的数据帧df一列，它的列是phone和label，其label只能是0或1。
下面是一个例子：熊猫数据帧更新使用另一种柱

phone label 
    a  0 
    b  1 
    a  1 
    a  0 
    c  0 
    b  0

我想要做的是计算每种类型的'电话'的'1'的数量，并使用数字替换'电话'列我附带的是groupby，但我不熟悉它

T他的回答应该是：

Count the number of each 'phone' 
phone count 
    a   1 
    b   1 
    c   0 

replace the 'phone' with 'count' in the original table 
phone 
    1 
    1 
    1 
    1 
    0 
    1

来源

2016-07-15 Fan

你想找到没有。 'phone'中的行给出了标签== 1？ –

你想要：'df.groupby ['phone']。sum（）'？ – bernie

但我怎样才能取代'电话'与'总和' – Fan

tak荷兰国际集团考虑的是，label列只能有0或1，您可以使用.trasnform('sum')方法：

In [4]: df.label = df.groupby('phone')['label'].transform('sum') 

In [5]: df 
Out[5]: 
    phone label 
0  a  1 
1  b  1 
2  a  1 
3  a  1 
4  c  0 
5  b  1

说明：

In [2]: df 
Out[2]: 
    phone label 
0  a  0 
1  b  1 
2  a  1 
3  a  0 
4  c  0 
5  b  0 

In [3]: df.groupby('phone')['label'].transform('sum') 
Out[3]: 
0 1 
1 1 
2 1 
3 1 
4 0 
5 1 
dtype: int64

来源

2016-07-15 07:11:51 MaxU

您可以在熊猫中筛选和分组数据。对于你的情况下，它看起来

假设数据

phone label 
0  a  0 
1  b  1 
2  a  1 
3  a  1 
4  c  1 
5  d  1 
6  a  0 
7  c  0 
8  b  0 

df.groupby(['phone','label'])['label'].count() 
phone label 
a  0  2 
     1  2 
b  0  1 
     1  1 
c  0  1 
     1  1 
d  1  1

如果需要的phones组数给予label==1然后做到这一点 -

#first filter to get only label==1 rows 
phone_rows_label_one_df = df[df.label==1] 

#then do groupby 
phone_rows_label_one_df.groupby(['phone'])['label'].count() 

phone 
a 2 
b 1 
c 1 
d 1

要获得count在数据帧的新列这样做

phone_rows_label_one_df.groupby(['phone'])['label'].count().reset_index(name='count') 
    phone count 
0  a  2 
1  b  1 
2  c  1 
3  d  1

来源

2016-07-15 02:44:18

其实，我想找出每个类型的'手机'给定标签== 1.行数。 – Fan

我怎样才能取代'手机'在计数的原始表中？ – Fan

@粉丝完成。熊猫真棒！ –

熊猫数据帧更新使用另一种柱

回答

相关问题