熊猫有条件地替换值，如果为给定以下的数据帧的其他列

> 1个的唯一值：熊猫有条件地替换值，如果为给定以下的数据帧的其他列

import pandas as pd 
df = pd.DataFrame(
     {'A':['A','A','B','B','C','C'], 
     'B':['Y','Y','N','N','Y','N'], 
     }) 
df 

    A B 
0 A Y 
1 A Y 
2 B N 
3 B N 
4 C Y 
5 C N

我需要的代码行： 1.识别是否存在在列大于1个的唯一值B（即列A中的类别“C”在列B中具有2个唯一值，而列A中的类别“A”和“B”每个仅具有1个唯一值）。 2.仅当每个类别有多于1个唯一值（即B列在A列中两行“C”类别应具有“Y”时，才将B列中的值更改为“Y”。

这里是理想的结果：

A B 0 A Y 1 A Y 2 B N 3 B N 4 C Y 5 C Y

在此先感谢

来源

2016-01-13 Dance Party2

或者，如果B列中包含两个 “Y” 和 “N” 的A列中给定的类别，然后更改B列中的所有值对于类别改为“Y”。 –

与'多于1个唯一值'类似的声音？ – Stefan

ÿ OU可以：

df['B'] = df.groupby('A')['B'].transform(lambda x: 'Y' if x.nunique() > 1 else x)

获得：

A B 
0 A Y 
1 A Y 
2 B N 
3 B N 
4 C Y 
5 C Y

来源

2016-01-13 19:14:48 Stefan

很好的使用条件转换，比我的简洁。做得好。 – cwharland

这应该工作：

import pandas as pd 
df = pd.DataFrame(
     {'A':['A','A','B','B','C','C'], 
     'B':['Y','Y','N','N','Y','N'], 
     }) 

# Get unique items in each column A group 
group_counts = df.groupby('A').B.apply(lambda x: len(x.unique())) 
# Find all of them with more than 1 unique value 
cols_to_impute = group_counts[group_counts > 1].index.values 
# Change column B to 'Y' for such columns 
df.loc[df.A.isin(cols_to_impute),'B'] = 'Y' 

In [20]: df 
Out[20]: 
    A B 
0 A Y 
1 A Y 
2 B N 
3 B N 
4 C Y 
5 C Y

来源

2016-01-13 19:11:19 cwharland

熊猫有条件地替换值，如果为给定以下的数据帧的其他列

回答

相关问题