大熊猫在列变更值选定行

试图创建一个新数据帧的第一劈裂原来的两个：大熊猫在列变更值选定行

DF1 - 包含从选定colomn具有从给定的列表

值原始帧只排

df2 - 仅包含来自原始行的选定colomn中具有其他值的行，然后将这些值更改为新的给定值。

返回新的数据帧为DF1和DF2的级联

这工作得很好：

l1 = ['a','b','c','d','a','b'] 
l2 = [1,2,3,4,5,6] 
df = pd.DataFrame({'cat':l1,'val':l2}) 
print(df) 

cat val 
0 a 1 
1 b 2 
2 c 3 
3 d 4 
4 a 5 
5 b 6 

df['cat'] = df['cat'].apply(lambda x: 'other') 
print(df) 

    cat val 
0 other 1 
1 other 2 
2 other 3 
3 other 4 
4 other 5 
5 other 6

然而，当我定义功能：

def create_df(df, select, vals, other): 
    df1 = df.loc[df[select].isin(vals)] 
    df2 = df.loc[~df[select].isin(vals)] 
    df2[select] = df2[select].apply(lambda x: other) 
    result = pd.concat([df1, df2]) 
    return result

，并称之为：

df3 = create_df(df, 'cat', ['a','b'], 'xxx') 
print(df3)

这会导致什么，我真的需要：

cat val 
0 a 1 
1 b 2 
4 a 5 
5 b 6 
2 xxx 3 
3 xxx 4

出于某种原因，在这种情况下，我得到一个警告：

..\usr\conda\lib\site-packages\ipykernel\__main__.py:10: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame. 
Try using .loc[row_indexer,col_indexer] = value instead

那么如何这种情况下（当我将值分配给函数中的列）与第一个不同，当我赋值不在函数中时？

什么是改变列值的正确方法？

来源

2017-08-05 dokondr

的可能的复制[？如何处理与熊猫SettingWithCopyWarning（https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-熊猫） –

尽管如此，奇怪的是，在我的情况下，这个特性在代码的不同部分显示不同：我在函数定义中得到警告，但在主程序中没有。这是为什么？ – dokondr

那么有很多方法的代码可以优化我猜想，但它的工作，你可以简单地保存输入的数据帧的副本和Concat的那些：

def create_df(df, select, vals, other): 
    df1 = df.copy()[df[select].isin(vals)] #boolean.index 
    df2 = df.copy()[~df[select].isin(vals)] #boolean-index 
    df2[select] = other # this is sufficient 
    result = pd.concat([df1, df2]) 
    return result

替代版本：

l1 = ['a','b','c','d','a','b'] 
l2 = [1,2,3,4,5,6] 
df = pd.DataFrame({'cat':l1,'val':l2}) 

# define a mask 
mask = df['cat'].isin(list("ab")) 

# concatenate mask, nonmask 
df2 = pd.concat([df[mask],df[-mask]]) 

# change values to 'xxx' 
df2.loc[-mask,["cat"]] = "xxx"

输出

cat val 
0 a 1 
1 b 2 
4 a 5 
5 b 6 
2 xxx 3 
3 xxx 4

或功能：

def create_df(df, filter_, isin_, value): 

    # define a mask 
    mask = df[filter_].isin(isin_) 

    # concatenate mask, nonmask 
    df = pd.concat([df[mask],df[-mask]]) 

    # change values to 'xxx' 
    df.loc[-mask,[filter_]] = value 

    return df 

df2 = create_df(df, 'cat', ['a','b'], 'xxx') 
df2

来源

2017-08-05 22:05:49

大熊猫在列变更值选定行

回答

相关问题