Python熊猫根据组填充值

我想复制一个“正确的填充”类似excel的函数，它填充值的权利，直到下一个值不为null /南/空。这个“正确填充”练习只有在紧接着的下一行中的值不为空或“南”时才能完成。而且，这必须为每个小组完成。我有以下熊猫数据框数据集。我目前的输入表是“有”。我的输出表是“想要”。Python熊猫根据组填充值

我只是一个Python初学者。所以任何帮助，将不胜感激。谁也为那些想这种操作上的集团化运作来进行，数据如下：表“有”与分组字段“组”如下：

import pandas as pd 
    have = pd.DataFrame({ \ 
    "groups": pd.Series(["group1","group1","group1","group2","group2","group2"]) \ 
    ,"0": pd.Series(["abc","1","something here","abc2","1","something here"]) \ 
    ,"1": pd.Series(["","2","something here","","","something here"]) \ 
    ,"2": pd.Series(["","3","something here","","3","something here"]) \ 
    ,"3": pd.Series(["something","1","something here","something","1","something here"]) \ 
    ,"4": pd.Series(["","2","something here","","2","something here"]) \ 
    ,"5": pd.Series(["","","something here","","","something here"]) \ 
    ,"6": pd.Series(["","","something here","","","something here"]) \ 
    ,"7": pd.Series(["cdf","5","something here","mnop","5","something here"]) \ 
    ,"8": pd.Series(["","6","something here","","6","something here"]) \ 
    ,"9": pd.Series(["xyz","1","something here","xyz","1","something here"]) \ 
    })

表“希望”与分组字段“组”：

import pandas as pd 
    want = pd.DataFrame({ \ 
    "groups": pd.Series(["group1","group1","group1","group2","group2","group2"]) \ 
    ,"0": pd.Series(["abc","1","something here","anything","1","something here"]) \ 
    ,"1": pd.Series(["abc","2","something here"," anything ","2","something here"]) \ 
    ,"2": pd.Series(["abc","3","something here"," anything ","3","something here"]) \ 
    ,"3": pd.Series(["something","1","something here","","","something here"]) \ 
    ,"4": pd.Series(["something ","2","something here","","","something here"]) \ 
    ,"5": pd.Series(["","","something here","","","something here"]) \ 
    ,"6": pd.Series(["","","something here","","","something here"]) \ 
    ,"7": pd.Series(["cdf","5","something here","mnop","5","something here"]) \ 
    ,"8": pd.Series(["cdf ","6","something here"," mnop ","6","something here"]) \ 
    ,"9": pd.Series(["xyz","1","something here","xyz","1","something here"]) \ 
    })

我试图用这个代码，但我仍然在努力熟悉自己与groupby和apply声明：

grouped=have.groupby('groups') 
have.groupby('groups').apply(lambda g: have.loc[g].isnull()) 
#cond = have.loc[1].isnull() | have.loc[1].ne('') 
want.loc[0, cond] = want.loc[0, cond].str.strip().replace('', None) 
want

来源

2017-01-02 Seb

def fill(df): 
    df = df.copy() 
    i0, i1 = df.index[0], df.index[1] 
    cond = have.loc[i1].isnull() | have.loc[i1].ne('') 
    df.loc[i0, cond] = df.loc[i0, cond].str.strip().replace('', None) 
    return df 


have.groupby('groups', group_keys=False).apply(fill)

来源

2017-01-02 10:19:37 piRSquared

谢谢piRSquared。 U天才:) – Seb

Python熊猫根据组填充值

回答

相关问题