2017-01-02 95 views
1

我想复制一个“正确的填充”类似excel的函数,它填充值的权利,直到下一个值不为null /南/空。这个“正确填充”练习只有在紧接着的下一行中的值不为空或“南”时才能完成。而且,这必须为每个小组完成。我有以下熊猫数据框数据集。我目前的输入表是“有”。我的输出表是“想要”。Python熊猫根据组填充值

我只是一个Python初学者。所以任何帮助,将不胜感激。 谁也为那些想这种操作上的集团化运作来进行,数据如下: 表“有”与分组字段“组”如下:

import pandas as pd 
    have = pd.DataFrame({ \ 
    "groups": pd.Series(["group1","group1","group1","group2","group2","group2"]) \ 
    ,"0": pd.Series(["abc","1","something here","abc2","1","something here"]) \ 
    ,"1": pd.Series(["","2","something here","","","something here"]) \ 
    ,"2": pd.Series(["","3","something here","","3","something here"]) \ 
    ,"3": pd.Series(["something","1","something here","something","1","something here"]) \ 
    ,"4": pd.Series(["","2","something here","","2","something here"]) \ 
    ,"5": pd.Series(["","","something here","","","something here"]) \ 
    ,"6": pd.Series(["","","something here","","","something here"]) \ 
    ,"7": pd.Series(["cdf","5","something here","mnop","5","something here"]) \ 
    ,"8": pd.Series(["","6","something here","","6","something here"]) \ 
    ,"9": pd.Series(["xyz","1","something here","xyz","1","something here"]) \ 
    }) 

表“希望”与分组字段“组”:

import pandas as pd 
    want = pd.DataFrame({ \ 
    "groups": pd.Series(["group1","group1","group1","group2","group2","group2"]) \ 
    ,"0": pd.Series(["abc","1","something here","anything","1","something here"]) \ 
    ,"1": pd.Series(["abc","2","something here"," anything ","2","something here"]) \ 
    ,"2": pd.Series(["abc","3","something here"," anything ","3","something here"]) \ 
    ,"3": pd.Series(["something","1","something here","","","something here"]) \ 
    ,"4": pd.Series(["something ","2","something here","","","something here"]) \ 
    ,"5": pd.Series(["","","something here","","","something here"]) \ 
    ,"6": pd.Series(["","","something here","","","something here"]) \ 
    ,"7": pd.Series(["cdf","5","something here","mnop","5","something here"]) \ 
    ,"8": pd.Series(["cdf ","6","something here"," mnop ","6","something here"]) \ 
    ,"9": pd.Series(["xyz","1","something here","xyz","1","something here"]) \ 
    }) 

我试图用这个代码,但我仍然在努力熟悉自己与groupbyapply声明:

grouped=have.groupby('groups') 
have.groupby('groups').apply(lambda g: have.loc[g].isnull()) 
#cond = have.loc[1].isnull() | have.loc[1].ne('') 
want.loc[0, cond] = want.loc[0, cond].str.strip().replace('', None) 
want 

回答

1
def fill(df): 
    df = df.copy() 
    i0, i1 = df.index[0], df.index[1] 
    cond = have.loc[i1].isnull() | have.loc[i1].ne('') 
    df.loc[i0, cond] = df.loc[i0, cond].str.strip().replace('', None) 
    return df 


have.groupby('groups', group_keys=False).apply(fill) 

enter image description here

+0

谢谢piRSquared。 U天才:) – Seb