2
我是一个Python和熊猫的新手。我需要做一些简单的熊猫数据框解析来获得一个新的数据框,涉及多个功能。这里有一个玩具例子:熊猫应用多个自定义功能
df = pd.DataFrame({'A' : pd.Series(["T100", "T100", "M100", "M100"]), 'B' : pd.Series(["520", "620", "720", "820"]), 'C' : pd.Series(["10/50", "20/50", "30/50", "50/50"])})
>>> df
A B C
0 T100 520 10/50
1 T100 620 20/50
2 M100 720 30/50
3 M100 820 50/50
这是我曾尝试(自然也没有工作 - 它返回的错误AttributeError: 'DataFrame' object has no attribute 'agg'
,但我想要做的想法是有):
def get_pat_ID(row):
sample = row['A']
patID = re.match("[TM](\d+)", sample).group(1)
return(patID)
def get_funcB(row):
sample, b, c = row['A'], row['B'], row['C']
if sample == "T100":
output = b + "_" + c
else:
output = "NA"
return(output)
def cust(dataset, funcname):
f = dataset.apply(funcname, axis=1) # I want the function to be performed on each row of my dataframe
return(f)
funcdict = {"pat_ID": get_pat_ID, "funcB": get_funcB} # contains all the functions that I want to pass to my dataframe
funcs = {'PatID': cust(df, funcdict["pat_ID"]), 'AnotherFunc': cust(df, funcdict["funcB"])} # creates one column for output of each function
newdf = pd.DataFrame()
newdf = df.agg(funcs)
我知道我的方法不是最有效的,因为每次我计算函数时,apply
函数都会重复使用相同的行。任何人都可以帮我吗?
对不起,我迟到的反应!感谢您的回答! – phusion