通过占位符循环创建熊猫系列

我想基于if条件自动更改pandas列缺失值的名称，最好使用'string_name_number'。数字应该从1开始并以最后一个缺失值结束。我已决定如下设置我的循环以从字符串中选择数据。通过占位符循环创建熊猫系列

然而，缺失列的结果（df2）保持不变。如下; - 被访者i，jakson，被访者i，被访者i，jane，被访者i，mary，...

我期望看到以下结果（df2）; - 被访者1，jakson，被访者2，被访者3，简，被访者4，玛丽，...

请协助。

import pandas as pd 

df = pd.read_csv('232 responses.csv', sep=',',header=0, parse_dates=True, 
       index_col='Timestamp') 

missing_rows_list = list(range(0, len (df))) 

for i in missing_rows_list: 
    i = 1 
    df2 = [df['Name (optional)']\ 
      .replace(np.nan, 'respondent {d[i]}'\ 
      .format(d=missing_rows_list)) if pd.isnull(df['Name (optional)']) \ 
      else df['Name (optional)'] == word in df['Name (optional)']] 
    i += 1

来源

2017-07-29 Gwiji

之前你寻求进一步的忠告：'DF [“名（可选）”] isnull'是_not_的方法调用，但是对方法的引用。这个表达总是“真”。 – DyZ

让我检查一下，然后回复你。 – Gwiji

调整为pd.isnull（df ['Name（optional）']），我希望这是一个方法调用。 – Gwiji

我想这应该韩德尔它是一个更方便的方法：

df=pd.DataFrame({"a":["test1","test2","test3","test4",np.NAN],"b":["test5",np.NAN,"test7",np.NAN,"test9"]}) 

#Create the respondent + inex number format --> you can also save this in an extra df column if you like 
a=["respondent"]*len(df.index) 
b=list(df.index) 
c=["{0}{1}".format(a_,b_)for a_,b_ in list(zip(a,b))] 

#Replace the missing values 
for i in df.columns: 
    mask = df[i].isnull() 
    df[i].mask(mask,c, inplace=True) 

print(df) 



      a   b 
0  test1  test5 
1  test2 response1 
2  test3  test7 
3  test4 response3 
4 response4  test9

来源

2017-07-31 06:01:07 2Obe

通过占位符循环创建熊猫系列

回答

相关问题