基于头Python的熊猫匹配VLOOKUP列中的值

我有以下的数据帧DF：基于头Python的熊猫匹配VLOOKUP列中的值

Customer_ID | 2015 | 2016 |2017 | Year_joined_mailing 
ABC   5  6  10  2015 
BCD   6  7  3  2016   
DEF   10  4  5  2017 
GHI   8  7  10  2016

我想查找他们加入邮件列表在今年顾客的价值，并将其保存在一个新专栏。

输出将是：

Customer_ID | 2015 | 2016 |2017 | Year_joined_mailing | Purchases_1st_year 
ABC   5  6  10  2015      5 
BCD   6  7  3  2016      7  
DEF   10  4  5  2017      5 
GHI   8  9  10  2016      9

我已经找到了比赛VLOOKUP在python一些解决方案，但没有说会用其他列的标题。

来源

2017-07-19 jeangelj

查找是列2015,2016和2017年 – jeangelj

使用pd.DataFrame.lookup
请记住，我假设Customer_ID是索引。

df.lookup(df.index, df.Year_joined_mailing) 

array([5, 7, 5, 7])

df.assign(
    Purchases_1st_year=df.lookup(df.index, df.Year_joined_mailing) 
) 

      2015 2016 2017 Year_joined_mailing Purchases_1st_year 
Customer_ID               
ABC    5  6 10     2015     5 
BCD    6  7  3     2016     7 
DEF   10  4  5     2017     5 
GHI    8  7 10     2016     7

但是，你必须要小心在第一年列的列名和整数比较可能的串...

核选项，以确保类型比较受到尊重。

df.assign(
    Purchases_1st_year=df.rename(columns=str).lookup(
     df.index, df.Year_joined_mailing.astype(str) 
    ) 
) 

      2015 2016 2017 Year_joined_mailing Purchases_1st_year 
Customer_ID               
ABC    5  6 10     2015     5 
BCD    6  7  3     2016     7 
DEF   10  4  5     2017     5 
GHI    8  7 10     2016     7

来源

2017-07-19 17:49:11 piRSquared

哇！我仍然在想'熔化'，但你明白了！ +1 – Wen

魔法......并不认为这是可能的一行 - 谢谢 – jeangelj

不客气！ – piRSquared

你可以申请“应用”到每一行

df.apply(lambda x: x[x['Year_joined_mailing']],axis=1)

来源

2017-07-19 17:52:02 galaxyan

谢谢 - 这也起作用了！我高举了它 – jeangelj

我会做这样的，假设表头和Year_joined_mailing是相同的数据类型和所有Year_joined_mailing值都是有效的列。如果数据类型不相同，则可以在适当的位置添加str()或int()进行转换。

df['Purchases_1st_year'] = [df[df['Year_joined_mailing'][i]][i] for i in df.index]

我们这里所做的是迭代的数据帧索引来获取该指数的'Year_joined_mailing'场，然后使用该得到我们想要的列，并再次从列中选择该索引，推这一切的列表，并给我们的新列指派该'Year_joined_mailing'

如果您'Year_joined_mailing'列不会永远是一个有效的列名，然后尝试：

from numpy import nan 
new_col = [] 
for i in df.index: 
    try: 
     new_col.append(df[df['Year_joined_mailing'][i]][i]) 
    except IndexError: 
     new_col.append(nan) #or whatever null value you want here) 
df['Purchases_1st_year'] = new_col

这更长的代码片段艾科制作相同的东西，但不会破坏，如果'Year_joined_mailing'不在df.columns

来源

2017-07-19 17:57:30

非常感谢 - 这工作得很好;所以我选了它 – jeangelj

基于头Python的熊猫匹配VLOOKUP列中的值

回答

相关问题