2017-07-14 32 views
0

我是python和熊猫的新手,并且拥有读取到熊猫数据框的csv文件。在下面找到它。如果基于同一数据框中的其他两列的行值满足条件,则填充数据框中列的行中的值

我想根据PLDATE中的行值填充列OND_ORIGIN和OND_DEST。

的逻辑是飞行在同一天每次飞行中,OND_ORIGIN和OND_DEST应该是相同departure_from和Arr_to列

import pandas as pd 
import numpy as np 
import csv 


location = r'C:\Users\bi.reports\Desktop\output.csv' 
df = pd.read_csv(location,sep='\s*,\s*',engine='python') 
for i, row in df.iterrows(): 
    if row['COUPON_NUMBER'] == 1: 
     df.OND_ORIGIN = df.DEP_FROM 
     #df.OND_DEST = df.DEP_FROM 
    elif row['COUPON_NUMBER'] == 2: 
     #df.OND_ORIGIN = df.DEP_FROM 
     df.OND_DEST = df.ARR_TO 
    elif row['COUPON_NUMBER'] == 3: 
     #df.OND_ORIGIN = df.DEP_FROM 
     df.OND_DEST = df.ARR_TO 
    else: 
    df.OND_ORIGIN = df.DEP_FROM 
    #df.OND_DEST = df.ARR_TO 

    df.to_csv('out.csv', sep=',',index = False) 

csv file in use

回答

0

试试这个:

df.loc[df['COUPON_NUMBER'] == 1, 'OND_ORIGIN'] = df.DEP_FROM 
df.loc[df['COUPON_NUMBER'].isin([2,3]), 'OND_DEST'] = df.ARR_TO 
df.loc[~df['COUPON_NUMBER'].isin([1,2,3]), 'OND_ORIGIN'] = df.DEP_FROM 

或位优化:

df.loc[df['COUPON_NUMBER'].isin([2,3]), 'OND_DEST'] = df.ARR_TO 
df.loc[~df['COUPON_NUMBER'].isin([2,3]), 'OND_ORIGIN'] = df.DEP_FROM 
+0

感谢您的快速回复,但是当我运行它时,对于每一行只有一列被填充。即(如果OND_DEST被填充,OND_ORIGIN是空白的,反之亦然) – MTK

+0

@MTK,你不应该为每一行运行它 - 这是一个矢量化的解决方案,只需用你的'for ... loop'替换这两行... – MaxU

+0

I如果你可以看一下CSV,我想根据PLDATE列提出OND_,例如,由于优惠券1和2在同一天飞行,所以ond_origin应该是HRE和OND_Destination KGL对于两个优惠券,以及对于优惠券3和4,ond_origin KGL和ond_destination HRE都是。 – MTK

相关问题