2017-08-29 119 views
0

将多个数据框的列作为参数传递给函数时,datetime列不希望执行下面的格式化函数。我可以使用所示的内联解决方案进行管理......但是,知道原因会很高兴......我是否应该有使用了不同的日期类数据类型?谢谢(PS熊猫=大)Dataframe datetime列作为函数的参数

import pandas as pd 
import numpy as np 
import datetime as dt 

def fmtfn(arg_dttm, arg_int): 
    retstr = arg_dttm.strftime(':%Y-%m-%d') + '{:0>3}'.format(arg_int) 
    # bombs with: 'numpy.datetime64' object has no attribute 'strftime' 

#  retstr = '{:%Y-%m-%d}~{:0>3}'.format(arg_dttm, arg_int) 
    # bombs with: invalid format specifier 
    return retstr 

def fmtfn2(arg_dtstr, arg_int): 
    retstr = '{}~{:0>3}'.format(arg_dtstr, arg_int) 
    return retstr 


# The source data. 
# I want to add a 3rd column newhg that carries e.g. 2017-06-25~066 
# i.e. a concatenation of the other two columns. 
df1 = pd.DataFrame({'mydt': ['2017-05-07', '2017-06-25', '2015-08-25'], 
        'myint': [66, 201, 100]}) 


df1['mydt'] = pd.to_datetime(df1['mydt'], errors='raise') 


# THIS WORKS (without calling a function) 
print('\nInline solution') 
df1['newhg'] = df1[['mydt', 'myint']].apply(lambda x: '{:%Y-%m-%d}~{:0>3}'.format(x[0], x[1]), axis=1) 
print(df1) 

# THIS WORKS 
print('\nConvert to string first') 
df1['mydt2'] = df1['mydt'].apply(lambda x: x.strftime('%Y-%m-%d')) 
df1['newhg'] = np.vectorize(fmtfn2)(df1['mydt2'], df1['myint']) 
print(df1) 

# Bombs in the function - see above 
print('\nPass a datetime') 
df1['newhg'] = np.vectorize(fmtfn)(df1['mydt'], df1['myint']) 
print(df1) 
+1

那是哪里调用函数造成的代码错误? –

+0

@cᴏʟᴅsᴘᴇᴇᴅ向下滚动到代码末尾。 –

回答

1

您可以有也使用了内建函数从大熊猫,这使得它有点更容易阅读:

df1['newhg'] = df1.mydt.dt.strftime('%Y-%m-%d') + '~' + df1.myint.astype(str).str.zfill(3) 
+0

谢谢你,你是最优雅的解决方案......我会在整个'调用带有数据帧列的函数'上多浏览一下,因为它对我感兴趣。 – Relaxed1