2017-03-15 68 views
1

我想获得最新的专栏,当列III是相同的。 所以结果应该是。 1/30/2017击败了1/27/2017如何通过某些列的最大日期获取行?

I I III     IV 
A X 1/30/2017 9:33:00 AM some_data 
A Y 1/30/2017 9:33:00 AM some_data 
A Z 1/30/2017 9:33:00 AM some_data 
A X 1/27/2017 4:53:00 PM some_data 
A Y 1/27/2017 4:53:00 PM some_data 
A Z 1/27/2017 4:53:00 PM some_data 
B X 1/30/2017 9:33:00 AM some_data 
B Y 1/30/2017 9:33:00 AM some_data 
B Z 1/30/2017 9:33:00 AM some_data 
B X 1/27/2017 4:53:00 PM some_data 
B Y 1/27/2017 4:53:00 PM some_data 
B Z 1/27/2017 4:53:00 PM some_data 

这是我想要的结果。

I I III     IV 
A X 1/30/2017 9:33:00 AM some_data 
A Y 1/30/2017 9:33:00 AM some_data 
A Z 1/30/2017 9:33:00 AM some_data 
B X 1/30/2017 9:33:00 AM some_data 
B Y 1/30/2017 9:33:00 AM some_data 
B Z 1/30/2017 9:33:00 AM some_data 

有人可以帮我弄清楚如何提取这些行吗?

+0

可以显示一些代码,你试过。 – Rednivrug

回答

0

看起来你想要的是一个groupby()transform()max()

代码:

data = [ 
    ('I', 'II', 'III', 'IV'), 
    ('A', 'X', '1/30/2017 9:33:00 AM', 'some_data'), 
    ('A', 'Y', '1/30/2017 9:33:00 AM', 'some_data'), 
    ('A', 'Z', '1/30/2017 9:33:00 AM', 'some_data'), 
    ('A', 'X', '1/27/2017 4:53:00 PM', 'some_data'), 
    ('A', 'Y', '1/27/2017 4:53:00 PM', 'some_data'), 
    ('A', 'Z', '1/27/2017 4:53:00 PM', 'some_data'), 
    ('B', 'X', '1/30/2017 9:33:00 AM', 'some_data'), 
    ('B', 'Y', '1/30/2017 9:33:00 AM', 'some_data'), 
    ('B', 'Z', '1/30/2017 9:33:00 AM', 'some_data'), 
    ('B', 'X', '1/27/2017 4:53:00 PM', 'some_data'), 
    ('B', 'Y', '1/27/2017 4:53:00 PM', 'some_data'), 
    ('B', 'Z', '1/27/2017 4:53:00 PM', 'some_data'), 
] 

import pandas as pd 
df = pd.DataFrame(data[1:], columns=data[0]) 
df['III'] = pd.to_datetime(df['III']) 

# groupby first two columns, then get the maximum value in the third column 
idx = df.groupby(['I', 'II'])['III'].transform(max) == df['III'] 

# use the index to fetch correct rows in dataframe 
print(df[idx]) 

结果:

I II     III   IV 
0 A X 2017-01-30 09:33:00 some_data 
1 A Y 2017-01-30 09:33:00 some_data 
2 A Z 2017-01-30 09:33:00 some_data 
6 B X 2017-01-30 09:33:00 some_data 
7 B Y 2017-01-30 09:33:00 some_data 
8 B Z 2017-01-30 09:33:00 some_data