2017-03-03 146 views
1

我被卡住了,需要一些帮助。我有以下数据帧:将熊猫数据框添加到列

+-----+---+---+--+--+ 
|  | A | B | | | 
+-----+---+---+--+--+ 
| 288 | 1 | 4 | | | 
+-----+---+---+--+--+ 
| 245 | 2 | 3 | | | 
+-----+---+---+--+--+ 
| 543 | 3 | 6 | | | 
+-----+---+---+--+--+ 
| 867 | 1 | 9 | | | 
+-----+---+---+--+--+ 
| 345 | 2 | 7 | | | 
+-----+---+---+--+--+ 
| 122 | 3 | 8 | | | 
+-----+---+---+--+--+ 
| 233 | 1 | 1 | | | 
+-----+---+---+--+--+ 
| 346 | 2 | 6 | | | 
+-----+---+---+--+--+ 
| 765 | 3 | 3 | | | 
+-----+---+---+--+--+ 

列A具有重复值,如图所示。我想要做的是每次看到在列A的重复值I要追加新的式柱与来自塔B为C列的对应值如下所示时间:

+-----+---+---+-----+ 
|  | A | B | C | 
+-----+---+---+-----+ 
| 288 | 1 | 4 | 9 | 
+-----+---+---+-----+ 
| 245 | 2 | 3 | 7 | 
+-----+---+---+-----+ 
| 543 | 3 | 6 | 8 | 
+-----+---+---+-----+ 
| 867 | 1 | 9 | 1 | 
+-----+---+---+-----+ 
| 345 | 2 | 7 | 6 | 
+-----+---+---+-----+ 
| 122 | 3 | 8 | 3 | 
+-----+---+---+-----+ 
| 233 | 1 | 1 | NaN | 
+-----+---+---+-----+ 
| 346 | 2 | 6 | NaN | 
+-----+---+---+-----+ 
| 765 | 3 | 3 | NaN | 
+-----+---+---+-----+ 

感谢。

+0

你的尝试在哪里? – blacksite

+0

听起来像你最好的选择是操纵'df.groupby('A')' – BallpointBen

回答

0

假设val是重复的值中的一个,

slice = df.loc[df.A == val, 'B'].shift(-1) 

将创建重新索引到它们的新位置的值的一列数据帧。

由于没有重新分配的索引值应该是多余的,因此您可以使用pandas.concat将不同切片拼接在一起,而不用担心丢失数据。然后,只需将它们作为新列:

df['C'] = pd.concat([df.loc[df['A'] == x, 'B'].shift(-1) for x in [1, 2, 3]]) 

当列分配,指标值将使一切阵容:

A B C 
0 1 4 9.0 
1 2 3 7.0 
2 3 6 8.0 
3 1 9 1.0 
4 2 7 6.0 
5 3 8 3.0 
6 1 1 NaN 
7 2 6 NaN 
8 3 3 NaN 
+0

的输出谢谢。这工作。 – magicsword

0

反向数据帧顺序,GROUPBY改造它针对移动功能,并将其逆转:

df = df[::-1] 
df['C'] = df.groupby(df.columns[0]).transform('shift') 
df = df[::-1] 
df 

    A B  C 
0 1 4 9.0 
1 2 3 7.0 
2 3 6 8.0 
3 1 9 1.0 
4 2 7 6.0 
5 3 8 3.0 
6 1 1 NaN 
7 2 6 NaN 
8 3 3 NaN 
相关问题