2016-11-13 73 views
1

我有8列和几个NaN值大的熊猫数据帧:哪一种扁平化熊猫数据框的最有效方法?

0 1 2 3 4 5 6 7 8 
1 Google, Inc. (Date 11/07/2016) NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN 
2 Apple Inc. (Date 07/01/2016) Amazon (Date 11/01/2016) NaN  NaN  NaN  NaN  NaN  NaN  NaN 
3 IBM, Inc. (Date 11/08/2016)  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN 
4 Microsoft (Date 11/10/2016)  Google, Inc. (Date 11/10/1990) Google, Inc. (Date 11/07/2016) Samsung (Date 05/02/2016) NaN  NaN  NaN  NaN  NaN 

我怎样才能拉平下来这样说:

0 companies 
1 Google, Inc. (Date 11/07/2016) 
2 Apple Inc. (Date 07/01/2016) 
3 Amazon (Date 11/01/2016) 
4 IBM, Inc. (Date 11/08/2016) 
5 Microsoft (Date 11/10/2016) 
6 Google, Inc. (Date 11/10/1990) 
7 Google, Inc. (Date 11/07/2016) 
8 Samsung (Date 05/02/2016) 

我读了docs,并试图:

df.iloc[:,0] 

问题是我失去了信息并且在其他列上排序。我想知道如何在没有丢失数据的情况下平铺其他单元并订购?

回答

2

您可以堆叠列并可选择重置索引。默认情况下,堆栈会丢弃NaN。

df.stack() 
Out: 
0 0 Google, Inc. (Date 11/07/2016) 
1 0  Apple Inc. (Date 07/01/2016) 
    1   Amazon (Date 11/01/2016) 
2 0  IBM, Inc. (Date 11/08/2016) 
3 0  Microsoft (Date 11/10/2016) 
    1 Google, Inc. (Date 11/10/1990) 
    2 Google, Inc. (Date 11/07/2016) 
    3   Samsung (Date 05/02/2016) 
dtype: object 

df.stack().reset_index(drop=True) 
Out: 
0 Google, Inc. (Date 11/07/2016) 
1  Apple Inc. (Date 07/01/2016) 
2   Amazon (Date 11/01/2016) 
3  IBM, Inc. (Date 11/08/2016) 
4  Microsoft (Date 11/10/2016) 
5 Google, Inc. (Date 11/10/1990) 
6 Google, Inc. (Date 11/07/2016) 
7   Samsung (Date 05/02/2016) 
dtype: object 
+0

感谢您的帮助。那么如果我有兴趣将nan空间保存到堆栈中呢?我应该这样做:'drop = False' – student

+1

该删除用于删除索引。相反,你应该使用'df.stack(dropna = False)'来保存NaN。 – ayhan

+0

谢谢,我得到了:'AttributeError:'系列'对象没有属性'stack'' – student

1

这可能做的伎俩:

df = pd.DataFrame([ 
     ["Google, Inc. (Date 11/07/2016)", float("NaN")], 
     ["Apple Inc. (Date 07/01/2016)", "Amazon (Date 11/01/2016)"]]) 
unstacked = df.T.unstack() 
unstacked.dropna(inplace=True) 
unstacked.reset_index(drop=True, inplace=True) 
unstacked 

输出:

0 Google, Inc. (Date 11/07/2016) 
1  Apple Inc. (Date 07/01/2016) 
2   Amazon (Date 11/01/2016) 
dtype: object 

附:请查看this question关于在问题中提供良好熊猫示例。

+0

看来,@艾汉的回答是好。 –

+0

感谢您的帮助。如果我有兴趣将nan空间保存到堆栈中,我该怎么做:'drop = False'? – student

+1

你必须删除'unstacked.dropna(inplace = True)'行。 –