2016-04-25 49 views
1

鉴于如下表order_size栈数据和删除0/NaN的

Symbol  BAX BTP CT D DX ESTX50 GBM GBP GBS GE I LE NZD S ZL 
Date                    
2016-03-03 0 0 -2 0 0  0 0 0 0 0 0 0 0 0 0 
2016-03-04 -12 0 0 0 0  0 0 0 0 1 0 0 -1 0 0 
2016-03-07 0 0 0 0 -1  0 1 -1 4 -1 1 0 1 1 0 
2016-03-08 0 0 0 0 0  0 0 0 0 0 0 0 -1 0 0 
2016-03-10 0 0 0 0 0  0 0 1 -1 0 0 0 0 0 0 
2016-03-11 0 0 0 0 0  0 -1 -1 -1 0 -1 0 1 -1 0 
2016-03-14 0 0 0 0 0  0 0 0 0 0 0 0 -1 1 0 
2016-03-15 -1 0 0 0 0  0 0 0 0 1 0 0 1 0 0 
2016-03-17 0 0 0 0 0  0 0 0 0 -1 0 0 0 0 -1 

我需要将其转换成一个堆叠视图,具有布局结束了如此: Date | Symbol | Value,其中值不0意味着所有条目都被丢弃。 如果我使用df.stack(),它将它转换为pd.TimeSeries,这不是我想要的(因为我将缺少第三列)。

Date  Symbol 
2016-03-03 BAX  0 
      BTP  0 
      CT  -2 
      D   0 
      DX   0 
      ESTX50  0 
      GBM  0 
      GBP  0 

这使得它看似不可能的运行order_size.loc[:, (order_size.Value != 0).any(axis=0)]删除0(因为Values是不是在pd.Series列)。

编辑

运行df.stack()order_size.replace('0', np.NaN)差不多的伎俩,但仍然pd.Series是不可取的,因为我需要一个第三列Value

回答

1

我觉得你可以先更换所有值等于0NaN然后用stackreset_index

print df != 0 
       BAX BTP  CT  D  DX ESTX50 GBM GBP GBS \ 
Date                   
2016-03-03 False False True False False False False False False 
2016-03-04 True False False False False False False False False 
2016-03-07 False False False False True False True True True 
2016-03-08 False False False False False False False False False 
2016-03-10 False False False False False False False True True 
2016-03-11 False False False False False False True True True 
2016-03-14 False False False False False False False False False 
2016-03-15 True False False False False False False False False 
2016-03-17 False False False False False False False False False 

       GE  I  LE NZD  S  ZL 
Date             
2016-03-03 False False False False False False 
2016-03-04 True False False True False False 
2016-03-07 True True False True True False 
2016-03-08 False False False True False False 
2016-03-10 False False False False False False 
2016-03-11 False True False True True False 
2016-03-14 False False False True True False 
2016-03-15 True False False True False False 
2016-03-17 True False False False False True 
print df[df != 0] 
      BAX BTP CT D DX ESTX50 GBM GBP GBS GE I LE NZD \ 
Date                    
2016-03-03 NaN NaN -2.0 NaN NaN  NaN NaN NaN NaN NaN NaN NaN NaN 
2016-03-04 -12.0 NaN NaN NaN NaN  NaN NaN NaN NaN 1.0 NaN NaN -1.0 
2016-03-07 NaN NaN NaN NaN -1.0  NaN 1.0 -1.0 4.0 -1.0 1.0 NaN 1.0 
2016-03-08 NaN NaN NaN NaN NaN  NaN NaN NaN NaN NaN NaN NaN -1.0 
2016-03-10 NaN NaN NaN NaN NaN  NaN NaN 1.0 -1.0 NaN NaN NaN NaN 
2016-03-11 NaN NaN NaN NaN NaN  NaN -1.0 -1.0 -1.0 NaN -1.0 NaN 1.0 
2016-03-14 NaN NaN NaN NaN NaN  NaN NaN NaN NaN NaN NaN NaN -1.0 
2016-03-15 -1.0 NaN NaN NaN NaN  NaN NaN NaN NaN 1.0 NaN NaN 1.0 
2016-03-17 NaN NaN NaN NaN NaN  NaN NaN NaN NaN -1.0 NaN NaN NaN 

       S ZL 
Date     
2016-03-03 NaN NaN 
2016-03-04 NaN NaN 
2016-03-07 1.0 NaN 
2016-03-08 NaN NaN 
2016-03-10 NaN NaN 
2016-03-11 -1.0 NaN 
2016-03-14 1.0 NaN 
2016-03-15 NaN NaN 
2016-03-17 NaN -1.0 
df1 = df[df != 0].stack().reset_index() 
#set custom column names 
df1.columns = ['Date','Symbol','Value'] 
print df1 
      Date Symbol Value 
0 2016-03-03  CT -2.0 
1 2016-03-04 BAX -12.0 
2 2016-03-04  GE 1.0 
3 2016-03-04 NZD -1.0 
4 2016-03-07  DX -1.0 
5 2016-03-07 GBM 1.0 
6 2016-03-07 GBP -1.0 
7 2016-03-07 GBS 4.0 
8 2016-03-07  GE -1.0 
9 2016-03-07  I 1.0 
10 2016-03-07 NZD 1.0 
11 2016-03-07  S 1.0 
12 2016-03-08 NZD -1.0 
13 2016-03-10 GBP 1.0 
14 2016-03-10 GBS -1.0 
15 2016-03-11 GBM -1.0 
16 2016-03-11 GBP -1.0 
17 2016-03-11 GBS -1.0 
18 2016-03-11  I -1.0 
19 2016-03-11 NZD 1.0 
20 2016-03-11  S -1.0 
21 2016-03-14 NZD -1.0 
22 2016-03-14  S 1.0 
23 2016-03-15 BAX -1.0 
24 2016-03-15  GE 1.0 
25 2016-03-15 NZD 1.0 
26 2016-03-17  GE -1.0 
27 2016-03-17  ZL -1.0 

replacereset_index另一种解决方案

df = df.replace({0:np.nan}) 
df1 = df[df != 0].stack().reset_index() 
#set custom column names 
df1.columns = ['Date','Symbol','Value'] 
print df1 
      Date Symbol Value 
0 2016-03-03  CT -2.0 
1 2016-03-04 BAX -12.0 
2 2016-03-04  GE 1.0 
3 2016-03-04 NZD -1.0 
4 2016-03-07  DX -1.0 
5 2016-03-07 GBM 1.0 
6 2016-03-07 GBP -1.0 
7 2016-03-07 GBS 4.0 
8 2016-03-07  GE -1.0 
9 2016-03-07  I 1.0 
10 2016-03-07 NZD 1.0 
11 2016-03-07  S 1.0 
12 2016-03-08 NZD -1.0 
13 2016-03-10 GBP 1.0 
14 2016-03-10 GBS -1.0 
15 2016-03-11 GBM -1.0 
16 2016-03-11 GBP -1.0 
17 2016-03-11 GBS -1.0 
18 2016-03-11  I -1.0 
19 2016-03-11 NZD 1.0 
20 2016-03-11  S -1.0 
21 2016-03-14 NZD -1.0 
22 2016-03-14  S 1.0 
23 2016-03-15 BAX -1.0 
24 2016-03-15  GE 1.0 
25 2016-03-15 NZD 1.0 
26 2016-03-17  GE -1.0 
27 2016-03-17  ZL -1.0 
+0

请检查解决方案 - 您需要将'0'替换为'NaN'还是所有值不是'0'都替换为'NaN'? – jezrael

+0

我需要用'NaN'替代'0'! – nlsdfnbch