熊猫堆栈不应该对剩余索引进行排序

我在问自己是否可以取消多索引数据框的一个级别，以便返回的数据帧的其余索引没有排序！代码例如：熊猫堆栈不应该对剩余索引进行排序

arrays = [["room1", "room1", "room1", "room1", "room1", "room1", 
      "room2", "room2", "room2", "room2", "room2", "room2"], 
      ["bed1", "bed1", "bed1", "bed2", "bed2", "bed2", 
      "bed1", "bed1", "bed1", "bed2", "bed2", "bed2"], 
      ["blankets", "pillows", "all", "blankets", "pillows", "all", 
      "blankets", "pillows", "all", "blankets", "pillows", "all"]] 

tuples = list(zip(*arrays)) 

index = pd.MultiIndex.from_tuples(tuples, names=['first index', 
               'second index', 'third index']) 

series = pd.Series([1, 2, 3, 1, 1, 2, 2, 2, 4, 2, 1, 3 ], index=index) 

series 

first index second index third index 
room1  bed1   blankets  1 
          pillows  2 
          all   3 
      bed2   blankets  1 
          pillows  1 
          all   2 
room2  bed1   blankets  2 
          pillows  2 
          all   4 
      bed2   blankets  2 
          pillows  1 
          all   3

取消堆栈第二索引：

series.unstack(1) 

second index    bed1 bed2 
first index third index    
room1  all    3  2 
      blankets  1  1 
      pillows   2  1 
room2  all    4  3 
      blankets  2  2 
      pillows   2  1

的问题是，该第三索引的顺序已经改变，因为指数为自动和按字母顺序排序。现在，行'毛毯'和'枕头'之和的'all'行是第一行，而不是最后一行。那么如何解决这个问题呢？似乎没有一个选项可以阻止自动排序。另外，似乎没有可能使用像myDataFrame.sort_index（...，key = ['some_key']）这样的键对数据框的索引进行排序。

来源

2017-08-01 John

一种可能的解决方案是reindex或reindex_axis与参数level=1：

s = series.unstack(1).reindex(['blankets','pillows','all'], level=1) 
print (s) 
second index    bed1 bed2 
first index third index    
room1  blankets  1  1 
      pillows   2  1 
      all    3  2 
room2  blankets  2  2 
      pillows   2  1 
      all    4  3

s = series.unstack(1).reindex_axis(['blankets','pillows','all'], level=1) 
print (s) 
second index    bed1 bed2 
first index third index    
room1  blankets  1  1 
      pillows   2  1 
      all    3  2 
room2  blankets  2  2 
      pillows   2  1 
      all    4  3

更动态的解决方案：

a = series.index.get_level_values('third index').unique() 
print (a) 
Index(['blankets', 'pillows', 'all'], dtype='object', name='third index') 

s = series.unstack(1).reindex_axis(a, level=1) 
print (s) 
second index    bed1 bed2 
first index third index    
room1  blankets  1  1 
      pillows   2  1 
      all    3  2 
room2  blankets  2  2 
      pillows   2  1 
      all    4  3

来源

2017-08-01 10:13:21 jezrael

熊猫堆栈不应该对剩余索引进行排序

回答

相关问题