循环引用的对象没有得到垃圾收集

我有我的代码中使用了很多小的方便类如下：循环引用的对象没有得到垃圾收集

class Structure(dict): 
    def __init__(self, **kwargs): 
     dict.__init__(self, **kwargs) 
     self.__dict__ = self

关于它的好处是，你可以通过访问属性字典的键语法或通常对象样式：

myStructure = Structure(name="My Structure") 
print myStructure["name"] 
print myStructure.name

今天，我已经注意到，我的应用程序的内存消耗是在我本来期望它减少的情况略有增加。在我看来，从结构类生成的实例不垃圾收集。在这里说明这是一个小片段：

import gc 

class Structure(dict): 
    def __init__(self, **kwargs): 
     dict.__init__(self, **kwargs) 
     self.__dict__ = self 

structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)] 
print "Structure name: ", structures[16].name 
print "Structure name: ", structures[16]["name"] 
del structures 
gc.collect() 
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure])

用下面的输出：

Structure name: __16 
Structure name: __16 
Structures count: 4096

当你注意到结构实例数仍是4096

我评论的行创建方便的自我参考：

import gc 

class Structure(dict): 
    def __init__(self, **kwargs): 
     dict.__init__(self, **kwargs) 
     # self.__dict__ = self 

structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)] 
# print "Structure name: ", structures[16].name 
print "Structure name: ", structures[16]["name"] 
del structures 
gc.collect() 
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure])

既然循环引用被移除时输出有意义：

Structure name: __16 
Structures count: 0

我推一点进一步使用Melia分析内存消耗测试：

import gc 
import pprint 
from meliae import scanner 
from meliae import loader 

class Structure(dict): 
    def __init__(self, **kwargs): 
     dict.__init__(self, **kwargs) 
     self.__dict__ = self 

structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)] 
print "Structure name: ", structures[16].name 
print "Structure name: ", structures[16]["name"] 
del structures 
gc.collect() 
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure]) 

scanner.dump_all_objects("Test_001.json") 
om = loader.load("Test_001.json") 
summary = om.summarize() 
print summary 

structures = om.get_all("Structure") 
if structures: 
    pprint.pprint(structures[0].c)

产生以下输出：

Structure name: __16 
Structure name: __16 
Structures count: 4096 
loading... line 5001, 5002 objs, 0.6/ 1.8 MiB read in 0.2s 
loading... line 10002, 10003 objs, 1.1/ 1.8 MiB read in 0.3s 
loading... line 15003, 15004 objs, 1.7/ 1.8 MiB read in 0.5s 
loaded line 16405, 16406 objs, 1.8/ 1.8 MiB read in 0.5s   
checked  1/ 16406 collapsed  0  
checked 16405/ 16406 collapsed  157  
compute parents  0/ 16249   
compute parents 16248/ 16249   
set parents 16248/ 16249   
collapsed in 0.2s 
Total 16249 objects, 58 types, Total size = 3.2MiB (3306183 bytes) 
Index Count %  Size % Cum  Max Kind 
    0 4096 25 1212416 36 36  296 Structure 
    1  390 2 536976 16 52 49432 dict 
    2 5135 31 417550 12 65 12479 str 
    3  82 0 290976 8 74 12624 module 
    4  235 1 212440 6 80  904 type 
    5  947 5 121216 3 84  128 code 
    6 1008 6 120960 3 88  120 function 
    7 1048 6  83840 2 90  80 wrapper_descriptor 
    8  654 4  47088 1 92  72 builtin_function_or_method 
    9  562 3  40464 1 93  72 method_descriptor 
    10  517 3  37008 1 94  216 tuple 
    11  139 0  35832 1 95 2280 set 
    12  351 2  30888 0 96  88 weakref 
    13  186 1  23200 0 97 1664 list 
    14  63 0  21672 0 97  344 WeakSet 
    15  21 0  18984 0 98  904 ABCMeta 
    16  197 1  14184 0 98  72 member_descriptor 
    17  188 1  13536 0 99  72 getset_descriptor 
    18  284 1  6816 0 99  24 int 
    19  14 0  5296 0 99 2280 frozenset 
[Structure(4312707312 296B 2refs 2par), 
type(4298634592 904B 4refs 100par 'Structure')]

内存使用量为3.2MiB，删除自引用行会导致以下输出：

Structure name: __16 
Structures count: 0 
loading... line 5001, 5002 objs, 0.6/ 1.4 MiB read in 0.1s 
loading... line 10002, 10003 objs, 1.1/ 1.4 MiB read in 0.3s 
loaded line 12308, 12309 objs, 1.4/ 1.4 MiB read in 0.4s   
checked  12/ 12309 collapsed  0  
checked 12308/ 12309 collapsed  157  
compute parents  0/ 12152   
compute parents 12151/ 12152   
set parents 12151/ 12152   
collapsed in 0.1s 
Total 12152 objects, 57 types, Total size = 2.0MiB (2093714 bytes) 
Index Count %  Size % Cum  Max Kind 
    0  390 3 536976 25 25 49432 dict 
    1 5134 42 417497 19 45 12479 str 
    2  82 0 290976 13 59 12624 module 
    3  235 1 212440 10 69  904 type 
    4  947 7 121216 5 75  128 code 
    5 1008 8 120960 5 81  120 function 
    6 1048 8  83840 4 85  80 wrapper_descriptor 
    7  654 5  47088 2 87  72 builtin_function_or_method 
    8  562 4  40464 1 89  72 method_descriptor 
    9  517 4  37008 1 91  216 tuple 
    10  139 1  35832 1 92 2280 set 
    11  351 2  30888 1 94  88 weakref 
    12  186 1  23200 1 95 1664 list 
    13  63 0  21672 1 96  344 WeakSet 
    14  21 0  18984 0 97  904 ABCMeta 
    15  197 1  14184 0 98  72 member_descriptor 
    16  188 1  13536 0 98  72 getset_descriptor 
    17  284 2  6816 0 99  24 int 
    18  14 0  5296 0 99 2280 frozenset 
    19  22 0  2288 0 99  104 classobj

确认结构情况下已被销毁和内存使用率降至2.0MiB。

任何想法我怎么能确保这个类得到正确的垃圾收集？顺便说一下，所有这些都是在Python 2.7.2（Darwin）上执行的。

干杯，

托马斯

来源

2011-12-31 Kel Solaar

你为什么要这样的自我引用？即使你坚持属性访问和项目查找的双重性（恕我直言，根据Python的Zen），还有更好，更简单的方法来实现这一点。 – delnan 2011-12-31 11:53:14

您可以更直接地利用__getattr__和__setattr__，使属性访问到底层的字典实现你的结构类。

class Structure(dict): 
    def __getattr__(self, k): 
     return self[k] 
    def __setattr__(self, k, v): 
     self[k] = v

周期被垃圾收集在Python，但只是周期性（不像得到尽快收集它们的引用计数经常引用计数的对象降到0）。

避免周期（因为使用__getattr__和__setattr__的Structure类会），意味着您将获得更好的gc行为。你可能想看看collections.namedtuple作为一个很好的选择：它不是完全按照你实现的，但也许它适合你的目的。

来源

2011-12-31 11:54:59

嗨保罗，干杯！它看起来是一个很好的选择，我实际上是从这篇文章中读到的：http://ruslanspivak.com/2011/06/12/the-bunch-pattern/。显然垃圾收集的错误也是已知的：http://bugs.python.org/issue1469629关于namedTuple：我很早以前就看过它，但我需要我的数据是可变的。 – 2011-12-31 12:01:27

循环引用的对象没有得到垃圾收集

回答

相关问题