Python：排序依赖列表

我想解决如果我的问题是可以解决使用内置排序（）函数，或者如果我需要自己做 - 使用cmp的老学校本来会相对容易。Python：排序依赖列表

我的数据集看起来像：

 
x = [ 
('business', Set('fleet','address')) 
('device', Set('business','model','status','pack')) 
('txn', Set('device','business','operator')) 
....

排序规则应基本对于N & Y，其中Y> N，X [N] [0]不是在X [Y] [所有值1]

尽管我使用Python 2.6，其中cmp参数仍然可用，但我试图让这个Python 3安全。

那么，这可以使用一些lambda魔术和关键参数来完成？

- == UPDATE == -

感谢礼&温斯顿！我真的不认为使用钥匙会起作用，或者如果我能怀疑它会是一种不理想的鞋子喇叭解决方案。

因为我的问题是数据库表依赖关系，我不得不对Eli的代码做一个小的添加，以便从它的依赖列表中删除一个项目（在设计良好的数据库中，这不会发生，但是谁住在那个神奇的完美世界）

我的解决方案：

def topological_sort(source): 
    """perform topo sort on elements. 

    :arg source: list of ``(name, set(names of dependancies))`` pairs 
    :returns: list of names, with dependancies listed first 
    """ 
    pending = [(name, set(deps)) for name, deps in source]   
    emitted = [] 
    while pending: 
     next_pending = [] 
     next_emitted = [] 
     for entry in pending: 
      name, deps = entry 
      deps.difference_update(set((name,)), emitted) # <-- pop self from dep, req Py2.6 
      if deps: 
       next_pending.append(entry) 
      else: 
       yield name 
       emitted.append(name) # <-- not required, but preserves original order 
       next_emitted.append(name) 
     if not next_emitted: 
      raise ValueError("cyclic dependancy detected: %s %r" % (name, (next_pending,))) 
     pending = next_pending 
     emitted = next_emitted

来源

2012-07-19 DisabledLeopard

什么是'Set'？ – 2012-07-19 08:56:56

http://en.wikipedia.org/wiki/Topological_sorting – interjay 2012-07-19 09:05:14

@JonClements我会假设他的意思是'sets.Set'，尽管在他说他使用的Python 2.6中已经废弃了。但是，如果这就是他的意思，那么他需要向构造函数提供一个迭代器，而不是多个参数。 – Duncan 2012-07-19 12:50:32

你想要什么叫做topological sort。虽然可以使用内置的sort()来实现，但它很尴尬，最好是直接在python中实现拓扑排序。

为什么它会变得尴尬？如果您在维基页面上研究这两种算法，则它们都依赖于一组正在运行的“标记节点”，由于key=xxx（或甚至cmp=xxx）对无状态比较函数的效果最好，因此这个概念很难扭曲到sort()可以使用的形式，特别是因为timsort不能保证元素将被检查的顺序。我（很漂亮）确定确实使用使用sort()的任何解决方案最终将冗余计算每次调用key/cmp函数的一些信息，以解决无国籍问题。

以下是我一直在使用（一些JavaScript库的依赖关系排序）ALG：

编辑：重新设计，这大大的基础上，温斯顿·尤尔特的解决方案

def topological_sort(source): 
    """perform topo sort on elements. 

    :arg source: list of ``(name, [list of dependancies])`` pairs 
    :returns: list of names, with dependancies listed first 
    """ 
    pending = [(name, set(deps)) for name, deps in source] # copy deps so we can modify set in-place  
    emitted = []   
    while pending: 
     next_pending = [] 
     next_emitted = [] 
     for entry in pending: 
      name, deps = entry 
      deps.difference_update(emitted) # remove deps we emitted last pass 
      if deps: # still has deps? recheck during next pass 
       next_pending.append(entry) 
      else: # no more deps? time to emit 
       yield name 
       emitted.append(name) # <-- not required, but helps preserve original ordering 
       next_emitted.append(name) # remember what we emitted for difference_update() in next pass 
     if not next_emitted: # all entries have unmet deps, one of two things is wrong... 
      raise ValueError("cyclic or missing dependancy detected: %r" % (next_pending,)) 
     pending = next_pending 
     emitted = next_emitted

旁注：它是可能鞋号cmp()函数key=xxx，如在这个python bug跟踪器message。

来源

2012-07-19 15:37:30

在寻找坏的格式和这个陌生的Set类型...（我已经把他们作为元组和正确分隔列表项...）...并使用networkx库来方便...

x = [ 
    ('business', ('fleet','address')), 
    ('device', ('business','model','status','pack')), 
    ('txn', ('device','business','operator')) 
] 

import networkx as nx 

g = nx.DiGraph() 
for key, vals in x: 
    for val in vals: 
     g.add_edge(key, val) 

print nx.topological_sort(g)

来源

2012-07-19 09:08:38

这是唯一的解决方案，它适用于我...我不喜欢依赖于其他libs，但它是值得的小代码 – devarni 2013-05-04 13:19:04

关于此解决方案的一个警告;它只在你的依赖关系形成一个完全连通的图时才起作用。如果有节点没有任何依赖关系（因此没有任何边到其他节点），它们将不会包含在'topological_sort（）'的输出中。 – larsks 2014-10-16 13:00:52

我做一个拓扑排序是这样的：

def topological_sort(items): 
    provided = set() 
    while items: 
     remaining_items = [] 
     emitted = False 

     for item, dependencies in items: 
      if dependencies.issubset(provided): 
        yield item 
        provided.add(item) 
        emitted = True 
      else: 
        remaining_items.append((item, dependencies)) 

     if not emitted: 
      raise TopologicalSortFailure() 

     items = remaining_items

我认为它是一个略多于利的版本简单的，我不知道效率。

来源

2012-07-19 16:00:27

这比我的版本更*更直接。我认为你的主要效率汇是issubset（）调用，但这对大数据集来说只是一个问题 - 我的版本受到初始设置成本的影响，对于小数据集来说速度较慢，但是它可以让它修改集合当有很多依赖关系时避免issubset。尽管如此，你的基本结构仍然更好，我希望你不要介意我已经重写了我的实现来借用你的一堆解决方案:) – 2012-07-19 17:58:33

这是温斯顿的建议，文档字符串和一个微小的调整，与provided.issuperset(dependencies)。该更改允许您将每个输入对中的dependencies作为任意迭代来传递，而不一定是set。

我的用例涉及dict，其键是项目字符串，每个键的值都是该键所依赖的项目名称的list。一旦我确定dict非空，我可以将其iteritems()传递给修改的算法。

再次感谢温斯顿。

def topological_sort(items): 
    """ 
    'items' is an iterable of (item, dependencies) pairs, where 'dependencies' 
    is an iterable of the same type as 'items'. 

    If 'items' is a generator rather than a data structure, it should not be 
    empty. Passing an empty generator for 'items' (zero yields before return) 
    will cause topological_sort() to raise TopologicalSortFailure. 

    An empty iterable (e.g. list, tuple, set, ...) produces no items but 
    raises no exception. 
    """ 
    provided = set() 
    while items: 
     remaining_items = [] 
     emitted = False 

     for item, dependencies in items: 
      if provided.issuperset(dependencies): 
        yield item 
        provided.add(item) 
        emitted = True 
      else: 
        remaining_items.append((item, dependencies)) 

     if not emitted: 
      raise TopologicalSortFailure() 

     items = remaining_items

来源

2016-01-26 15:56:26

Python：排序依赖列表

回答

相关问题