2009-06-25 54 views
1

我有一个列表的列表,看起来像这样:格式化输出写列表时TEXTFILE

try: 
    file_name = open("dupe.txt", "w") 
except IOError: 
    pass 

for a in range (len(dupe)): 
    file_name.write(dupe[a][0] + " " + dupe[a][1] + " " + dupe[a][2] + "\n"); 

file_name.close() 

dupe = [['95d1543adea47e88923c3d4ad56e9f65c2b40c76', 'ron\\c', 'apa.txt'], ['95d1543adea47e88923c3d4ad56e9f65c2b40c76', 'ron\\c', 'knark.txt'], ['b5cc17d3a35877ca8b76f0b2e07497039c250696', 'ron\\a', 'apa2.txt'], ['b5cc17d3a35877ca8b76f0b2e07497039c250696', 'ron\\a', 'jude.txt']] 

我用一个非常基本的()函数将其写入文件与文件中的输出如下所示:

95d1543adea47e88923c3d4ad56e9f65c2b40c76 ron\c apa.txt 
95d1543adea47e88923c3d4ad56e9f65c2b40c76 ron\c knark.txt 
b5cc17d3a35877ca8b76f0b2e07497039c250696 ron\a apa2.txt 
b5cc17d3a35877ca8b76f0b2e07497039c250696 ron\a jude.txt 

但是,如何使输出在dupe.txt文件看起来像t他:

95d1543adea47e88923c3d4ad56e9f65c2b40c76 ron\c apa.txt, knark.txt 
b5cc17d3a35877ca8b76f0b2e07497039c250696 ron\a apa2.txt, jude.txt 
+0

为第二列始终是相同的,如果散列相等? (Smakfulla VAL AV filnamn,förövrigtVad的AR DET此项魔På富OCH吧:?P) – 2009-06-25 19:16:50

+0

这似乎是基本上为http://stackoverflow.com/questions/1034145/python-list-question同样的问题。 – 2009-06-25 19:19:03

回答

2

首先,小组由“钥匙”线(前两个元素每个阵列):

dupedict = {} 
for a, b, c in dupe: 
    dupedict.setdefault((a,b),[]).append(c) 

然后打印出来:

for key, values in dupedict.iteritems(): 
    print ' '.join(key), ', '.join(values) 
0

如果这是你的实际的回答,您可以:

  1. 输出Per在欺骗每两个元素一行。这很容易。或者,
  2. 如果你的数据不作为结构化(所以你可能就可以使一本字典在您的长哈希是关键,和字符串的尾部是你的输出。有意义吗?

在认识上,意味着你可以像这样:

tmp_string = "" 
for a in range (len(dupe)): 
if isOdd(a): 
    tmp_string = dupe[a][0] + " " + dupe[a][1] + " " + dupe[a][2] 
else: 
    tmp_string += ", " + dupe[a][2] 
    file_name.write(dupe[a][0] + " " + dupe[a][1] + " " + dupe[a][2] + "\n"); 

在观念二,你可能有这样的事情:

x=dict() 
for a in range(len(dupe)): 
    # check if the hash exists in x; bad syntax - I dunno "exists?" syntax 
    if (exists(x[dupe[a][0]])): 
     x[a] += "," + dupe[a][2] 
    else: 
     x[a] = dupe[a][0] + " " + dupe[a][1] + " " + dupe[a][2] 
for b in x: # bad syntax: basically, for every key in dictionary x 
    file_name.write(x[b]); 
0

使用字典将它们分组:

data = [['95d1543adea47e88923c3d4ad56e9f65c2b40c76', 'ron\\c', 'apa.txt'], \ 
    ['95d1543adea47e88923c3d4ad56e9f65c2b40c76', 'ron\\c', 'knark.txt'], \ 
    ['b5cc17d3a35877ca8b76f0b2e07497039c250696', 'ron\\a', 'apa2.txt'], \ 
    ['b5cc17d3a35877ca8b76f0b2e07497039c250696', 'ron\\a', 'jude.txt']] 

dupes = {} 
for row in data: 
    if dupes.has_key(row[0]): 
     dupes[row[0]].append(row) 
    else: 
     dupes[row[0]] = [row] 

for dupe in dupes.itervalues(): 
    print "%s\t%s\t%s" % (dupe[0][0], dupe[0][1], ",".join([x[2] for x in dupe])) 
1

我认为你最后一个问题没有解决你的问题?

而不是把每个列表与重复的ID和目录列表独立,为什么不把列表的文件元素包含所有具有相同ID和目录中的文件另一个子列表。

这样欺骗是这样的:

dupe = [['95d1543adea47e88923c3d4ad56e9f65c2b40c76', 'ron\\c', ['apa.txt','knark.txt']], 
['b5cc17d3a35877ca8b76f0b2e07497039c250696', 'ron\\a', ['apa2.txt','jude.txt']] 

然后打印循环可能类似于:

for i in dupe: 
    print i[0], i[1], 
    for j in i[2] 
     print j, 
    print 
1
from collections import defaultdict 

dupe = [ 
    ['95d1543adea47e88923c3d4ad56e9f65c2b40c76', 'ron\\c', 'apa.txt'], 
    ['95d1543adea47e88923c3d4ad56e9f65c2b40c76', 'ron\\c', 'knark.txt'], 
    ['b5cc17d3a35877ca8b76f0b2e07497039c250696', 'ron\\a', 'apa2.txt'], 
    ['b5cc17d3a35877ca8b76f0b2e07497039c250696', 'ron\\a', 'jude.txt'], 
] 
with open("dupe.txt", "w") as f: 
    data = defaultdict(list) 
    for hash, dir, fn in dupe: 
    data[(hash, dir)].append(fn) 
    for hash_dir, fns in data.items(): 
    f.write("{0[0]} {0[1]} {1}\n".format(hash_dir, ', '.join(fns)))