2016-04-27 108 views
0

基本上,我必须打开一份CSV报告(约30,000行),并将其更名为ARTIST和TITLE,如果它们出现在更正的ARTIST和TITLE的第二个CSV文件(约10,000行) 。嵌套循环读取Python中不同的CSV文件

我想出的代码将扫描所有31,400行,但由于某种原因,它只会替换它找到的第一个实例。

这里是我的代码:

def convert(): # StackOverflow refuses to display the indents correctly 
global modified 
print "\n\nConverting: " + logfile + "\n\n" 
songCount = 0  # Number of lines required to be reported 
unclaimedCount = 0 # Number of lines not required to be reported (used to double check accuracy or report) 
freport = open(musicreportname, "w") # This is the new report we will create 
flogfile = open(logfile, "r")  # This is the existing report 
freplacefile = open(replacefile, "r")# This file contains corrected names to be substituted and ISRC Codes 
freport.write("^NAME_OF_SERVICE^|^TRANSMISSION_CATEGORY^|^FEATURED_ARTIST^|^SOUND_RECORDING_TITLE^|^ISRC^|^ALBUM_TITLE^|^MARKETING_LABEL^|^ACTUAL_TOTAL_PERFORMANCES^\n") 
lineCount = 0 
rlinecount = 0 
for line in csv.reader(flogfile, quotechar='"', delimiter=',', quoting=csv.QUOTE_ALL, skipinitialspace=True): 
    lineCount += 1 
    if line[0][0] == "#": 
     continue 
    if line[16] == "S": 
     songCount += 1 
     matched = "FALSE" 
     rlineCount = 0 
     for rline in csv.reader(freplacefile, delimiter=',', quoting=csv.QUOTE_ALL, skipinitialspace=True): 
      rlineCount += 1 
      if line[3] == rline[2]: 
       print "Matched " + line[3] 
       if line[4] == rline[1]: 
        print "Matched " + line[3], rline[1] 
        output = "^" + service + "^|^" + "B" + "^|^" + rline[8] + "^|^" + rline[7] + "^|^" + rline[6] + "^|^" + line[5] + "^|^" + line[6] + "^|^" + line[2] + "^\n" 
        freport.write(output) 
        matched = "TRUE" 
        modified += 1 
        break 
      if matched == "FALSE": 
       output = "^" + service + "^|^" + "B" + "^|^" + line[3] + "^|^" + line[4] + "^|^" + line[8] + "^|^" + line[5] + "^|^" + line[6] + "^|^" + line[2] + "^\n" 
       freport.write(output) 
    else: 
     unclaimedCount += 1 
freport.close() 
flogfile.close() 
freplacefile.close() 
print str(songCount) + " Total Songs Found." 
print "Checked " + str(lineCount) + " lines." 
print "Replaced " + str(modified) + " lines." 

任何帮助将不胜感激!感谢您的期待!

+1

我自己和另一位用户修改了代码格式,使其更清晰一些 - 您能否确认我们没有将缩进的螺丝拧紧?为了将来的参考,如果在代码之前添加四个空格,它将被放置在一个代码块中并且更易于阅读。 – thegrinner

+0

嵌套循环是这样做的错误方式,因为它意味着它必须读取第二个文件30,000次。阅读第二个文件并创建一个包含所有映射的字典。然后读取第一个文件并使用映射字典执行所有重命名。 – Barmar

+0

我仍然试图编辑这个方式,它显示正确的缩进...... –

回答

0

读取第二个文件一次,并创建一个包含所有映射的字典。然后读取第一个文件并使用映射字典执行所有重命名。 - Barmar 4月27日18:12

我跟随了Barmar的建议。我刚刚开始。我用元组而不是字典,但同样的想法。我从来没有在上面的代码中找到错误,但现在一切都按预期工作。感谢Barmar。