f.readline与f.read打印输出

我是Python新手（使用Python 3.6）。我有一个包含公司信息的read.txt文件。文件开始与不同的报告特性f.readline与f.read打印输出

CONFORMED PERIOD REPORT:    20120928 #this is 1 line 
DATE OF REPORT:      20121128 #this is another line 

and then starts all the text about the firm..... #lots of lines here

我试图提取两个日期（[“20120928”，“20121128”]）以及一些字符串是文本（即，如果该字符串存在，那么我想要一个'1'）。最终，我想要一个向量给我两个日期+不同字符串的1和0，即：''20120928'，'20121128'，'1'，'0']。我的代码如下：

exemptions = [] #vector I want 

with open('read.txt', 'r') as f: 
    line2 = f.read() # read the txt file 
    for line in f: 
     if "CONFORMED PERIOD REPORT" in line: 
      exemptions.append(line.strip('\n').replace("CONFORMED PERIOD REPORT:\t", "")) # add line without stating CONFORMED PERIOD REPORT, just with the date) 
     elif "DATE OF REPORT" in line: 
      exemptions.append(line.rstrip('\n').replace("DATE OF REPORT:\t", "")) # idem above 

    var1 = re.findall("string1", line2, re.I) # find string1 in line2, case-insensitive 
    if len(var1) > 0: # if the string appears, it will have length>0 
     exemptions.append('1') 
    else: 
     exemptions.append('0') 
    var2 = re.findall("string2", line2, re.I) 
    if len(var2) > 0: 
     exemptions.append('1') 
    else: 
     exemptions.append('0') 

print(exemptions)

如果我运行这段代码，我得到[“1”，“0”]，省略了日期，并给予正确的读取文件的，VAR1存在（OK“1”）和var2不（OK'0'）。我不明白的是为什么它不报告日期。重要的是，当我将line2更改为“line2 = f.readline（）”时，我获得['20120928'，'20121128'，'0'，'0']。现在确定日期，但我知道var1存在，它似乎没有读取文件的其余部分？如果我省略“line2 = f.read（）”，它会为每行输出一个0的向量，除了我想要的输出。我怎样才能省略这些0？

我所需的输出将是：[ '20120928'， '20121128'， '1'， '0']

抱歉打扰。不管怎样，谢谢你！

来源

2017-07-24 martins

我通过它去的方式终于如下：

exemptions = [] #vector I want 

with open('read.txt', 'r') as f: 
    line2 = "" # create an empty string variable out of the "for line" loop 
    for line in f: 
     line2 = line2 + line #append each line to the above created empty string 
     if "CONFORMED PERIOD REPORT" in line: 
      exemptions.append(line.strip('\n').replace("CONFORMED PERIOD REPORT:\t", "")) # add line without stating CONFORMED PERIOD REPORT, just with the date) 
     elif "DATE OF REPORT" in line: 
      exemptions.append(line.rstrip('\n').replace("DATE OF REPORT:\t", "")) # idem above 

    var1 = re.findall("string1", line2, re.I) # find string1 in line2, case-insensitive 
    if len(var1) > 0: # if the string appears, it will have length>0 
     exemptions.append('1') 
    else: 
     exemptions.append('0') 
    var2 = re.findall("string2", line2, re.I) 
    if len(var2) > 0: 
     exemptions.append('1') 
    else: 
     exemptions.append('0') 

print(exemptions)

到目前为止，这是我得到。它为我工作，虽然我猜与美丽的工作会增加代码的效率。下一步:)

来源

2017-07-24 15:31:42 martins

line2 = f.read()读取整个文件到line2，所以没有什么可以为您的for line in f:循环读取。

来源

2017-07-24 12:31:18

行f.read()会将整个文件读入变量line2。如果你想逐行读取，你可以跳过f.read()一起，只是重复，像这样

with open('read.txt', 'r') as f: 
    for line in f:

否则书面，你.read()为line2没有更多的文字中读出f后，因为它是所有包含在line2变量中。

来源

2017-07-24 12:31:46 CoryKramer

将会更好地使用f.readlines（），然后对行进行换行而不是按\ n分割，因为这可能不会给您预期的结果。 – Ajurna

我不确定第一个代码片段甚至值得一提的建议，第二种方式显然是要走的路 –

好点。放弃了第一种方法。 – CoryKramer

f.readline与f.read打印输出

回答

相关问题