读取文件的\ n，而是忽略最后\ n

我有一个名为LIST.TXT文件看起来像这样分离：读取文件的 n，而是忽略最后 n

input1 
input2 
input3

我敢肯定有最后一行后无空行（输入3 ）。然后，我有一个Python脚本将由线读取这个文件线和文字写入一些文字创建3个文件（每行一个）：

import os 
os.chdir("/Users/user/Desktop/Folder") 

with open('list.txt','r') as f: 
    lines = f.read().split('\n') 

    #for l in lines: 
     header = "#!/bin/bash \n#BSUB -J %s.sh \n#BSUB -o /scratch/DBC/user/%s.sh.out \n#BSUB -e /scratch/DBC/user/%s.sh.err \n#BSUB -n 1 \n#BSUB -q normal \n#BSUB -P DBCDOBZAK \n#BSUB -W 168:00\n"%(l,l,l) 
     script = "cd /scratch/DBC/user\n" 
     script2 = 'grep "input" %s > result.%s.txt\n'%(l,l) 
     all= "\n".join([header,script,script2]) 

     with open('script_{}.sh'.format(l), 'w') as output: 
      output.write(all)

我的问题是，这将创建4个文件，不3：script_input1.sh，script_input.sh，script_input3.sh和script_.sh。最后一个文件没有文本，其他文件将具有input1或input2或input3。

似乎Python逐行读取我的list.txt，但是当它到达“input3”时，它以某种方式继续？我该如何让Python逐行读取我的文件，用“\ n”分隔，但在最后一个文本之后停止？

来源

2017-10-11 m93

的[从文件中读取列表中删除换行符（可能的复制https://stackoverflow.com/questions/4319236/remove-the-newline-character-in-a-list-read -from-a-file） – Mort

我会再说一次[https://stackoverflow.com/questions/46685755/python-script-to-make-multiple-bash-scripts#comment80321657_46685755]：你可能应该重新思考你的方法。 – tripleee

首先，不读整个文件到内存中，当你没有太多 - 文件是可迭代的，所以按行读取文件的正确方法是：

with open("/path/to/file.ext") as f: 
    for line in f: 
     do_something_with(line)

现在我n您的for循环，你只需要剥离线，如果它是空的，忽略它：

with open("/path/to/file.ext") as f: 
    for line in f: 
     line = line.strip() 
     if not line: 
      continue 
     do_something_with(line)

略无关，但是Python有多个字符串，所以你不需要串联之一：

# not sure I got it right actually ;) 
script_tpl = """ 
#!/bin/bash 
#BSUB -J {line}.sh 
#BSUB -o /scratch/DBC/user/{line}.sh.out 
#BSUB -e /scratch/DBC/user/{line}.sh.err 
#BSUB -n 1 
#BSUB -q normal 
#BSUB -P DBCDOBZAK 
#BSUB -W 168:00 
cd /scratch/DBC/user 
grep "input" {line} > result.{line}.txt 
""" 

with open("/path/to/file.ext") as f: 
    for line in f: 
     line = line.strip() 
     if not line: 
      continue 
     script = script_tpl.format(line=line) 
     with open('script_{}.sh'.format(line), 'w') as output: 
      output.write(script)

最后一点：避免更改脚本中的目录，使用os.path.join()来代替绝对路径。

来源

2017-10-11 15:01:41

谢谢@bruno desthuilliers。关于最后一条评论的问题：在下面一行中：“with open（'script _ {}。sh'.format（l），'w'）作为输出：”，我应该用“line”替换“l”吧？因为我不再定义这个脚本 – m93

当然是 - 我修正了它。 –

最后一个问题，部分说：“line = line.strip（）;如果不是line：continue”：是说：去掉空白行还是换行符？如果没有这样的空白或换行符继续？对不起，我对Python很陌生，所以对我不是很清楚 – m93

使用您目前的做法，你会想：

检查中lines的最后一个元素是空的（lines[-1] == ''）
如果是这样，将其丢弃（lines = lines[:-1]）。

with open('list.txt','r') as f: 
    lines = f.read().split('\n') 

if lines[-1] == '': 
    lines = lines[:-1] 

for line in lines:  
    print(line)

不要忘记，对于一个文件不是一个换行符（与在最后一个空行）结束它是合法的......这将处理这个情况。

此外，作为@setsquare指出的那样，你可能想使用readlines()尝试：

with open('list.txt', 'r') as f: 
    lines = [ line.rstrip('\n') for line in f.readlines() ] 

for line in lines: 
    print(line)

来源

2017-10-11 14:47:38 Attie

如果最后有多个空白行怎么办？ – randomir

如果处理空白行是值得关注的，那么我们有一个不同的问题......这只是处理常见的“_empty最后一行_” – Attie

你有没有考虑过使用readlines方法（），而不是阅读（）？这将让Python为您处理最后一行是否有\ n或不是。

请记住，如果输入文件在最后一行有\ n，那么使用read（）和'\ n'分割将创建一个额外的值。例如：

my_string = 'one\ntwo\nthree\n' 
my_list = my_string.split('\n') 
print my_list 
# >> ['one', 'two', 'three', '']

潜在的解决方案

lines = f.readlines() 
# remove newlines 
lines = [line.strip() for line in lines] 
# remove any empty values, just in case 
lines = filter(bool, lines)

举个简单的例子，在这里看到：How do I read a file line-by-line into a list?

来源

2017-10-11 14:52:44 setsquare

为什么要使用'readlines（）'呢？ 'line = [line.strip（）for line in f]'做同样的事情。但是这不会解决OP问题 - 您仍然需要过滤掉空行。 –

够公平 - 增加编辑 – setsquare

我想你是用错了。

如果您具备以下条件：

text = 'xxx yyy' 
text.split(' ') # or simply text.split()

其结果将是

['xxx', 'yyy']

现在，如果您有：

text = 'xxx yyy ' # extra space at the end 
text.split()

其结果将是

['xxx', 'yyy', '']

，因为拆分得到每个''（空格）之前和之后的内容。在这种情况下，最后一个空格后面会有空字符串。

有些功能你可以使用：

strip([chars]) # This removes all chars at the beggining or end of a string

例子：

text = '___text_about_something___' 
text.strip('_')

结果将是：

'text_about_something'

在特定的问题，你可以简单地说：

lines = f.readlines() # read all lines of the file without '\n' 
for l in lines: 
    l.strip(' ') # remove extra spaces at the start or end of line if you need

来源

2017-10-11 15:07:52 klaus

f.read()返回一个以换行符结尾的字符串，其中split将其最后一行从空字符串中分离出来。目前尚不清楚为什么你明确地将整个文件读入内存;只是迭代文件对象并让它处理行分割。

with open('list.txt','r') as f: 
    for l in f: 
     # ...

来源

2017-10-11 15:08:37 chepner

读取文件的\ n，而是忽略最后\ n

回答

相关问题