如何使用Python或任何其他编程/脚本语言来格式化文本文件？

-1

我想知道如何使用Python或任何其他编程/脚本语言来格式化文本文件？如何使用Python或任何其他编程/脚本语言来格式化文本文件？

在文本文件中

目前的格式是这样的：

ABALONE 
Ab`a*lo"ne, n. (Zoöl.) 

Defn: A univalve mollusk of the genus Haliotis. The shell is lined 
with mother-of-pearl, and used for ornamental purposes; the sea-ear. 
Several large species are found on the coast of California, clinging 
closely to the rocks.

我希望它是这样的（全部在一行排除一些单词等）：

ABALONE : A univalve mollusk of the genus Haliotis. The shell is lined with

来源

2013-02-08 DevCon

你试过了什么？ – Cfreak 2013-02-08 15:28:49

将第一段文本转换为第二段文字的规则是什么？在Python中操作字符串相对比较简单，你试过什么，什么不行？ – 2013-02-08 15:29:29

与http://docs.python.org/3/tutorial/index.html – 2013-02-08 15:30:12

假设格式总是完全按照您所描述的（字，发音，空行，“Defn：”定义），这是一个简单的字符串拆分和连接问题：

def reformat(text): 
    lines = text.split('\n', 3) 
    word = lines[0] 
    definition_paragraph = lines[3][len('Defn:'):] 
    definition_line = definition_paragraph.replace('\n', ' ') 
    return word + ' : ' + definition_line

这个想法是制作一段可以轻松调用以修复文本的代码。在这种情况下，该函数被称为reformat，它通过将给定文本分成三个第一行和定义，从段落中提取定义，并将单词本身与定义粘合在一起。

另一种解决方案是一个regular expression，这是更适合的任务，但可以因为怪的语法是很难理解：

import re 
pattern = re.compile('(.+?)\n.+?\n\nDefn: (.+)', re.DOTALL) 
def reformat(text): 
    word, definition = pattern.search(text).groups() 
    return word + ' : ' + definition.replace('\n', ' ')

这应该工作完全一样，上面的其他代码，但它更简单，更灵活，可以移植到不同的语言。

要使用上述任何一种方法，只需调用传递文本作为参数的方法即可。

要替换文件中的文本，你需要打开文件，读取内容，格式化使用上述任何功能，并返回保存到文件：

with open('word.txt') as open_file: 
    text = open_file.read() 

with open('word.txt', 'w') as open_file: 
    open_file.write(reformat(text))

如果你需要做的例如，查看os模块中的listdir。

来源

2013-02-08 18:31:15 BoppreH

如何使用Python或任何其他编程/脚本语言来格式化文本文件？

回答

相关问题