2014-09-03 64 views
0

我有一个问题,要求我查找文本文件中单词的最小和最大数量。我已经完成了五个问题中的三个问题,剩下两个问题是要求最小值和最大值,我不能有任何解决方案。这里是我的代码:感谢您的帮助在输入文件中的句子中查找最大和最小字数

lines, blanklines, sentences, words = 0, 0, 0, 0, 
print '-' * 50 
full_text = 'input.txt' 
empty_text = 'output.txt' 

text_file = open(full_text, 'r') 
out_file = open(empty_text, "w") 


for line in text_file: 
    print line 
    lines += 1 

    if line.startswith('\n'): 
    blanklines += 1 
    else: 
    # assume that each sentence ends with . or ! or ? 

    # so simply count these characters 

    sentences += line.count('.') + line.count('!') + line.count('?') 


    # create a list of words 

    # use None to split at any whitespace regardless of length 

    # so for instance double space counts as one space 

    # word total count 

    words += len(line.split()) 
average = float(words)/float(sentences) 



text_file.close() 
out_file.close() 

######## T E S T P R O G R A M ######## 

print 
print '-' * 50 
print "Total number of sentences in the input file : ", sentences 
print "Total number of words in the input file  : ", words 
print "Average number of words per sentence   : ", average 
+0

我建议你看看https://docs.python.org/2.7/library/re.html#re.split – 2014-09-03 05:41:38

回答

0

可以使用regex find的话是这样的:

import re 

for line in open(thefilepath): 
re_word = re.findall(r"[\w'-]+",line) 
sentences = re.split(r"\.",k) 
for s in sentence: 
    words_in_sent=re.findall(r"[\w'-]+",k) 
    summ+=len(word_in_sent) 

print "Total number of sentences in the input file :{0}\n and Total number of words in the input file: {1}\n and average of words in each sentence is :{2} ".format(len(sentences),len(words),summ/len(sentences)) 
0

使用​​,对于这个目的的数据类型

>>> from collections import Counter 
>>> lines=""" 
... foo bar baz hello world foo 
... a b c z d 
... 0 foo 1 bar""" 
>>> counter = Counter() 
>>> 
>>> for line in lines.split("\n"): 
...  counter.update(line.split()) 
... 
>>> print counter.most_common(1) #print max 
[('foo', 3)] 
>>> print counter.most_common()[-1] #print min 
('hello', 1) 
>>> print len(list(counter.elements())) #print total words 
15 
相关问题