2016-07-31 78 views
1

我试图计算单词'the'出现在保存为文本文件的两本书中的次数。我正在运行的代码会为每本书返回零。使用count方法来计算文本文件中的某个单词

这里是我的代码:

def word_count(filename): 
    """Count specified words in a text""" 
    try: 
     with open(filename) as f_obj: 
      contents = f_obj.readlines() 
      for line in contents: 
       word_count = line.lower().count('the') 
      print (word_count) 

    except FileNotFoundError: 
     msg = "Sorry, the file you entered, " + filename + ", could not be  found." 
    print (msg) 

dracula = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\dracula.txt' 
siddhartha = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\siddhartha.txt' 

word_count(dracula) 
word_count(siddhartha) 

我到底错在这里做什么?

+0

不。我尝试使用你的线增加,但我必须分配word_count之前我增加它。所以我添加了第二行增加word_count与它本身,它仍然给我零这两本书。 –

回答

1

除非单词'the'出现在每个文件的最后一行,否则您将看到零。

你可能要初始化的变量word_count零,则使用增强加法(+=):

例如:

def word_count(filename): 
    """Count specified words in a text""" 
    try: 
     word_count = 0          # <- change #1 here 
     with open(filename) as f_obj: 
      contents = f_obj.readlines() 
      for line in contents: 
       word_count += line.lower().count('the')  # <- change #2 here 
      print(word_count) 

    except FileNotFoundError: 
     msg = "Sorry, the file you entered, " + filename + ", could not be  found." 
    print(msg) 

dracula = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\dracula.txt' 
siddhartha = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\siddhartha.txt' 

word_count(dracula) 
word_count(siddhartha) 

增强除了是没有必要的,只是有帮助的。这条线:

word_count += line.lower().count('the') 

可以写成

word_count = word_count + line.lower().count('the') 

但你也并不需要一次读取的所有行到内存中。您可以从文件对象中直接遍历行。例如:

def word_count(filename): 
    """Count specified words in a text""" 
    try: 
     word_count = 0 
     with open(filename) as f_obj: 
      for line in f_obj:      # <- change here 
       word_count += line.lower().count('the') 
     print(word_count) 

    except FileNotFoundError: 
     msg = "Sorry, the file you entered, " + filename + ", could not be  found." 
     print(msg) 

dracula = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\dracula.txt' 
siddhartha = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\siddhartha.txt' 

word_count(dracula) 
word_count(siddhartha) 
+0

谢谢jedwards ....那工作:) –

3

您正在为每次迭代重新分配word_count。这意味着最后它将与文件最后一行中的出现次数the相同。你应该得到这笔钱。另一件事:应该there匹配?可能不会。您可能要使用line.split()。此外,您可以直接遍历文件对象;不需要.readlines()。最后,使用生成器表达式来简化。我的第一个例子是没有生成器表达式;第二个是与它:

def word_count(filename): 
    with open(filename) as f_obj: 
     total = 0 
     for line in f_obj: 
      total += line.lower().split().count('the') 
     print(total) 
def word_count(filename): 
    with open(filename) as f_obj: 
     total = sum(line.lower().split().count('the') for line in f_obj) 
     print(total) 
0
import os 
def word_count(filename): 
    """Count specified words in a text""" 
    if os.path.exists(filename): 
     if not os.path.isdir(filename): 
      with open(filename) as f_obj: 
       print(f_obj.read().lower().count('t')) 
     else: 
      print("is path to folder, not to file '%s'" % filename) 
    else: 
     print("path not found '%s'" % filename) 
相关问题