使用count方法来计算文本文件中的某个单词

我试图计算单词'the'出现在保存为文本文件的两本书中的次数。我正在运行的代码会为每本书返回零。使用count方法来计算文本文件中的某个单词

这里是我的代码：

def word_count(filename): 
    """Count specified words in a text""" 
    try: 
     with open(filename) as f_obj: 
      contents = f_obj.readlines() 
      for line in contents: 
       word_count = line.lower().count('the') 
      print (word_count) 

    except FileNotFoundError: 
     msg = "Sorry, the file you entered, " + filename + ", could not be  found." 
    print (msg) 

dracula = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\dracula.txt' 
siddhartha = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\siddhartha.txt' 

word_count(dracula) 
word_count(siddhartha)

我到底错在这里做什么？

来源

2016-07-31 David Agabi

不。我尝试使用你的线增加，但我必须分配word_count之前我增加它。所以我添加了第二行增加word_count与它本身，它仍然给我零这两本书。 –

除非单词'the'出现在每个文件的最后一行，否则您将看到零。

你可能要初始化的变量word_count零，则使用增强加法（+=）：

例如：

def word_count(filename): 
    """Count specified words in a text""" 
    try: 
     word_count = 0          # <- change #1 here 
     with open(filename) as f_obj: 
      contents = f_obj.readlines() 
      for line in contents: 
       word_count += line.lower().count('the')  # <- change #2 here 
      print(word_count) 

    except FileNotFoundError: 
     msg = "Sorry, the file you entered, " + filename + ", could not be  found." 
    print(msg) 

dracula = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\dracula.txt' 
siddhartha = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\siddhartha.txt' 

word_count(dracula) 
word_count(siddhartha)

增强除了是没有必要的，只是有帮助的。这条线：

word_count += line.lower().count('the')

可以写成

word_count = word_count + line.lower().count('the')

但你也并不需要一次读取的所有行到内存中。您可以从文件对象中直接遍历行。例如：

def word_count(filename): 
    """Count specified words in a text""" 
    try: 
     word_count = 0 
     with open(filename) as f_obj: 
      for line in f_obj:      # <- change here 
       word_count += line.lower().count('the') 
     print(word_count) 

    except FileNotFoundError: 
     msg = "Sorry, the file you entered, " + filename + ", could not be  found." 
     print(msg) 

dracula = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\dracula.txt' 
siddhartha = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\siddhartha.txt' 

word_count(dracula) 
word_count(siddhartha)

来源

2016-07-31 02:06:20 jedwards

谢谢jedwards ....那工作:) –

您正在为每次迭代重新分配word_count。这意味着最后它将与文件最后一行中的出现次数the相同。你应该得到这笔钱。另一件事：应该there匹配？可能不会。您可能要使用line.split()。此外，您可以直接遍历文件对象;不需要.readlines()。最后，使用生成器表达式来简化。我的第一个例子是没有生成器表达式;第二个是与它：

def word_count(filename): 
    with open(filename) as f_obj: 
     total = 0 
     for line in f_obj: 
      total += line.lower().split().count('the') 
     print(total)

def word_count(filename): 
    with open(filename) as f_obj: 
     total = sum(line.lower().split().count('the') for line in f_obj) 
     print(total)

来源

2016-07-31 02:09:22 zondo

import os 
def word_count(filename): 
    """Count specified words in a text""" 
    if os.path.exists(filename): 
     if not os.path.isdir(filename): 
      with open(filename) as f_obj: 
       print(f_obj.read().lower().count('t')) 
     else: 
      print("is path to folder, not to file '%s'" % filename) 
    else: 
     print("path not found '%s'" % filename)

来源

2016-07-31 02:10:04 andreytata

使用count方法来计算文本文件中的某个单词

回答

相关问题