如何读取文件，并计算出一个特定的值

我如何找出有多少关键字从文件也在另一个文件？我有一个包含单词列表的文件，我试图弄清楚这些单词是否在另一个文件中。如何读取文件，并计算出一个特定的值

我有一个包含关键词的文件（keywords.txt），和我试图找出是否另一个文件包含（tweets.txt），其中包含的句子，包含任何关键字

def main() : 
    done = False 
    while not done: 
     try: 
      keywords = input("Enter the filename titled keywords: ") 
      with open(keywords, "r") as words: 
       done = True 
     except IOError: 
      print("Error: file not found.") 

total = 0 
try: 
    tweets = input("Enter the file Name titled tweets: ") 
    with open(tweets, 'r') as tweets: 
except IOError: 
    print("Error: file not found.") 

def sentiment_of_msg(msg_words_counter): 
     summary = 0 
     for line in tweets: 
       if happy_dict in line: 
        summary += 10 * **The number of keywords in the sentence of the file** 
       elif veryUnhappy_dict in line: 
        summary += 1 * quantity 
       elif neutral_dict in line: 
        summary += 5 * quantity 
      return summary

来源

2016-11-09 HelloWorld4382

第一读取文本存储大文件。现在你打开文件，但后来你对这些文件什么都不做。后来你会做计算。 – furas

没有人愿意为你做功课，原因很多。问一个具体的问题来解决你的问题的一部分。现在你甚至没有接近。用开放（tweets，'r'）作为推文后会发生什么：'？ –

@AlexHall如果你不打算提出任何建议或提供帮助，Id感谢，如果你没有评论。谢谢！ – HelloWorld4382

我感觉到这是作业，所以我能做的最好的是给你一个解决方案的大纲。

如果你能负担得起在内存中加载文件：

负载keywords.txt，read its lines，将它们分成记号，并从中构建一个set。现在你有能力快速身份的查询（即你可以问if token in set并在固定时间内得到答案的数据结构。
负荷你的关键字做的鸣叫文件，并通过行（或但是他们阅读其内容线你可能需要做一些预处理（删除空格，替换不必要的字符，删除无效的单词，逗号等）。对于每一行，分割它，以便获取每条推文的单词，并询问是否有任何分割的单词处于。关键词设置

伪代码是这样的：

file=open(keywords) 
keywords_set=set() 
for token in file.readlines(): 
    for word in token.split(): 
     keywords_set.add(word) 

file=open(tweets) 
for token in file.readlines(): 
    preprocess(token) #function with your custom logic 
    for item in token.split(): 
     if item in keywords: 
      do_stuff() #function with your custom logic

如果您需要关键字的频率，请使用{key：key_frequency}构建字典。或者查看Counter，并考虑如何解决您的问题。

如果您不能加载鸣叫文件到内存中考虑lazy solution阅读使用发电机从文件

来源

2016-11-10 08:15:50 themistoklik

谢谢！它应该提示用户输入文件名，这就是为什么我问他们。我把你的东西考虑在内！ – HelloWorld4382

如何读取文件，并计算出一个特定的值

回答

相关问题