如何逐行读取CSV文件并将其每次存储到新行中的新CSV文件？

我是Python新手。我正在尝试读取CSV文件，并从文件中删除停用词后，我必须将其存储到新的CSV文件中。我的代码是删除停用词，但它将第一行复制到单行文件的每一行。（例如，如果文件中有三行，则它将在第一行中将第一行复制三次）。如何逐行读取CSV文件并将其每次存储到新行中的新CSV文件？

正如我分析它，我认为问题是在循环中，但我没有得到它。我的代码附在下面。

代码：

import nltk 
import csv 
from nltk.corpus import stopwords 
from nltk.tokenize import word_tokenize 

def stop_Words(fileName,fileName_out): 
    file_out=open(fileName_out,'w') 
    with open(fileName,'r') as myfile: 
     line=myfile.readline() 
     stop_words=set(stopwords.words("english")) 
     words=word_tokenize(line) 
     filtered_sentence=[" "] 
     for w in myfile: 
      for n in words: 
       if n not in stop_words: 
       filtered_sentence.append(' '+n) 
     file_out.writelines(filtered_sentence) 
    print "All Done SW" 

stop_Words("A_Nehra_updated.csv","A_Nehra_final.csv") 
print "all done :)"

来源

2016-06-08 SmartF

这不是很清楚，你应该表现出输入，电流输出和预期输出的一个例子。 – polku

你只是读取文件的第一行：line=myfile.readline()。你想遍历文件中的每一行。要做到这一点的方法之一是

with open(fileName,'r') as myfile: 
    for line in myfile: 
     # the rest of your code here, i.e.: 
     stop_words=set(stopwords.words("english")) 
     words=word_tokenize(line)

而且，你有这样的循环

for w in myfile: 
    for n in words: 
     if n not in stop_words: 
      filtered_sentence.append(' '+n)

但是你会发现，在最外层循环所定义的w从未在循环内使用。你应该能够删除这一点，只是写

for n in words: 
    if n not in stop_words: 
     filtered_sentence.append(' '+n)

编辑：

import nltk 
import csv 
from nltk.corpus import stopwords 
from nltk.tokenize import word_tokenize 

def stop_Words(fileName,fileName_out): 
    file_out=open(fileName_out,'w') 
    with open(fileName,'r') as myfile: 
     for line in myfile: 
      stop_words=set(stopwords.words("english")) 
      words=word_tokenize(line) 
      filtered_sentence=[""] 
      for n in words: 
       if n not in stop_words: 
        filtered_sentence.append(""+n) 
      file_out.writelines(filtered_sentence+["\n"]) 
    print "All Done SW"

来源

2016-06-08 16:04:20 Greg

用于MYFILE行：）线= myfile.readline（ STOP_WORDS =集（stopwords.words（ “英语”））词语= word_tokenize（线） filtered_sentence = [”“] 用于词语N：如果n不在stop_words中： filtered_sentence.append（''+ n） file_out.writelines（filtered_sentence）我已经使用此代码。它给出了以下错误： line = myfile.readline（） ValueError：混合迭代和读取方法会丢失数据 – SmartF

您不需要'line = myfile.readline（）。'使用'in line in myfile'替换这个。 – Greg

非常感谢。一个问题就解决了。但它仍然将所有数据存储在一行中。我无法在字符串中连接'\ n'运算符。请帮忙吗？ – SmartF

如何逐行读取CSV文件并将其每次存储到新行中的新CSV文件？

回答

相关问题