复制列，添加一些文本，并在新的csv文件

我想打一个脚本，将从多个CSV文件复制第2列的文件夹中，并将其保存到一个CSV文件之前添加一些文字书写。复制列，添加一些文本，并在新的csv文件

这里就是我想要做的：从所有CSV文件

2）附加文本“Hello” &“欢迎”在开始的每一行

1）在第2列中获取数据并最终

3）将数据写入到一个文件中

我试着用熊猫

import os 
import pandas as pd 
dataframes = [pd.read_csv(p, index_col=2, header=None) for p in ('1.csv','2.csv','3.csv')] 
merged_dataframe = pd.concat(dataframes, axis=0) 
merged_dataframe.to_csv("all.csv", index=False)

创建它

的问题是 -

在上面的代码中，我不得不手动提的文件名，这是非常困难的，作为一个解决方案，我需要包括所有CSV文件*.csv
需要使用类似writr.writerow(("Hello"+r[1]+"welcome"))
由于在每个文件中有多个csv文件，并且有很多行（大约100k），所以我需要加快速度。

下面是CSV文件的一个样本：

"1.csv"  "2.csv"   "3.csv" 
    a,Jac   b,William   c,James

这里是我怎么想的输出看all.csv：

Hello Jac welcome 
Hello William welcome 
Hello James welcome

任何解决方案使用.merge().append()或.concat() ??

我怎样才能做到这一点使用Python？

来源

2017-06-21 Nancy

南希嗨。你可以像这样获得所有带有模块glob的csv文件：'paths = glob.glob（'foo/*。csv'）'。 –

你不需要这个熊猫。下面是与csv

import csv 
import glob 


with open("path/to/output", 'w') as outfile: 
    for fpath in glob.glob('path/to/directory/*.csv'): 
     with open(fpath) as infile: 
      for row in csv.reader(infile): 
       outfile.write("Hello {} welcome\n".format(row[1]))

来源

2017-06-21 17:46:11 inspectorG4dget

不会大熊猫加快工作吗？ – Nancy

@Nancy：我不能确定地说，但我认为你不会用Pandas为这个应用程序加速“足够” - 你仍然通过编写输出的瓶颈 – inspectorG4dget

1）这样做，如果你想导入一个文件夹中所有的.csv文件非常简单的方法，你可以用

for i in [a in os.listdir() if a[-4:] == '.csv']: 
    #code to read in .csv file and concatenate to existing dataframe

2）要追加的文本并写入文件，则可以将函数映射到数据框的列2的每个元素以添加文本。

#existing dataframe called df 
df[df.columns[1]].map(lambda x: "Hello {} welcome".format(x)).to_csv(<targetpath>) 
#replace <targetpath> with your target path

所有你可以传递给to_csv的各种参数见http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.Series.to_csv.html。

来源

2017-06-21 17:52:30 victor

这里是使用内置的CSV模块的非大熊猫溶液。不知道速度。

import os 
import csv 

path_to_files = "path to files" 
all_csv = os.path.join(path_to_files, "all.csv") 
file_list = os.listdir(path_to_files) 

names = [] 

for file in file_list: 
    if file.endswith(".csv"): 
     path_to_current_file = os.path.join(path_to_files, file) 

     with open(path_to_current_file, "r") as current_csv: 
      reader = csv.reader(current_csv, delimiter=',') 

      for row in reader: 
       names.append(row[1]) 

with open(all_csv, "w") as out_csv: 
    writer = csv.writer(current_csv, delimiter=',') 

    for name in names: 
     writer.writerow(["Hello {} welcome".format(name))

来源

2017-06-21 17:58:46 Hopeless

复制列，添加一些文本，并在新的csv文件

回答

相关问题