熊猫：读取CSV文件的目的是创建3D阵列

第一次在这里发布。所以我的问题是关于如何读取Pandas中的CSV文件，目的是在每个元素内创建一个具有矩阵的2d数组。熊猫：读取CSV文件的目的是创建3D阵列

因此，例如借此例如CSV文件

1,1,1;2,2,2;3,3,3 
1,1,1;2,2,2;3,3,3 
1,1,1;2,2,2;3,3,3

当每一个新的行代表一个单独的矩阵
并且每个分号表示每一矩阵
内的单独行和每逗号表示内的每个单独的元件排

从这个

所以我想获得这种类型的数组：

[ 
    [[1,1,1],[2,2,2],[3,3,3]], 
    [[1,1,1],[2,2,2],[3,3,3]], 
    [[1,1,1],[2,2,2],[3,3,3]] 
]

当前，当我在这样的东西上使用pandas.read_csv（）时，它不会将分号读为分隔符，所以像1; 2这样的东西会被读作字符串。

谢谢！

来源

2016-07-05 Jason

您可以使用read_csv和参数sep=';'和header=None（如果csv中没有标题）。然后，你需要apply功能str.split，因为string功能与Series（的df列）只工作：

import pandas as pd 
import io 

temp=u"""1,1,1;2,2,2;3,3,3 
1,1,1;2,2,2;3,3,3 
1,1,1;2,2,2;3,3,3""" 
#after testing replace io.StringIO(temp) to filename 
df = pd.read_csv(io.StringIO(temp), sep=";", header=None) 
print (df) 
     0  1  2 
0 1,1,1 2,2,2 3,3,3 
1 1,1,1 2,2,2 3,3,3 
2 1,1,1 2,2,2 3,3,3 

print (df.apply(lambda x: x.str.split(','))) 
      0   1   2 
0 [1, 1, 1] [2, 2, 2] [3, 3, 3] 
1 [1, 1, 1] [2, 2, 2] [3, 3, 3] 
2 [1, 1, 1] [2, 2, 2] [3, 3, 3] 

print (df.apply(lambda x: x.str.split(',')).values.tolist()) 
[[['1', '1', '1'], ['2', '2', '2'], ['3', '3', '3']], 
[['1', '1', '1'], ['2', '2', '2'], ['3', '3', '3']], 
[['1', '1', '1'], ['2', '2', '2'], ['3', '3', '3']]]

但如果int需求清单：

import pandas as pd 
import io 

temp=u"""1,1,1;2,2,2;3,3,3 
1,1,1;2,2,2;3,3,3 
1,1,1;2,2,2;3,3,3""" 
#after testing replace io.StringIO(temp) to filename 
df = pd.read_csv(io.StringIO(temp), sep=";", header=None) 
print (df) 
     0  1  2 
0 1,1,1 2,2,2 3,3,3 
1 1,1,1 2,2,2 3,3,3 
2 1,1,1 2,2,2 3,3,3 

for col in df.columns: 
    df[col] = df[col].str.split(',') 
    #if need convert string numbers to int 
    df[col] = [[int(y) for y in x] for x in df[col]]  

print (df.values.tolist()) 
[[[1, 1, 1], [2, 2, 2], [3, 3, 3]], 
[[1, 1, 1], [2, 2, 2], [3, 3, 3]], 
[[1, 1, 1], [2, 2, 2], [3, 3, 3]]]

来源

2016-07-05 07:07:16 jezrael

熊猫：读取CSV文件的目的是创建3D阵列

回答

相关问题