加载和重塑与熊猫

我需要以下问题帮助多个csv文件。加载和重塑与熊猫

我有多个CSV文件像下面

1.csv

2.csv

我想每个10行重塑列Length，并把它并排在一个结果旁边。下面例如是我想要的输出

[[12 23 44 34 11] [[ 52.1 32.2 44.6 99.1 122.3] 
[39 79 45 56 15]] [ 43.2 79.4 45.5 56.3 15.4]] 
[[35 23 66 33 12] [[ 35.7 23.7 66.7 33.8 12.9] 
[34 21 43 44 55]] [ 34.8 21.6 43.7 44.2 55.8]]

我用下面的脚本尝试，但它给了我一个类型的错误。

myscript.py

import pandas as pd 
import glob 

df = [pd.read_csv(filename) for filename in glob.glob("Users/Ling/workspace/testing/*.csv")] 

start = 0 
for i in range(0, len(df.index)): 
    if (i + 1)%10 == 0: 
     result = df['Length'].iloc[start:i+1].reshape(2,5) 
     start = i + 1 
     print result

错误

TypeError: object of type 'builtin_function_or_method' has no len()

我不理解的错误。我应该在start = 0之后放置另一个For loop以便程序读取每个文件，或者有另一种方法可以解决此问题吗？

谢谢你的帮助。

[UPDATE]

随着从@cmaher建议，我修改myscript.py是这样

import pandas as pd 
import glob 

df = [pd.read_csv(filename) for filename in glob.glob("Users/Ling/workspace/testing/*.csv")] 

df = pd.concat(df) 
start = 0 
for i in range(0, len(df.index)): 
    if (i + 1)%10 == 0: 
     result = df['Length'].iloc[start:i+1].reshape(2,5) 
     start = i + 1 
     print result

输出是这样

[[ 52.1 32.2 44.6 99.1 122.3] 
[ 43.2 79.4 45.5 56.3 15.4]] 
[[ 35.7 23.7 66.7 33.8 12.9] 
[ 34.8 21.6 43.7 44.2 55.8]] 
[[ 12. 23. 44. 34. 11.] 
[ 39. 79. 45. 56. 15.]] 
[[ 35. 23. 66. 33. 12.] 
[ 34. 21. 43. 44. 55.]]

其是从什么不同我期望。我想像我在所需的输出中提供的那样并排放置。

来源

2017-02-14 Ling

它告诉你，'df.index'的是，没有一个'LEN（）'方法的功能。 'df'看起来像什么？ 'df.index'看起来像什么？ – Batman

@Batman最初的剧本其实是这样的，我加载一个CSV文件只'DF = pd.read_csv（“1.csv”）'。为了回答你的问题，我认为'df.index'这里的意思是1.csv中所有数据的列表。 – Ling

你是否需要它们在交互式shell中对它们进行视觉检查？还是有更大的理由，你需要他们在这种特定的格式？他们需要成为一个列表，numpy数组，还是这很重要？ – Jarad

正如您所写，df是DataFrame的列表，而不是DataFrame，因此.index是对列表方法.index()的引用。您for循环之前，只需添加df = pd.concat(df)（见http://pandas.pydata.org/pandas-docs/stable/generated/pandas.concat.html），这是专为拼接大熊猫对象的序列建立了一个类的方法。

编辑：这里是你的代码与所添加的一步

df = [pd.read_csv(filename) for filename in glob.glob("Users/Ling/workspace/testing/*.csv")] 

df = pd.concat(df) 

start = 0 
for i in range(0, len(df.index)): 
    if (i + 1)%10 == 0: 
     result = df['Length'].iloc[start:i+1].reshape(2,5) 
     start = i + 1 
     print result

来源

2017-02-14 02:31:15 cmaher

你能更多地讨论'之前，你的循环，只需添加DF = pd.concat（DF）' ？你的意思是把'df = pd.concat（df）'放在'in for范围内（0，len（df.index（））'？ – Ling

这是正确的 - 我转发了你的代码和额外的步骤 – cmaher

我做了你的建议，但它返回了以下错误---'ValueError：没有对象连接' – Ling

加载和重塑与熊猫

回答

相关问题