如何根据第二个数组的索引来选择numpy数组的相同列

我需要根据上一列中的选定列过滤数据框或numpy数组，即过滤与第一个数组中的列相同的列。如何根据第二个数组的索引来选择numpy数组的相同列

这是我的方法：

不包含零变量（过滤，列在第一DF选择）

df_NN_70 = df_NN_70.loc[:, (df_NN_70 != df_NN_70.ix[0]).any()]

采样（数据的单独的70％将被用作火车/测试集）

df_NN = df_NN_70.sample(frac=0.7, replace=False, weights=None, random_state=seed, axis=None)

转化到阵列中NN与keras（它需要阵列）

df_NN_array = df_NN.as_matrix(columns=None)

分割数据分成输入（X）和输出（Y）的变量

X = df_NN_array[:, 0:df_NN_array.shape[1]-1] 
Y = df_NN_array[:, 427] 

print(type(df_NN_70.columns)) 
index_list= list(df_NN_70.columns) 

index_list = index_list[0:427] 
print(index_list)

过滤上的第二相同的列根据列表获得df_NN_x

filter_columns = index_list 
df_filtered = np.array(df_NN_x)[filter_columns] 
new.shape

这个过滤，但是，它不起作用，因为它将index_list视为第二个数组df_NN_x中的行的索引，但不是列！

来源

2017-07-04 Mauro Nogueira

你需要在数组的第二维来过滤，比如：

df_filtered = np.array(df_NN_x).ix[:, filter_columns]

或：

df_filtered = np.array(df_NN_x) 
df_filtered = df_filtered[df_filtered.columns[filter_columns]]

注意，第一个版本包括您filter_columns列表的最后一个元素时，第二个不是。

编辑：我的答案是为一个numpy数组，更新为熊猫。

来源

2017-07-04 11:59:52 Nyps

Thanks @Nyps。第一行代码使用pandas df为我工作。 –

如何根据第二个数组的索引来选择numpy数组的相同列

回答

相关问题