2016-09-29 79 views
2

我想使用三种不同的聚类算法执行一些聚类分析。我的数据从标准输入加载如下sklean fit_predict不接受2维numpy数组

import sklearn.cluster as cluster 

X = [] 
for line in sys.stdin: 
    x1, x2 = line.strip().split() 
    X.append([float(x1), float(x2)]) 
X = numpy.array(X) 

,然后在阵列中存储我的群集参数和类型,这样

clustering_configs = [ 
    ### K-Means 
    ['KMeans', {'n_clusters' : 5}], 
    ### Ward 
    ['AgglomerativeClustering', { 
       'n_clusters' : 5, 
       'linkage' : 'ward' 
       }], 
    ### DBSCAN 
    ['DBSCAN', {'eps' : 0.15}] 
] 

,我试图打电话给他们在for循环中

for alg_name, alg_params in clustering_configs: 

    class_ = getattr(cluster, alg_name) 
    instance_ = class_(alg_params) 

    instance_.fit_predict(X) 

除了instance_.fit_prefict(X)函数以外,一切正常。我正在返回一个错误

Traceback (most recent call last): 
    File "meta_cluster.py", line 47, in <module> 
    instance_.fit_predict(X) 
    File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/k_means_.py", line 830, in fit_predict 
    return self.fit(X).labels_ 
    File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/k_means_.py", line 812, in fit 
    X = self._check_fit_data(X) 
    File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/k_means_.py", line 789, in _check_fit_data 
    X.shape[0], self.n_clusters)) 
TypeError: %d format: a number is required, not dict 

任何人都有线索,我可能会出错?我读了sklearn文档here,它声称你只需要一个array-like or sparse matrix, shape=(n_samples, n_features),我相信我有。

有什么建议吗?谢谢!

回答

2
class sklearn.cluster.KMeans(n_clusters=8, init='k-means++', n_init=10, max_iter=300, tol=0.0001, precompute_distances='auto', verbose=0, random_state=None, copy_x=True, n_jobs=1, algorithm='auto')[source] 

他们的方式,你会打电话的K均值类,

KMeans(n_clusters=5) 

根据您目前的代码,你在呼唤

KMeans({'n_clusters': 5}) 

,这是造成alg_params作为一个快译通,而不是传递的类参数。其他算法也一样。

+0

有没有一种简单的方法可以将这些值从字典中转化为必要的格式? – wKavey

+2

@wKavey:'KMeans(** {'n_clusters':5})' –

+0

所以在我的例子中'instance_ = class _(** alg_params)'? – wKavey