通过的Jaccard系数

-1

构造相似度矩阵谱聚类我有一个明确的数据集，我在其上执行频谱聚类。但是我没有得到很好的输出。我选择对应于最大特征值的特征向量作为k均值的质心。通过的Jaccard系数

请查收过程中，我按照以下：

1. Create a symmetric similarity matrix (m*m) using jaccard coefficient. 
    For example, for a data set, 
    a,b,c,d 
    a,b,x,y 
    The similarity matrix I compute would look like : 
    |1  0.33| 
    |0.33  1 | 
2. Compute the first k eigen vectors corresponding to largest eigen values. where k is the number of cluster. 
3. Normalize the symmetric similarity matrix 
4. perform the clustering on the normalized similarity matrix using eigen vectors as initial centroids for k-means.

我的问题是：

Is computing Jaccard similarity matrix the right choice for spectral clustering. 

Is it the right way of selecting eigen vectors as cluster centroids for spectal clustering because I dont see other options for categorical dataset. 

Is there anything wrong with the procedure I follow.

来源

2015-06-10 Sam

据我所知，你已经混且改组的方法AA号码。难怪它不工作...

，你可以简单地使用杰卡德距离（Jaccard相似的简单反转）+系统聚类
你可以做MDS项目您的数据，然后K-均值（也许你正在尝试做的）
亲和力传播等都是值得一试的

来源

2015-06-10 20:49:46

感谢您的回复，我只是在聚类分析领域的初学者刚刚尝试不同的方法。需要问另一件事。将在矩阵创建使用的Jaccard系数相似性矩阵（M * M），然后进行k-均值什么好处。这是一种可行的方法吗？我试图使用它在http://archive.ics.uci.edu/ml/datasets.html，一些数据集（国会，蘑菇），它给了可喜的成果。由于 – Sam

k均值应该对原始数据进行运行。它意味着一个线性的欧几里德向量空间。 **不要因为你可以**而运行方法。理解算法*和*您的问题的要求和目标。如果你可以让它们对齐（通常需要大量的预处理），那么试试吧。 –

通过的Jaccard系数

回答

相关问题