2016-06-21 617 views
0

我试图运行通过K-均值聚类算法9列和1064行的数据帧,但我得到了以下错误:Nbclust抛出错误

Error in NbClust(df, min.nc = 2, max.nc = 15, method = "kmeans") : The TSS matrix is indefinite. There must be too many missing values. The index cannot be calculated.

但是,有没有缺失值

> dim(df) 
[1] 1064 9 

> sum(is.na(df)) 
[1] 0 

任何想法是什么问题以及如何解决它?

> head(df) 
    hr_830 hr_930 hr_1030 hr_1130 hr_160 hr_180 hr_190 hr_200 hr_0 
1  2  2  2  2  2  2  2  2 2 
2  2  2  2  2  2  2  2  2 3 
3  2  2  2  2  2  2  2  2 3 
4  2  2  2  2  2  2  2  2 2 
5  2  2  2  2  2  2  2  2 2 
6  2  2  2  2  2  2  2  2 4 

这里是输入的一个样本:

> dput(input) 
structure(list(hr_830 = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 
2L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L 
), hr_930 = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 2L, 4L, 4L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), hr_1030 = c(2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 2L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), hr_1130 = c(2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L), hr_160 = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L), hr_180 = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 2L, 4L, 4L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), hr_190 = c(2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 2L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), hr_200 = c(2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 4L, 2L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L), hr_0 = c(2L, 3L, 3L, 2L, 2L, 4L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 2L, 2L, 2L, 2L, 2L, 2L 
)), .Names = c("hr_830", "hr_930", "hr_1030", "hr_1130", "hr_160", 
"hr_180", "hr_190", "hr_200", "hr_0"), row.names = c(NA, 25L), class = "data.frame") 

回答

0

您可能有太多的不断属性了。

这些方法的大多数(全部?)假设值为,连续值为,而不是整数。