我使用numpy数组做支持向量机的问题如下。在Sklearn中运行SVM时的值错误
import numpy as np
from sklearn import svm
我有3类/标签(male
,female
,na
),表示如下:
labels = [0,1,2]
每个类是由3个变量(height
,weight
,age
)作为训练数据定义:
male_height = np.array([111,121,137,143,157])
male_weight = np.array([60,70,88,99,75])
male_age = np.array([41,32,73,54,35])
males = np.hstack([male_height,male_weight,male_age])
female_height = np.array([91,121,135,98,90])
female_weight = np.array([32,67,98,86,56])
female_age = np.array([51,35,33,67,61])
females = np.hstack([female_height,female_weight,female_age])
na_height = np.array([96,127,145,99,91])
na_weight = np.array([42,97,78,76,86])
na_age = np.array([56,35,49,64,66])
nas = np.hstack([na_height,na_weight,na_age])
现在我必须拟合支持向量机方法f或训练数据来预测类给出的三个变量:
height_weight_age = [100,100,100]
clf = svm.SVC()
trainingData = np.vstack([males,females,nas])
clf.fit(trainingData, labels)
result = clf.predict(height_weight_age)
print result
不幸的是,出现以下错误:
ValueError: X.shape[1] = 3 should be equal to 15, the number of features at training time
我应该如何修改trainingData
和labels
,以得到正确的答案?
@jonrsharpe感谢编辑我的原始问题,很好! – jean 2014-10-12 14:13:45