2017-03-27 127 views
0

我已经对大型数据集进行了线性支持向量机,但是为了减少我进行PCA的维数,我们在组件分数的子集上进行了SVM(前650个组件解释了99.5%的方差)。现在我想使用来自PCA空间中创建的SVM的beta权重和偏差在原始变量空间中绘制决策边界。但我不知道如何将SVM中的偏置项投影到原始变量空间中。我已经写了使用渔民虹膜数据来说明一个演示:如何在Matlab中使用PCA后从线性SVM绘制决策边界?

clear; clc; close all 

% load data 
load fisheriris 
inds = ~strcmp(species,'setosa'); 
X = meas(inds,3:4); 
Y = species(inds); 
mu = mean(X) 

% perform the PCA 
[eigenvectors, scores] = pca(X); 

% train the svm 
SVMModel = fitcsvm(scores,Y); 

% plot the result 
figure(1) 
gscatter(scores(:,1),scores(:,2),Y,'rgb','osd') 
title('PCA space') 

% now plot the decision boundary 
betas = SVMModel.Beta; 
m = -betas(1)/betas(2); % my gradient 
b = -SVMModel.Bias;  % my y-intercept 
f = @(x) m.*x + b;  % my linear equation 
hold on 
fplot(f,'k') 
hold off 
axis equal 
xlim([-1.5 2.5]) 
ylim([-2 2]) 

% inverse transform the PCA 
Xhat = scores * eigenvectors'; 
Xhat = bsxfun(@plus, Xhat, mu); 

% plot the result 
figure(2) 
hold on 
gscatter(Xhat(:,1),Xhat(:,2),Y,'rgb','osd') 

% and the decision boundary 
betaHat = betas' * eigenvectors'; 
mHat = -betaHat(1)/betaHat(2); 
bHat = b * eigenvectors'; 
bHat = bHat + mu; % I know I have to add mu somewhere... 
bHat = bHat/betaHat(2); 
bHat = sum(sum(bHat)); % sum to reduce the matrix to a single value 
% the correct value of bHat should be 6.3962 

f = @(x) mHat.*x + bHat; 
fplot(f,'k') 
hold off 

axis equal 
title('Recovered feature space') 
xlim([3 7]) 
ylim([0 4]) 

对我如何计算铢任何指导不当,将不胜感激。

+0

正确的y截距是'b = -SVMModel.Bias/beta(2)' –

回答

0

为防万一别人遇到这个问题,解决的办法是用偏倚项可以找到y截距b = -SVMModel.Bias/betas(2)。 y截距只是空间[0 b]中的另一点,可以通过PCA逆变换来恢复/未旋转。这个新点可以用来求解线性方程y = mx + b(即,b = y - mx)。所以代码应该是:

% and the decision boundary 
betaHat = betas' * eigenvectors'; 
mHat = -betaHat(1)/betaHat(2); 
yint = b/betas(2);     % y-intercept in PCA space 
yintHat = [0 b] * eigenvectors';  % recover in original space 
yintHat = yintHat + mu;  
bHat = yintHat(2) - mHat*yintHat(1); % solve the linear equation 
% the correct value of bHat is now 6.3962