故障配件在sklearn

多项式回归曲线我是新来sklearn和我有一个适当的简单的任务：给定的15点的散点图，我需要故障配件在sklearn

把他们的11作为我的训练样本'，
通过这11个点拟合3次多项式曲线;
在15个点上画出得到的多项式曲线。

但是我陷入了第二步。

这是数据图：

%matplotlib notebook 

import numpy as np from sklearn.model_selection 
import train_test_split from sklearn.linear_model 
import LinearRegression from sklearn.preprocessing import PolynomialFeatures 

np.random.seed(0) 
n = 15 
x = np.linspace(0,10,n) + np.random.randn(n)/5 
y = np.sin(x)+x/6 + np.random.randn(n)/10 

X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0) 

plt.figure() plt.scatter(X_train, y_train, label='training data') 
plt.scatter(X_test, y_test, label='test data') 
plt.legend(loc=4);

我再取11分X_train和度数3的聚特点如下改造他们：

degrees = 3 
poly = PolynomialFeatures(degree=degree) 

X_train_poly = poly.fit_transform(X_train)

然后我尽量合身通过变换点的线（注意：X_train_poly.size = 364）。

linreg = LinearRegression().fit(X_train_poly, y_train)

，我得到以下错误：

ValueError: Found input variables with inconsistent numbers of samples: [1, 11]

我已阅读，解决类似的和更为复杂的问题（例如Multivariate (polynomial) best fit curve in python?）的各种问题，但我不能提取它们的解决方案。

来源

2017-06-13 Emanuele

可能重复：https://stackoverflow.com/questions/32097392/sklearn-issue-found-arrays-with-inconsistent-numbers -of次采样时，这样做，REGR – Moritz

的问题是在X_train和y_train尺寸。它是一个单维数组，因此它将每个X记录视为一个单独的变量。

使用.reshape命令如下应该做的伎俩：

# reshape data to have 11 records rather than 11 columns 
X_trainT  = X_train.reshape(11,1) 
y_trainT  = y_train.reshape(11,1) 

# create polynomial features on the single va 
poly   = PolynomialFeatures(degree=3) 
X_train_poly = poly.fit_transform(X_trainT) 

print (X_train_poly.shape) 
# 

linreg  = LinearRegression().fit(X_train_poly, y_trainT)

来源

2017-06-14 04:57:25 ELO

错误基本上意味着你X_train_poly和y_train不匹配，你的X_train_poly只有1套X和您的y_train有11个值。我不太清楚你想要什么，但我想多项式特征不是以你想要的方式生成的。你的代码目前所做的是为单个11维点生成3次多项式特征。

我想你想要为11点的每个点（实际上每个x）生成3次多项式特征。您可以使用一个循环或列表理解来做到这一点：

X_train_poly = poly.fit_transform([[i] for i in X_train]) 
X_train_poly.shape 
# (11, 4)

现在你可以看到你的X_train_poly得到11分，其中每个点是4维的，而不是单一的364维点。这种新的X_train_poly的y_train形状相匹配和回归可能会给你想要的东西：

linreg = LinearRegression().fit(X_train_poly, y_train) 
linreg.coef_ 
# array([ 0.  , -0.79802899, 0.2120088 , -0.01285893])

来源

2017-06-13 17:35:05

故障配件在sklearn

回答

相关问题