2017-04-03 48 views
0

我想通过使用线性SVM模型与RBF内核模型测试我的数据来测试我的数据是否可线性分离,以查看两个使用F分数之间哪个分数更好。致命错误与插入符号和RStudio

我使用了脱字符号包,并且我设置了两个模型,fit.SVMKernel使用'rbf'方法,而fit.SVML使用'svmLinear'。

我可以运行我的脚本的整个长度直到线性模型,并执行没有任何问题,但只要我运行内核模型,我得到一个致命的错误消息,并必须重新启动会话,请参阅附图下面。

任何人都可以提供一个建议,为什么R每次运行内核代码段时崩溃?

这是我的data和我的code的可重复版本给任何人提供两个hoots!

enter image description here

我已加载插入符包按以下

> sessionInfo() 
R version 3.3.2 (2016-10-31) 
Platform: x86_64-w64-mingw32/x64 (64-bit) 
Running under: Windows >= 8 x64 (build 9200) 

locale: 
[1] LC_COLLATE=English_Ireland.1252 LC_CTYPE=English_Ireland.1252 LC_MONETARY=English_Ireland.1252 LC_NUMERIC=C      
[5] LC_TIME=English_Ireland.1252  

attached base packages: 
[1] parallel stats  graphics grDevices utils  datasets methods base  

other attached packages: 
[1] kernlab_0.9-25 doMC_1.3.4  iterators_1.0.8 foreach_1.4.3 Amelia_1.7.4 Rcpp_0.12.9  dplyr_0.5.0  pROC_1.9.1  klaR_0.6-12  
[10] MASS_7.3-45  caret_6.0-73 ggplot2_2.2.1 lattice_0.20-34 

loaded via a namespace (and not attached): 
[1] compiler_3.3.2  nloptr_1.0.4  plyr_1.8.4   class_7.3-14  tools_3.3.2  digest_0.6.12  lme4_1.1-12  tibble_1.2   
[9] nlme_3.1-128  gtable_0.2.0  mgcv_1.8-15  Matrix_1.2-7.1  DBI_0.5-1   SparseM_1.74  e1071_1.6-8  stringr_1.2.0  
[17] MatrixModels_0.4-1 stats4_3.3.2  combinat_0.0-8  grid_3.3.2   nnet_7.3-12  R6_2.2.0   foreign_0.8-67  minqa_1.2.4  
[25] reshape2_1.4.2  car_2.1-4   magrittr_1.5  scales_0.4.1  codetools_0.2-15 ModelMetrics_1.1.0 splines_3.3.2  assertthat_0.1  
[33] pbkrtest_0.4-6  colorspace_1.3-2 labeling_0.3  quantreg_5.29  stringi_1.1.2  lazyeval_0.2.0  munsell_0.4.3 

结果变量我想在分类(training.data.raw $超调)的类型的因素:

> str(training.data.raw) 
'data.frame': 2846 obs. of 19 variables: 
$ Total.Tx.Height : num 31.2 31.2 31.2 31.2 31.2 ... 
$ Antenna.Tilt  : int 0 0 0 0 0 0 4 4 2 2 ... 
$ Antenna.Gain  : num 15.9 15.9 15.9 18.2 18.2 18.2 15.9 15.9 18.8 18.8 ... 
$ Ant.Vert.Beamwidth: num 10 10 10 4.4 4.4 4.4 9.6 9.6 4.3 4.3 ... 
$ RTWP    : num -106 -104 -105 -105 -105 ... 
$ Voice.Drops  : int 1 12 1 0 1 5 1 18 4 3 ... 
$ Range    : num 11.33 5.14 5.14 11.33 3.88 ... 
$ Max.Distance  : num 12.43 6.24 6.24 12.43 4.98 ... 
$ Environment  : Factor w/ 3 levels "Rural","Suburban",..: 1 1 1 1 1 1 1 1 1 1 ... 
$ Rural    : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ... 
$ Suburban   : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ... 
$ Urban    : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ... 
$ Overshooting  : Factor w/ 2 levels "0","1": 1 1 2 2 2 2 2 2 2 1 ... 
$ OShooting   : Factor w/ 2 levels "Not.Overshooting",..: 1 1 2 2 2 2 2 2 2 1 ... 
$ HSUPA.Throughput : num 164 223 232 241 264 ... 
$ Max.HSDPA.Users : int 10 16 9 5 8 7 14 31 8 12 ... 
$ HS.DSCH.throughput: num 1975 2346 2995 3696 3894 ... 
$ Max.HSUPA.Users : int 13 25 11 5 13 9 15 33 8 13 ... 
$ Avg.CQI   : num 16.2 18 19.4 19.7 21.8 ... 

以下是我用于两种型号的训练控制的代码:

metric <- "Accuracy" 
train_Control <- trainControl(method = "repeatedCV", 
            number = 10, 
            repeats = 3), 
            classProbs = T) 

这里有两种模式,我创建fit.SVMKernelfit.SVMLinear

set.seed(123) 
fit.SVML <- train(Overshooting ~ Total.Tx.Height + 
        Antenna.Tilt + 
        Antenna.Gain + 
        Ant.Vert.Beamwidth + 
        RTWP + 
        Voice.Drops + 
        Range + 
        Max.Distance + 
        Rural + 
        Suburban + 
        Urban + 
        HSUPA.Throughput + 
        Max.HSDPA.Users + 
        HS.DSCH.throughput + 
        Max.HSUPA.Users + 
        Avg.CQI, 
       data = training.data.raw, 
       method = 'svmLinear', 
       preProcess = c('center','scale'), 
       trControl=train_Control, 
       tuneLength=5, 
       metric = metric) 

set.seed(123) 
fit.SVMKernel <- train(Overshooting ~ Total.Tx.Height + 
        Antenna.Tilt + 
        Antenna.Gain + 
        Ant.Vert.Beamwidth + 
        RTWP + 
        Voice.Drops + 
        Range + 
        Max.Distance + 
        Rural + 
        Suburban + 
        Urban + 
        HSUPA.Throughput + 
        Max.HSDPA.Users + 
        HS.DSCH.throughput + 
        Max.HSUPA.Users + 
        Avg.CQI, 
        data = training.data.raw, 
        method = 'rbf', 
        preProcess = c('center','scale'), 
        trControl=train_Control, 
        tuneLength=5, 
        metric = metric)#, 
        #summaryFunction=twoClassSummary) 

回答

0

原来我呼吁在火车模型是不是一个有效的型号名称的方法。我改变了这一点,我知道自己重新开始工作了。