0
想知道为什么我得到这个错误。我只能重现它,如果我在我的数据框中的级别非法列名称,但为什么它在射频实施工作?非法列名错误,但列名是合法的
考虑使用护林员,因为它似乎运行得更快。
library(caret)
library(ranger)
library(randomForest)
df <- data.frame(class = c(rep(c('A','B'), 10)), var1 = runif(20, 0,10), var2 = runif(20, 0,20), var3 = c(rep(c(' A','1 B', 'C'), 6), 'D','D'))
df
CTRL <- trainControl(method = "repeatedcv",
number = 2,
repeats = 1,
verboseIter = TRUE,
classProbs = TRUE,
returnResamp = "final",
summaryFunction = twoClassSummary)
ranger_model <- caret::train(class ~ .,
df,
method = "ranger",
trControl = CTRL,
preProc = c("center", "scale"),
metric="ROC",
tuneGrid = expand.grid(.mtry=c(1,2)))
rf_model <- caret::train(class ~ .,
df,
method = "rf",
trControl = CTRL,
preProc = c("center", "scale"),
metric="ROC",
tuneGrid = expand.grid(.mtry=c(1,2)))
ranger_model
rf_model
游侠输出:
+ Fold1.Rep1: mtry=1
model fit failed for Fold1.Rep1: mtry=1 Error in parse.formula(formula, data) :
Error: Illegal column names in formula interface. Fix column names or use alternative interface in ranger.
而且,当我检查文档游侠产生错误,我不理解为什么这个计算结果为TRUE,因为当我运行的代码我DF,我没有得到相同的结果:
## Error if illegal column name
if (!all(make.names(independent_vars[!interaction_idx]) == independent_vars[!interaction_idx])) {
stop("Error: Illegal column names in formula interface. Fix column names or use alternative interface in ranger.")
}
https://github.com/cran/ranger/blob/master/R/formula.R
当我在我的DF运行:
formula <- 'class ~ .'
data <- df
f <- as.formula(formula)
t <- terms(f, data = data)
## Get dependent var(s)
response <- data.frame(eval(f[[2]], envir = data))
colnames(response) <- deparse(f[[2]])
## Get independent vars
independent_vars <- attr(t, "term.labels")
interaction_idx <- grepl(":", independent_vars)
## Error if illegal column name
if (!all(make.names(independent_vars[!interaction_idx]) == independent_vars[!interaction_idx])) {
print("Error: Illegal column names in formula interface. Fix column names or use alternative interface in ranger.")
}
> !all(make.names(independent_vars[!interaction_idx]) == independent_vars[!interaction_idx])
## [1] FALSE
是因为因子列做成一个使用因子水平作为列名1热编码矩阵?再次,不知道为什么它可以在RF而不是游侠中工作。
想法?