我似乎无法在互联网上找到任何有关此问题的帮助。{:任务1524失败 - “无法打开连接”中的错误
我使用'foreach'和'doParallel'软件包并行运行一个函数。该函数将训练好的模型和两个数据帧作为输入,进行预测,然后对其中一个变量的值进行混洗,并再次进行预测。它计算每个变量的RMSE,并返回在混洗后增加的RMSE。这需要相当长的时间,所以我必须并行运行它。即便如此,每个型号仍需要2小时左右。
它似乎没有与函数的代码本身,也许是输入,因为我以前运行它没有问题,我检查了我的日志文件后的错误,它处理所有的变量。我有5个模型,我想运行此功能。我首先在一个模型上运行它,并保存结果。现在,它的工作,我想把它应用到其余的模型。
foreach循环完成处理后,似乎出现错误,因为日志文件指示分析了所有变量。但我没有得到的是回溯表明错误发生在循环内部。
在此先感谢您对此问题的任何帮助。如果我不清楚任何事情,请告诉我。我正在运行Windows7和R版本3.1。
以下是错误:
Error in { : task 1524 failed - "cannot open the connection"
10 stop(simpleError(msg, call = expr))
9 e$fun(obj, substitute(ex), parent.frame(), e$data)
8 foreach(variable = names(newdata), .export = c("calc.rmse", "catf",
"start.timer", "stop.timer"), .combine = "rbind") %dopar%
{
baseline = NULL ... at feature_selection.R#53
7 FUN(c("pH", "Ca", "P", "Sand")[[1L]], ...)
6 lapply(X = X, FUN = FUN, ...)
5 sapply(names(amodels[2:length(amodels)]), analyze.features, newdata = test.data,
newoutcomes = test.outcomes) at script.R#59
4 eval(expr, envir, enclos)
3 eval(ei, envir)
2 withVisible(eval(ei, envir))
1 source("~/%FILEPATH%")
这里是有问题的函数代码:
analyze.features = function(newdata, newoutcomes, model.name) {
model = amodels[[model.name]]
file = "data/shuffled_data.csv"
if(!file.exists(file)) {
cat("Creating shuffled data frame...\r\n")
shuffled.data = as.data.frame(sapply(newdata, shuffle))
cat("Writing shuffled data frame to disk...\r\n")
write.csv(shuffled.data, file)
} else {
cat("Reading shuffled data from file...\r\n")
shuffled.data = read.csv(file)
}
# Send output to a log file.
writeLines("", "log.txt")
start.timer("About to enter parallelization...")
cat("Time is: ", format(Sys.time(), "%a %b %d %X %Y"), "\r\n")
output = foreach(variable = names(newdata), .export=c("calc.rmse", "catf", "start.timer", "stop.timer"), .combine="rbind") %dopar% {
baseline = NULL
shuffle = NULL
sdata = newdata
# Write to log file.
catf("Analyzing ", variable)
sdata[[variable]] = shuffled.data[[variable]]
baseline[[variable]] = suppressWarnings(calc.rmse(predict(model, newdata=newdata), newoutcomes))
shuffle[[variable]] = suppressWarnings(calc.rmse(predict(model, newdata=sdata), newoutcomes))
cbind(baseline=baseline, shuffle=shuffle)
}
stop.timer("Total time to analyze features")
save.df(output, paste("RMSE_", model.name, sep=""))
# Reduce list of kept features.
keep = row.names(output)[which(output[,2] - output[,1] > 0)]
rm(output, shuffled.data)
beep(1)
return(keep)
}