[R矢量不打印预期输出

corr <- function(directory, threshold = 0){ 

    #get all the cases that have non-NA values 
    complete_cases <- complete(directory) 
    #get all the observations over the threshold amount 
    filter_cases <- complete_cases[complete_cases[["nobs"]] > threshold, ] 

    #The returned data frame contains two columns "ID" and "nobs" 

    #get all file names in a vector 
    all_files <- list.files(directory, full.names=TRUE) 

    correlation <- vector("numeric") 

    for(i in as.numeric(filter_cases[["ID"]])){ 
    #get all the files that are in the filter_cases 
    output <- read.csv(all_files[i]) 
    #remove all NA values from the data 
    output <- output[complete.cases(output), ] 
    #get each of the correlations and store them 
    correlation[i] <- cor(output[["nitrate"]], output[["sulfate"]]) 
    } 

    correlation 
}

我预计将从这一点说就是这样的：[R矢量不打印预期输出

corr("directory", 200) 

[1] -1.023 0.0456 0.8231 etc

我得到的是：

NA NA -1.023 NA NA 
NA NA NA 0.0456 NA 
0.8231 NA NA NA NA etc

我觉得这是一件简单的我作为print（cor（输出[[“硝酸盐”]，输出[[“硫酸盐”]]））缺少基本上得到我所期望的。输出必须是一个向量，因为我打算在其他函数中使用该函数。

来源

2016-02-27 Shawn

在我看来，你的问题可能是由于你的循环索引。这导致相关矢量的一些条目被跳过并因此被设置为NAs。如果没有访问你的数据的信息，很难确定，但看起来上面的行的目的是让你只能循环访问某些文件。如果是这种情况，由于您为了两个目的而使用for循环，因此使关联索引使用明确的计数器可能是有意义的，如下所示。

cor_index = 0 
for(i in as.numeric(filter_cases[["ID"]])){ 
    #get all the files that are in the filter_cases 
    output <- read.csv(all_files[i]) 
    #remove all NA values from the data 
    output <- output[complete.cases(output), ] 
    #get each of the correlations and store them 
    cor_index = cor_index + 1 
    correlation[cor_index] <- cor(output[["nitrate"]], output[["sulfate"]]) 
}

来源

2016-02-28 00:28:38

这正是问题所在。出于某种奇怪的原因，我仍然在我的数据中获得少数NA。我不确定这个新的代码为什么会是这种情况？当我输出“输出”变量时，那里没有NA，为什么相关性（cor）仍然有一些？ – Shawn

[R矢量不打印预期输出

回答

相关问题