2016-05-14 262 views
3

I'm R中建立两个不同的矩阵之间的相关性与rcorr()功能:rcorr()函数的相关性

res <- rcorr(as.matrix(table1), as.matrix(table2),type="pearson") 

它似乎是工作正常,但是我想避免表内的相关性 - 的任何建议?

回答

5

考虑将R的基数cor()用于两组之间的不同相关性,因为Hmisc的rcorr()返回所有可能的组合。请注意0​​(右下角对角线重复)的右上象限是cor()的整个结果(四舍五入为小数点后两位)。

table1 <- matrix(rnorm(25),5) 
table2 <- matrix(rnorm(25),5) 

res <- rcorr(table1, table2, type="pearson") 
res 
     [,1] [,2] [,3] [,4] [,5] | [,6] [,7] [,8] [,9] [,10] 
# [1,] 1.00 -0.55 0.95 -0.16 0.17 |-0.46 0.15 0.10 0.69 0.16 
# [2,] -0.55 1.00 -0.55 -0.60 -0.79 |-0.45 -0.66 -0.22 -0.30 0.12 
# [3,] 0.95 -0.55 1.00 -0.09 0.30 |-0.35 -0.05 -0.17 0.57 -0.03 
# [4,] -0.16 -0.60 -0.09 1.00 0.91 | 0.92 0.53 -0.21 -0.58 -0.71 
# [5,] 0.17 -0.79 0.30 0.91 1.00 | 0.78 0.41 -0.31 -0.32 -0.68 
# ------------------------------------------------------------------ 
# [6,] -0.46 -0.45 -0.35 0.92 0.78 | 1.00 0.44 -0.14 -0.62 -0.58 
# [7,] 0.15 -0.66 -0.05 0.53 0.41 | 0.44 1.00 0.68 0.13 0.13 
# [8,] 0.10 -0.22 -0.17 -0.21 -0.31 |-0.14 0.68 1.00 0.59 0.80 
# [9,] 0.69 -0.30 0.57 -0.58 -0.32 |-0.62 0.13 0.59 1.00 0.80 
#[10,] 0.16 0.12 -0.03 -0.71 -0.68 |-0.58 0.13 0.80 0.80 1.00 

# pvalues to follow ... 

res <- cor(table1, table2, method="pearson") 
res 

#   [,1]  [,2]  [,3]  [,4]  [,5] 
# [1,] -0.4551474 0.15080994 0.1008215 0.6894955 0.16390813 
# [2,] -0.4468285 -0.66209106 -0.2154960 -0.2954581 0.11662382 
# [3,] -0.3542023 -0.05474287 -0.1720881 0.5669501 -0.02880113 
# [4,] 0.9246330 0.53456574 -0.2084105 -0.5807386 -0.71108552 
# [5,] 0.7788395 0.40551828 -0.3122606 -0.3209273 -0.67912147 

唯一需要注意的是显着性检验统计数据,包括T-统计和p值不适用于cor()。但是,可以使用cor.test()来检索它们,您可以使用mapply()来反复运行它们。下面演示一个测试配对,并将其推广到所有其他列。注意测试的估计值对应于cor()输出中的值。

# EXAMPLE OF FIRST COL PAIRING 
res <- cor.test(table1[,1], table2[,1], method="pearson") 
res 

# Pearson's product-moment correlation 

# data: table1[, 1] and table2[, 1] 
# t = -0.88536, df = 3, p-value = 0.4412 
# alternative hypothesis: true correlation is not equal to 0 
# 95 percent confidence interval: 
# -0.9542314 0.7137222 
# sample estimates: 
#  cor 
# -0.4551474 

# OBTAIN ALL MATRIX COL COMBINATIONS 
tblcols <- expand.grid(1:ncol(table1), 1:ncol(table2)) 

# MAPPLY COR.TEST ACROSS ALL COLS 
cfunc <- function(var1, var2) { 
       cor.test(table1[,var1], table2[,var2], method="pearson") 
     } 

res <- mapply(function(a,b) { 
       cfunc(var1 = a, var2 = b) 
     }, tblcols$Var1, tblcols$Var2) 

head(res) 

#    [,1]  [,2]  [,3]  [,4]  
# statistic -0.8853596 -0.8650936 -0.6560274 4.204994 
# parameter 3   3   3   3   
# p.value  0.4411699 0.4506234 0.5586316 0.02455469 
# estimate -0.4551474 -0.4468285 -0.3542023 0.924633 
# null.value 0   0   0   0   
# alternative "two.sided" "two.sided" "two.sided" "two.sided" 
#    [,5]  [,6]  [,7]  [,8]  
# statistic 2.150733 0.2642326 -1.53021 -0.09495982 
# parameter 3   3   3   3   
# p.value  0.1206246 0.8087132 0.2234562 0.930334 
# estimate 0.7788395 0.1508099 -0.6620911 -0.05474287 
# null.value 0   0   0   0   
# alternative "two.sided" "two.sided" "two.sided" "two.sided" 
# ... 
+0

非常感谢你Parfait! – FranciscoC