2016-09-24 48 views
0

我有一个距离矩阵,行和列由第一个下划线(例如,7A_0_AAGCCTAGCGAC = 0)后面的数字值表示。我想要一种方式来比较这些值的行列方式。比如说,我想从列分隔符中减去行分隔符。通过减去行和列名称分隔符创建数据帧

输入:

    7A_0_AAGCCTAGCGAC 7A_4_AAATGACTGGCC 7A_7_CATCTCGTTCTA 
7A_0_AAGCCTAGCGAC  0.00000000  0.034312102  0.04539427 
7A_4_AAATGACTGGCC  0.03431210  0.000000000  0.01422137 
7A_7_CATCTCGTTCTA  0.04539427  0.014221369  0.00000000 

预期输出:

    7A_0_AAGCCTAGCGAC 7A_4_AAATGACTGGCC 7A_7_CATCTCGTTCTA 
7A_0_AAGCCTAGCGAC  0.00000000    -4    -7 
7A_4_AAATGACTGGCC     4  0.000000000    -3 
7A_7_CATCTCGTTCTA     7     3  0.00000000 

任何帮助将非常感激。

回答

1

您可以分别提取了列名和行名的数值,然后做一个外减法:

# extract numeric values from the dimension names of the matrix 
cols = as.numeric(sub(".*_(\\d+)_.*", "\\1", colnames(mat))) 
rows = as.numeric(sub(".*_(\\d+)_.*", "\\1", rownames(mat))) 

# calculate an outer subtract from the two vectors 
output <- outer(cols, rows, "-") 

# set up the dimension name 
dimnames(output) <- list(rownames(mat), colnames(mat)) 

output 
#     7A_0_AAGCCTAGCGAC 7A_4_AAATGACTGGCC 7A_7_CATCTCGTTCTA 
#7A_0_AAGCCTAGCGAC     0    -4    -7 
#7A_4_AAATGACTGGCC     4     0    -3 
#7A_7_CATCTCGTTCTA     7     3     0 
+0

一个相关的问题:有没有办法比较的二进制意义上的分隔符(如值匹配然后0,值不匹配,然后1)? – user2117258