2016-08-25 56 views
-3

我有这样的数据帧称为mydf在那里我有柱Gene_symbol和三个不同的列(癌症),AMLCLLMDS。我想绘制这些癌症中每个基因的百分比。在情节中表现这种情况的好方法是什么?如何绘制三组对比数据中的R

mydf <- structure(list(GENE_SYMBOL = c("NPM1", "DNMT3A", "TET2", "IDH1", 
"IDH2"), AML = c("28.00%", "24.00%", "8.00%", "9.00%", "10.00%" 
), CLL = c("0.00%", "8.00%", "0.00%", "3.00%", "1.00%"), MDS = c("7.00%", 
"28.00%", "7.00%", "10.00%", "3.00%")), .Names = c("GENE_SYMBOL", 
"AML", "CLL", "MDS"), row.names = c(NA, 5L), class = "data.frame") 

回答

1

我们可以从barplot通过base R通过列循环,使用sub去除%,并转换为numeric去除百分比列%后再试。

mydf[-1] <- lapply(mydf[-1], function(x) as.numeric(sub("[%]", "", x))) 
barplot(`row.names<-`(as.matrix(mydf[-1]), mydf$GENE_SYMBOL), beside=TRUE, 
      legend = TRUE, col = c("red", "green", "blue", "yellow")) 

如果我们想 'GENE_SYMBOL' 在x轴

barplot(t(`row.names<-`(mydf[-1], mydf$GENE_SYMBOL)), beside=TRUE, 
       legend = TRUE, col = c("red", "green", "blue")) 

如果我们使用ggplot

library(dplyr) 
library(tidyr) 
library(ggplot2) 
gather(mydf, Var, Val, -GENE_SYMBOL) %>% 
    mutate(Val = as.numeric(sub("[%]", "", Val))) %>% 
    ggplot(., aes(x= GENE_SYMBOL, y = Val)) + 
        geom_bar(aes(fill = Var), position = "dodge", stat="identity") 

enter image description here