2017-09-25 62 views
1

我有一个问题类似于this question的顶部响应,但没有解决。但是,我通过rpy2包在python中使用ggplot2,这引入了额外的困难。Python中使用ggplot2的分类变量的稳定颜色映射(rpy2)

我有许多不同的时间序列(带有变量名称),我想要在数据序列旁边绘制(这是常量)。我希望数据系列在所有图中都是相同的颜色,但不关心其他系列的颜色。但是,如果允许ggplot2自动分配颜色,则按字母顺序排列,颜色根据序列名称是按字母顺序排列在“数据”之前还是之后,映射不稳定。 (请参阅下面的代码)

请注意,系列名称(代码示例中的'a_model','e_model')并不都是事先已知的,所以我不能简单地创建带有所有可能系列名称的手动色阶。此外,地块可能包含数据系列和多个其他系列。我只关心保持数据系列不变的颜色。

from rpy2 import robjects 
from rpy2.robjects.lib import grid 
from rpy2.robjects.packages import importr 
import rpy2.robjects.lib.ggplot2 as ggplot2 
from rpy2.robjects import pandas2ri 
import pandas as pd 
pandas2ri.activate()    

###Input data### 
plot_data={} 
plot_data.update({'a_model':[0.217,0.226,0.238,0.253,0.272,0.278,0.283,0.29,0.296,0.298]}) 
plot_data.update({'data':[0.255,0.226,0.241,0.19,0.264,0.302,0.291,0.26,0.218,0.221]}) 
plot_data.update({'mos_since_start':[1,2,3,4,5,6,7,8,9,10]}) 

###Plotting Function### 
def plot(plot_data, filename): 
    df=pd.DataFrame(in_dict) 
    fig = pd.melt(df, id_vars=['mos_since_start']) 
    pp = ggplot2.ggplot(fig) + \ 
     ggplot2.aes_string(x='mos_since_start', 
     y='value',group='variable',colour='variable', shape = 'variable', linetype = 'variable') +\ 
     ggplot2.geom_line() + ggplot2.geom_point() 
    robjects.r.ggsave(filename=filename, plot=pp, width =12, height = 8) 

###Plots### 
plot(plot_data,"./testplot.pdf") 
plot_data.update({'e_model':plot_data.pop('a_model')}) 
plot(plot_data,"./testplot2.pdf") 

回答

0

这不是用Python编写的,但应该显示的选项,使数据系列图例中的第一个值应该是整个地块

library(ggplot2) 
library(reshape2) 

df1 <- data.frame(a_model = c(0.217,0.226,0.238,0.253,0.272,0.278,0.283,0.29,0.296,0.298), 
        e_model = c(0.217,0.226,0.238,0.253,0.272,0.278,0.283,0.29,0.296,0.298), 
        data = c(0.255,0.226,0.241,0.19,0.264,0.302,0.291,0.26,0.218,0.221), 
        b_model = c(0.217,0.226,0.238,0.253,0.272,0.278,0.283,0.29,0.296,0.298), 
        mos_since_start = c(1,2,3,4,5,6,7,8,9,10)) 
dfm <- melt(df1, id.vars = "mos_since_start") 

ggplot(dfm, 
     aes(x = mos_since_start, 
      y = value, 
      group = variable, 
      colour = variable, 
      shape = variable, 
      linetype = variable)) + 
     geom_line() + 
     geom_point() + 
    scale_shape_discrete(name = "legend", 
         breaks = union("data", dfm$variable)) + 
    scale_colour_discrete(name = "legend", 
         breaks = union("data", dfm$variable)) + 
    scale_linetype_discrete(name = "legend", 
          breaks = union("data", dfm$variable)) 

第二种方法可能一致的颜色更简单的是改变因子顺序为variable

dfm$variable <- relevel(dfm$variable, "data") 

ggplot(dfm, 
     aes(x = mos_since_start, 
      y = value, 
      group = variable, 
      colour = variable, 
      shape = variable, 
      linetype = variable)) + 
    geom_line() + 
    geom_point()