2015-08-28 183 views
1

我有五个线图,我想从中输出一个阴影区域,代表它们绘制的上部和下部区域之间的区域。我创建了一个R脚本(见下文),因为我有多个数据集,需要重复这个练习。R ggplot2:使用嵌套循环在单个图中覆盖多个geom_ribbon对象

但是,我只能打印来自最后一个i和j对的geom_ribbon - 我似乎无法将每个geom_ribbon输出到创建的列表中。

我很感激任何有关如何将所有geom_ribbon对象导入列表的想法。 print(Z)(下面的例子)只打印一张图。如果可能的话,我想要将所有geom_ribbon对象重叠并打印为单个ggplot?

Z <- list() 
allmaxi <- list(cahp_max_plot15cb$decade_maxa, cahp_max_plot15cb$decade_maxc,cahp_max_plot15cb$decade_maxd, cahp_max_plot15cb$decade_maxe, cahp_max_plot15cb$decade_maxf) 
allmaxj <- list(cahp_max_plot15cb$decade_maxa, cahp_max_plot15cb$decade_maxc,cahp_max_plot15cb$decade_maxd, cahp_max_plot15cb$decade_maxe, cahp_max_plot15cb$decade_maxf) 
for (i in allmaxi) { 
    for (j in allmaxj) { 
    l <- geom_ribbon(data=cahp_max_plot15cb,aes(x=decade,ymin=i, ymax=j)) 
    Z[[length(Z) + 1]] <- l 
    print(i) 
    print(j) 
    }  
} 
print(ggplot() + Z) 

样本输出(从打印(i)和印刷(j)在脚本)输入来自一个数据集(decade_maxa)至i列出,以及四个其他数据集到j列表:

[1] 2010.811 1723.783 1961.088 1662.909 1587.191 1662.140 1665.415 1602.974 1807.453 1586.106 
[11] 1580.880 1685.253 1653.178 1824.842 

[1] 1390.260 1247.700 1263.578 1711.638 1228.326 1762.045 1260.147 1171.914 1697.987 1350.867 
[11] 1434.525 1488.818 1610.513 1536.895 
` 
`[1] 2010.811 1723.783 1961.088 1662.909 1587.191 1662.140 1665.415 1602.974 1807.453 1586.106 
[11] 1580.880 1685.253 1653.178 1824.842 
` 
`[1] 1120.2700 1094.3047 1196.8792 1227.9660 1236.9170 1266.0935 1127.1480 974.6948 947.3365 
[10] 1244.3242 1254.2704 1082.3667 1286.9080 1126.1943 
` 
`[1] 2010.811 1723.783 1961.088 1662.909 1587.191 1662.140 1665.415 1602.974 1807.453 1586.106 
[11] 1580.880 1685.253 1653.178 1824.842 
` 
`[1] 1396.695 1425.073 1382.941 1913.495 1401.754 1499.763 1600.656 1367.043 1413.390 1343.804 
[11] 1431.790 1402.292 1329.192 1696.729 
` 
`[1] 2010.811 1723.783 1961.088 1662.909 1587.191 1662.140 1665.415 1602.974 1807.453 1586.106 
[11] 1580.880 1685.253 1653.178 1824.842 
` 
`[1] 1718.874 1389.134 1501.574 1233.189 1262.480 1508.919 1291.467 1431.869 1505.102 1376.519 
[11] 1441.181 1421.552 1326.547 1635.599 
` 
> print(ggplot() + Z) 

` 

这是我的目标。也许乐队有更好的方法吗?

insert

这是通过整合中值的输出图像,如下建议:

median_g <- group_by(cahp_max_plot15cbm,decade) median_gm <- mutate(median_g, median=median(value)) p2 <- ggplot(median_gm) + geom_ribbon(aes(x=decade, ymin=median,ymax=value,group=variable),alpha=0.40,fill="#3985ff") + geom_line(aes(x=decade,y=value,group=variable,color=variable),lwd=1) + geom_point(aes(x=decade,y=median)) p2

image

+0

它出现在变量名(仅i和j)不被传递到列表:>ž [[1]] 映射:X =十年中,YMIN = I,YMAX = j geom_ribbon:na。RM = FALSE stat_identity: position_identity:(宽度= NULL,高度= NULL) [[2]] 映射:X =十年中,YMIN = I,YMAX = j的 geom_ribbon:na.rm = FALSE stat_identity: position_identity:(宽度= NULL,高度= NULL) [[3]] 映射:X =十年中,YMIN = I,YMAX = j的 geom_ribbon:na.rm = FALSE stat_identity: position_identity:(宽度= NULL,height = NULL) ... – 2015-08-28 07:39:47

+0

为什么你需要这个循环?更好的方法可能是重塑你的数据。更易读,更简单的绘图定制。 – Heroka

回答

0

我对这个问题很感兴趣,想看看我可以通过模拟(但类似)数据得到答案,因为我喜欢制作情节。为了说明的目的,我将第一种方法包括在内,但并未完全按照预期工作。

library(ggplot2) 
library(reshape2) 
library(plyr) 
set.seed(123) 
data <- data.frame(decade=1:10) 
n=nrow(data) 
data$maxa <- runif(n,1000,2000) 
data$maxb <- runif(n,1000,2000) 
data$maxc <- runif(n,1000,2000) 
data$maxd <- runif(n,1000,2000) 
data$maxe <- runif(n,1000,2000) 

第一种方法:计算最小值和最大值,并利用这些来计算带状

data$min <- apply(data[,-1],MARGIN=1,FUN=min) 
data$max <- apply(data[,-1],MARGIN=1,FUN=max) 

#reshape 
data_long <- melt(data, id.vars=c("decade","min","max")) 

#plot 
p1 <- ggplot(data_long) + 
    geom_ribbon(aes(x=decade,ymin=min,ymax=max),fill="#FFCCCC", alpha=0.3) + 
    geom_line(aes(x=decade,y=value,group=variable,col=variable),size=1) 

p1 

enter image description here 没有得到预期的结果;绘制峰顶之间的丝带。

第二种方法;应该为数据工作dat不是太极端:找到每个十年的中值,并将其用作功能区的ymin。 ymax是融化数据集中的值。

#find median 
data_long <- ddply(data_long,.(decade),transform, median=median(value)) 

#plot. Quick hex-color and no alpha because the ribbons overlap, and that becomes visible with alpha. 
p2 <- ggplot(data_long) + geom_ribbon(aes(x=decade, ymin=median,ymax=value,group=variable),fill="#FFCCCC")+ 
    geom_line(aes(x=decade,y=value,group=variable,col=variable),size=1) 

p2 

enter image description here

作品!

+0

这是一个非常好的简洁方法! 我适用于我的数据,如上图所示* 但是,我注意到如果包含一个alpha值(这是我原来的问题的补充)以显示重叠,那么似乎有一些着色文物接近中间值(中值点绘制在上图中)。我会进一步探讨,看看这些是否可以消除。 * image:我插入图像在我原来的问题,因为它似乎我无法发布图像在这里的评论。 – Richard

+0

也存在alpha和重叠问题(如果您仔细考虑,这是相当期望的,所以我选择了一个类似的颜色,但将alpha保持为1.您可以使用彩色网站/工具来查找所需填充颜色的代码。 – Heroka

1

这里有一个稍微过度设计的解决方案:找到所有段的交叉点,将这些横坐标添加到混合中,并为每个x找到最小值和最大值。

# some segment-segment intersection code 
# http://paulbourke.net/geometry/pointlineplane/ 
ssi <- function(x1, x2, x3, x4, y1, y2, y3, y4){ 

    denom <- ((y4 - y3)*(x2 - x1) - (x4 - x3)*(y2 - y1)) 
    denom[abs(denom) < 1e-10] <- NA # parallel lines 

    ua <- ((x4 - x3)*(y1 - y3) - (y4 - y3)*(x1 - x3))/denom 
    ub <- ((x2 - x1)*(y1 - y3) - (y2 - y1)*(x1 - x3))/denom 

    x <- x1 + ua * (x2 - x1) 
    y <- y1 + ua * (y2 - y1) 
    inside <- (ua >= 0) & (ua <= 1) & (ub >= 0) & (ub <= 1) 
    data.frame(x = ifelse(inside, x, NA), 
      y = ifelse(inside, y, NA)) 

} 
# do it with two polylines (xy dataframes) 
ssi_polyline <- function(l1, l2){ 
    n1 <- nrow(l1) 
    n2 <- nrow(l2) 
    stopifnot(n1==n2) 
    x1 <- l1[-n1,1] ; y1 <- l1[-n1,2] 
    x2 <- l1[-1L,1] ; y2 <- l1[-1L,2] 
    x3 <- l2[-n2,1] ; y3 <- l2[-n2,2] 
    x4 <- l2[-1L,1] ; y4 <- l2[-1L,2] 
    ssi(x1, x2, x3, x4, y1, y2, y3, y4) 
} 
# testing the above 
d1 <- cbind(seq(1, 10), rnorm(10)) 
d2 <- cbind(seq(1, 10), rnorm(10)) 
plot(rbind(d1, d2), t="n") 
lines(d1) 
lines(d2, col=2) 
points(ssi_polyline(d1, d2)) 

# do it with all columns of a matrix (common xs assumed) 
# the general case (different xs) could be treated similarly 
# e.g by doing first a linear interpolation at all unique xs 
ssi_matrix <- function(x, m){ 
    # pairwise combinations 
    cn <- combn(ncol(m), 2) 
    test_pair <- function(i){ 
    l1 <- cbind(x, m[,cn[1,i]]) 
    l2 <- cbind(x, m[,cn[2,i]]) 
    pts <- ssi_polyline(l1, l2) 
    pts[complete.cases(pts),] 
    } 
    ints <- lapply(seq_len(ncol(cn)), test_pair) 
    do.call(rbind, ints) 

} 
# testing this on a matrix 
m <- replicate(5, rnorm(10)) 
x <- seq_len(nrow(m)) 
matplot(x, m, t="l", lty=1) 
test <- ssi_matrix(x, m) 
points(test) 

# now, apply this to the dataset at hand 

library(ggplot2) 
library(reshape2) 
library(plyr) 
set.seed(123) 
data <- data.frame(decade=1:10) 
n=nrow(data) 
data$maxa <- runif(n,1000,2000) 
data$maxb <- runif(n,1000,2000) 
data$maxc <- runif(n,1000,2000) 
data$maxd <- runif(n,1000,2000) 
data$maxe <- runif(n,1000,2000) 

newpoints <- setNames(data.frame(ssi_matrix(data$decade, data[,-1L]), 
           "added"), c("decade", "value", "variable")) 
mdata <- melt(data, id=1L) 

interpolated <- ddply(mdata, "variable", function(d){ 
    xy <- approx(d$decade, d$value, xout=newpoints[,1]) 
    data.frame(decade = xy$x, value=xy$y, variable = "interpolated") 
}) 

all <- rbind(mdata, interpolated, newpoints) 

rib <- ddply(all, "decade", summarise, 
      ymin=min(value), ymax=max(value)) 

ggplot(mdata, aes(decade)) + 
    geom_ribbon(data = rib, aes(x=decade, ymin=ymin, ymax=ymax), 
       alpha=0.40,fill="#3985ff")+ 
    geom_line(aes(y=value, colour=variable)) 

enter image description here