我公司生产线图的东西，看起来像这样如何通过在R中对变量进行分组来对线图进行着色？

Generated using ggplot2

我有50个国家的数据集和其过去的10年GDP。
样本数据：

Country variable value 
China Y2007 3.55218e+12 
USA  Y2007 1.45000e+13 
Japan Y2007 4.51526e+12 
UK  Y2007 3.06301e+12 
Russia Y2007 1.29971e+12 
Canada Y2007 1.46498e+12 
Germany Y2007 3.43995e+12 
India Y2007 1.20107e+12 
France Y2007 2.66311e+12 
SKorea Y2007 1.12268e+12

我使用的代码

GDP_lineplot = ggplot(data=GDP_linechart, aes(x=variable,y=value)) + 
    geom_line() + 
    scale_y_continuous(name = "GDP(USD in Trillions)", 
        breaks = c(0.0e+00,5.0e+12,1.0e+13,1.5e+13), 
        labels = c(0,5,10,15)) + 
    scale_x_discrete(name = "Years", labels = c(2007,"",2009,"",2011,"",2013,"",2015))

的想法是让图看起来是这样产生的线图。 How can I plot the colors

我尝试添加

group=country, color = country

它的输出着色所有的国家。

我该如何为前4名的国家和其他国家着色？

PS：我仍然天真与R.

来源

2017-04-07 Kishan J

'ggplot（data = GDP_linechart，aes（x = variable，y = value，color = Country））+ ...'应该这样做。 – emilliman5

是的！但是由于我有50个国家，它可以对齐50种不同的颜色。我需要不同颜色的前4个国家和其他国家的灰色（请参阅https://i.stack.imgur.com/sAhZM.png）谢谢！ –

通过绘制的子集，其他组不包括在右侧的颜色图例。下面的替代方法处理因子水平，并使用自定义色标来克服这一点。

准备数据

假设GDP_long包含长格式的数据。这与OP显示的数据一致（GDP_lineplot，但请参阅下面的数据部分了解差异）。为了操纵因子水平，使用forcats包（和data.table）。

library(data.table) 
library(forcats) 
# coerce to data.table, reorder factors by values in last = most actual year 
setDT(GDP_long)[, Country := fct_reorder(Country, -value, last)] 
# create new factor which collapses all countries to "Other" except the top 4 countries 
GDP_long[, top_country := fct_other(Country, keep = head(levels(Country), 4))]

创建情节

library(ggplot2) 
ggplot(GDP_long, aes(Year, value/1e12, group = Country, colour = top_country)) + 
    geom_point() + geom_line(size = 1) + theme_bw() + ylab("GDP(USD in Trillions)") + 
    scale_colour_manual(name = "Country", 
         values = c("green3", "orange", "blue", "red", "grey"))

下图是现在颇为相似，预期的结果。前4个国家的行显示为不同的颜色，而其他国家显示为灰色，但显示在右侧的颜色图例中。

请注意，group审美仍然需要，以便为每个国家绘制一条线，而colour由top_country的水平控制。

数据

的数据集是太大，在这里复制（甚至有dput()）。结构

str(GDP_long) 
'data.frame': 1763 obs. of 3 variables: 
$ Country: chr "Afghanistan" "Albania" "Algeria" "Andorra" ... 
$ Year : int 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 ... 
$ value : num 9.84e+09 1.07e+10 1.35e+11 4.01e+09 6.04e+10 ...

是不同之处在于所述variable柱已经被转换为一个整数列year类似于OP的数据。这将提供一个很好格式化的X轴，而无需额外的努力。

来源

2017-04-09 13:21:35 Uwe

我们一直在寻找它的确切答案我对我的数据持怀疑态度，因为在我的数据中其中两个变量是因素，认为我会得到一些错误，但它已经过了。对我来说是一个很好的教训。 –

我道歉，我错过了大约只着色国家的一个子集...在geom_line调用您可以添加适合自己需要的子集的一部分。

df <- data.frame(Country=rep(LETTERS[1:10], each=5), 
    Year=rep(2007:2011, length.out=10), 
    value=rnorm(50)) 

ggplot(df) + 
geom_line(data=df[21:50, ], aes(x=Year, y=value, group=Country), color="#999999") + 
geom_line(data=df[1:20, ], aes(Year, y=value, color=Country))

来源

2017-04-07 20:55:46 emilliman5

谢谢！为回应。你给了geom_line（data = df [1:20，] ...）在字母顺序排列的地方，但是在mydata中，国家排列顺序不合适，第一列必须包含10个变量中国和10个美国（“10 “因为我有10年的数据）我现在应该操纵数据，我该怎么做，看起来很复杂 –

你可以这样子'GDP_linechart [GDP_linehcart $ Coutnry％in％c（”Japan“， “中国”，“美国”，“德国”），''和'GDP_linechart [！GDP_linehcart $ Coutnry％in％c（“日本”，“中国”，“美国”，“德国”）]' – emilliman5

如何通过在R中对变量进行分组来对线图进行着色？

回答

准备数据

创建情节

数据

相关问题