2017-11-04 14 views
0

我正在使用维基百科上发现的有关主食谷物营养成分的数据集。我刮使用rvest包中的数据表,并创建以下如何更改单个值从条形表示到行的方式?

Nutrient Content of Major Staple Foods

所示的图形有人向我指出,也许它可能是更好的代表“建议摄取量”(RDA)与垂直线条而不是酒吧。

1)如何创建代表“推荐膳食补贴”的独立垂直线?

用于创建图形的代码如下:我不知道是否应该包含用于收集和缠绕数据的代码。请让我知道这是否会有所帮助。

ggplot(grain.nut, aes(grain, nutrients, fill = grain)) + 
    facet_wrap(~ nutrient.component., scales = "free") + 
    geom_bar(stat = "identity", position = "dodge") + 
    coord_flip() + 
    labs(title = "Nutrient Content of Major Staple Foods per 100 gram Portion", 
     caption = "https://en.wikipedia.org/wiki/Staple_food#Nutritional_content") + 
    theme(plot.title = element_text(size = 30, face = "bold")) + 
    theme(axis.text.y = element_blank()) + 
    theme(axis.ticks.y = element_blank()) + 
    theme(panel.grid.major.y = element_blank()) + 
    theme(panel.grid.minor.y = element_blank()) + 
    theme(axis.title = element_blank()) + 
    theme(legend.position = c(0.80,0.05), legend.direction = "horizontal") + 
    theme(legend.title = element_blank()) + 
    theme(plot.caption = element_text(hjust = 0.84)) + 
    guides(fill=guide_legend(reverse=TRUE)) + 
    scale_fill_manual(values = c("#e70000", 
           "#204bcc", 
           "#68ca3b", 
           "#fe9bff", 
           "#518901", 
           "#de0890", 
           "#fcba4c", 
           "#292c7a", 
           "#e69067", 
           "#79b5ff", 
           "#68272d", 
           "#c9cb6c")) 

我已经尝试使用geom_vline以及geom_hline。但我认为我的问题是我试图通过水平(grain.nut $ grain)1来调用RDA的价值,其输出是“推荐膳食津贴”。

geom_vline(aes(xintercept = levels(grain.nut$grain)[1])) 

任何帮助将不胜感激!

回答

1

以下是使用geom_linerangegeom_pointrange的方法。

首先数据:

library("rvest") 
library(tidyverse) 
url <- "https://en.wikipedia.org/wiki/Staple_food" 
nutrient <- url %>% 
    read_html() %>% 
    html_nodes(xpath='//*[@id="mw-content-text"]/div/table[2]') %>% 
    html_table() 

得到离散规模水平的正确顺序:

lev = levels(as.factor(z$grain))[c(1:4,6:12, 5)] 

情节:

ggplot() + 
    geom_col(data = nutrient[[1]] %>% 
        as.tibble() %>% 
        gather(grain, value, 2:ncol(.)) %>% 
        filter(grain!="RDA") %>% 
        mutate(nutrient = `Nutrient component:`, 
          value = as.numeric(value)), aes(grain, value, fill = grain), position = "dodge")+ 
    geom_pointrange(data = nutrient[[1]] %>% 
        as.tibble() %>% 
        gather(grain, value, 2:ncol(.)) %>% 
        filter(grain=="RDA") %>% 
        mutate(nutrient = `Nutrient component:`, 
          value = as.numeric(value)), aes(x = grain, ymin = 0, ymax = value, y = value, color = grain), size = 0.3, show.legend = F)+ 
    facet_wrap(~ nutrient, scales = "free") + 
    scale_x_discrete(limits = lev) + 
    coord_flip() + 
    labs(title = "Nutrient Content of Major Staple Foods per 100 gram Portion", 
     caption = "https://en.wikipedia.org/wiki/Staple_food#Nutritional_content") + 
    theme(plot.title = element_text(size = 30, face = "bold")) + 
    theme(axis.text.y = element_blank()) + 
    theme(axis.ticks.y = element_blank()) + 
    theme(panel.grid.major.y = element_blank()) + 
    theme(panel.grid.minor.y = element_blank()) + 
    theme(axis.title = element_blank()) + 
    theme(legend.position = c(0.80,0.05), legend.direction = "horizontal") + 
    theme(legend.title = element_blank()) + 
    theme(plot.caption = element_text(hjust = 0.84)) + 
    guides(fill=guide_legend(reverse=TRUE)) + 
    scale_fill_manual(values = c("#e70000", 
           "#204bcc", 
           "#68ca3b", 
           "#fe9bff", 
           "#518901", 
           "#de0890", 
           "#fcba4c", 
           "#292c7a", 
           "#e69067", 
           "#79b5ff", 
           "#68272d", 
           "#c9cb6c")) 

enter image description here

巴斯两层用于不同的数据:geom_col带有没有RDA的数据,geom_pointrange用于带有RDA的数据。并且在scale_x_discrete中更改顺序以匹配lev对象。

如果你不喜欢的点使用geom_linerange和省略的Y他AES调用

或没有ü意味着这个?

ggplot() + 
    geom_col(data = nutrient[[1]] %>% 
      as.tibble() %>% 
      gather(grain, value, 2:ncol(.)) %>% 
      filter(grain!="RDA") %>% 
      mutate(nutrient = `Nutrient component:`, 
        value = as.numeric(value)), aes(grain, value, fill = grain), position = "dodge")+ 
    geom_hline(data = nutrient[[1]] %>% 
        as.tibble() %>% 
        gather(grain, value, 2:ncol(.)) %>% 
        filter(grain=="RDA") %>% 
        mutate(nutrient = `Nutrient component:`, 
          value = as.numeric(value)), aes(yintercept = value), show.legend = F)+ 
    facet_wrap(~ nutrient, scales = "free") + 
    coord_flip() + 
    labs(title = "Nutrient Content of Major Staple Foods per 100 gram Portion", 
     caption = "https://en.wikipedia.org/wiki/Staple_food#Nutritional_content") + 
    theme(plot.title = element_text(size = 30, face = "bold")) + 
    theme(axis.text.y = element_blank()) + 
    theme(axis.ticks.y = element_blank()) + 
    theme(panel.grid.major.y = element_blank()) + 
    theme(panel.grid.minor.y = element_blank()) + 
    theme(axis.title = element_blank()) + 
    theme(legend.position = c(0.80,0.05), legend.direction = "horizontal") + 
    theme(legend.title = element_blank()) + 
    theme(plot.caption = element_text(hjust = 0.84)) + 
    guides(fill=guide_legend(reverse=TRUE)) + 
    scale_fill_manual(values = c("#e70000", 
           "#204bcc", 
           "#68ca3b", 
           "#fe9bff", 
           "#518901", 
           "#de0890", 
           "#fcba4c", 
           "#292c7a", 
           "#e69067", 
           "#79b5ff", 
           "#68272d", 
           "#c9cb6c")) 

enter image description here

+0

创建的第二个图是正是我要寻找的解决方案。谢谢! – RunAmuck

相关问题