2017-04-05 102 views
0

我有一个包含城市,州,年份和谋杀数量在内的多个值的对象。我用dplyr它组由城市和计算超过所有年份的总谋杀的前10个城市是这样的:R dplyr group,ungroup,top_n和ggplot

MurderNb_reshaped2 %>% 
    select(city, state, Year, Murders) %>% 
    group_by(city) %>% 
    summarise(total = sum(Murders)) %>% 
    top_n(10, total) %>% 
    ggplot(aes(x = Year, y = Murders, fill = "red")) + 
    geom_histogram(stat = "identity") + 
    facet_wrap(~city) 

我想绘制这个只对十大城市,不是'x =一年没有找到,因为它已按城市分组。任何人都可以解释我怎么能做到这一点?

编辑:这个原始源数据https://interactive.guim.co.uk/2017/feb/09/gva-data/UCR-1985-2015.csv 这里是我的代码:

Deaths <- read.csv("UCR-1985-2015.csv", stringsAsFactors = F) 
MurderRate <- Deaths[, -c(5:35)] 
MurderNb <- Deaths[, -c(36:66)] 
colnames(MurderNb) <- gsub("X", "", colnames(MurderNb)) 
colnames(MurderNb) <- gsub("_raw_murder_num", "", colnames(MurderNb)) 

MurderNb_reshaped <- melt(MurderNb, id = c("city", "Agency", "state", "state_short")) 
colnames(MurderNb_reshaped) <- c("city", "Agency", "state", "state_short", "Year", "Murders") 


MurderNb_reshaped2 <- MurderNb_reshaped 

MurderNb_reshaped2 %>% 
    select(city, state, Year, Murders) %>% 
    group_by(city) %>% 
    summarise(total = sum(Murders)) %>% 
    top_n(10, total) %>% 
    ggplot(aes(x = Year, y = Murders, fill = "red")) + 
    geom_bar(stat = "identity") + 
    facet_wrap(~city) 
+0

我想你想要一个'geom_bar'而不是直方图,因为你有2个维度(年+谋杀)。如果你需要在你的阴谋的一年,你可能还需要包括它作为一个分组变量 –

+0

谢谢。确定为geom_bar,但不会包含年份作为分组变量阻止我正确使用top_n? – Romain

+0

向我们展示您的数据的小样本,以获得更好的答案,包括代码 –

回答

0

好有一对夫妇小问题。这应该是诀窍:

#this gives you the top cities 
topCities <- MurderNb_reshaped2 %>% 
    select(city, state, Year, Murders) %>% 
    group_by(city) %>% 
    summarise(total = sum(Murders)) %>% 
    top_n(10, total) 

#you then need to filter your original data to be only the data for the top cities 
MurderNb_reshaped2 <- filter(MurderNb_reshaped2, city %in% topCities$city) 

ggplot(data = MurderNb_reshaped2, aes(x = Year, y = Murders, fill = "red")) + 
geom_bar(stat = "identity") + 
facet_wrap(~city) 
+0

太好了,非常感谢,它满足了我的需求! – Romain