2016-02-29 81 views
1

我有覆盖绘图的问题。R ggmap,覆盖绘图,点覆盖对方

情况看起来像这样:我有一些坐标和不同地名的数据,有些地方是在同一个地方 - 所以我有几个地名相同的坐标。如何绘制它们,让它们不会互相遮掩?我尝试过不同的形状,最好的选择是散布这些点,或者用少量的颜色绘制一个点。但我不知道如何去做。我会感谢任何帮助。

代码示例:

require(rgdal) 
require(ggmap) 
require(maptools) 
require (plyr) 

swd <- structure(list(nazwa = structure(c(8L, 8L, 9L, 7L, 7L, 7L, 3L, 
            5L, 6L, 4L, 2L, 2L, 1L), .Label = c("ODDZIAŁ CHIRURGII ONKOLOGICZNEJ", 
                     "ODDZIAŁ GINEKOLOGII ONKOLOGICZNEJ", "ODDZIAŁ ONKOLOGICZNY", 
                     "ODDZIAŁ ONKOLOGII I HEMATOLOGII DZIECIĘCEJ", "ODDZIAŁ ONKOLOGII KLINICZNEJ CHEMIOTERAPII", 
                     "ODDZIAŁ RADIOTERAPII", "PORADNIA CHIRURGII ONKOLOGICZNEJ", "PORADNIA ONKOLOGICZNA", 
                     "PORADNIA RADIOTERAPII"), class = "factor"), miasto = structure(c(8L, 
                                     8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L), .Label = c("DZIAŁDOWO", 
                                                    "ELBLĄG", "EŁK", "GIŻYCKO", "MRĄGOWO", "NOWE MIASTO LUBAWSKIE", 
                                                    "OLECKO", "OLSZTYN", "OSTRÓDA", "PISZ", "SZCZYTNO"), class = "factor"), 
       dom = structure(c(17L, 5L, 17L, 17L, 8L, 18L, 5L, 17L, 17L, 
           20L, 17L, 19L, 17L), .Label = c("BARANKI 24", "GNIEŹNIEŃSKA 2", 
                   "GOŁDAPSKA 1", "HENRYKA SIENKIEWICZA 4", "JAGIELLOŃSKA 78", 
                   "JANA III SOBIESKIEGO 3 C/44", "KONOPNICKIEJ 1", "KOPERNIKA 30", 
                   "KOŚCIUSZKI 30", "KRÓLEWIECKA 146", "KRÓLEWIECKA 146 146", 
                   "LEŚNA 1", "MICKIEWICZA 10", "MICKIEWICZA 14", "OSEDLE MAZURSKIE 33 A", 
                   "WARSZAWSKA 41", "WOJSKA POLSKIEGO 37", "ŻOŁNIERSKA 16B", 
                   "ŻOŁNIERSKA 18", "ŻOŁNIERSKA 18 A"), class = "factor"), Lat = c(53.794077, 
                                   53.80182, 53.794077, 53.794077, 53.7827025, 53.7688275, 53.80182, 
                                   53.794077, 53.794077, 53.7696245, 53.794077, 53.7698809, 
                                   53.794077), Long = c(20.483249, 20.508952, 20.483249, 20.483249, 
                                         20.4918876, 20.4903438, 20.508952, 20.483249, 20.483249, 
                                         20.4927874, 20.483249, 20.492049, 20.483249)), .Names = c("nazwa", 
                                                       "miasto", "dom", "Lat", "Long"), row.names = c(1L, 2L, 12L, 13L, 
                                                                   14L, 15L, 23L, 25L, 27L, 29L, 30L, 31L, 32L), class = "data.frame") 
polska <- get_googlemap(
    center =c('Olsztyn, Polska'), 
    zoom=12, 
    maptype="roadmap" , 
    scale = 2 
    ,color = "bw" 
) 
kontury<- ggmap(polska) 




punkty <- kontury+ geom_point(aes(x=Long, y=Lat, color=nazwa, shape=nazwa) 
           ,data=subset(swd,( nazwa=='ODDZIAŁ GINEKOLOGII ONKOLOGICZNEJ'| 
                nazwa=='PORADNIA CHIRURGII ONKOLOGICZNEJ'| 
                nazwa=='ODDZIAŁ ONKOLOGII KLINICZNEJ CHEMIOTERAPII'| 
                nazwa=='PORADNIA ONKOLOGICZNA'| 
                nazwa=='ODDZIAŁ RADIOTERAPII'& 
                miasto=="OLSZTYN")) 
           ,size=7 

)+ 

    guides(fill = guide_legend(ncol = 1)) + 
    theme(legend.position="right") + 
    scale_shape_manual(values = c(15,16,17,18,19,20), name="Symbol") 

print(punkty) 

OUTPUT

UPDATE 基础上的答案从菲利普我做了这样的事情:

require(rgdal) 
require(ggmap) 
require(maptools) 
require (plyr) 

swd <- structure(list(nazwa = structure(c(8L, 8L, 9L, 7L, 7L, 7L, 3L, 
              5L, 6L, 4L, 2L, 2L, 1L), .Label = c("ODDZIAŁ CHIRURGII ONKOLOGICZNEJ", 
                       "ODDZIAŁ GINEKOLOGII ONKOLOGICZNEJ", "ODDZIAŁ ONKOLOGICZNY", 
                       "ODDZIAŁ ONKOLOGII I HEMATOLOGII DZIECIĘCEJ", "ODDZIAŁ ONKOLOGII KLINICZNEJ CHEMIOTERAPII", 
                       "ODDZIAŁ RADIOTERAPII", "PORADNIA CHIRURGII ONKOLOGICZNEJ", "PORADNIA ONKOLOGICZNA", 
                       "PORADNIA RADIOTERAPII"), class = "factor"), miasto = structure(c(8L, 
                                       8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L), .Label = c("DZIAŁDOWO", 
                                                      "ELBLĄG", "EŁK", "GIŻYCKO", "MRĄGOWO", "NOWE MIASTO LUBAWSKIE", 
                                                      "OLECKO", "OLSZTYN", "OSTRÓDA", "PISZ", "SZCZYTNO"), class = "factor"), 
         dom = structure(c(17L, 5L, 17L, 17L, 8L, 18L, 5L, 17L, 17L, 
             20L, 17L, 19L, 17L), .Label = c("BARANKI 24", "GNIEŹNIEŃSKA 2", 
                     "GOŁDAPSKA 1", "HENRYKA SIENKIEWICZA 4", "JAGIELLOŃSKA 78", 
                     "JANA III SOBIESKIEGO 3 C/44", "KONOPNICKIEJ 1", "KOPERNIKA 30", 
                     "KOŚCIUSZKI 30", "KRÓLEWIECKA 146", "KRÓLEWIECKA 146 146", 
                     "LEŚNA 1", "MICKIEWICZA 10", "MICKIEWICZA 14", "OSEDLE MAZURSKIE 33 A", 
                     "WARSZAWSKA 41", "WOJSKA POLSKIEGO 37", "ŻOŁNIERSKA 16B", 
                     "ŻOŁNIERSKA 18", "ŻOŁNIERSKA 18 A"), class = "factor"), Lat = c(53.794077, 
                                     53.80182, 53.794077, 53.794077, 53.7827025, 53.7688275, 53.80182, 
                                     53.794077, 53.794077, 53.7696245, 53.794077, 53.7698809, 
                                     53.794077), Long = c(20.483249, 20.508952, 20.483249, 20.483249, 
                                          20.4918876, 20.4903438, 20.508952, 20.483249, 20.483249, 
                                          20.4927874, 20.483249, 20.492049, 20.483249)), .Names = c("nazwa", 
                                                         "miasto", "dom", "Lat", "Long"), row.names = c(1L, 2L, 12L, 13L, 
                                                                     14L, 15L, 23L, 25L, 27L, 29L, 30L, 31L, 32L), class = "data.frame") 

swd <- data.table(swd)   # idk rly why but it didnt want to work w/o this command 
setkey(swd,dom) 
swd <- swd[swd[,.N,keyby=dom],.(dom,is.unique=N==1,nazwa,miasto,Lat,Long)] 

olsztynOSM <- get_openstreetmap(bbox = c (left=20.4359, bottom = 53.7319, right= 20.5623, top= 53.81), scale = 40913, color = c('color')) 

moja.paleta <- brewer.pal(9, "Set1") 
swd$kolor <- moja.paleta[swd$nazwa] 

konturyOSM<- ggmap(olsztynOSM) 


punkty <- konturyOSM + geom_jitter(aes(x=Long,y=Lat,fill=nazwa), data = swd[!(is.unique)], width=0.006,height=0.006, size=7,pch=21) + 
    geom_point(aes(x=Long,y=Lat,fill=nazwa), data = swd[(is.unique)], size=7, pch=25)+ 

    scale_fill_manual(values=setNames(moja.paleta,levels(swd$nazwa)),name='Legenda') + 
    guides(fill = guide_legend(ncol = 1)) + 
    theme(legend.position="right") 

plot(punkty) 

输出

enter image description here

回答

3

尝试geom_jitter而不是geom_point。您可以指定widthheight来调整抖动量。

从文档:

width Amount of vertical and horizontal jitter. The jitter is added in both positive and negative directions, so the total spread is twice the value specified here. If omitted, defaults to 40% of the resolution of the data: this means the jitter values will occupy 80% of the implied bins. Categorical data is aligned on the integers, so a width or height of 0.5 will spread the data so it's not possible to see the distinction between the categories.

height Amount of vertical and horizontal jitter. The jitter is added in both positive and negative directions, so the total spread is twice the value specified here. If omitted, defaults to 40% of the resolution of the data: this means the jitter values will occupy 80% of the implied bins. Categorical data is aligned on the integers, so a width or height of 0.5 will spread the data so it's not possible to see the distinction between the categories.

为响应您的评论的后续问题:假设你有一个列中的某些数据(或列),可能会或可能不会跨越观察复制:

library(data.table) 
set.seed(123) 
x <- data.table(a=sample(1:5,10,replace=T)) 
setkey(x,a) 

> x 
    a 
1: 1 
2: 2 
3: 3 
4: 3 
5: 3 
6: 3 
7: 4 
8: 5 
9: 5 
10: 5 

现在,我们可以添加一列指示值是否是唯一的或者不:(编辑您的其他评论回答的问题:在data.table .N = count,所以x[,.N,keyby=a]将返回观察值的计数,按每次出现a分组。另外,因为我已经设置的x关键是a,并使用keybyx[,.N,keyby=a]本身就是一个data.table使用相同的密钥x,所以x[ x[,.N,keyby=a] ]data.table加入:它的内表加入额外的列N到在外面的列。然后.(a,is.unique=N==1)是一个标准的data.table操作来选择两列的列表,虽然我懒惰没有使用更多的括号。这也可以被解读为list(a=a,is.unique=(N==1))。请注意,理解这些命令的最好方法是将它们分解并在REPL中逐步执行它们,仔细查看输出,直到您了解每个命令的作用。)

pts <- x[x[,.N,keyby=a],.(a,is.unique=N==1)] 
> pts 
    a is.unique 
1: 1  TRUE 
2: 2  TRUE 
3: 3  FALSE 
4: 3  FALSE 
5: 3  FALSE 
6: 3  FALSE 
7: 4  TRUE 
8: 5  FALSE 
9: 5  FALSE 
10: 5  FALSE 

让我们添加一列刚刚列举的观测数据绘制:

pts[,b:=.I] 
> pts 
    a is.unique b 
1: 1  TRUE 1 
2: 2  TRUE 2 
3: 3  FALSE 3 
4: 3  FALSE 4 
5: 3  FALSE 5 
6: 3  FALSE 6 
7: 4  TRUE 7 
8: 5  FALSE 8 
9: 5  FALSE 9 
10: 5  FALSE 10 

现在我们可以做一个阴谋数据是否会被overplotted分离(注意不是字面上这个数据,因为在这里我把所有的x值不同,但我认为这是很容易反正可视化),我在评论所说:

ggplot(pts,aes(x=b,y=a)) + 
    geom_point(data=pts[(is.unique)],color="blue") + 
    geom_jitter(data=pts[!(is.unique)],color="red") 

Plot of ten points with default jitter

请注意只有唯一值(蓝色)精确地落在格点上。我们可以调整抖动说抖动点只在垂直方向,并小于默认:

ggplot(pts,aes(x=b,y=a)) + 
    geom_point(data=pts[(is.unique)],color="blue") + 
    geom_jitter(data=pts[!(is.unique)],color="red",width=0,height=.2) 

Plot of ten points with no horizontal jitter

顺便说一句,不请自来的文体挑剔:如果你给你的颜色/填充和形状缩放相同名字,他们会结合起来,你可以有一个更好看的传说。例如: -

ggplot(pts,aes(x=b,y=a,color=is.unique,shape=is.unique)) + 
    geom_point(data=pts[(is.unique)]) + 
    geom_jitter(data=pts[(!is.unique)]) + 
    scale_color_manual(values=c("red","blue"),name="Unique a?") + 
    scale_shape_manual(values=c(15,16),name="Unique a?") 

Plot like first plot but with combined color and shape legend

+0

但后来生病调整所有点不仅是一个与几个地名相同的坐标,和我有坐标单一名称更加分。在这个数据样本中,我只提出了这个问题,原始数据要大得多。或者我错了?我会尝试在第二,写我看到的。 – banshe

+0

如果这是一个大问题,你可以拆分你的数据(或者分成两个'data.frames',或者通过添加一列作为指标,并使用'data'参数到'geom_point'),然后做两个图层:一个用于geom_point的唯一数据,另一个用于geom_jitter(或者等同于width/height = 0)的已绘制数据。我添加了一个虚拟MWE,您应该能够修补以获得您的结果。 – Philip

+0

非常感谢你的努力,你的例子非常有用。我会试一试,我会尝试将你的想法添加到我的脚本中。谢谢! :) – banshe