2015-07-12 72 views
2

标签的数据点可以得到笨拙: enter image description here如何绘制漂亮间隔的数据标签?在情节

随机抽样几个标签可能会失望: enter image description here

这将是一个很好的方式来挑选一小套精美,空间数据的标签?也就是说,要随机挑选标签不重叠的代表。

# demo data 
set.seed(123) 
N <- 50 
x <- runif(N) 
y <- x + rnorm(N, 0, x) 
data <- data.frame(x, y, labels=state.name) 

# plot with labels 
plot(x,y) 
text(x,y,labels) 

# plot a few labels 
frame() 
few_labels <- data[sample(N, 10), ] 
plot(x,y) 
with(few_labels, text(x,y,labels)) 
+1

这个例子是在R中,但它确实是一个普通的问题 – dzeltzer

回答

2

一种方法是通过聚类。这是一个与stats::hclust解决方案。我们将聚类数据点聚集在一起,然后从每个聚类中挑选一个随机观察。

few_labels <- function(df, coord=1:ncol(df),grp=5){ 

    require(dplyr) 
    df$cl <- cutree(hclust(dist(df[,coord])),grp) 
    few_labels <- df %>% group_by(cl) %>% 
    do(sample_n(.,1)) 
    return(few_labels) 
} 

# demo data 
set.seed(123) 
N <- 50 
x <- runif(N) 
y <- x + rnorm(N, 0, x) 
data <- data.frame(x, y, labels=state.name) 

# plot a few labels 
frame() 
few_labels <- few_labels(data,coord=1:2,grp=12) 
plot(x,y) 
with(few_labels, text(x,y,labels)) 

enter image description here

+0

没有库?那不是那个dplyr吗? –

+0

@MikeWise函数定义中有一个'require'。 – scoa

+0

当然。错过了。很好的解决方案btw ... –

2

对于所有标签:

xlims=c(-1,2) 
plot(x,y,xlim=xlims) 
#text(x,y,data$labels,pos = 2,cex=0.7) 
library(plotrix) 
spread.labels(x,y,data$labels,cex=0.7,ony=NA) 

enter image description here

1

另一种方法是随机选取一个点,抛出所有靠近的人,依此类推,直到没有一点左:

radius <- .1 # of a ball containing the largest label 

d <- as.matrix(dist(data[, c("x","y")], upper=TRUE, diag=TRUE)) 
remaining <- 1:N 
spaced <- numeric() 
i <- 1 
while(length(remaining)>0) { 
    p <- ifelse(length(remaining)>1, sample(remaining, 1), remaining) 
    spaced <- c(spaced, p) # ... 
    remaining <- setdiff(remaining, which(d[p, ] < 2*radius)) 
    i <- i + 1 
} 

frame() 
plot(x,y) 
spaced_labels <- data[spaced, ] 
with(spaced_labels, text(x,y,labels)) 

enter image description here