2014-11-05 79 views
2

我有,看起来像一个数据帧:包括Na的频率的气泡图

Data<- data.frame(item1=c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5, NA, 5, NA, NA), 
        item2=c(1, 2, 2, 4, 1, 1, 2, 3, 5, 5, NA, NA, NA, NA), 
        item3=c(1, 2, 2, 4, 1, 1, 2, 3, 5, 5, NA, NA, NA, NA), 
        item4=c(1, 2, 2, 4, 1, 1, 4, 3, 1, 5, NA, 3, NA, NA), 
        item5=c(1, 5, 2, 4, 2, 1, 2, 3, 5, 5, NA, NA, 1, NA)) 

和我有一个函数定义的,即提取柱的频率和绘制它没有NA的

frequencies <- function(x,K=5) 
{ 
    p <- length(x) # items 
    n <- nrow(x) # observations 
    r <- (5, NA) # values 
    myf <- function(y) # extract frequencies 
    { 
    y <- y[!is.na(y)] 
    y <- as.factor(y) 
    aux <- summary(y) 
    res <- rep(0, r) 
    res[1:r %in% names(aux)] <- aux 
    100 * res/sum(res) 
    } 

    freqs <- apply(x, 2, FUN = myf) # apply myf by columns 
    df2 <- expand.grid(vals = 1:r, item = 1:p) # all possible combinations 
    df2$freq <- as.numeric(freqs) # add frequencies 

    # graph 
    plot(df2$item,df2$vals,type="n",xlim=c(1,p),ylim=c(1,r),xaxt = "n", 
     xlab="", ylab="", ann=FALSE) 


    axis(1, labels=FALSE) 
    labs <- paste(names(x)) ##labels=c("v1", "v2", ...) 
    text(1:p, srt = 60, adj=0.5, pos=1, las=2, 
     labels = labs, xpd = TRUE, par("usr")[1], cex.main=0.8, offset=1) 



    points(df2$item,df2$vals,pch=22,col="black", bg="gray", cex=(df2$freq/n)*K) 
} 

我想NA的被ploted为“价值”(在y坐标),所以我的情节可以看看类似于一个(已被编辑用编辑器,没有R): enter image description here

谢谢你在前进,

安古洛

回答

2

另一种可能性,你melt您的数据长格式,然后使用exclude = NULL也算NAtable计数。如果你想使频率与面积成正比,而不是正方形的宽度,请检查scale_size_area

library(reshape2) 
library(ggplot2) 

Data2 <- melt(Data) 
Data3 <- with(Data2, as.data.frame(table(variable, value, exclude = NULL))) 
Data3 <- Data3[!is.na(Data3$variable), ] 

ggplot(data = Data3, aes(x = variable, y = value, size = Freq)) + 
    geom_point(shape = 0) 

enter image description here

+0

谢谢,这是我想要的一个很好的解决方案。你知道这种情节是否有一个特定的名字? – 2014-11-05 10:31:26

+0

我认为它被称为[**气泡图/plot**](http://en.wikipedia.org/wiki/Bubble_chart) – Henrik 2014-11-05 10:47:25

+0

@AriadnaAngulo,我把你的问题的标题从“频率图”改为“泡泡图”,我认为这是一个比较常见的问题参考这种情节的方式。 – Henrik 2014-11-05 12:00:43

1

尝试是这样的:

#u Useful packages: 
library(plyr) 
library(ggplot2) 

# Loop over variables getting the counts of each value 
counts <- lapply(Data, count) 

# Combine the list of counts into a single data frame 
all_counts <- do.call(rbind, counts) 

# A bit of fixing. Make x into a factor, and get the variable name 
all_counts <- within(
    all_counts, 
    { 
    Value <- factor(x) 
    Variable <- rep(names(counts), vapply(counts, nrow, integer(1))) 
    } 
) 

# Remove NAs (it isn't very clear from the question whether you want NAs or not) 
all_counts <- subset(all_counts, !is.na(x)) 

# Draw the plot. sqrt is to scale area by freq rather than width by freq 
(p <- ggplot(all_counts, aes(var, x, size = sqrt(freq))) + 
    geom_point(shape = 15) # shape 15 is a square. See ?points. 
) 
+0

ploting时我无法运行此代码由于一个错误:不知道如何自动选取规模类型功能的对象。默认为连续 data.frame中的错误(x = function(x,y = NULL,na.rm = FALSE,use): 参数意味着行数不同:0,25 – 2014-11-05 10:39:45

+0

猜测,您尚未转换'x'或'y'轴变量是一个因素 – 2014-11-06 10:35:33