2013-12-19 31 views
0

总结日期数据我想下面的示例数据聚合成一个新的数据帧,如下所示:如何通过基团中的R

人口,样本大小(N),完成的百分比(%)

样品大小是每个人口的所有记录的计数。我可以使用table命令或tapply来做到这一点。完成百分比是与“结束日期的(不包括所有记录‘终止日期’被假定为不完整记录的百分比,这是我输了!

样本数据

sample <- structure(list(Population = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 
    2L, 2L, 3L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 
    1L, 2L, 2L, 3L, 3L, 3L, 3L, 1L, 1L, 3L, 3L, 3L, 3L), .Label = c("Glommen", 
    "Kaseberga", "Steninge"), class = "factor"), Start_Date = structure(c(16032, 
    16032, 16032, 16032, 16032, 16036, 16036, 16036, 16037, 16038, 
    16038, 16039, 16039, 16039, 16039, 16039, 16039, 16041, 16041, 
    16041, 16041, 16041, 16041, 16044, 16044, 16045, 16045, 16045, 
    16045, 16048, 16048, 16048, 16048, 16048, 16048), class = "Date"), 
     End_Date = structure(c(NA, 16037, NA, NA, 16036, 16043, 16040, 
     16041, 16042, 16042, 16042, 16043, 16043, 16043, 16043, 16043, 
     16043, 16045, 16045, 16045, 16045, 16045, NA, 16048, 16048, 
     16049, 16049, NA, NA, 16052, 16052, 16052, 16052, 16052, 
     16052), class = "Date")), .Names = c("Population", "Start_Date", 
    "End_Date"), row.names = c(NA, 35L), class = "data.frame") 

回答

2

你可以做这种分流/应用/组合:

spl = split(sample, sample$Population) 
new.rows = lapply(spl, function(x) data.frame(Population=x$Population[1], 
               SampleSize=nrow(x), 
               PctComplete=sum(!is.na(x$End_Date))/nrow(x))) 
combined = do.call(rbind, new.rows) 
combined 

#   Population SampleSize PctComplete 
# Glommen  Glommen   13 0.6923077 
# Kaseberga Kaseberga   7 1.0000000 
# Steninge Steninge   15 0.8666667 

一个提醒一句:sample是基函数的名称,所以你应该选择适合您的数据帧不同的名称

+0

对不起数据框名称。我试图保持简单。我很欣赏使用基本功能的解决方案。我有一个更复杂的问题,你的解决方案帮我弄明白了。 –

2

这很容易与plyr包:

library(plyr) 
ddply(sample, .(Population), summarize, 
     Sample_Size = length(End_Date), 
     Percent_Completed = mean(!is.na(End_Date)) * 100) 

# Population Sample_Size Percent_Completed 
# 1 Glommen   13   69.23077 
# 2 Kaseberga   7   100.00000 
# 3 Steninge   15   86.66667 
+0

这是一个非常好的解决方案。我只投了拆分/应用/合并解决方案,因为我喜欢用基础包学习R。谢谢! –