Count函数

-1

假设我有以下类型的数据：Count函数

df <- data.frame(student = c("S1", "S2", "S3", "S4", "S5", "S2", "S6", "S1", "S7", "S8"), 
       factor = c("A", "A", "A", "A", "A", "B", "B", "C", "C", "D"), 
       year = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2), 
       count1 = c(0, 1, 0, 0, 0, 1, 0, 0, 0, 0), 
       count2 = c(1, 0, 0, 0, 0, 0, 0, 1, 0, 0))

我需要比典型的更有效的方式应用（）函数来分析在给定年份里，学生和班级的两列。当学生在给定年份保持相同的因子水平时，函数返回零计数。当一个学生在某一年有多个因子水平时，每个学生的实例在单独的因子水平上更新i + 1。

我想要一个单独的计数/功能来分析数据集中的学生跨多年。例如，一个多年保持相同因子水平的学生的计数为零。如果在不同的年份发现学生具有不同的因子水平，则每个实例的计数会更新i + 1。

有超过10K的观察，所以我在*申请的尝试是非生产性的。也就是说，我已经能够计算每个学生的唯一实例，但是只有第一个唯一实例并不是所有学生的唯一实例（唯一ID）和因子。个人可以在几年内或几年内重复。

理想的输出如下：

Student1，Factor.Count（在年），Factor.Count（间年）

来源

2013-04-26 DV Hughes

没有样本数据很难理解问题。请为这里的优秀人士添加可重复的样本以帮助您。见http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – 2013-04-26 01:39:52

编辑显示代码来模拟数据。 – 2013-04-26 01:48:29

两条评论。 1（稍微小一点）：在'apply'变得太贵之前，10K的观测值并不是你所需要的。 2（有些重大）：它不完全清楚你想要什么。更改您的示例数据，以便某些学生实际得到0分，并为示例提供期望的结果。 – 2013-04-26 02:19:12

这里，让你有一个命令链，采用因子相互作用，以找到在同一年一个学生的因素变化：

# Add up the occurrences of a student having multiple factors in the same year, 
# for each year 
in.each.year <- aggregate(factor~student:year, data=df, FUN=function(x) length(x)-1)[c(1,3)] 

# Total these up, for each student 
in.year <- aggregate(factor~student, data=in.each.year, FUN=sum) 

# The name was "factor". Set it to the desired name. 
names(in.year)[2] <- 'count1' 

# Find the occurrences of a student having multiple factors 
both <- aggregate(factor~student, data=df, FUN=function(x) length(x)-1) 
names(both)[2] <- 'both' 

# Combine with 'merge' 
m <- merge(in.year, both) 

# Subtract to find "count2" 
m$count2 <- m$both - m$count1 
m$both <- NULL 

m 
## student count1 count2 
## 1  S1  0  1 
## 2  S2  1  0 
## 3  S3  0  0 
## 4  S4  0  0 
## 5  S5  0  0 
## 6  S6  0  0 
## 7  S7  0  0 
## 8  S8  0  0

这可以用自己的原始数据帧（合并不列count1和count2）：

merge(df, m) 
## student factor year count1 count2 
## 1  S1  A 1  0  1 
## 2  S1  C 2  0  1 
## 3  S2  A 1  1  0 
## 4  S2  B 1  1  0 
## 5  S3  A 1  0  0 
## 6  S4  A 1  0  0 
## 7  S5  A 1  0  0 
## 8  S6  B 1  0  0 
## 9  S7  C 2  0  0 
## 10  S8  D 2  0  0

来源

2013-04-26 03:00:54

回答

相关问题