我有一个数据集,看起来像这样:集团行多达当前行中的R data.table
library(data.table)
set.seed(10)
n_rows <- 50
data <- data.table(id = 1:n_rows,
timestamp = Sys.Date() + as.difftime(1:n_rows, units = "days"),
subject = sample(letters[1:4], n_rows, replace = T),
response = sample(3, n_rows, replace = T)
)
head(data, 10)
id timestamp subject response
1: 1 2016-05-17 c 2
2: 2 2016-05-18 b 3
3: 3 2016-05-19 b 1
4: 4 2016-05-20 c 2
5: 5 2016-05-21 a 1
6: 6 2016-05-22 a 2
7: 7 2016-05-23 b 2
8: 8 2016-05-24 b 2
9: 9 2016-05-25 c 2
10: 10 2016-05-26 b 2
我需要通过操作做一些组按主题迄今为止每个响应的那笔出现次数。
下面的组通过产生nth_test列。
new_vars <- data[, .(id, timestamp, nth_test = 1:.N, response), by=.(subject)]
subject id timestamp nth_test response
1: c 1 2016-05-17 1 2
2: c 4 2016-05-20 2 2
3: c 9 2016-05-25 3 2
4: c 11 2016-05-27 4 1
5: c 12 2016-05-28 5 1
6: c 14 2016-05-30 6 2
7: c 22 2016-06-07 7 2
8: c 26 2016-06-11 8 2
9: c 31 2016-06-16 9 3
10: c 36 2016-06-21 10 1
但我不知道如何生产列resp_1,resp_2 & resp_3像下面。
subject id timestamp nth_test response resp_1 resp_2 resp_3
1: c 1 2016-05-17 1 2 0 1 0
2: c 4 2016-05-20 2 2 0 2 0
3: c 9 2016-05-25 3 2 0 3 0
4: c 11 2016-05-27 4 1 1 3 0
5: c 12 2016-05-28 5 1 2 3 0
6: c 14 2016-05-30 6 2 2 4 0
7: c 22 2016-06-07 7 2 2 5 0
8: c 26 2016-06-11 8 2 2 6 0
9: c 31 2016-06-16 9 3 2 6 1
10: c 36 2016-06-21 10 1 3 6 1
干杯
您的数据是如何排序的,因为这些列值取决于您的数据的顺序?你可以做一些类似'resp_i:= cumsum(response == i)' – Psidom
Psidom这正是我需要的,谢谢。 – efbbrown