我有一系列来自一系列曲棍球比赛的得分数据,并且我处于分析阶段。我试图在每场比赛中每10分钟画出主队的领先优势。根据得分数据定期计算球队领先优势
这里是哪里我到目前为止已经得到我的数据集的例子:
library(tidyverse)
# Generate example data ordered by gameid and event_ts
game <- tibble(event_type = "goal", event_ts = runif(n = 1000, min = 0, max = 60),
team = sample(c("home", "away"), size = 1000, replace = TRUE, prob = c(0.55,0.45)),
gameid = sample(100:300, size = 1000, replace = TRUE)) %>%
arrange(gameid, event_ts)
我知道,我可以用summarise
每场比赛的最终比分。下面是一个假设两队得分至少一个目标在每场比赛一个简单的例子:
game %>%
group_by(gameid, team) %>%
summarise(goals = n()) %>%
spread(key = team, value = goals) %>%
mutate(away = ifelse(is.null(away), 0, away))
我想在整个游戏10分钟间隔计算出主队的领先优势(正或负)。这需要总结那时发生的所有得分。这里有一个我想要得到的结构的例子:
finished_demo <- tibble(
gameid = sort(rep_len(seq(100, 300, 1), 1206)),
timestamp = rep(seq(10, 60, 10), 201),
home_lead = round(runif(
n = 1206, min = -5, max = 7
))
) %>% arrange(gameid, timestamp)
'库(tidyverse);游戏%>%mutate(event_ts = ceiling(event_ts/10)* 10)%>%complete(event_ts,gameid,team)%>%group_by(gameid,team,event_ts)%>%summarize(score = coalesce(sum %>总结(ts = list(event_ts),score = list(cumsum(得分)))%>%unnest()%>%spread(团队,分数)%> %mutate(home_lead = home - away)' – alistaire