我有一个数据帧,像这样:变化数据帧由采样单元格式入射频率格式(准备数据帧为iNext发生率数据帧)
df<- data.frame(region = c("1","1","1","1","1","2","3","3","3"),
loc = c("104","104","104","105","105","106","107", "108", "109"),
interact = c("A_B","A_B", "B_C", "C_D", "A_B", "E_F", "E_F", "F_G", "A_B"))
我想使一个数据帧的是:
1)对每个region
子集计数发生在loc
水平之间的给定相互作用的发生频率。因此,在上面的示例中,在区域1中有两个loc
(104和105),它们都具有交互A_B
。因此,区域1的发生频率为A_B
= 2.同一loc
中的重复interact
水平不计算在内。所以虽然A_B在区域1中出现3次,但它只发生在两个独特的loc
中。有多少独特loc
水平这个interact
时发生频率计数。
2)新的数据框应该向量化所有可能interact
各级区域之间,并计算这些发生率的每个区域。因此,0应该包含在该地区没有发生的所有层次的互动中。
3)第一行需要是该地区独特的loc
级别的计数。在区域1中有2个loc位点(104,105),区域2 1个位点级别(106)和区域3,3个位点级别(107-109)。
最终输出如下:
output<- data.frame(interact = c("","A_B","B_C","C_D","E_F","F_G"),
region1 = c("2","2","1","0","1","0"),
region2 = c("1","0","0","0","1","0"),
region3 = c("3","1","0","0","1","1"))
我不知道从哪里开始的这一点,但这里是我在贴在Convert from long to wide format counting frequency of eliminated factor level (Prepping dataframe for input into iNEXT Online)了类似的问题已经适应从@akrun,但得到有错误:
library(tidyverse)
df %>%
group_by(region = paste0('region', region)) %>%
summarise(interact = "", V1 = n_distinct(loc)) %>%
spread(region, V1),
df %>%
group_by(region = paste0('region', region) & loc),
interact = as.character(interact)) %>%
summarise(V1 = length(unique((interact)) %>%
spread(region, V1, fill = 0))
到目前为止您尝试过哪些方法无效? –
我已经添加到OP来解决您的问题。感谢您的时间。 – Danielle