我想计算分类变量的最频繁值。我尝试使用modeest软件包中的mlv函数,但获得了NAs。R中的分类变量的统计模式(使用mlv)
user <- c("A","B","A","A","B","A","B","B")
color <- c("blue","green","blue","blue","green","yellow","pink","blue")
df <- data.frame(user,color)
df$color <- as.factor(df$color)
library(plyr)
library(dplyr)
library(modeest)
summary <- ddply(df,.(user),summarise,mode=mlv(color,method="mlv")[['M']])
Warning messages:
1: In discrete(x, ...) : NAs introduced by coercion
2: In discrete(x, ...) : NAs introduced by coercion
summary
user mode
1 A NA
2 B NA
然而,我需要这样的:
user mode
A blue
B green
我在做什么错?我尝试过使用其他方法,以及mlv(x=color)
。根据modeest的帮助页面,它应该适用于各种因素。
我不想使用table(),因为我需要一个简单的函数来创建一个类似于这个问题的汇总表:How to get the mode of a group in summarize in R,但是对于一个分类列。
也许还有相关性:[*“是否有内置函数用于查找模式?”](https://stackoverflow.com/q/2547402/2204410) – Jaap