2016-09-23 63 views
1

嘿,我学习了R,并试图计算融化数据中有多少零点。所以,我想知道有多少个零与列a和b相对应,并打印出两个结果。 我产生一个例子:在“融化”数据帧内计算零点数量

library(reshape) 
library(plyr) 
library(dplyr) 
id = c(1,2,3,4,5,6,7,8,9,10) 
b = c(0,0,5,6,3,7,2,8,1,8) 
c = c(0,4,9,87,0,87,0,4,5,0) 
test = data.frame(id,b,c) 
test_melt = melt(test, id.vars = "id") 
test_melt 

我想象,我应该创建一个if语句。如果(test $ value == 0){print()}与 有什么关系,但是如何告诉R为已经被融化的列计数了零?

回答

2

与您的数据:

test_melt %>% 
    group_by(variable) %>% 
    summarize(zeroes = sum(value == 0)) 
# # A tibble: 2 x 2 
# variable zeroes 
#  <fctr> <int> 
# 1  b  2 
# 2  c  4 

基础R:

aggregate(test_melt$value, by = list(variable = test_melt$variable), 
      FUN = function(x) sum(x == 0)) 
# variable x 
# 1  b 2 
# 2  c 4 

...和好奇:

library(microbenchmark) 
microbenchmark(
    dplyr = group_by(test_melt, variable) %>% summarize(zeroes = sum(value == 0)), 
    base1 = aggregate(test_melt$value, by = list(variable = test_melt$variable), FUN = function(x) sum(x == 0)), 
    # @PankajKaundal's suggested "formula" notation reads easier 
    base2 = aggregate(value ~ variable, test_melt, function(x) sum(x == 0)) 
) 
# Unit: microseconds 
# expr  min  lq  mean median  uq  max neval 
# dplyr 916.421 986.985 1069.7000 1022.1760 1094.7460 2272.636 100 
# base1 647.658 682.302 783.2065 715.3045 765.9940 1905.411 100 
# base2 813.219 867.737 950.3247 897.0930 959.8175 2017.001 100 
+0

如果第一个选项不适合你,那么别的东西是错误的。您的数据在我的电脑上正常工作。无论如何,很高兴听到它的工作。 – r2evans

0
sum(test_melt$value==0) 

这应该这样做。

+0

它可以工作,但它计算零的总量,但是,我需要计算b中有多少个零和c中有多少个零。我应该使用unique()函数吗? – marianess

0

这可能会有帮助。这是你在找什么?

> test_melt[4] <- 1 
    > test_melt2 <- aggregate(V4 ~ value + variable, test_melt, sum) 
    > test_melt2 
     value variable V4 
    1  0  b 2 
    2  1  b 1 
    3  2  b 1 
    4  3  b 1 
    5  5  b 1 
    6  6  b 1 
    7  7  b 1 
    8  8  b 2 
    9  0  c 4 
    10  4  c 2 
    11  5  c 1 
    12  9  c 1 
    13 87  c 2 

V4 is the count 
+0

它不计算零:(但它似乎是函数聚合()是关键 – marianess

+0

V4是“0”的计数和变量b和c的所有其他数字。 –