将行条目转换为R中的列

我有这样的疑问：

假设我有一个数据帧：

name = c("John", "John","John","John","Mark","Mark","Mark","Mark","Dave", "Dave","Dave","Dave") 
color = c("red", "blue", "green", "yellow","red", "blue", "green", "yellow","red", "blue", "green", "yellow") 
value = c(1,2,1,3,5,5,3,2,4,6,7,8) 
df = data.frame(name, color, value) 
#View(df) 
df 
#  name color value 
# 1 John red  1 
# 2 John blue  2 
# 3 John green  1 
# 4 John yellow  3 
# 5 Mark red  5 
# 6 Mark blue  5 
# 7 Mark green  3 
# 8 Mark yellow  2 
# 9 Dave red  4 
# 10 Dave blue  6 
# 11 Dave green  7 
# 12 Dave yellow  8

，我希望它看起来像这样：

# names red blue green yellow 
#1 John 1 2  1  3 
#2 Mark 5 5  3  2 
#3 Dave 4 6  7  8

那是，第一列（名称）中的条目将变得唯一，第二列（颜色）中的级别将成为新列，并且这些新列中的条目将来自对应在原始数据框中的第三列（值）中的行。

我可以使用下面的做到这一点：

library(dplyr) 
    df = df %>% 
    group_by(name) %>% 
    mutate(red = ifelse(color == "red", value, 0.0), 
     blue = ifelse(color == "blue", value, 0.0), 
     green = ifelse(color == "green", value, 0.0), 
     yellow = ifelse(color == "yellow", value, 0.0)) %>% 
    group_by(name) %>% 
    summarise_each(funs(sum), red, blue, green, yellow) 
df 
    name red blue green yellow 
1 Dave  4  6  7  8 
2 John  1  2  1  3 
3 Mark  5  5  3  2

但是，如果有很多的颜色栏的水平，这将不是很理想。我将如何继续这样做？

谢谢！

来源

2016-08-03 chowching

由于OP使用dplyr家庭套餐的，一个不错的选择与tidyr

library(tidyr) 
spread(df, color, value) 
# name blue green red yellow 
#1 Dave 6  7 4  8 
#2 John 2  1 1  3 
#3 Mark 5  3 5  2

如果我们需要使用%>%

library(dplyr) 
df %>% 
    spread(color, value)

为了保持秩序，我们可以将'color'转换为factor类，使用levels类指定为'color'的unique值，然后执行th Ëspread

df %>% 
    mutate(color = factor(color, levels = unique(color))) %>% 
    spread(color, value) 
# name red blue green yellow 
#1 Dave 4 6  7  8 
#2 John 1 2  1  3 
#3 Mark 5 5  3  2

或者我们可以使用data.table以更快dcast。转换为data.table并使用data.table的dcast具有优势。它比reshape2的dcast快得多。

library(data.table) 
dcast(setDT(df), name~color, value.var="value") 
# name blue green red yellow 
#1: Dave 6  7 4  8 
#2: John 2  1 1  3 
#3: Mark 5  3 5  2

注：在这两种解决方案，我们得到的列名在预期的输出，并且没有连接到它（这BTW是可以改变的任何丑陋的前缀或后缀，但它是另一行代码）

如果我们需要一个base R，一种选择是tapply

with(df, tapply(value, list(name, color), FUN = I)) 
#  blue green red yellow 
#Dave 6  7 4  8 
#John 2  1 1  3 
#Mark 5  3 5  2

来源

2016-08-03 04:32:53 akrun

这是快。谢谢！ – chowching

所以，你要跨标签呢？

> xtabs(value~name+color, df) 
     color 
name blue green red yellow 
    Dave 6  7 4  8 
    John 2  1 1  3 
    Mark 5  3 5  2

来源

2016-08-03 04:45:29

您可以使用dcast从reshape2包

library(reshape2) 
dcast(df, name~color) 


# name blue green red yellow 
#1 Dave 6  7 4  8 
#2 John 2  1 1  3 
#3 Mark 5  3 5  2

要不然你可以从reshapebase R

reshape(df, idvar="name", timevar="color", direction="wide") 


# name value.red value.blue value.green value.yellow 
#1 John   1   2   1   3 
#5 Mark   5   5   3   2 
#9 Dave   4   6   7   8

来源

2016-08-03 04:46:50

将行条目转换为R中的列

回答

相关问题