2017-09-25 61 views
1

我有下面的tibble,我想从中创建第4列,这是从A,B & C的统一向量。我明白dplyr :: unite()可以做到这一点创建一个新的字符向量,但我正在寻找与载体创建一个列表列。从dplyr和rowwise的其他向量列中的新列表

现在rowwise的作品,但不保留输入的tibble。有关将A_Vector列保存到C_Vector的建议吗?

下面是代码:

library(tidyverse) 

My_Data <- tibble(A_Vector = rnorm(10), 
        B_Vector = rnorm(10), 
        C_Vector = rnorm(10)) %>% 
      rowwise() %>% 
      do(Port_Weights = matrix(c(.$A_Vector,.$B_Vector,.$C_Vector),3,1)) 

而且结果:

Source: local data frame [10 x 1] 
Groups: <by row> 

# A tibble: 10 x 1 
    Port_Weights 
*  <list> 
1 <dbl [3 x 1]> 
2 <dbl [3 x 1]> 
3 <dbl [3 x 1]> 
4 <dbl [3 x 1]> 
5 <dbl [3 x 1]> 
6 <dbl [3 x 1]> 
7 <dbl [3 x 1]> 
8 <dbl [3 x 1]> 
9 <dbl [3 x 1]> 
10 <dbl [3 x 1]> 

这不起作用:

My_Data <- tibble(A_Vector = rnorm(10), 
        B_Vector = rnorm(10), 
        C_Vector = rnorm(10)) %>% 
    mutate(Port_Weights = rowwise() %>% do(matrix(c(.$A_Vector,.$B_Vector,.$C_Vector),3,1))) 

长的版本,这显然是没有意义的:

My_Data <- tibble(A_Vector = rnorm(10), 
        B_Vector = rnorm(10), 
        C_Vector = rnorm(10)) 

Data_Unite <- My_Data %>% 
    rowwise() %>% 
    do(Port_Weights = matrix(c(.$A_Vector,.$B_Vector,.$C_Vector),3,1)) 

My_Data <- as.tibble(cbind(My_Data,Data_Unite)) 

但是确实提供结果的追捧:

# A tibble: 10 x 4 
     A_Vector B_Vector C_Vector Port_Weights 
*  <dbl>  <dbl>  <dbl>  <list> 
1 -1.23504457 -0.3750408 -0.4214122 <dbl [3 x 1]> 
2 -0.90678699 0.5261914 1.1191229 <dbl [3 x 1]> 
3 -0.62944085 0.5995529 0.2096462 <dbl [3 x 1]> 
4 2.06171633 1.5399094 2.2972950 <dbl [3 x 1]> 
5 0.08761555 0.1424207 -1.4758585 <dbl [3 x 1]> 
6 -1.07334432 -1.9112787 0.4820864 <dbl [3 x 1]> 
7 -0.18655423 -1.3698855 0.6672621 <dbl [3 x 1]> 
8 -0.97961789 -0.8194373 -0.4158516 <dbl [3 x 1]> 
9 0.68112936 -1.9864507 1.0193449 <dbl [3 x 1]> 
10 0.61455438 0.5885380 -1.0925312 <dbl [3 x 1]> 
+0

另外一个有趣的事情发生在很多卫星之前:https://stackoverflow.com/q/42077507/496803巧合地发现我最喜欢的问题列表,当今天寻找别的东西。 – thelatemail

回答

2

数据

library(tidyverse) 

my_tibble <- tibble(A_Vector = rnorm(10), 
        B_Vector = rnorm(10), 
        C_Vector = rnorm(10)) 

将列添加到一个数据帧时,使用mutate代替do,并通过使用Map循环并行的三个向量并构建每行中的矩阵:

my_tibble %>% 
    mutate(Port_Weights = Map(function(...) matrix(c(...), 3, 1), A_Vector, B_Vector, C_Vector)) 

# A tibble: 10 x 4 
#  A_Vector B_Vector C_Vector Port_Weights 
#   <dbl>  <dbl>  <dbl>  <list> 
# 1 0.62674726 -0.5432169 -1.66763618 <dbl [3 x 1]> 
# 2 -0.47346722 -0.4436020 -1.04892634 <dbl [3 x 1]> 
# 3 0.19059238 -1.6733052 2.79275828 <dbl [3 x 1]> 
# 4 -0.23501873 -1.1664704 -0.19324676 <dbl [3 x 1]> 
# 5 0.66552642 -1.3328070 -1.53575954 <dbl [3 x 1]> 
# 6 -0.41251920 -0.2056882 1.66537220 <dbl [3 x 1]> 
# 7 0.48396052 0.3968486 0.16110407 <dbl [3 x 1]> 
# 8 0.43035213 -0.6433268 1.61640228 <dbl [3 x 1]> 
# 9 0.06747126 -1.0146385 -0.47824193 <dbl [3 x 1]> 
#10 0.79916411 -1.2349901 -0.05151402 <dbl [3 x 1]> 

如果元素并不一定是一个矩阵:

my_tibble %>% mutate(Port_Weights = Map(c, A_Vector, B_Vector, C_Vector)) 

即相当于(与data.table::transpose):

my_tibble %>% mutate(Port_Weights = data.table::transpose(as.list(.))) 
+1

或'rowwise()%>% mutate(Port_Weights = list(matrix(c(A_Vector,B_Vector,C_Vector),3,1)))' – thelatemail

+0

@thelatemail好的选择,如果速度不是问题。对于样本数据,“rowwise”似乎比较慢,这并不令人意外。 – Psidom

2

由于您使用的是tidyverse,你可以还要考虑purrr程序包中的pmap函数,它是tidyverse的一部分。

set.seed(123) 

library(tidyverse) 

My_Data <- tibble(A_Vector = rnorm(10), 
        B_Vector = rnorm(10), 
        C_Vector = rnorm(10)) 

My_Data2 <- My_Data %>% 
    mutate(Port_Weights = pmap(.l = list(A_Vector, B_Vector, C_Vector), 
          .f = function(x, y, z) matrix(c(x, y, z), 3, 1))) 

My_Data2 
# A tibble: 10 x 4 
     A_Vector B_Vector C_Vector Port_Weights 
     <dbl>  <dbl>  <dbl>  <list> 
1 -0.56047565 1.2240818 -1.0678237 <dbl [3 x 1]> 
2 -0.23017749 0.3598138 -0.2179749 <dbl [3 x 1]> 
3 1.55870831 0.4007715 -1.0260044 <dbl [3 x 1]> 
4 0.07050839 0.1106827 -0.7288912 <dbl [3 x 1]> 
5 0.12928774 -0.5558411 -0.6250393 <dbl [3 x 1]> 
6 1.71506499 1.7869131 -1.6866933 <dbl [3 x 1]> 
7 0.46091621 0.4978505 0.8377870 <dbl [3 x 1]> 
8 -1.26506123 -1.9666172 0.1533731 <dbl [3 x 1]> 
9 -0.68685285 0.7013559 -1.1381369 <dbl [3 x 1]> 
10 -0.44566197 -0.4727914 1.2538149 <dbl [3 x 1]>