2017-10-12 82 views
-1

的数据具有包含多个观测两种通用基团,其中一些是在DLA字段NADLA日期与组内所有记录的日期相同。如何扩大DLA值,以便在相应的日期填入NA值。我在dplyr之内工作,我怀疑有一个我找不到的解决方案。这些数据是具有约5k行和约500个个体的较大数据集的一小部分。非常感谢。一个组内的扩大的日期值时,下面NA

dat <- structure(list(GenIndID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("BHS_106", 
"BHS_164"), class = "factor"), IndID = structure(c(1L, 1L, 1L, 
1L, 2L, 2L, 2L, 2L, 3L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 7L, 
8L), .Label = c("BHS_106_A", "BHS_106_B", "BHS_106_C", "BHS_106_D", 
"BHS_164_A", "BHS_164_B", "BHS_164_C", "BHS_164_D"), class = "factor"), 
    DLA = structure(c(1507010400, 1507010400, 1507010400, 1507010400, 
    1507010400, 1507010400, 1507010400, 1507010400, NA, NA, 1499061600, 
    1499061600, 1499061600, 1499061600, 1499061600, 1499061600, 
    1499061600, NA, NA, NA), tzone = "", class = c("POSIXct", 
    "POSIXt"))), .Names = c("GenIndID", "IndID", "DLA"), row.names = c(411L, 
412L, 413L, 414L, 415L, 416L, 417L, 418L, 419L, 420L, 442L, 443L, 
444L, 445L, 446L, 447L, 448L, 449L, 450L, 451L), class = "data.frame") 

> dat 
    GenIndID  IndID  DLA 
411 BHS_106 BHS_106_A 2017-10-03 
412 BHS_106 BHS_106_A 2017-10-03 
413 BHS_106 BHS_106_A 2017-10-03 
414 BHS_106 BHS_106_A 2017-10-03 
415 BHS_106 BHS_106_B 2017-10-03 
416 BHS_106 BHS_106_B 2017-10-03 
417 BHS_106 BHS_106_B 2017-10-03 
418 BHS_106 BHS_106_B 2017-10-03 
419 BHS_106 BHS_106_C  <NA> 
420 BHS_106 BHS_106_D  <NA> 
442 BHS_164 BHS_164_A 2017-07-03 
443 BHS_164 BHS_164_A 2017-07-03 
444 BHS_164 BHS_164_A 2017-07-03 
445 BHS_164 BHS_164_A 2017-07-03 
446 BHS_164 BHS_164_A 2017-07-03 
447 BHS_164 BHS_164_A 2017-07-03 
448 BHS_164 BHS_164_A 2017-07-03 
449 BHS_164 BHS_164_B  <NA> 
450 BHS_164 BHS_164_C  <NA> 
451 BHS_164 BHS_164_D  <NA> 
+2

的可能的复制[如何通过组最近的非NA NA替换?](https://stackoverflow.com/questions/39063253/how-to -replace-na-with-most-recent-non-na-by-group)或者[用组值替换NA值](https://stackoverflow.com/questions/23583739/replace-na-value-with-该组的价值) –

+0

是的,这是重复的。道歉。我如何删除?因为它有一个答案,所以SO不会允许(至少在我的名声下)。 –

回答

0

'GenIndID'分组后我们需要fill。由于NAs位于最下方,因此默认为.direction = 'down'。所以,我们并不需要指定它

dat %>% 
    group_by(GenIndID) %>% 
    fill(DLA) 
相关问题