2017-10-21 111 views
0

我很努力地找到一种很好的方式,将月中发生的最后一个值发送到我的xts对象的月末。R LOCF直到月份在xts对象结尾

2010-02-26  4029.027 
2010-02-27  4029.027 
2010-02-28  4029.027 
2010-03-04  4029.027 
2010-03-05  4029.027 
2010-03-20  4029.027 
2010-03-26  4029.027 
2010-03-27  4029.027 
2010-03-28  4029.027 
2010-03-31  4029.027 
2010-04-02  4029.027 
2010-04-03  5956.582 
2010-04-04   NA 
2010-04-11   NA 
2010-04-24   NA 
2010-04-25   NA 
2010-04-28   NA 
2010-04-30   NA 
2010-05-01   NA 

你可以从我的数据见上我有“NA的” 2010-04的后,理想我想携带5956.582向前,直到月底,所以我的数据会看起来像:

2010-02-26  4029.027 
2010-02-27  4029.027 
2010-02-28  4029.027 
2010-03-04  4029.027 
2010-03-05  4029.027 
2010-03-20  4029.027 
2010-03-26  4029.027 
2010-03-27  4029.027 
2010-03-28  4029.027 
2010-03-31  4029.027 
2010-04-02  4029.027 
2010-04-03  5956.582 
2010-04-04  5956.582 
2010-04-11  5956.582 
2010-04-24  5956.582 
2010-04-25  5956.582 
2010-04-28  5956.582 
2010-04-30  5956.582 
2010-05-01   NA 

在我开始编写自己的函数来做这件事之前,我想知道是否有人知道另一种方式?

感谢

ST

回答

2

使用aveas.yearmonna.locf0从动物园包(其中XTS负荷)。这不会使用任何额外的软件包,而不是您已经使用的xts/zoo。

library(xts) 
ave(x, as.yearmon(time(x)), FUN = na.locf0) 

,并提供:

   [,1] 
2010-02-26 4029.027 
2010-02-27 4029.027 
2010-02-28 4029.027 
2010-03-04 4029.027 
2010-03-05 4029.027 
2010-03-20 4029.027 
2010-03-26 4029.027 
2010-03-27 4029.027 
2010-03-28 4029.027 
2010-03-31 4029.027 
2010-04-02 4029.027 
2010-04-03 5956.582 
2010-04-04 5956.582 
2010-04-11 5956.582 
2010-04-24 5956.582 
2010-04-25 5956.582 
2010-04-28 5956.582 
2010-04-30 5956.582 
2010-05-01  NA 

注:

输入x在重现的形式是:

Lines <- " 
2010-02-26  4029.027 
2010-02-27  4029.027 
2010-02-28  4029.027 
2010-03-04  4029.027 
2010-03-05  4029.027 
2010-03-20  4029.027 
2010-03-26  4029.027 
2010-03-27  4029.027 
2010-03-28  4029.027 
2010-03-31  4029.027 
2010-04-02  4029.027 
2010-04-03  5956.582 
2010-04-04   NA 
2010-04-11   NA 
2010-04-24   NA 
2010-04-25   NA 
2010-04-28   NA 
2010-04-30   NA 
2010-05-01   NA" 

library(xts) 

z <- read.zoo(text = Lines) 
x <- as.xts(z) 
+0

这个工作一种享受,是一个更容易理解。非常感谢。 – SyTrade

+0

这将如何应用于具有多列的xts对象?我试过“申请”,它抱怨? – SyTrade

+0

'xx < - cbind(a = x,b = x); xx [] < - apply(xx,2,function(x)ave(x,as.yearmon(time(xx)),FUN = na.locf0))' –

0

试试这个,它使用zoo::na.locf填写NA

你的数据

df <- read.table(text="2010-02-26  4029.027 
2010-02-27  4029.027 
2010-02-28  4029.027 
2010-03-04  4029.027 
2010-03-05  4029.027 
2010-03-20  4029.027 
2010-03-26  4029.027 
2010-03-27  4029.027 
2010-03-28  4029.027 
2010-03-31  4029.027 
2010-04-02  4029.027 
2010-04-03  5956.582 
2010-04-04   NA 
2010-04-11   NA 
2010-04-24   NA 
2010-04-25   NA 
2010-04-28   NA 
2010-04-30   NA 
2010-05-01   NA", header=FALSE) 

解决方案

library(dplyr) 
library(zoo) 
library(lubridate) 

您的May数据是一个问题,因为它是该月的一个单独的NA观察值。这是我有理由使用if (!is.na(.x$V2))到状态下运行mutate(V2 = na.locf(V2))

result <- df %>% 
      mutate(V1 = ymd(V1)) %>%  # convert to Date just in case 
      split(month(.$V1)) %>%   # split data by month 
      map(., ~if (!is.na(.x$V2)) {.x %>% mutate(V2 = na.locf(V2))} else {.x}) # iterate through list by month 
ans <- Reduce("rbind", result) 

      # V1  V2 
# 1 2010-02-26 4029.027 
# 2 2010-02-27 4029.027 
# 3 2010-02-28 4029.027 
# 4 2010-03-04 4029.027 
# 5 2010-03-05 4029.027 
# 6 2010-03-20 4029.027 
# 7 2010-03-26 4029.027 
# 8 2010-03-27 4029.027 
# 9 2010-03-28 4029.027 
# 10 2010-03-31 4029.027 
# 11 2010-04-02 4029.027 
# 12 2010-04-03 5956.582 
# 13 2010-04-04 5956.582 
# 14 2010-04-11 5956.582 
# 15 2010-04-24 5956.582 
# 16 2010-04-25 5956.582 
# 17 2010-04-28 5956.582 
# 18 2010-04-30 5956.582 
# 19 2010-05-01  NA