2016-11-30 59 views
1

我有一个时间序列数据集,其中初始观察值来自每月数据。我将日期转换为每日,并将每个值放在月初。现在,我想为每个重复值添加一天,直到数据集中没有重叠的日期。这一步对后续分析和绘图至关重要。R - 使用while循环消除重复值

这是生成一个数据集,则与我相似:

sample <- rbind("2007-01-01","2007-02-01","2007-03-01","2007-05-01", 
      "2007-06-01","2007-07-01","2007-09-01","2007-10-01", 
      "2007-11-01","2007-12-01","2008-01-01","2008-02-01", 
      "2008-03-01","2008-05-01","2008-06-01","2008-07-01", 
      "2008-09-01","2008-10-01","2008-11-01","2008-12-01", 
      "2009-02-01","2009-04-01","2009-05-01","2009-06-01", 
      "2009-07-01","2009-09-01","2009-10-01","2009-11-01", 
      "2009-12-01","2010-01-01","2010-02-01","2010-03-01", 
      "2010-04-01","2010-05-01","2010-05-01","2010-05-01", 
      "2010-05-01","2010-05-01","2010-06-01","2010-06-01", 
      "2010-06-01","2010-06-01","2010-07-01","2010-07-01", 
      "2010-07-01","2010-07-01","2010-07-01","2010-08-01", 
      "2010-08-01","2010-08-01","2010-08-01","2010-09-01", 
      "2010-09-01","2010-09-01","2010-09-01","2010-09-01", 
      "2010-10-01","2010-10-01","2010-10-01","2010-10-01", 
      "2010-10-01","2010-11-01","2010-11-01","2010-11-01", 
      "2010-11-01","2010-11-01","2010-12-01","2010-12-01", 
      "2010-12-01","2010-12-01","2010-12-01","2011-01-01", 
      "2011-01-01","2011-01-01","2011-01-01","2011-02-01", 
      "2011-02-01","2011-02-01","2011-02-01","2011-03-01", 
      "2011-03-01","2011-03-01","2011-03-01","2011-04-01", 
      "2011-04-01","2011-04-01","2011-04-01","2011-04-01", 
      "2011-05-01","2011-05-01","2011-05-01","2011-05-01", 
      "2011-05-01","2011-06-01","2011-06-01","2011-06-01", 
      "2011-06-01","2011-06-01","2011-07-01","2011-07-01", 
      "2011-07-01","2011-07-01","2011-08-01","2011-08-01", 
      "2011-08-01","2011-09-01","2011-09-01","2011-09-01", 
      "2011-09-01","2011-10-01","2011-10-01","2011-10-01", 
      "2011-10-01","2011-10-01","2011-11-01","2011-11-01", 
      "2011-11-01","2011-11-01","2011-11-01","2011-12-01", 
      "2011-12-01","2011-12-01","2011-12-01","2011-12-01", 
      "2012-01-01","2012-01-01","2012-01-01","2012-01-01", 
      "2012-01-01","2012-02-01","2012-02-01","2012-02-01", 
      "2012-02-01","2012-02-01","2012-03-01","2012-03-01", 
      "2012-03-01","2012-03-01","2012-03-01","2012-04-01", 
      "2012-04-01","2012-04-01","2012-04-01","2012-05-01", 
      "2012-05-01","2012-05-01","2012-05-01","2012-05-01", 
      "2012-06-01","2012-06-01","2012-06-01","2012-06-01", 
      "2012-06-01","2012-07-01","2012-07-01","2012-07-01", 
      "2012-07-01","2012-07-01","2012-08-01","2012-08-01", 
      "2012-08-01","2012-09-01","2012-09-01","2012-09-01", 
      "2012-09-01","2012-09-01","2012-10-01","2012-10-01", 
      "2012-10-01","2012-10-01","2012-10-01","2012-11-01", 
      "2012-11-01","2012-11-01","2012-11-01","2012-11-01", 
      "2012-12-01","2012-12-01","2012-12-01","2013-01-01", 
      "2013-01-01","2013-01-01","2013-01-01","2013-01-01", 
      "2013-02-01","2013-02-01","2013-02-01","2013-02-01", 
      "2013-02-01","2013-03-01","2013-03-01","2013-03-01", 
      "2013-03-01","2013-03-01","2013-04-01","2013-04-01", 
      "2013-04-01","2013-04-01","2013-04-01","2013-05-01", 
      "2013-05-01","2013-05-01","2013-05-01","2013-05-01", 
      "2013-06-01","2013-06-01","2013-06-01","2013-06-01", 
      "2013-07-01","2013-07-01","2013-07-01","2013-07-01", 
      "2013-08-01","2013-08-01","2013-08-01","2013-09-01", 
      "2013-09-01","2013-09-01","2013-09-01","2013-09-01", 
      "2013-10-01","2013-10-01","2013-10-01","2013-10-01", 
      "2013-10-01","2013-11-01","2013-11-01","2013-11-01", 
      "2013-11-01","2013-11-01","2013-12-01","2013-12-01", 
      "2013-12-01","2013-12-01","2013-12-01","2014-01-01", 
      "2014-01-01","2014-01-01","2014-01-01","2014-01-01", 
      "2014-02-01","2014-02-01","2014-02-01","2014-02-01", 
      "2014-02-01","2014-03-01","2014-03-01","2014-03-01", 
      "2014-03-01","2014-03-01","2014-05-01","2014-05-01", 
      "2014-05-01","2014-05-01","2014-05-01","2014-06-01", 
      "2014-06-01","2014-06-01","2014-07-01","2014-07-01", 
      "2014-07-01","2014-07-01","2014-08-01","2014-08-01", 
      "2014-09-01","2014-09-01","2014-09-01","2014-09-01", 
      "2014-09-01","2014-10-01","2014-10-01","2014-10-01", 
      "2014-10-01","2014-11-01","2014-11-01","2014-11-01", 
      "2014-11-01","2014-12-01","2014-12-01","2014-12-01", 
      "2015-01-01","2015-01-01","2015-01-01","2015-01-01", 
      "2015-02-01","2015-02-01","2015-02-01","2015-02-01", 
      "2015-03-01","2015-03-01","2015-03-01","2015-03-01", 
      "2015-04-01","2015-04-01","2015-04-01","2015-04-01", 
      "2015-05-01","2015-05-01","2015-06-01","2015-06-01", 
      "2015-06-01","2015-07-01","2015-07-01","2015-08-01", 
      "2015-08-01","2015-09-01","2015-09-01","2015-09-01", 
      "2015-10-01","2015-10-01","2015-11-01","2015-11-01", 
      "2015-12-01","2016-01-01","2016-01-01","2016-01-01", 
      "2016-01-01","2016-02-01","2016-02-01","2016-02-01", 
      "2016-02-01","2016-03-01","2016-04-01","2016-04-01", 
      "2016-04-01","2016-04-01","2016-05-01","2016-05-01", 
      "2016-06-01","2016-06-01","2016-06-01","2016-06-01", 
      "2016-07-01","2016-07-01","2016-07-01","2016-07-01", 
      "2016-08-01","2016-08-01","2016-08-01","2016-08-01", 
      "2016-08-01","2016-08-01","2016-08-01","2016-08-01", 
      "2016-08-01","2016-08-01","2016-09-01","2016-09-01", 
      "2016-09-01","2016-09-01","2016-10-01","2016-10-01", 
      "2016-10-01","2016-11-01","2016-11-01") 
sample <- as.data.frame(sample) 
sample$Value <- (1:355) 
colnames(sample)[1] <- c("Date") 
View(sample) 

上这个有点阅读后,我得出的结论是什么,我需要做的是通过运行一个while循环日期列,并为每个值添加一天,如果它是重复的。通过使用该lubridate package我做这样的事情:

library(lubridate)  
while(sample$Date==sample$Date[-1]) {sample$Date <- sample$Date+days(1); print(sample$Date);} 

然而,循环不会运行,并且产生大量的警告。你有什么想法如何解决这个问题?我认为这是一个非常简单的问题,我只是新的循环。

谢谢!

+0

分享更小的数据和预期的输出! –

回答

2

我们可以用data.table来实现。首先,我们将设置东西,包括factor类变换日期:

library(data.table) 
setDT(sample) 
sample[ , Date := as.Date(Date) ] 

然后,我们会进行转换:

sample[ , Date := Date + (seq_len(.N) - 1L), by = Date ] 

我们这里所做的是分离出每个子集的匹配日期值,并添加一个序列向量。例如,具有4个匹配日期值的子集将为该日期向量添加c(0,1,2,3)天,以便第一个值保持不变,并且随后的值按照您描述的方式递增。

+0

谢谢你!我想知道,我如何做到这一点,不仅增加一天,而且在整个月份将重叠值分配到距离彼此相等的地方? – eborbath

+0

不客气!如果它回答了您的问题,请标记为已接受...对于均匀间隔的序列,您可能需要'seq.Date'。如果你需要'seq.Date'的帮助,你可以问一个新的问题。 – rosscova