2017-09-14 118 views
-2

我的季节时间是次年的10月1日至3月31日。我如何创建一个赛季一个虚拟变量来看到这个人在进出在R中创建季节变量

df <- data.frame(ID= c(1:6), 
      Drug = c("A","C","A","A","B","A"), 
      Start = c("01/01/2009","07/10/2010","10/10/2009","03/01/2011","03/01/2012","04/12/2010"), 
      End=c("09/10/2009","04/20/2011","07/20/1010","01/01/2012","04/01/2013","09/30/2011")) 

我的输出曝光:

ID Drug  Start  End Season 
1 1 A 01/01/2009 09/10/2009  1 
2 1 A 01/01/2009 09/10/2009  0 
3 2 C 07/10/2010 04/20/2011  0 
4 2 C 07/10/2010 04/20/2011  1 
5 2 C 07/10/2010 04/20/2011  0 
6 3 A 10/10/2009 07/20/1010  1 
7 3 A 10/10/2009 07/20/1010  0 
8 3 A 10/10/2009 07/20/1010  1 
9 4 B 03/01/2011 01/01/2012  1 
10 4 B 03/01/2011 01/01/2012  0 
11 4 B 03/01/2011 01/01/2012  1 
12 5 A 03/01/2012 04/01/2013  1 
13 5 A 03/01/2012 04/01/2013  0 
14 5 A 03/01/2012 04/01/2013  1 
15 5 A 03/01/2012 04/01/2013  0 
16 6 A 04/12/2010 09/30/2011  0 

ID 1:她从01/01和09/10末开始。

[01/01, 03/31] =1 

[03/31,09/10] = 0 

ID 2:她从07/10/10开始,04/20结束。我检查

[07/10, 10/01] = 0 

[10/01,03/31] = 1 

[03/31, 04/20] = 0 

ID5她开始03/01和04/01结束

[03/01, 03/31]= 1 

[03/31, 10/01] = 0 

[10/01, 03/31] = 1 

[03/31, 04/01] = 0 
+3

我不清楚你在问什么。因此,患者2获得了三排季节0,1,0,因为她在赛季外开始,经历了赛季,并在赛季之外结束了赛季? – lebelinoz

+0

患者5获得四排,因为她经历了四个时期(两个赛季和两个淡季)? – lebelinoz

+0

她从07/10/2010开始到2011年4月20日结束,因此我检查[07/10,10/01] = 0,然后[10/1,03/31] = 1,[03/31/04/20] = 0 – BIN

回答

1

我觉得我得到了ExposedIn和ExposedOut正确使用下面的代码(注意:您需要添加“stringsAsFactors = FALSE'当你创建你的数据框时)。但是,我没有足够的时间来计算所涵盖的整个季节的额外总和 - 我会通过添加具有日期/时间功能的另一列来考虑整个治疗时间。

df$Start <- as.Date(df$Start, format = '%m/%d/%Y') 
df$End <- as.Date(df$End, format = '%m/%d/%Y') 
df$SeasonIn <- 274 # 275 in leap years 
df$SeasonOut <- 90 # 91 in leap years 
df$ExposedIn <- as.integer(as.POSIXlt(df$Start)$yday >= df$SeasonIn | 
as.POSIXlt(df$Start)$yday < df$SeasonOut) 
df$ExposedOut <- as.integer(as.POSIXlt(df$End)$yday >= df$SeasonIn | 
as.POSIXlt(df$End)$yday < df$SeasonOut) 

希望这至少有助于一些。