2016-11-25 67 views
0

我有这个数据集,我有一个YYYYMM格式的时间序列。我有两列,基本上是真/假标志。我想基于检索电流范围,这些真/假标志添加两个额外列:T-SQL:时间序列填充范围

 Default Cure 
201301 0  NULL 
201302 0  NULL 
201303 0  NULL 
201304 1  NULL 
201305 1  NULL 
201306 1  NULL 
201307 1  NULL 
201308 NULL  0 
201309 NULL  0 
201310 NULL  1 
201311 0  NULL 
201312 0  NULL 
201401 0  NULL 
201402 0  NULL 
201403 1  NULL 
201404 1  NULL 
201405 0  NULL 
201406 0  NULL 
201407 NULL  1 
201408 NULL  0 
201409 NULL  0 
201410 NULL  0 
201411 NULL  0 
201412 NULL  0 

我该数据集可以看到默认的列被设置为1周期201304,05,06, 07和治愈列被设置为1的期间201310.

这基本上意味着默认时间序列是有效的从周期201304直到周期201310.最终我想生成了以下一组:

 Default Cure DefaultPeriod CurePeriod 
201301 0  NULL NULL   NULL 
201302 0  NULL NULL   NULL 
201303 0  NULL NULL   NULL 
201304 1  NULL 201304   201310 
201305 1  NULL 201304   201310 
201306 1  NULL 201304   201310 
201307 1  NULL 201304   201310 
201308 NULL  0  201304   201310 
201309 NULL  0  201304   201310 
201310 NULL  1  201304   201310 
201311 0  NULL NULL   NULL 
201312 0  NULL NULL   NULL 
201401 0  NULL NULL   NULL 
201402 0  NULL NULL   NULL 
201403 1  NULL 201403   201407 
201404 1  NULL 201403   201407 
201405 0  NULL 201403   201407 
201406 0  NULL 201403   201407 
201407 NULL  1  201403   201407 
201408 NULL  0  NULL   NULL 
201409 NULL  0  NULL   NULL 
201410 NULL  0  NULL   NULL 
201411 NULL  0  NULL   NULL 
201412 NULL  0  NULL   NULL 

可能会出现多个范围,但他们不能 交叠。我将如何去实现这一点。我试图在同一张桌子上做各种最小/最大周期加入,但我似乎无法找到一个可行的解决方案。

+0

你说你有'...两列,基本上是真/假标志...',但也有'NULL'值。你是否在使用'NULL == 0 == FALSE'?如果是这样,为什么不用'0'填充'NULL'字段? – Tony

+0

在'201308'上,'Cure'的'0'值保持范围不变,但'201311''Default'的'0'值不*保持范围继续。为什么不?在'201405'上''Default''的'0'值确实会使范围继续下去,毕竟...... – AakashM

+0

基本上,从描述或例子来看,它一点也不清楚a)什么算作为真,什么算作假b)什么reule围绕什么构成一个范围 – AakashM

回答

1

这是一个真正的思想家:)

基本上我划分了对“治疗”时间(C1)的数据,编号每组(C2),然后寻找分钟和马克塞斯每个组(C3 C4),然后应用一些逻辑来过滤出来之前的行。

declare @t table 
(
    [Month] varchar(6), 
    [Default] bit, 
    [Cure] bit 
); 

insert into @t values('201301', 0,  NULL); 
insert into @t values('201302', 0,  NULL); 
insert into @t values('201303', 0,  NULL); 
insert into @t values('201304', 1,  NULL); 
insert into @t values('201305', 1,  NULL); 
insert into @t values('201306', 1,  NULL); 
insert into @t values('201307', 1,  NULL); 
insert into @t values('201308', NULL,  0); 
insert into @t values('201309', NULL,  0); 
insert into @t values('201310', NULL,  1); 
insert into @t values('201311', 0,  NULL); 
insert into @t values('201312', 0,  NULL); 
insert into @t values('201401', 0,  NULL); 
insert into @t values('201402', 0,  NULL); 
insert into @t values('201403', 1,  NULL); 
insert into @t values('201404', 1,  NULL); 
insert into @t values('201405', 0,  NULL); 
insert into @t values('201406', 0,  NULL); 
insert into @t values('201407', NULL,  1); 
insert into @t values('201408', NULL,  0); 
insert into @t values('201409', NULL,  0); 
insert into @t values('201410', NULL,  0); 
insert into @t values('201411', NULL,  0); 
insert into @t values('201412', NULL,  0); 


with c1 as 
(
    select min([Month]) [Month], 1 x from @t 
    union all 
    select [Month],1 from @t 
    where Cure = 1 
), 
c2 as 
(
    select t.[Month],[Default],[Cure], 
     sum(x) over (order by t.[Month] rows between unbounded preceding and 1 preceding) grp 
    from @t t 
    left outer join c1 on c1.[Month] = t.[Month] 
), 
c3 as 
(
    select grp, min([Month]) [Month] 
    from c2 
    where [Default] = 1 
    group by grp 
), 
c4 as 
(
    select grp, max([Month]) [Month] 
    from c2 
    where [Cure] = 1 
    group by grp 
) 
select c2.[Month], c2.[Default], c2.[Cure], 
    case when c2.[Month] >= c3.[Month] then c3.[Month] else null end as DefaultPeriod, 
    case when c2.[Month] >= c3.[Month] then c4.[Month] else null end as CurePeriod 
from c2 
left outer join c3 on c2.grp = c3.grp 
left outer join c4 on c2.grp = c4.grp