2011-03-06 46 views
8

我有一组日期范围同时包含部分和完全重叠的日期,就像这样:消除和减少重叠的日期范围

UserID StartDate EndDate 
====== ========== ========== 
1  2011-01-01 2011-01-02 <- A 
1  2011-01-01 2011-01-10 <- A 
1  2011-01-08 2011-02-15 <- A 
1  2011-02-20 2011-03-10 <- B 
2  2011-01-01 2011-01-20 <- C 
2  2011-01-15 2011-01-25 <- C 

使用T-SQL,我想创建一个新的数据集每个用户,以消除重复数据,扩展范围和消除在必要的冗余数据,从而导致这样的事情:如果需要

UserID StartDate EndDate 
====== ========== ========== 
1  2011-01-01 2011-02-15 ('A', three rows combined, extending the range) 
1  2011-02-20 2011-03-10 ('B', no change, no overlaps here) 
2  2011-01-01 2011-01-25 ('C', two rows combined) 

光标是好的,但如果我可以毫不他们做到这一点就更好了。

+0

什么SQL Server中,2005 +的版本? – RichardTheKiwi 2011-03-06 21:35:39

+0

是的,SQL Server 2005+。 – 2011-03-06 21:52:31

回答

12

对于SQL Server 2005+

-- sample table with data 
declare @t table(UserID int, StartDate datetime, EndDate datetime) 
insert @t select 
1, '20110101', '20110102' union all select 
1, '20110101', '20110110' union all select 
1, '20110108', '20110215' union all select 
1, '20110220', '20110310' union all select 
2, '20110101', '20110120' union all select 
2, '20110115', '20110125' 

-- your query starts below 

select UserID, Min(NewStartDate) StartDate, MAX(enddate) EndDate 
from 
(
    select *, 
     NewStartDate = t.startdate+v.number, 
     NewStartDateGroup = 
      dateadd(d, 
        1- DENSE_RANK() over (partition by UserID order by t.startdate+v.number), 
        t.startdate+v.number) 
    from @t t 
    inner join master..spt_values v 
     on v.type='P' and v.number <= DATEDIFF(d, startdate, EndDate) 
) X 
group by UserID, NewStartDateGroup 
order by UserID, StartDate 

注:

  1. 替换@t您的表名
+0

好东西!奇迹般有效!我将不得不考虑使用** DENSE_RANK()**,这对我来说是新的。谢谢! – 2011-03-06 21:57:40

+0

值得一提的是,对于大于'master..spt_values'返回的行数的日期跨度,这将不起作用。在这种情况下,您可以将该表交叉连接到自身以提供更大的窗口大小。 – 2016-07-22 11:28:48

+0

某些DENSE_RANK()文档:https://docs.microsoft.com/en-us/sql/t-sql/functions/dense-rank-transact-sql – Westy92 2017-07-19 22:12:50