2014-09-26 74 views
1

我正在尝试使用表中的数据,这在我看来有点不完整,我无法弄清楚如何处理这个问题或者如何开始构建问题看看我试图完成甚至可以使用SQL。这是我正在使用的数据的假想图(I进入了CSV格式的数据,因为该文本字段不支持表格格式):用于比较单个表中彼此之间的记录行的SQL查询

Date,Time,Traveler,Source,Destination,Travel Status 
9/20/2014,1:00pm,James,Station A,Station B,Scheduled 
9/20/2014,1:10pm,James,Station A,Station B,Traveling 
9/20/2014,1:40pm,James,,Station B,Arrived 
9/20/2014,1:00pm,Ann,Station B,Station A,Scheduled 
9/20/2014,1:10pm,Ann,Station B,Station A,Traveling 
9/20/2014,1:40pm,Ann,,Station A,Arrived 
9/20/2014,1:00pm,Karl,Station A,Station B,Scheduled 
9/20/2014,1:10pm,Karl,Station A,Station B,Traveling 
9/20/2014,1:40pm,Karl,,Station B,Arrived 
9/20/2014,1:00pm,Joyce,Station B,Station A,Scheduled 
9/20/2014,1:10pm,Joyce,Station B,Station A,Traveling 
9/20/2014,1:40pm,Joyce,,Station A,Arrived 
9/20/2014,1:00pm,Kelly,Station B,Station B,Scheduled 
9/20/2014,1:10pm,Kelly,Station B,Station B,Traveling 
9/20/2014,1:40pm,Kelly,,Station B,Arrived 
9/20/2014,1:00pm,Sam,Station A,Station A,Scheduled 
9/20/2014,1:10pm,Sam,Station A,Station A,Traveling 
9/20/2014,1:40pm,Sam,,Station A,Arrived 

我想看看有多少“类型“例如,我们有多少到达的类型,例如A-> A类型的多少到达,B-> B类型的数量以及A-> B和B-> A的数量。

如果数据是这样的:

Date,Time,Traveler,Source,Destination,Travel Status 
9/20/2014,1:00pm,James,Station A,Station B,Scheduled 
9/20/2014,1:10pm,James,Station A,Station B,Traveling 
9/20/2014,1:40pm,James,Station A,Station B,Arrived 
9/20/2014,1:00pm,Ann,Station B,Station A,Scheduled 
9/20/2014,1:10pm,Ann,Station B,Station A,Traveling 
9/20/2014,1:40pm,Ann,Station B,Station A,Arrived 

这个简单的查询将完成此对于每种类型的到来,即对于类型A-> B:

SELECT COUNT(*) FROM TRAVEL_TBL WHERE 
Travel Status = 'Arrived' AND Source = 'Station A' 
AND Destination = 'Station B'; 

但是,由于源字段从包含“到达”条目的记录中缺少,我如何执行查询以查找计数?我想唯一的办法是按顺序按时间顺序比较每个旅行者对每个旅行者的顺序,并跟踪何时安排行程,如果他们到达并增加此基础的计数。这是可能使用SQL,还是只能用Java编写应用程序,或者使用PHP或任何主机语言来完成逻辑?

回答

2

一个解决方案,与MS SQL 2012+的工作原理是使用LAG()函数来访问以前行:

SELECT COUNT(*) AS "Count A-B" 
FROM (
    SELECT 
     Date, Time, Traveler, 
     CASE 
      WHEN Source IS NULL THEN LAG(Source,1) OVER (PARTITION BY Date, Traveler ORDER BY Date) 
      ELSE Source 
     END AS Source, 
     Destination, 
     [Travel Status] 
from TRAVEL_TBL) derived_table 
WHERE [Travel Status] = 'Arrived' AND Source = 'Station A' AND Destination = 'Station B'; 

或使用ROW_NUMBER()(这是应该是一个功能更宽泛的版本在大多数主要数据库中可用)在一个自我加入cte:

;WITH cte AS (
    SELECT 
     Date, Time, Traveler, 
     ROW_NUMBER() OVER (ORDER BY Traveler, Date, Time) rn, 
     Source, 
     Destination, 
     [Travel Status] 
    FROM TRAVEL_TBL 
) 

SELECT COUNT(*) AS "Count A-B" 
FROM (
    SELECT 
     c.Date, c.Time, c.Traveler, 
     CASE 
      WHEN c.Source IS NULL THEN c2.source 
      ELSE c.Source 
     END AS Source, 
     c.Destination, 
     c.[Travel Status] 
    FROM cte c 
    LEFT JOIN cte c2 ON c.rn = c2.rn+1 
) derived_table 
WHERE [Travel Status] = 'Arrived' AND Source = 'Station A' AND Destination = 'Station B'; 
+0

谢谢你。为了理解这些SQL概念,我有一些工作要做,因为它们对我来说是新的,但是您已经给了我一个起点,所以我现在知道这是可行的,以及如何解决这个问题。 – rocklandcitizen 2014-10-01 16:28:12

+1

对此进行跟进。查询工作完美。我在DB2上使用它,必须做一些调整,因为实际数据与这个例子稍有不同,但我得到了我所需要的。感谢这个解决方案,并帮助我深入了解CTE,派生查询和SQL函数的SQL。 – rocklandcitizen 2014-10-27 20:30:23