2017-02-09 110 views
-1

形势PostgreSQL的 - 只选择1行每个ID

我工作的一个旅游引擎网站,写一个复杂的查询,以匹配基于IP地址与他们预约的访客的搜索查询目的地日期所以我可以稍后计算转换比率。

问题

需要有基于参数的多个转化率(在这种情况下,utm_source我从RequestUrl存储在搜索表中提取)。问题是有些用户从不同的位置进行多次搜索。有时我们会在请求中获得utm_source,有时候不会......并且当然我们只需要匹配一次预订。参见查询结果的截图如下,以更好地理解:

enter image description here

见第3和第4行具有为列相同的预订ID等。但不同的值。我只需要选择其中的一种,但不能同时选择两种。基本上,如果超过1,我需要选择不是“N/A”的1。

我的查询:

SELECT DISTINCT "B"."Id" AS "BookingId", "PQ"."IPAddress", "PQ"."To", "PQ"."SearchDate", "PQ"."Value" 
FROM 
(
    SELECT DISTINCT "IPAddress", "To", "CreatedAt"::date AS "SearchDate", COALESCE(SUBSTRING("RequestUrl", 'utm_source=([^&]*)'), 'N/A') AS "Value" 
    FROM dbo."PackageQueries" 
    WHERE "SiteId" = '<The ID>' 
    AND "CreatedAt" >= '<Start Date>' 
    AND "CreatedAt" < '<End Date>' 
) AS "PQ" 
INNER JOIN dbo."Bookings" AS "B" 
    ON "PQ"."IPAddress" = "B"."IPAddress" 
    AND "B"."To" = "PQ"."To" 
    AND "B"."BookingDate"::date = "PQ"."SearchDate" 
WHERE "B"."SiteId" = '<The ID>' 
AND "B"."BookingStatus" = 2 
AND "B"."BookingDate" >= '<Start Date>' 
AND "B"."BookingDate" < '<End Date>' 
ORDER BY "B"."Id", "PQ"."IPAddress", "PQ"."To"; 
+1

http://stackoverflow.com/questions/tagged/postgresql+greatest-n-per-group –

+0

@a_horse_with_no_name,谢谢你的链接..并没有这么多的downvote 。 :-D。这比那些情况稍微复杂一些。首先,我不能仅仅通过一些可用的整数或日期/时间值来排序,因此我认为它不值得投票表决,但这样做是可以的。我找到了一个解决方案,我会在一会儿发布自己的答案... – Matt

+0

我没有downvote –

回答

0

我找到了解决办法,并根据它什么我发现这里:Return rows that are max of one column in Postgresql这里:Postgres CASE in ORDER BY using an alias

我的解决方案如下:

SELECT "BookingId", "IPAddress", "To", "SearchDate", "Value" 
FROM 
(
    SELECT DISTINCT 
     "B"."Id" AS "BookingId", 
     "PQ"."IPAddress", 
     "PQ"."To", 
     "PQ"."SearchDate", 
     "PQ"."Value", 
     RANK() OVER 
     (
      PARTITION BY "B"."Id" 
      ORDER BY 
      CASE 
       WHEN "PQ"."Value" = 'N/A' THEN 1 
       ELSE 0 
      END 
     ) AS "RowNumber" 
    FROM 
    (
     SELECT DISTINCT "IPAddress", "To", "CreatedAt"::date AS "SearchDate", COALESCE(SUBSTRING("RequestUrl", 'utm_source=([^&]*)'), 'N/A') AS "Value" 
     FROM dbo."PackageQueries" 
     WHERE "SiteId" = '<Site ID>' 
     AND "CreatedAt" >= '<Start Date>' 
     AND "CreatedAt" < '<End Date>' 
    ) AS "PQ" 
    INNER JOIN dbo."Bookings" AS "B" 
     ON "PQ"."IPAddress" = "B"."IPAddress" 
     AND "B"."To" = "PQ"."To" 
     AND "B"."BookingDate"::date = "PQ"."SearchDate" 
    WHERE "B"."SiteId" = '<Site ID>' 
    AND "B"."BookingStatus" = 2 
    AND "B"."BookingDate" >= '<Start Date>' 
    AND "B"."BookingDate" < '<End Date>' 
) T 
WHERE "RowNumber" = 1 
ORDER BY "BookingId", "IPAddress", "To"; 

有点啰嗦,但它很好地诀窍。我希望它能帮助别人。

编辑

这不是故事的结局:仍有一些案件中,我得到超过1倍的值。答案是修改CASE语句,为每个文本值生成一个唯一的编号。该解决方案可以在这里找到:PostgreSQL - Assign integer value to string in case statement