2017-12-27 322 views
0

在我的表中的数据是这样的:司 - SQL

date, app, country, sales 
2017-01-01,XYZ,US,10000 
2017-01-01,XYZ,GB,2000 
2017-01-02,XYZ,US,30000 
2017-01-02,XYZ,GB,1000 

我需要找到,对于每个应用程序每天的基础上,美国销售的国标销售的比例,因此,最好结果是这样的:

date, app, ratio 
2017-01-01,XYZ,10000/2000 = 5 
2017-01-02,XYZ,30000/1000 = 30 

我目前倾倒一切都变成CSV和在Python离线做我的计算,但我想一切都移动到SQL侧。一个办法是给每个国家聚集成一个子查询,加入和再划分,如

select d1_us.date, d1_us.app, d1_us.sales/d1_gb.sales from 
(select date, app, sales from table where date between '2017-01-01' and '2017-01-10' and country = 'US') as d1_us 
join 
(select date, app, sales from table where date between '2017-01-01' and '2017-01-10' and country = 'GB') as d1_gb 
on d1_us.app = d1_gb.app and d1_us.date = d1_gb.date 

有一个不太混乱的方式去这样做呢?

回答

5

您可以在查询中使用SUM(CASE WHEN)和GROUP BY的比例来执行此操作,而无需子查询。

SELECT DATE, 
     APP, 
     SUM(CASE WHEN COUNTRY = 'US' THEN SALES ELSE 0 END)/
     SUM(CASE WHEN COUNTRY = 'GB' THEN SALES END) AS RATIO  
FROM TABLE1 
GROUP BY DATE, APP; 

依据GB销售额为零的可能性,你可以调整GB的ELSE条件,也许ELSE 1,零错误,避免鸿沟。这实际上取决于你想如何处理异常。

+0

不要你所需要的'ELSE 0'对第二种情况? –

+3

@JuanCarlosOropeza。 。 。一点也不。这个表述就是你如何避免被零除。 –

+1

如果没有最终将整个表达式评估为NULL的GB数据,则当前公式将分母评估为NULL。没有抛出零分的错误。 – Vashi

0

您可以使用分组一个查询,一旦提供了条件:

SELECT date, app, 
     SUM(CASE WHEN country = 'US' THEN SALES ELSE 0 END)/
     SUM(CASE WHEN country = 'GB' THEN SALES END) AS ratio 
WHERE date between '2017-01-01' AND '2017-01-10' 
FROM your_table 
GROUP BY date, app; 

然而,这给你零,如果有美国没有记录和NULL如果对于国标中没有记录。如果您需要为这些情况返回不同的值,则可以在该部门周围使用另一个CASE WHEN。例如,返回-1和-2分别,你可以使用:

SELECT date, app, 
     CASE WHEN COUNT(CASE WHEN country = 'US' THEN 1 ELSE 0 END) = 0 THEN -1 
      WHEN COUNT(CASE WHEN country = 'GB' THEN 1 ELSE 0 END) = 0 THEN -2 
      ELSE SUM(CASE WHEN country = 'US' THEN SALES ELSE 0 END)/
       SUM(CASE WHEN country = 'GB' THEN SALES END) 
      END AS ratio 
WHERE date between '2017-01-01' AND '2017-01-10' 
FROM your_table 
GROUP BY date, app; 
0
DROP TABLE IF EXISTS t; 
CREATE TABLE t (
    date DATE, 
    app VARCHAR(5), 
    country VARCHAR(5), 
    sales DECIMAL(10,2) 
); 

INSERT INTO t VALUES 
    ('2017-01-01','XYZ','US',10000), 
    ('2017-01-01','XYZ','GB',2000), 
    ('2017-01-02','XYZ','US',30000), 
    ('2017-01-02','XYZ','GB',1000); 


WITH q AS (
    SELECT 
     date, 
     app, 
     country, 
     SUM(sales) AS sales 
    FROM t 
    GROUP BY date, app, country 
) SELECT 
    q1.date, 
    q1.app, 
    q1.country || ' vs ' || NVL(q2.country,'-') AS ratio_between, 
    CASE WHEN q2.sales IS NULL OR q2.sales = 0 THEN 0 ELSE ROUND(q1.sales/q2.sales, 2) END AS ratio 
    FROM q AS q1 
    LEFT JOIN q AS q2 ON q2.date = q1.date AND 
        q2.app = q1.app AND 
        q2.country != q1.country 
    -- WHERE q1.country = 'US' 
    ORDER BY q1.date; 

结果对任何国家VS任何国家(WHERE q1.country = '美国' 被注释掉)

date,app,ratio_between,ratio 
2017-01-01,XYZ,GB vs US,0.20 
2017-01-01,XYZ,US vs GB,5.00 
2017-01-02,XYZ,GB vs US,0.03 
2017-01-02,XYZ,US vs GB,30.00 

结果美国VS任何其他国家(WHERE q1.country = 'US' 未注释)

date,app,ratio_between,ratio 
2017-01-01,XYZ,US vs GB,5.00 
2017-01-02,XYZ,US vs GB,30.00 

诀窍在JOIN子句中。 按日期,应用程序和国家/地区汇总数据的子查询q的结果与结果本身结合,但与日期和应用程序结合在一起。

这样,对于每个日期,应用和国家,我们都会在同一日期和应用中与任何其他国家/地区“匹配”。通过添加q1.country!= q2.country,我们排除同一个国家的结果,与*下方突出

date,app,country,sales,date,app,country,sales 
*2017-01-01,XYZ,GB,2000.00,2017-01-01,XYZ,GB,2000.00* 
2017-01-01,XYZ,GB,2000.00,2017-01-01,XYZ,US,10000.00 
2017-01-01,XYZ,US,10000.00,2017-01-01,XYZ,GB,2000.00 
*2017-01-01,XYZ,US,10000.00,2017-01-01,XYZ,US,10000.00* 
2017-01-02,XYZ,GB,1000.00,2017-01-02,XYZ,US,30000.00 
*2017-01-02,XYZ,GB,1000.00,2017-01-02,XYZ,GB,1000.00* 
*2017-01-02,XYZ,US,30000.00,2017-01-02,XYZ,US,30000.00* 
2017-01-02,XYZ,US,30000.00,2017-01-02,XYZ,GB,1000.00