2016-06-08 45 views
1

我一直在努力编写一个查询,命中我们公司的超大型数据库,以便为客户撤回最大计费金额(本例中为A和B)。我们希望为过去一个月的每位客户提供最大A/B,并为过去一年最大A/B提供最大A/B。由于大数据集和负值,Oracle查询运行非常缓慢

我们在我们的账单数据库中注意到的一个问题是它存储“取消”账单的方式。它通过将第一个计费记录的第二个负数版本添加到计费表中来实现。像这样:

enter image description here

在这种情况下,41040是不正确的法案,因此加入该纪录的负版本。但是,当我试图选择此列上的最大值时,我仍然会返回41040而不是正确的计费值50.此表似乎不会以任何方式标记这些不正确的帐单,这些帐单会使它们变得轻松过滤掉。

我现在的解决方案是将ID列的最大值作为正确的账单。这使得假设一个月输入的最终账单是正确的。

这似乎会恢复正确的数据,但查询在大型数据集上运行速度令人难以置信,而且我没有对此表的写入权限来添加或查看索引。总共有98,007,807行和1,596,491个独立客户,并且无论如何优化查询以改善性能?

select mth.KY_CUSTOMER_NO,max(QY_MTH_BILLED_A) as QY_MTH_BILLED_A, max(QY_MTH_B) as QY_MTH_BILLING_B, max.MAX_BILLING_A, max.MAX_BILLING_B 
from (
    --Get the max A/B values for the past month 
    select m.* 
    from CUSTOMER_USAGE m 
    where rev_year = to_number(to_char(sysdate,'yyyy')) 
    and rev_mth in (to_number(to_char(add_months(sysdate, -1), 'mm')),to_number(to_char(sysdate,'mm'))) 
    and ID in (select max(ID) from CUSTOMER_USAGE where KY_CUSTOMER_NO = m.KY_CUSTOMER_NO group by rev_mth, rev_year) 
) mth join 
(
    --Get the max A/B values for the past year 
    select KY_CUSTOMER_NO, max(QY_MTH_B) as MAX_BILLING_B, max(QY_MTH_BILLED_A) as MAX_BILLING_A from CUSTOMER_USAGE m 
    where DT_ADDED > current_timestamp - 365 ID in (select max(ID) from CUSTOMER_USAGE where KY_CUSTOMER_NO = m.KY_CUSTOMER_NO group by rev_mth, rev_year) 
    group by KY_CUSTOMER_NO 
) max on mth.KY_CUSTOMER_NO = max.KY_CUSTOMER_NO 
group by mth.KY_CUSTOMER_NO, max.MAX_BILLING_KVA, max.MAX_BILLING_KW 
+0

什么指数存在和什么是当前的查询计划? –

回答

1

分析函数似乎是解决方案。

我已经省去了WHERE子句,因为它们不是您的示例数据所必需的,但您应该能够将它们添加回到最内层的内联视图。您也可以使用EXTRACT(YEAR FROM SYSDATE)而不是将字符串转换为字符串。

甲骨文设置

CREATE TABLE customer_usage (id, ky_customer_no, rev_mth, rev_year, qy_mth_billed_a, qy_mth_billed_b) AS 
SELECT 1, 1, 1, 2016, 41040, 0 FROM DUAL UNION ALL 
SELECT 2, 1, 1, 2016, -41040, 0 FROM DUAL UNION ALL 
SELECT 3, 1, 1, 2016,  50, 0 FROM DUAL UNION ALL 
SELECT 4, 1, 1, 2016,  0, 0 FROM DUAL; 

查询

SELECT id, 
     ky_customer_no, 
     rev_mth, 
     rev_year, 
     qy_mth_billed_a, 
     qy_mth_billed_b 
FROM (
    SELECT c.*, 
     ROW_NUMBER() 
      OVER (PARTITION BY ky_customer_no, rev_year, rev_mth 
        ORDER BY total_mth_billed_a DESC) AS rn 
    FROM (
    SELECT c.*, 
      SUM(qy_mth_billed_a) 
      OVER (PARTITION BY ky_customer_no, rev_year, rev_mth, ABS(qy_mth_billed_a) 
        ORDER BY id DESC) AS total_mth_billed_a    
    FROM customer_usage c 
) c 
) 
WHERE rn = 1; 

输出

 ID KY_CUSTOMER_NO REV_MTH REV_YEAR QY_MTH_BILLED_A QY_MTH_BILLED_B 
---------- -------------- ---------- ---------- --------------- --------------- 
     3    1   1  2016    50    0 
0

我试过肛其他方法,但使用大多数@ MT0设置。

CREATE TABLE customer_usage (id, ky_customer_no, rev_mth, rev_year, qy_mth_billed_a, qy_mth_billed_b) AS 
SELECT 1, 1, 1, 2016, 41040, 0 FROM DUAL UNION ALL 
SELECT 2, 1, 1, 2016, -41040, 0 FROM DUAL UNION ALL 
SELECT 3, 1, 1, 2016,  50, 0 FROM DUAL UNION ALL 
SELECT 4, 1, 1, 2016,  0, 0 FROM DUAL; 

因为我们想摆脱它的ABS(这些值)相等,但有不同的符号我尝试这样做:

SELECT c.KY_CUSTOMER_NO, c.REV_MTH, c.REV_YEAR, max(qy_mth_billed_a) as qy_mth_billed_a , max(QY_MTH_BILLED_B) as qy_mth_billed_b 
    FROM (
    SELECT c.*, 
      max(qy_mth_billed_a) 
      OVER (PARTITION BY ky_customer_no, rev_year, rev_mth,ABS(qy_mth_billed_a)) AS max_mth_billed_a, 
      min(qy_mth_billed_a) 
      OVER (PARTITION BY ky_customer_no, rev_year, rev_mth,ABS(qy_mth_billed_a)) AS min_mth_billed_a  
    FROM customer_usage c 
) c where max_mth_billed_a+min_mth_billed_a!=0 
group by c.KY_CUSTOMER_NO, c.REV_MTH, c.REV_YEAR; 

是相同的,因为你所面对的一些输出性能问题,我想尝试这两种方法:

KY_CUSTOMER_NO REV_MTH REV_YEAR qy_mth_billed_a qy_mth_billed_b 
1 1 1 2016 50 0 

编辑 其实如果算上不同的SI每个abs的值(价值),这是一个奇数,我认为它会更快的工作(只需一个窗口功能)

SELECT c.KY_CUSTOMER_NO, c.REV_MTH, c.REV_YEAR, max(qy_mth_billed_a) as qy_mth_billed_a , max(QY_MTH_BILLED_B) as qy_mth_billed_b 
    FROM (
    SELECT c.*, 
      count(sign(qy_mth_billed_a)) 
      OVER (PARTITION BY ky_customer_no, rev_year, rev_mth,ABS(qy_mth_billed_a)) AS signo  
    FROM customer_usage c 
) c where mod(signo,2) =1 
group by c.KY_CUSTOMER_NO, c.REV_MTH, c.REV_YEAR