2013-04-10 52 views
1

我看了几个其他问题,试图找到答案但我不能。这是事情,我有一个不断增长的大桌子。当我说BIG我的意思是我有大约1000万行查询限制了6个小时的数据。我们有数月的数据,所以你可以看到它有多大。选择顶部,分组和其他与表变量相加

好吧,证明大小问题,我想做一个非常简单的查询:按列分组并合计另一列的值。除此之外,我想要例如最大的10个数字,以及所有其他数字的总和不在前10名。我知道有这样做的方法,但是我想这样做,而不必计算两次总计表。为此,我使用了Table变量。我使用的是SQL Server 2012的

DECLARE @sumsTable TABLE(operationName varchar(200), operationAmount int) 
DECLARE @topTable TABLE(operationName varchar(200), operationAmount int) 
DECLARE @startTime DATETIME 
DECLARE @endTime DATETIME 
DECLARE @top INTEGER 

SET @top = 10 
SET @endTime = '03/11/2013' 
SET @startTime = '03/10/2013' 

--grouping by operationName and summing occurences 
INSERT INTO @sumsTable 
SELECT operationName, COUNT(*) AS operationAmount 
FROM [f6f87bf0-33ab-4882-8674-2cb31e5e49c4] 
WHERE (TIMESTAMP >= @startTime) AND (TIMESTAMP <= @endTime) 
GROUP BY operationName 

--selecting top ocurrences 
INSERT INTO @topTable 
SELECT TOP(@top) * FROM @sumsTable 
ORDER BY operationAmount DESC 

--Summing others and making union with top 
SELECT 'OTHER' AS operationName, SUM(operationAmount) as operationAmount FROM @sumsTable 
WHERE operationName NOT IN (SELECT operationName FROM @topTable) 
UNION 
SELECT * FROM @topTable 
ORDER BY operationAmount DESC 

我的问题是合适的,这是做的一个很好的方式,如果有更好的方式,更快的方式......我该犯任何罪行?我可以摆脱表变量,而不是所有的总和多一次?

回答

2

你可以不用临时表:用以下SQL

SET @top = 10 
SET @endTime = '03/11/2013' 
SET @startTime = '03/10/2013' 

select 
     (case when y.RowID > @top then 'OTHER' else y.operationName end) as operationName, 
     sum(y.operationAmount) as operationAmount 
from 
(
    select 
      row_number() over(order by count(*) desc) as RowID, 
      x.operationName, 
      count(*) AS operationAmount 
    from [f6f87bf0-33ab-4882-8674-2cb31e5e49c4] as x 
    where (TIMESTAMP >= @startTime) AND (TIMESTAMP <= @endTime) 
    group by x.operationName 
) 
as y 
group by (case when y.RowID > @top then 'OTHER' else y.operationName end) 
0

,你只需要合计原始表一次

而不是

row_number() over(order by count(*) desc) as RowID, x.operationName, count(*) AS operationAmount 

这确实COUNT(*)两次

DECLARE @startTime DATETIME 
DECLARE @endTime DATETIME 
DECLARE @top INTEGER 

SET @endTime = '03/11/2013' 
SET @startTime = '03/10/2013' 

;WITH cte AS (-- get sum for all operations 
    SELECT operationName, COUNT(*) AS operationAmount 
    FROM [f6f87bf0-33ab-4882-8674-2cb31e5e49c4] 
    WHERE (TIMESTAMP >= @startTime) AND (TIMESTAMP <= @endTime) 
    GROUP BY operationName 
), 
cte1 AS (-- rank totals 
    SELECT operationName, operationAmount, ROW_NUMBER()OVER (ORDER BY operationAmount DESC) AS RN 
    FROM cte 
) -- get top 10 and others 
SELECT (CASE WHEN RN < 10 THEN operationName ELSE 'Others' END) Name, SUM(operationAmount) 
FROM cte1 
GROUP BY (CASE WHEN RN < 10 THEN operationName ELSE 'Others' END)