选择顶部，分组和其他与表变量相加

我看了几个其他问题，试图找到答案但我不能。这是事情，我有一个不断增长的大桌子。当我说BIG我的意思是我有大约1000万行查询限制了6个小时的数据。我们有数月的数据，所以你可以看到它有多大。选择顶部，分组和其他与表变量相加

好吧，证明大小问题，我想做一个非常简单的查询：按列分组并合计另一列的值。除此之外，我想要例如最大的10个数字，以及所有其他数字的总和不在前10名。我知道有这样做的方法，但是我想这样做，而不必计算两次总计表。为此，我使用了Table变量。我使用的是SQL Server 2012的

DECLARE @sumsTable TABLE(operationName varchar(200), operationAmount int) 
DECLARE @topTable TABLE(operationName varchar(200), operationAmount int) 
DECLARE @startTime DATETIME 
DECLARE @endTime DATETIME 
DECLARE @top INTEGER 

SET @top = 10 
SET @endTime = '03/11/2013' 
SET @startTime = '03/10/2013' 

--grouping by operationName and summing occurences 
INSERT INTO @sumsTable 
SELECT operationName, COUNT(*) AS operationAmount 
FROM [f6f87bf0-33ab-4882-8674-2cb31e5e49c4] 
WHERE (TIMESTAMP >= @startTime) AND (TIMESTAMP <= @endTime) 
GROUP BY operationName 

--selecting top ocurrences 
INSERT INTO @topTable 
SELECT TOP(@top) * FROM @sumsTable 
ORDER BY operationAmount DESC 

--Summing others and making union with top 
SELECT 'OTHER' AS operationName, SUM(operationAmount) as operationAmount FROM @sumsTable 
WHERE operationName NOT IN (SELECT operationName FROM @topTable) 
UNION 
SELECT * FROM @topTable 
ORDER BY operationAmount DESC

我的问题是合适的，这是做的一个很好的方式，如果有更好的方式，更快的方式......我该犯任何罪行？我可以摆脱表变量，而不是所有的总和多一次？

来源

2013-04-10 Marcel Batista

你可以不用临时表：用以下SQL

SET @top = 10 
SET @endTime = '03/11/2013' 
SET @startTime = '03/10/2013' 

select 
     (case when y.RowID > @top then 'OTHER' else y.operationName end) as operationName, 
     sum(y.operationAmount) as operationAmount 
from 
(
    select 
      row_number() over(order by count(*) desc) as RowID, 
      x.operationName, 
      count(*) AS operationAmount 
    from [f6f87bf0-33ab-4882-8674-2cb31e5e49c4] as x 
    where (TIMESTAMP >= @startTime) AND (TIMESTAMP <= @endTime) 
    group by x.operationName 
) 
as y 
group by (case when y.RowID > @top then 'OTHER' else y.operationName end)

来源

2013-04-11 00:57:26 outcoldman

，你只需要合计原始表一次

而不是

row_number() over(order by count(*) desc) as RowID, x.operationName, count(*) AS operationAmount

这确实COUNT（*）两次

DECLARE @startTime DATETIME 
DECLARE @endTime DATETIME 
DECLARE @top INTEGER 

SET @endTime = '03/11/2013' 
SET @startTime = '03/10/2013' 

;WITH cte AS (-- get sum for all operations 
    SELECT operationName, COUNT(*) AS operationAmount 
    FROM [f6f87bf0-33ab-4882-8674-2cb31e5e49c4] 
    WHERE (TIMESTAMP >= @startTime) AND (TIMESTAMP <= @endTime) 
    GROUP BY operationName 
), 
cte1 AS (-- rank totals 
    SELECT operationName, operationAmount, ROW_NUMBER()OVER (ORDER BY operationAmount DESC) AS RN 
    FROM cte 
) -- get top 10 and others 
SELECT (CASE WHEN RN < 10 THEN operationName ELSE 'Others' END) Name, SUM(operationAmount) 
FROM cte1 
GROUP BY (CASE WHEN RN < 10 THEN operationName ELSE 'Others' END)

来源

2013-04-11 04:25:54

选择顶部，分组和其他与表变量相加

回答

相关问题