2016-09-28 72 views
0

我现在使用与其样本数据库“的AdventureWorks 2014” mssql的,在这里我遇到了一些问题,加入和总和,这里是两个表我用:如何避免和多次同时使用加入

PurchaseOrderHeader: 
PurchaseOrderID VendorID OrderDate    TotalDue 
1    1580  2011-04-16 00:00:00.000 222.1492 
2    1496  2011-04-16 00:00:00.000 300.6721 
3    1494  2011-04-16 00:00:00.000 9776.2665 
4    1650  2011-04-16 00:00:00.000 189.0395 
5    1654  2011-04-30 00:00:00.000 22539.0165 
6    1664  2011-04-30 00:00:00.000 16164.0229 
7    1678  2011-04-30 00:00:00.000 64847.5328 

PurchaseOrderDetail: 
PurchaseOrderID PurchaseOrderDetailID OrderQty ProductID 
1    1      4   1 
2    2      3   359 
2    3      3   360 
3    4      550   530 
4    5      3   4 
5    6      550   512 
6    7      550   513 
7    8      550   317 
7    9      550   318 
7    10      550   319 

下面是SQL脚本:

CREATE TABLE PurchaseOrderHeader(
    PurchaseOrderID INTEGER NOT NULL PRIMARY KEY 
    ,VendorID  INTEGER NOT NULL 
    ,OrderDate  VARCHAR(23) NOT NULL 
    ,TotalDue  NUMERIC(10,4) NOT NULL 
); 
INSERT INTO PurchaseOrderHeader(PurchaseOrderID,VendorID,OrderDate,TotalDue) VALUES (1,1580,'2011-04-16 00:00:00.000',222.1492); 
INSERT INTO PurchaseOrderHeader(PurchaseOrderID,VendorID,OrderDate,TotalDue) VALUES (2,1496,'2011-04-16 00:00:00.000',300.6721); 
INSERT INTO PurchaseOrderHeader(PurchaseOrderID,VendorID,OrderDate,TotalDue) VALUES (3,1494,'2011-04-16 00:00:00.000',9776.2665); 
INSERT INTO PurchaseOrderHeader(PurchaseOrderID,VendorID,OrderDate,TotalDue) VALUES (4,1650,'2011-04-16 00:00:00.000',189.0395); 
INSERT INTO PurchaseOrderHeader(PurchaseOrderID,VendorID,OrderDate,TotalDue) VALUES (5,1654,'2011-04-30 00:00:00.000',22539.0165); 
INSERT INTO PurchaseOrderHeader(PurchaseOrderID,VendorID,OrderDate,TotalDue) VALUES (6,1664,'2011-04-30 00:00:00.000',16164.0229); 
INSERT INTO PurchaseOrderHeader(PurchaseOrderID,VendorID,OrderDate,TotalDue) VALUES (7,1678,'2011-04-30 00:00:00.000',64847.5328); 


     CREATE TABLE PurchaseOrderDetail(
    PurchaseOrderID  INTEGER NOT NULL 
    ,PurchaseOrderDetailID INTEGER NOT NULL PRIMARY KEY 
    ,OrderQty    INTEGER NOT NULL 
    ,ProductID    INTEGER NOT NULL 
); 
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (1,1,4,1); 
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (2,2,3,359); 
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (2,3,3,360); 
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (3,4,550,530); 
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (4,5,3,4); 
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (5,6,550,512); 
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (6,7,550,513); 
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (7,8,550,317); 
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (7,9,550,318); 
INSERT INTO PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID,OrderQty,ProductID) VALUES (7,10,550,319); 

这里是我的代码:

select PurchaseOrderHeader.VendorID, 
SUM(CASE WHEN Datename(year,PurchaseOrderHeader.OrderDate) = 2011 THEN PurchaseOrderHeader.TotalDue else 0 END) as "TotalPay IN 2011", 
SUM(CASE WHEN Datename(year,PurchaseOrderHeader.OrderDate) = 2011 THEN PurchaseOrderDetail.OrderQty else 0 END) as "TotalOrder IN 2011" 
from PurchaseOrderHeader 
left join PurchaseOrderDetail on PurchaseOrderHeader.PurchaseOrderID = PurchaseOrderDetail.PurchaseOrderID 
group by PurchaseOrderHeader.VendorID 
order by VendorID 

这里是什么我:

VendorID TotalPay IN 2011 TotalOrder IN 2011 
1494  9776.2665   550 
1496  601.3442   6 
1580  222.1492   4 
1650  189.0395   3 
1654  22539.0165   550 
1664  16164.0229   550 
1678  194542.5984   1650 

,而我应该期待:

VendorID TotalPay IN 2011 TotalOrder IN 2011 
1494  9776.2665   550 
1496  300.6721   6 
1580  222.1492   4 
1650  189.0395   3 
1654  22539.0165   550 
1664  16164.0229   550 
1678  64847.5328   1650 

此代码将加入两个表上PurchaseOrderID,并计算由厂商ID分组的TotalDue。问题是当我使用join时,表PurchaseOrderDetail中的多行将引用表PurchaseOrderHeader中的一行。在供应商1496和1678的这个示例中,有两个或三个行引用PurchaseDetailHeader中的一行。所以它会被添加两到三次。我应该如何避免多次添加,谢谢!

+0

尝试过 – ad4s

+0

需要您给我们提供一些更详细的PurchaseOrderID分组。你说的不清楚。 http://spaghettidba.com/2015/04/24/how-to-post-at-sql-question-on-a-public-forum/ –

+0

谢谢,我将在稍后编辑问题以符合标准@ Sean Lange –

回答

0
select h.VendorID, 
     SUM(CASE WHEN Datename(year,h.OrderDate) = 2011 THEN h.TotalDue else 0 END) as "TotalPay IN 2011", 
     SUM(CASE WHEN Datename(year,h.OrderDate) = 2011 THEN d.OrderQty else 0 END) as "TotalOrder IN 2011" 
from PurchaseOrderHeader h 
left join (
     select t.PurchaseOrderID, 
       sum(t.OrderQty) as OrderQty 
     from PurchaseOrderDetail t 
     group by t.PurchaseOrderID 
     ) d on d.PurchaseOrderID = h.PurchaseOrderID 
group by h.VendorID 
order by VendorID 
+0

尽管此代码可能有助于解决问题,但并未解释_why_和/或_how_它回答了这个问题。提供这种附加背景将显着提高其长期教育价值。请[编辑]您的答案以添加解释,包括适用的限制和假设。 –

+0

谢谢!这是正确的答案 –

1

你可以把你的SUM和除以COUNT。像这样的东西。

select PurchaseOrderHeader.VendorID, 
SUM(CASE WHEN Datename(year,PurchaseOrderHeader.OrderDate) = 2011 THEN PurchaseOrderHeader.TotalDue else 0 END)/COUNT(*) as "TotalPay IN 2011", 
SUM(CASE WHEN Datename(year,PurchaseOrderHeader.OrderDate) = 2011 THEN PurchaseOrderDetail.OrderQty else 0 END)/COUNT(*) as "TotalOrder IN 2011" 
from Purchasing.PurchaseOrderHeader 
left join Purchasing.PurchaseOrderDetail on PurchaseOrderHeader.PurchaseOrderID = PurchaseOrderDetail.PurchaseOrderID 
group by PurchaseOrderHeader.VendorID 
order by VendorID 
+0

我曾想过关于它,但计数可能不会按照我预期的方式使用,因为计数会一直计数,而不是确切的时间。 –

+0

你的确切时间是什么意思?这对我来说根本没有意义。时间与它有什么关系? –

+0

对不起,我没有说清楚,我在我的问题中编辑过表格。根据我的理解,'count(*)'将计算vendorID在vendorID中出现的时间。但是这张表有不同的PurchaseOrderID和相同的vendorID的多个订单。因此,vendorID可以使用不同的orderdate和PurchaseOrderID多次应用。因此,如果使用count(*),它将计算所有vendorID,即使采用不相关的PurchaseOrderID,这也不是我们所期望的。@ Sean Lange –

0

默认的方式,以避免双重计算是使用SUM(DISTINCT expr)

这并不总是工作得很好,因为您不希望对不同的值进行求和,但要求对不同的进行求和,即使这些行共享相同的值。

解决方法是使用子查询来汇总订单号上的详细信息,然后加入结果。然后,你只有一个总每份订单ID与订单行加入:

SELECT PurchaseOrderHeader.VendorID, 
    SUM(PurchaseOrderHeader.TotalDue) AS "TotalPay IN 2011", 
    SUM(POD.Qty) AS "TotalOrder IN 2011" 
FROM PurchaseOrderHeader 
LEFT JOIN (
    SELECT PurchaseOrderDetail.PurchaseOrderID, SUM(OrderQty) AS Qty 
    FROM PurchaseOrderDetail 
    GROUP BY PurchaseOrderDetail.PurchaseOrderID 
    ) AS POD on PurchaseOrderHeader.PurchaseOrderID = POD.PurchaseOrderID 
WHERE Datename(year,PurchaseOrderHeader.OrderDate) = 2011 
GROUP BY PurchaseOrderHeader.VendorID 
ORDER BY VendorID 

而且我把自由删除从SUM()CASE WHEN语句来查询的WHERE一部分。在这种情况下,应该使用较短的代码给出相同的结果。

0

很多好的答案,但我认为他们错过了供应商可能有多个采购订单的位,并且抛出了TotalOrder如何计算。 (尝试多个订单的多个供应商的示例,每个订单都有多个详细信息。)不要忘记检查可能的NULL值!

在这里,我使用子查询为当年的每个供应商计算TotalPay,然后将其返回到所有供应商的列表。 (加送表别名为好,易于识别。)

-- As a subquery 
SELECT 
    hd.VendorID, 
    ,sum(case 
     when year(hd.OrderDate) = 2011 then hd.TotalDue 
     else 0 
     end)        as "TotalPay IN 2011" 
    ,isnull(subQuery.TotaOrderIn2011, 0) as "TotalOrder IN 2011" 
from PurchaseOrderHeader hd 
    left join (-- Calculate volume by vendor for 2011 
      select 
       hd.VendorID 
       ,sum(OrderQty) TotalOrderIn2011 
       from PurchaseOrderHeader hd 
       inner join PurchaseOrderDetail dt 
       on hd.PurchaseOrderID = dt.PurchaseOrderID 
       where year(hd.OrderDate) = 2011 
       group by 
       hd.VendorID 
    ) subQuery 
    on subQuery.VendorId = hd.VendorId 
group by hd.VendorID 
order by hd.VendorID