如何提高哈希匹配的外部连接的SQL Server性能问题

我是新来的性能问题。所以我不确定我的方法应该是什么。如何提高哈希匹配的外部连接的SQL Server性能问题

这是超过7分钟运行的查询。

INSERT INTO SubscriberToEncounterMapping(PatientEncounterID, InsuranceSubscriberID) 
    SELECT 
     PV.PatientVisitId AS PatientEncounterID, 
     InsSub.InsuranceSubscriberID 
    FROM 
     DB1.dbo.PatientVisit PV 
    JOIN 
     DB1.dbo.PatientVisitInsurance PVI ON PV.PatientVisitId = PVI.PatientVisitId 
    JOIN 
     DB1.dbo.PatientInsurance PatIns on PatIns.PatientInsuranceId = PVI.PatientInsuranceId 
    JOIN 
     DB1.dbo.PatientProfile PP On PP.PatientProfileId = PatIns.PatientProfileId 
    LEFT OUTER JOIN 
     DB1.dbo.Guarantor G ON PatIns.PatientProfileId = G.PatientProfileId 
    JOIN 
     Warehouse.dbo.InsuranceSubscriber InsSub ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId 
         AND InsSub.OrderForClaims = PatIns.OrderForClaims 
         AND ((InsSub.GuarantorID = G.GuarantorId) OR (InsSub.GuarantorID IS NULL AND G.GuarantorId IS NULL)) 
    JOIN 
     Warehouse.dbo.Encounter E ON E.PatientEncounterID = PV.PatientVisitId

执行计划指出，有一个

哈希匹配右外连接，成本89％

查询

。

没有一个右外连接查询，所以我不明白问题出在哪里。

如何使查询更有效？

这里是哈希地图详情：

来源

2016-11-09 Gloria Santin

首先：我没有看到你的语句使用您在.....也行的你'SELECT'列表使用'InsSub'别名任何表：你*真的*需要加入所有这些表格才能得到这两条信息？ –

你可以显示哈希匹配的细节吗？什么是探测器，输出是什么？从屏幕截图中不清楚。我猜想这个谓词会导致你的问题 - '（InsSub.GuarantorID = G.GuarantorId）或（InsSub.GuarantorID IS NULL AND G.GuarantorId IS NULL）'，你可能想要考虑使用两个查询，并且结合结果通常当你有这样的OR或谓词时，它会导致次优计划，而且这两个单独的查询能够更好地利用索引。 – GarethD

@GarethD也许在where子句中使用EXISTS而不是在连接中使用这两个谓词？ – dfundako

要阐述我的意见，你可以尝试它分裂成两个查询，第一个匹配GuarantorID和第二匹配当它在InsuranceSubscriberNULL，并在Guarantor，或者如果记录完全丢失从Guarantor：

INSERT INTO SubscriberToEncounterMapping(PatientEncounterID, InsuranceSubscriberID) 
SELECT PV.PatientVisitId AS PatientEncounterID, InsSub.InsuranceSubscriberID 
FROM DB1.dbo.PatientVisit PV 
     JOIN DB1.dbo.PatientVisitInsurance PVI 
      ON PV.PatientVisitId = PVI.PatientVisitId 
     JOIN DB1.dbo.PatientInsurance PatIns 
      ON PatIns.PatientInsuranceId = PVI.PatientInsuranceId 
     JOIN DB1.dbo.PatientProfile PP 
      ON PP.PatientProfileId = PatIns.PatientProfileId 
     JOIN DB1.dbo.Guarantor G 
      ON PatIns.PatientProfileId = G.PatientProfileId 
     JOIN Warehouse.dbo.InsuranceSubscriber InsSub 
      ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId 
      AND InsSub.OrderForClaims = PatIns.OrderForClaims 
      AND InsSub.GuarantorID = G.GuarantorId 
     JOIN Warehouse.dbo.Encounter E 
      ON E.PatientEncounterID = PV.PatientVisitId 
UNION ALL 
SELECT PV.PatientVisitId AS PatientEncounterID, InsSub.InsuranceSubscriberID 
FROM DB1.dbo.PatientVisit PV 
     JOIN DB1.dbo.PatientVisitInsurance PVI 
      ON PV.PatientVisitId = PVI.PatientVisitId 
     JOIN DB1.dbo.PatientInsurance PatIns 
      ON PatIns.PatientInsuranceId = PVI.PatientInsuranceId 
     JOIN DB1.dbo.PatientProfile PP 
      ON PP.PatientProfileId = PatIns.PatientProfileId 
     JOIN Warehouse.dbo.InsuranceSubscriber InsSub 
      ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId 
      AND InsSub.OrderForClaims = PatIns.OrderForClaims 
      AND InsSub.GuarantorID IS NULL 
     JOIN Warehouse.dbo.Encounter E 
      ON E.PatientEncounterID = PV.PatientVisitId 
WHERE NOT EXISTS 
     ( SELECT 1 
      FROM DB1.dbo.Guarantor G 
      WHERE PatIns.PatientProfileId = G.PatientProfileId 
      AND  InsSub.GuarantorID IS NOT NULL 
     );

来源

2016-11-09 17:26:54 GarethD

这绝对快很多！但是返回的记录与原始查询不同。所以我将不得不推迟，但这绝对是要走的路！！ –

-2

的联接基础上，以减少每个返回的记录数加入的能力我会重新排序。无论哪个加入可以减少返回的数量或记录都会提高效率。然后执行外部连接。此外，表锁定总是可能是一个问题，所以添加（nolock）以防止记录被锁定。

也许像这样的东西将工作与一点点调整。

INSERT INTO SubscriberToEncounterMapping (
    PatientEncounterID 
    , InsuranceSubscriberID 
    ) 
SELECT PV.PatientVisitId AS PatientEncounterID 
    , InsSub.InsuranceSubscriberID 
FROM DB1.dbo.PatientVisit PV WITH (NOLOCK) 
INNER JOIN Warehouse.dbo.Encounter E WITH (NOLOCK) 
    ON E.PatientEncounterID = PV.PatientVisitId 
INNER JOIN DB1.dbo.PatientVisitInsurance PVI WITH (NOLOCK) 
    ON PV.PatientVisitId = PVI.PatientVisitId 
INNER JOIN DB1.dbo.PatientInsurance PatIns WITH (NOLOCK) 
    ON PatIns.PatientInsuranceId = PVI.PatientInsuranceId 
INNER JOIN DB1.dbo.PatientProfile PP WITH (NOLOCK) 
    ON PP.PatientProfileId = PatIns.PatientProfileId 
INNER JOIN Warehouse.dbo.InsuranceSubscriber InsSub WITH (NOLOCK) 
    ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId 
     AND InsSub.OrderForClaims = PatIns.OrderForClaims 
LEFT JOIN DB1.dbo.Guarantor G WITH (NOLOCK) 
    ON PatIns.PatientProfileId = G.PatientProfileId 
     AND (
      (InsSub.GuarantorID = G.GuarantorId) 
      OR (
       InsSub.GuarantorID IS NULL 
       AND G.GuarantorId IS NULL 
       ) 
      )

来源

2016-11-09 17:12:56 KH1229

添加NOLOCK如何影响执行计划中的散列连接运算符？ – dfundako

连接写入的顺序与它们被执行的顺序无关（除非你使用'OPTION（FORCEORDER）'），所以这没有任何区别。你也可以阅读[不良习惯：把NOLOCK放在任何地方]（https://blogs.sentryone.com/aaronbertrand/bad-habits-nolock-everywhere/），这不是一个神奇的性能修复，应该谨慎使用通常是由那些了解并意识到风险的人。 – GarethD

我发现连接顺序很重要，如果你想要优化器去做它的工作，那么它可以自行优化或自行优化Joins。同意没有锁可能不需要或理想，但如果有东西被锁定，它将通过防止等待锁来更快地执行。如果它不帮助删除它们。哈希匹配将始终存在，但减少操作中的记录集大小应该有所帮助。 – KH1229

如何提高哈希匹配的外部连接的SQL Server性能问题

回答

相关问题