2011-04-26 77 views
0

我有一个表叫1主要联系人(“主”),和其他几个表。他们都有名为Contact_ID(或referral_ID)的列,它们将其挂钩。这是所有设置的一个问题是,“Main”中的记录可以链接到引用表中的多个记录,因此在运行查询以通过“Contact_Source”列中的联系人获取重复记录推荐表。按层次结构复制记录?

我创建了一个视图,当我从一个网站运行查询时,我可以从中选择视图,因此我正在从与正确情况相关的数据中提取数据。我也通过他们的“联系人源码”运行此查询。我之前发布了2个问题(herehere),以避免重复记录。我有这个工作。

这里是最后的代码,我得让我获得重复的记录:

ALTER VIEW dbo.v_angelview AS 

WITH q AS 
     (
     SELECT dbo.[1_MAIN - Contacts].Contact_ID, dbo.[1_MAIN - Contacts].Date_entered_into_Database, dbo.[1_MAIN - Contacts].Date_of_Initial_Contact, 
         dbo.[1_MAIN - Contacts].[Company_ Name], dbo.[1_MAIN - Contacts].Key_Contact_Title, dbo.[1_MAIN - Contacts].Key_Contact_First_Name, 
         dbo.[1_MAIN - Contacts].Key_Contact_Middle, dbo.[1_MAIN - Contacts].Key_Contact_Last_Name, dbo.[1_MAIN - Contacts].Key_Credential, 
         dbo.[1_MAIN - Contacts].Key_Contact_Occupation, dbo.[1_MAIN - Contacts].Key_Degree_1, dbo.[1_MAIN - Contacts].Key_Degree_2, 
         dbo.[1_MAIN - Contacts].Key_Degree_3, dbo.[1_MAIN - Contacts].Date_of_Highest_Degree, dbo.[1_MAIN - Contacts].Work_Setting, 
         dbo.[1_MAIN - Contacts].Website_Address, dbo.[1_MAIN - Contacts].Email_1_Key_Contact, dbo.[1_MAIN - Contacts].Email_2, 
         dbo.[1_MAIN - Contacts].Email_3, dbo.[1_MAIN - Contacts].Day_Time_Phone_Number, dbo.[1_MAIN - Contacts].Extension, 
         dbo.[1_MAIN - Contacts].Mobile_Phone_Number, dbo.[1_MAIN - Contacts].Bus_Fax_Number, dbo.[1_MAIN - Contacts].Home_Phone_Number, 
         dbo.[1_MAIN - Contacts].Home_Fax_Number, dbo.[1_MAIN - Contacts].Mailing_Street_1, dbo.[1_MAIN - Contacts].Mailing_Street_2, 
         dbo.[1_MAIN - Contacts].Mailing_City, dbo.[1_MAIN - Contacts].Mailing_State, dbo.[1_MAIN - Contacts].[Mailing_Zip/Postal], 
         dbo.[1_MAIN - Contacts].Mailing_Country, dbo.[1_MAIN - Contacts].[Bad_Address?], dbo.[1_MAIN - Contacts].[PROV/REG?], 
         dbo.[1_MAIN - Contacts].status_flag, dbo.[1_MAIN - Contacts].status_flag AS status_flag2, dbo.Providers.Referral_Source, dbo.Referral.Contact_Source, 
         dbo.Resource_Center.cert_start_date, dbo.Resource_Center.cert_exp_date, dbo.prov_training_records.Contact_ID AS Expr2, 
         dbo.prov_training_records.date_reg_email_sent, dbo.Resource_Center.access, dbo.Providers.Contact_ID AS Expr1, 
       ROW_NUMBER() OVER (PARTITION BY dbo.[1_MAIN - Contacts].Contact_ID ORDER BY dbo.[1_MAIN - Contacts].Contact_ID) AS rn 
     FROM dbo.[1_MAIN - Contacts] 
     INNER JOIN 
       dbo.Referral 
     ON  dbo.[1_MAIN - Contacts].Contact_ID = dbo.Referral.Referral_ID 
     INNER JOIN 
       dbo.prov_training_records 
     ON  dbo.[1_MAIN - Contacts].Contact_ID = dbo.prov_training_records.Contact_ID 
     LEFT OUTER JOIN 
       dbo.Resource_Center 
     ON  dbo.[1_MAIN - Contacts].Contact_ID = dbo.Resource_Center.Contact_ID 
     FULL OUTER JOIN 
       dbo.Providers 
     ON  dbo.[1_MAIN - Contacts].Contact_ID = dbo.Providers.Contact_ID 
WHERE (dbo.[1_MAIN - Contacts].Mailing_State = N'AL') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'FL') OR 

         (dbo.[1_MAIN - Contacts].Mailing_State = N'GA') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'KY') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'MS') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'NC') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'SC') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'TN') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'PR') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'CO') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'MT') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'ND') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'SD') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'UT') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'WY') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'AR') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'LA') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'NM') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'OK') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'TX') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'AZ') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'CA') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'HI') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'ID') OR 
         (dbo.[1_MAIN - Contacts].Mailing_State = N'NV') 
     ) 
SELECT * 
FROM q 
WHERE rn = 1 

现在我面临的问题是,我需要设置优先级,其中重复记录我保持的,哪些得到除去。

这是优先从最先例的层次结构:

  1. Contact_Source = 'PROVIDER'
  2. Contact_Source LIKE 'RG_%'
  3. Contact_Source LIKE 'IN_%'
  4. Contact_Source LIKE 'LD_%'

一条记录可能与其中的一条或全部相关联。所以,例如,如果一条记录有PROVIDER和RG_Train,我想用PROVIDER保存记录。依此类推。同样,所有记录都有一个Contact_ID,这就是我可以告诉它有重复的地方。

有没有办法修改我现有的SQL来做到这一点,还是需要一个新的方法?如果是这样,我如何根据我的优先级列表来删除重复的记录?

我正在使用SQL Server 2005.

在此先感谢!

回答

1

尝试...

rn = ROW_NUMBER() 
OVER 
(
    PARTITION BY dbo.[1_MAIN - Contacts].Contact_ID 
    ORDER BY 
    CASE 
    WHEN Contact_Source = 'PROVIDER' 
    THEN 1 

    WHEN Contact_Source LIKE 'RG_%' 
    THEN 2 

    WHEN Contact_Source LIKE 'IN_%' 
    THEN 3 

    WHEN Contact_Source LIKE 'LD_%' 
    THEN 4 

    ELSE 5 
    END 
) 
+0

伟大的作品,谢谢! – UpHelix 2011-04-26 16:28:18

0

这不趁您使用ROW_NUMBER函数的,但我相信这会工作:

with q as (
    -- <query> 
), 
cp as (
    select 1 as Precedence, 'PROVIDER' ContactSourcePattern 
    union all select 2, 'RG_%' 
    union all select 3, 'IN_%' 
    union all select 4, 'LD_%' 
) 
select q.* 
from q 
inner join cp on q.Contact_Source like cp.ContactSourcePattern 
where 
    -- filter out duplicate records with the same `Contact_ID` that have a lower precedence than other records 
    not exists (
     select 1 
     from q as q2 inner join cp as cp2 on q2.Contact_Source like cp2.ContactSourcePattern 
     where 
      q2.Contact_ID = q.Contact_ID -- q2 is a duplicate of q if `Contact_ID` matches 
      and cp2.Precedence < cp.Precedence -- q2/cp2 is higher precedence than q/cp if `Precedence` is a smaller number 
    )