2016-09-19 106 views
0

我正在使用SQL Server 2014并利用由Microsoft提供的AdventureWorks2012示例数据库。使用子查询删除重复行

我试图删除使用下面的子查询(选项#2)重复行:

/*选项#2:SUBQUERY */

--SELECT * FROM 
DELETE SQLPractice.[dbo].[CURRENCY] 
WHERE EXISTS (SELECT * 
       FROM 
        (SELECT 
         NAME, 
         ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY NAME) AS Flag 
        FROM 
         SQLPractice.[dbo].[CURRENCY]) AS T 
       WHERE Flag > 1) 
GO 

但它会删除所有行从桌子上。

但是另一个选项(CTE)确实只删除重复的行。

/*** Option #3: CTE ***/ 
;WITH RepFlag AS 
(
    SELECT 
     NAME, 
     ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY NAME) AS Flag 
    FROM 
     SQLPractice.[dbo].[CURRENCY] 
) 
--SELECT * FROM RepFlag 
DELETE RepFlag 
WHERE Flag > 1 

SELECT * 
FROM SQLPractice.[dbo].[CURRENCY] 

请使用下面的代码来创建您自己的测试表。

/*** REMOVING DUPLICATE ROWS OPTION ***/ 
-- Creating a table 
SELECT TOP 0 * 
INTO [dbo].[CURRENCY] 
FROM AdventureWorks2012.Sales.Currency 
WHERE NAME LIKE '%A'; 

-- inserting duplicate rows 
INSERT [dbo].[CURRENCY] 
SELECT * FROM AdventureWorks2012.Sales.Currency 
WHERE NAME LIKE '%A'; 

/***** SELECTING COUNT OF DUPLICATED ROWS *****/ 

/*** Option #1: "GROUP BY" with "HAVING" ***/ 
SELECT 
    NAME, COUNT(*) AS Qty 
FROM 
    SQLPractice.[dbo].[CURRENCY] 
GROUP BY 
    NAME 
HAVING 
    COUNT(*) >1 
GO 

回答

1

选项#2删除所有行,因为里面EXISTS子查询将总是返回行的表的所有行。 EXISTS中的子查询与父查询之间必须存在某种关系。子查询必须根据表的每一行生成不同的结果。一个选项删除使用子查询时,表中有一个标识山坳是重复的行:

DELETE from SQLPractice.[dbo].[CURRENCY] 
where identityCol not in (select min(identityCol) FROM SQLPractice.[dbo].[CURRENCY] GROUP BY NAME) 
+0

是的,非常感谢。我想过这个问题。只是想知道如何在不改变表格定义的情况下绕过它。 – enigma6205

+0

你可以用cte –

+0

我可以。我用它。如上面的代码所示。只是想探索不同的选择。 – enigma6205

0

随着语句,因为它收集所有重复的记录,然后再执行删除操作只删除重复的行。

虽然在子查询你有没有指定的条件在其上记录您要删除,它应该写成如下:

DELETE SQLPractice.[dbo].[CURRENCY] 
WHERE EXISTS 
(
    SELECT * FROM 
    (
     SELECT 
     NAME, 
     ID, 
     ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY NAME) AS Flag 
     FROM SQLPractice.[dbo].[CURRENCY] 
    ) AS T 
    WHERE Flag > 1 AND T.ID=[CURRENCY].ID 
) 
+0

不会这也删除发生多次的货币的所有行? –

+0

它会删除重复的项目,因此每个货币都有一个记录 –

+0

例如,如果“美元”出现两次,那么内部查询将为这两行返回行。所以它们都将被删除 –

1

一个可能的方法:

DELETE tt 
FROM [your table] tt 
    INNER JOIN 

    (SELECT NAME, MIN(PK) AS MIN_KEY) 
    FROM [your table] 
    GROUP BY Name 
    HAVING COUNT(*) > 1) dup ON dup.name = tt.name and tt.PK <> dup.MIN_KEY 
+0

谢谢安东为你的解决方案,但它不会工作,因为我的表没有主键。基本上你提出类似于Akshey的解决方案。 – enigma6205

+1

如果你没有PK,那么你可以使用游标或“WHILE loop + temp table”。因此,对于每个重复的名称,执行“DELETE TOP(xxx)...”,其中xxx是“[当前名称的重复次数] - 1”。也可以使用SET ROWCOUNT代替DELETE TOP – Anton

+1

或者,您可以将不同的行(仅限于重复项)复制到临时表中,删除所有重复项,并从临时表中重新插入数据。 – Anton

0

你可以试试这个查询只是重复的记录将被删除我做了这一个基于货币重复值它删除所有重复值

delete from test where currency in(select currency from test group by currency having count(*) >1)

+0

谢谢,但这会删除所有行。所以,这是行不通的。 – enigma6205

1

在您的示例中,Row_Number()不会帮助您解决问题。 因为重复的行甚至在作为货币代码的主键(候选字段)中也是相同的,因为您只需将相同的行插入到目标表中,ModifiedDate字段也是相同的。

对于样品的情况下,你可以申请在delete duplicate rows where no primary key exists

可以测试并看到下面的DELETE命令将删除表中

delete [dbo].[CURRENCY] 
from [dbo].[CURRENCY] 
inner join (
    select ROW_NUMBER() over (partition by CurrencyCode order by ModifiedDate) rn, CurrencyCode, ModifiedDate from [dbo].[CURRENCY] 
) dublicates 
    on dublicates.CurrencyCode = [dbo].[CURRENCY].CurrencyCode and 
     dublicates.ModifiedDate = [dbo].[CURRENCY].ModifiedDate 
where dublicates.rn > 1 

例如从教程中的所有行描述的解决方案,光标方法被提及 如果你想使用删除重复的名称可以使用以下

DECLARE @Count int 
DECLARE @CurrencyCode varchar(10) 
DECLARE @ModifiedDate datetime 

DECLARE dublicate_cursor CURSOR FAST_FORWARD FOR 
SELECT CurrencyCode, ModifiedDate, Count(*) - 1 
FROM CURRENCY 
GROUP BY CurrencyCode, ModifiedDate 
HAVING Count(*) > 1 

OPEN dublicate_cursor 

FETCH NEXT FROM dublicate_cursor INTO @CurrencyCode, @ModifiedDate, @Count 

WHILE @@FETCH_STATUS = 0 
BEGIN 

SET ROWCOUNT @Count 
DELETE FROM CURRENCY WHERE CurrencyCode = @CurrencyCode AND ModifiedDate = @ModifiedDate 
SET ROWCOUNT 0 

FETCH NEXT FROM dublicate_cursor INTO @CurrencyCode, @ModifiedDate, @Count 
END 

CLOSE dublicate_cursor 
DEALLOCATE dublicate_cursor 
1

subquery,请使用以下方法。

DELETE t 
FROM (SELECT NAME,ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY NAME) AS Flag 
       FROM SQLPractice.[dbo].[CURRENCY] 
      ) t 
WHERE t.Flag > 1 
GO 

您也可以使用c ommon table expression (CTE)来实现此目的。

;WITH cte_1 
AS (SELECT NAME,ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY NAME) AS Flag 
       FROM SQLPractice.[dbo].[CURRENCY] 
      ) 
DELETE FROM cte_1 
WHERE Flag > 1