2016-04-29 65 views
2

我将XML文件数据上传到SQL Server数据库。当我再次导入同一文件时,所有数据行都会被复制。将数据导入SQL服务器时跳过重复项

我试图使用DISTINCT当重复的行被删除,但是当我导入时,数据行仍然被重复。

如何在使用DISTINCT方法将数据导入SQL Server数据库时跳过重复项?

我的表:

Create table HallSeat 
(
    HallGroupID int, 
    ShowSeatID int, 
    Color nvarchar(15), 
    Price int, 
    SeatRow int,  
    SeatNumber int, 
    IsReserved bit 
) 

SQL DISTINCT声明:

SELECT DISTINCT * 
INTO tempdb.dbo.tmpTable 
FROM HallSeat 

DELETE FROM HallSeat 

INSERT INTO HallSeat 
    SELECT * 
    FROM tempdb.dbo.tmpTable 

DROP TABLE tempdb.dbo.tmpTable 
+0

提供您要导入的示例数据。 –

+0

上传了我导入的XML文件http://s000.tinyupload.com/index.php?file_id=00087989931748177566 – Paulius

回答

2

您可以使用T-SQL MERGE语句来做到这一点。它将匹配您的HallSeat表导入的行集。如果该行不存在,它将插入一个新行。如果该行确实存在并且存在差异,则可以更新它。

(你可能不希望做删除操作,但我已经包括它的完整性。)

请参阅联机丛书>合并(的Transact-SQL) - https://msdn.microsoft.com/en-GB/library/bb510625.aspx

为了证明这一点,第一创建两个表。

CREATE TABLE dbo.HallSeat 
(
    HallGroupID int NOT NULL, 
    ShowSeatID int NOT NULL, 
    Color nvarchar(15) NOT NULL, 
    Price int NOT NULL, 
    SeatRow int NOT NULL, 
    SeatNumber int NOT NULL, 
    IsReserved bit NOT NULL, 
    CONSTRAINT PK_HallSeat PRIMARY KEY CLUSTERED (HallGroupID, ShowSeatID) 
); 

CREATE TABLE dbo.ImportHallSeat 
(
    HallGroupID int NOT NULL, 
    ShowSeatID int NOT NULL, 
    Color nvarchar(15) NOT NULL, 
    Price int NOT NULL, 
    SeatRow int NOT NULL, 
    SeatNumber int NOT NULL, 
    IsReserved bit NOT NULL, 
    CONSTRAINT PK_ImportHallSeat PRIMARY KEY CLUSTERED (HallGroupID, ShowSeatID) 
); 

然后将XML数据文件导入ImportHallSeat表:

-- Read the XML data file to be imported 
DECLARE @xml xml; 
SELECT @xml = x.a 
    FROM OPENROWSET(BULK 'F:\Work\Data.xml', SINGLE_BLOB) AS x(a); 

TRUNCATE TABLE dbo.ImportHallSeat; 

INSERT INTO dbo.ImportHallSeat(HallGroupID, ShowSeatID, Color, Price, SeatRow, SeatNumber, IsReserved) 
    SELECT T.C.value('HallGroupID[1]', 'int') AS 'HallGroupID', 
      T.C.value('ShowSeatID[1]', 'int') AS 'ShowSeatID', 
      T.C.value('Color[1]', 'nvarchar(15)') AS 'Color', 
      T.C.value('Price[1]', 'money') AS 'Price', 
      T.C.value('SeatRow[1]', 'int') AS 'SeatRow', 
      T.C.value('SeatNumber[1]', 'int') AS 'SeatNumber', 
      T.C.value('IsReserved[1]', 'bit') AS 'IsReserved' 
     FROM @xml.nodes(N'/Filharmonija/Hall/HallGroup/HallSeat') as T(C); 

然后,我们可以更新HallSeat表导入的数据:

MERGE 
    INTO dbo.HallSeat AS H 
    USING dbo.ImportHallSeat AS I 
    ON I.HallGroupID = H.HallGroupID AND I.ShowSeatID = H.ShowSeatID 
    WHEN MATCHED AND H.Color <> I.Color AND H.Price <> I.Price 
     THEN UPDATE SET H.Color = I.Color, H.Price = I.Price 
    WHEN NOT MATCHED BY TARGET 
     THEN INSERT (HallGroupID, ShowSeatID, Color, Price, SeatRow, SeatNumber, IsReserved) 
      VALUES (I.HallGroupID, I.ShowSeatID, I.Color, I.Price, I.SeatRow, I.SeatNumber, I.IsReserved) 
    WHEN NOT MATCHED BY SOURCE 
     THEN DELETE; 

显示拥有数据被输入到HallSeat表中:

SELECT * 
    FROM dbo.HallSeat; 

enter image description here

+0

非常感谢richard345!这是完美的答案。 – Paulius