2017-04-14 154 views
2

我在这里整理了问题的简化版本。LEFT JOIN导致重复的结果

方案

  • 我的应用程序的用户,文件和文件夹。
  • 用户可以创建只有他可以看到并共享所有用户才能看到的文件的私人文件。
  • 用户可以创建私人文件夹,以便将其私人文件和共享文件组织到中。文件夹分配是可选的。如果用户未分配文件夹,则文件显示在“未分类”栏中。

架构

-- ------------------------------------- 
-- User 
-- ------------------------------------- 

CREATE TABLE [User] (
    [Id] VARCHAR(50) NOT NULL 
); 

INSERT INTO [User] 
    VALUES ('user_1'); 
INSERT INTO [User] 
    VALUES ('user_2'); 

-- ------------------------------------- 
-- Folder 
-- ------------------------------------- 

CREATE TABLE [Folder] (
    [Id] VARCHAR(50) NOT NULL, 
    [UserId] VARCHAR(50) NOT NULL 
); 

-- Each user has a private folder 
INSERT INTO [Folder] 
    VALUES ('user1_folder', 'user_1'); 
INSERT INTO [Folder] 
    VALUES ('user2_folder', 'user_2'); 

-- ------------------------------------- 
-- File 
-- ------------------------------------- 

CREATE TABLE [File] (
    [Id] VARCHAR(50) NOT NULL, 
    [UserId] VARCHAR(50) NULL 
); 

-- Private files 
INSERT INTO [File] 
    VALUES ('user1_file1', 'user_1'); 
INSERT INTO [File] 
    VALUES ('user1_file2', 'user_1'); 

INSERT INTO [File] 
    VALUES ('user2_file1', 'user_2'); 
INSERT INTO [File] 
    VALUES ('user2_file2', 'user_2'); 

-- Shared files 
INSERT INTO [File] 
    VALUES ('shared_file1', NULL); 
INSERT INTO [File] 
    VALUES ('shared_file2', NULL); 
INSERT INTO [File] 
    VALUES ('shared_file3', NULL); 
-- UPDATE: new case 
INSERT INTO [File] 
    VALUES ('shared_file4', NULL); 

-- ------------------------------------- 
-- FolderFile Association 
-- ------------------------------------- 

CREATE TABLE [FolderFile] (
    [FolderId] VARCHAR(50) NOT NULL, 
    [FileId] VARCHAR(50) NOT NULL 
); 

-- User 1 puts some files in his private folders 
INSERT INTO [FolderFile] 
    VALUES ('user1_folder', 'user1_file'); 
INSERT INTO [FolderFile] 
    VALUES ('user1_folder', 'shared_file1'); 
INSERT INTO [FolderFile] 
    VALUES ('user1_folder', 'shared_file2'); 

-- User 2 puts some files in his private folders 
INSERT INTO [FolderFile] 
    VALUES ('user2_folder', 'user2_file'); 
INSERT INTO [FolderFile] 
    VALUES ('user2_folder', 'shared_file1'); 
-- UPDATE: new case 
INSERT INTO [FolderFile] 
    VALUES ('user2_folder', 'shared_file4'); 

所需的结果

我希望看到所有专用和共享文件给定@UserIduser_1在这种情况下),与相关的私人文件夹的用户一起(如果有的话)。 请注意,对于用户的文件,文件夹是可选的。

尝试查询#1

DECLARE @UserId VARCHAR(50) = 'user_1' 

SELECT 
    F.[Id] AS [FileId], 
    F.[UserId] AS [FileUserId], 
    FO.[Id] AS [FolderId] 
FROM 
    [File] AS F 
LEFT JOIN 
    [FolderFile] FOF ON FOF.[FileId] = F.[Id] 
LEFT JOIN 
    [Folder] FO ON FO.[Id] = FOF.[FolderId] 
WHERE 
    F.[UserId] IS NULL 
    OR F.[UserId] = @UserId 

结果#1

FileId   FileUserId FolderId 
========================================= 
user1_file1  user_1  NULL 
user1_file2  user_1  NULL 
shared_file1 NULL   user1_folder 
shared_file1 NULL   user2_folder <== bad result 
shared_file2 NULL   user1_folder 
shared_file3 NULL   NULL 
shared_file4 NULL   user2_folder <== bad result 

尝试查询#2

添加另一个条件的FolderJOINON

DECLARE @UserId VARCHAR(50) = 'user_1' 

SELECT 
    F.[Id] AS [FileId], 
    F.[UserId] AS [FileUserId], 
    FO.[Id] AS [FolderId] 
FROM 
    [File] AS F 
LEFT JOIN 
    [FolderFile] FOF ON FOF.[FileId] = F.[Id] 
LEFT JOIN 
    [Folder] FO ON FO.[Id] = FOF.[FolderId] AND FO.[UserId] = @UserId -- Add another condition here on UserId 
WHERE 
    F.[UserId] IS NULL 
    OR F.[UserId] = @UserId 

结果#2

FileId   FileUserId FolderId 
========================================= 
user1_file1  user_1  NULL 
user1_file2  user_1  NULL 
shared_file1 NULL   user1_folder 
shared_file1 NULL   NULL   <== bad result 
shared_file2 NULL   user1_folder 
shared_file3 NULL   NULL 
shared_file4 NULL   NULL 

分析

正如你可以在上面看到,为user_2的文件夹中的关联从而导致额外的一行user_1归还。我不希望这一行被包含在内。

如果FolderFile表有一个UserId就可以了,我想我可以用一个条件限制它,但它没有。 UserId暗示通过相关的Folder。关联上的LEFT JOIN会导致它传播null并传递它下面的条件。

我跑出来的想法,但它可能是明显的东西:)

更新#1

我增加了一个新的情况与shared_file4,这是一个文件夹中为user_2,但不user_1。它应该包含在两个用户的结果中。

INSERT INTO [File] 
    VALUES ('shared_file4', NULL); 

INSERT INTO [FolderFile] 
    VALUES ('user2_folder', 'shared_file4'); 
+0

如果你只是为了表示_shared_文件的方式是分配NULL作为文件的所有者,并且有一个_shared_或_private_文件夹中没有表示,那么这将是很难告诉共享文件在某些​​文件夹中是私人的。使用'bit'来表示每个文件和文件夹的共享/私有状态可能会更好,并且可以让您通过为每个文件和文件夹分配一个'UserId'来独立于共享来跟踪所有权。 – HABO

+0

不谈论表格设计,你能解释为什么在'INSERT INTO [FolderFile] VALUES('user2_folder','shared_file1');'? –

+0

@ PhamX.Bach因为文件夹分配是私人的。查询只应该返回'user_1'的结果。 – kspearrin

回答

0

尽管许多给出的答案能够返回我需要的结果集,但他们并没有提供非常好的查询计划。我最终决定以最佳性能实现我想要的结果集的最佳方法是将FolderFile表格非规范化并添加UserId列。现在可以使用此列,我可以使用与我原始查询尝试类似的标准连接,在FolderFileLEFT JOIN处过滤用户。

CREATE TABLE [FolderFile] (
    [FolderId] VARCHAR(50) NOT NULL, 
    [FileId] VARCHAR(50) NOT NULL, 
    [UserId] VARCHAR(50) NOT NULL 
); 

SELECT 
    F.[Id] AS [FileId], 
    F.[UserId] AS [FileUserId], 
    FO.[Id] AS [FolderId] 
FROM 
    [File] AS F 
LEFT JOIN 
    [FolderFile] FOF ON FOF.[FileId] = F.[Id] AND FOF.[UserId] = @UserId 
LEFT JOIN 
    [Folder] FO ON FO.[Id] = FOF.[FolderId] 
WHERE 
    F.[UserId] IS NULL 
    OR F.[UserId] = @UserId 
0

[FolderFile表显示, “shared_file1” 存在于两个user1_folder和user2_folder。它是否正确?

(对不起,我没有足够的点添加评论。)

+0

是的,这是正确的,这是什么导致我目前的查询问题。该查询应该只返回'user_1'的结果。 – kspearrin

+0

为什么你要为'user_1'返回“shared_file1”?如果我只为'user_2'返回,这仍然是正确的,不是吗? – TriV

0

请尝试以下...

DECLARE @UserId VARCHAR(50) = 'user_1' 

SELECT File.Id AS FileId, 
     File.UserId AS FileUserId, 
     Folder.Id AS FolderId 
FROM File 
LEFT JOIN FolderFile ON FolderFile.FileId = File.Id 
LEFT JOIN Folder ON Folder.Id = FolderFile.FolderId 
       AND Folder.UserId = @UserId 
WHERE (File.UserId IS NULL OR 
     File.UserId = @UserId) 
    AND (FileUserId IS NOT NULL OR 
     FolderId IS NOT NULL) 

你的第二个尝试查询接近,你只需要添加这两个字段均为NULL /包含子句的排除子句,其中至少一个字段为NULL

如果您有任何问题或意见,请随时发布相应评论。

进一步阅读

https://www.w3schools.com/sql/sql_null_values.asp

+0

这不包括'shared_file_3',它不在文件夹中。 – kspearrin

0

额外的行正由第一LEFT JOINFolderFile介绍,而不是由LEFT JOINFolder,因此增加一个额外的加入对Folder表状况不会消除该行。

但是,您可以过滤WHERE子句中的行。由于您希望将行分配给没有文件夹的共享文件,例如shared_file3,或行的共享文件链接到属于@UserId文件夹,只需添加以下过滤器来查询1.

DECLARE @UserId VARCHAR(50) = 'user_1' 

SELECT 
    F.[Id] AS [FileId], 
    F.[UserId] AS [FileUserId], 
    FO.[Id] AS [FolderId] 
FROM 
    [File] AS F 
LEFT JOIN 
    [FolderFile] FOF ON FOF.[FileId] = F.[Id] 
LEFT JOIN 
    [Folder] FO ON FO.[Id] = FOF.[FolderId] 
WHERE (F.[UserId] IS NULL OR F.[UserId] = @UserId) 
    AND (FO.UserId IS NULL OR FO.UserId = @UserId) 

更新

如果你只希望包括属于用户私人文件夹,但仍包含所有共享文件,那么以下内容应该可以做到。

DECLARE @UserId VARCHAR(50) = 'user_1' 

SELECT 
    F.[Id] AS [FileId], 
    F.[UserId] AS [FileUserId], 
    FO.[Id] AS [FolderId] 
FROM 
    [File] AS F 
LEFT JOIN 
    [FolderFile] FOF ON FOF.[FileId] = F.[Id] 
LEFT JOIN 
    [Folder] FO ON FO.[Id] = FOF.[FolderId] AND FO.UserId = @UserId 
WHERE (F.[UserId] IS NULL OR F.[UserId] = @UserId) 
    AND (FO.UserId IS NULL OR FO.UserId = @UserId) 
+0

这一个看起来是返回正确的结果!让我通过一些更多的条件来充分验证它,如果一切顺利,我会接受。 – kspearrin

+0

我发现一个条件,仍然会导致此查询的问题。我用'INSERT INTO [File] VALUES('shared_file4',NULL)更新了这个问题; INSERT INTO [FolderFile] VALUES('user2_folder','shared_file4');'。这添加了一个新的'shared_file4',它位于'user_2'的私人文件夹中。 'shared_file4'应该返回'user_1',但它不会。 – kspearrin

+0

@kspearrin应该处理额外的案件。 – tep

0

你可以试试这个:

DECLARE @UserId VARCHAR(50) = 'user_1' 

SELECT 
    F.[Id] AS [FileId], 
    F.[UserId] AS [FileUserId], 
    FO.[Id] AS [FolderId] 
FROM 
    [File] AS F 
LEFT JOIN 
    [FolderFile] FOF ON FOF.[FileId] = F.[Id] 
LEFT JOIN 
    [Folder] FO ON FO.[Id] = FOF.[FolderId] 
WHERE 
    FO.[UserId] = @UserId 
    OR F.[UserId] = @UserId; 

编辑:对于shared_file_3,其中有NULL USER_ID,并没有任何文件夹中,如果您的设计会出现在所有用户shared files那么你应该使用:

DECLARE @UserId VARCHAR(50) = 'user_1' 

SELECT 
    F.[Id] AS [FileId], 
    F.[UserId] AS [FileUserId], 
    FO.[Id] AS [FolderId] 
FROM 
    [File] AS F 
LEFT JOIN 
    [FolderFile] FOF ON FOF.[FileId] = F.[Id] 
LEFT JOIN 
    [Folder] FO ON FO.[Id] = FOF.[FolderId] 
WHERE 
    FO.[UserId] = @UserId 
    OR F.[UserId] = @UserId 
    OR (FO.[UserId] IS NULL AND F.[UserId] IS NULL); 
+0

这不包括'shared_file_3',它不在文件夹中。 – kspearrin

+0

@kspearrin如果对于您的数据库设计,拥有NULL拥有者而不是任何文件夹的文件是ALL用户的共享文件,那么您可以在编辑答案中使用我的第二个查询。 –

+0

您的第二个查询现在正确包含'shared_file_3'。但是,我发现一个条件仍然会导致此查询出现问题。我用'INSERT INTO [File] VALUES('shared_file4',NULL)更新了这个问题; INSERT INTO [FolderFile] VALUES('user2_folder','shared_file4');'。这添加了一个新的'shared_file_4',它位于'user_2'的私人文件夹中。 'shared_file_4'应该返回'user_1',但它不会。 – kspearrin

1

关于LEFT JOIN已经有一些很好的答案。我决定和CTE一起玩,看看我能否做出非常富有表现力的回答。享受:

DECLARE @UserId VARCHAR(50) = 'user_1' 

;WITH 
PrivateFile (FileId) AS 
(
    SELECT Id FROM [File] 
    WHERE UserId = @UserId 
), 
SharedFile (FileId) AS 
(
    SELECT Id FROM [File] 
    WHERE UserId is null 
), 
AnyFile ([FileId]) AS 
(
    SELECT FileId FROM PrivateFile 
    UNION 
    SELECT FileId FROM SharedFile 
), 
PrivateFolder (FolderId) AS 
(
    SELECT Id FROM [Folder] 
    WHERE UserId = @UserId 
), 
AssociatedFolder ([FileId], [FolderId]) AS 
(
    SELECT ff.FileId, ff.FolderId 
    FROM [FolderFile] ff 
    JOIN PrivateFolder pf ON ff.FolderId = pf.FolderId 
) 
SELECT f.[FileId], @UserId as UserId, fo.[FolderId] 
FROM AnyFile as f 
    LEFT JOIN AssociatedFolder as fo ON f.[FileId] = fo.[FileId] 
1

这提供给定数据所需的答案。

-- Sample data. 
declare @Users as Table (UserId VarChar(50) not NULL); 
insert into @Users (UserId) values 
    ('user_1'), ('user_2'); 

declare @Folders as Table (FolderId VarChar(50) not NULL, UserId VarChar(50) not NULL); 
insert into @Folders (FolderId, UserId) values 
    ('user1_folder', 'user_1'), ('user2_folder', 'user_2'); 

declare @Files as Table (FileId VarChar(50) not NULL, UserId VarChar(50) NULL); 
insert into @Files (FileId, UserId) values 
    -- Private files. 
    ('user1_file1', 'user_1'), ('user1_file2', 'user_1'), 
    ('user2_file1', 'user_2'), ('user2_file2', 'user_2'), 
    -- Shared files. 
    ('shared_file1', NULL), ('shared_file2', NULL), ('shared_file3', NULL), ('shared_file4', NULL); 

declare @FileFolders as Table (FolderId VarChar(50) not NULL, FileId VarChar(50) not NULL); 
insert into @FileFolders (FolderId, FileId) values 
    -- User 1 puts some files in his private folders. 
    ('user1_folder', 'user1_file'), ('user1_folder', 'shared_file1'), ('user1_folder', 'shared_file2'), 
    -- User 2 puts some files in his private folders. 
    ('user2_folder', 'user2_file'), ('user2_folder', 'shared_file1'), ('user2_folder', 'shared_file4'); 

select * from @Users; 
select * from @Files; 
select * from @Folders; 
select * from @FileFolders; 

-- Query the data. 
declare @UserId as VarChar(50) = 'user_1'; 

with 
    -- Any file with a UserId of NULL is shared. 
    -- If it is in the given user's folders then pick up the folder. 
    SharedFiles as (
    select Fi.FileId, Max(Fi.UserId) as UserId, Max(Fo.FolderId) as FolderId 
     from @Files as Fi left outer join 
     @FileFolders as FF on FF.FileId = Fi.FileId left outer join 
     @Folders as Fo on Fo.FolderId = FF.FolderId and (Fo.UserId = @UserId or FF.FileId is NULL) 
     where Fi.UserId is NULL 
     group by Fi.FileId), 
    -- Any file with a non-NULL UserId is private. 
    -- Find all of the given user's files. 
    PrivateFiles as (
    select Fi.FileId, Fi.UserId, Fo.FolderId 
     from @Files as Fi left outer join 
     @FileFolders as FF on FF.FileId = Fi.FileId left outer join 
     @Folders as Fo on Fo.FolderId = FF.FolderId and Fo.UserId = @UserId 
     where Fi.UserId = @UserId) 
    select FileId, UserId, FolderId 
    from PrivateFiles 
    union all 
    select FileId, UserId, FolderId 
    from SharedFiles; 
1

我改变你的#QUERY2一点与Row_number

;WITH temp AS 
(
    SELECT 
    F.[Id] AS [FileId], 
    F.[UserId] AS [FileUserId], 
    FO.[Id] AS [FolderId], 
    row_number() OVER(PARTITION BY F.Id ORDER BY FO.Id DESC) AS Rn 
    -- if folder id not null (it means that folder belongs to @UserId) 
    --> it will be the first priority -- Rownumber = 1 
    FROM 
    [File] AS F 
    LEFT JOIN 
    [FolderFile] FOF ON FOF.[FileId] = F.[Id] 
    LEFT JOIN 
    [Folder] FO ON FO.[Id] = FOF.[FolderId] AND FO.[UserId] = @UserId 
    WHERE 
    F.[UserId] IS NULL 
    OR F.[UserId] = @UserId 
) 
SELECT t.FileId, t.FileUserId, t.FolderId FROM temp t 
WHERE rn = 1 
1

使用outer apply()

declare @UserId varchar(50) = 'user_1'; 
select 
    FileId = F.Id 
    , FileUserId = F.UserId 
    , FolderId = x.Id 
from [File] as F 
    outer apply (
    select top 1 
     Id = case when fo.UserId = @UserId then fo.Id else null end 
    from [FolderFile] fof 
     left join [Folder] fo 
     on fo.Id = fof.FolderId 
    where fof.FileId = f.id 
    order by case when fo.UserId = @UserId then 0 else 1 end 
    ) as x 
where (f.UserId = @UserId or f.UserId is null); 

rextester演示:http://rextester.com/YEAMZ12650

回报:

+--------------+------------+--------------+ 
| FileId | FileUserId | FolderId | 
+--------------+------------+--------------+ 
| user1_file1 | user_1  | NULL   | 
| user1_file2 | user_1  | NULL   | 
| shared_file1 | NULL  | user1_folder | 
| shared_file2 | NULL  | user1_folder | 
| shared_file3 | NULL  | NULL   | 
| shared_file4 | NULL  | NULL   | 
+--------------+------------+--------------+