2011-05-09 45 views
2

我设法组成了sql查询,以将包含组合的重复行更新为2字段表的空值。但是,我坚持超过2场表。在Oracle上识别n字段数据表的重复组合

我的2场的解决方案是:

插入测试数据组合表:

create table combinations as 
select 1 col1, 2 col2 from dual --row1 
union all 
select 2, 1 from dual --row2 
union all 
select 1, 3 from dual --row3 
union all 
select 1,4 from dual; --row4

从组合表ROW1和ROW2是重复的,因为元素的顺序并不重要。

更新复制组合为null 2场(更新2行是null):

update combinations 
set col1=null, col2=null 
where rowid IN(
select x.rid from (
    select 
     rowid rid, 
     col1, 
     col2, 
     row_number() over (partition by least(col1,col2), greatest(col1,col2) 
           order by rownum) duplicate_row 
    from combinations) x 
where duplicate_row > 1); 

我上面的代码依赖于最少(,)和最大()函数,这就是为什么它的作品整齐。任何想法将此代码调整为3字段表?

为组合2' 表中插入测试数据(3-场)

create table combinations2 as 
select 1 col1, 2 col2, 3 col3 from dual --row1 
union all 
select 2, 1, 3 from dual --row2 
union all 
select 1, 3, 2 from dual --row3; 

组合2表3场具有ROW1,ROW2,ROW3它们是相等的。我的目标是将row2和row3更新为null。

+1

好像是这个一样的问题:http://stackoverflow.com/questions/5924118/sql-and-unique-n-coulmn-combinations – 2011-05-09 14:51:47

回答

1
update combinations2 
set col1 = NULL 
    , col2 = NULL 
    , col3 = NULL 
where rowid in (
      select r 
      from 
       (
       -- STEP 4 
       select r, row_number() over(partition by colls order by colls) duplicate_row 
       from 
        (
        -- STEP 3 
        select r, c1 || '_' || c2 || '_' || c3 colls 
        from 
         (
         -- STEP 2 
         select r 
           , max(case when rn = 1 then val else null end) c1 
           , max(case when rn = 2 then val else null end) c2 
           , max(case when rn = 3 then val else null end) c3 
         from 
          (
          -- STEP 1 
          select r 
            , val 
            , row_number() over(partition by r order by val) rn 
          from 
           (
            select rowid as r, col1 as val 
            from combinations2 
           union all 
            select rowid, col2 
            from combinations2 
           union all 
            select rowid, col3 
            from combinations2 
           ) 
          ) 
         group by r 
         ) 
        ) 
       ) 
      where duplicate_row > 1 
      ) 
; 
  • 步骤1:在列中的值进行排序
  • 步骤2:建立排,排序的值
  • 步骤3:串联列的字符串
  • 步骤4:查找重复
+0

不错,但你需要一点点修改,它不工作,如果你添加更多的值,例如:1,4,2和4,2,1 – mcha 2011-05-09 15:01:58

+0

在步骤4中它应该是row_number()over(由colls ORDER BY colls分区) – mcha 2011-05-09 15:08:52

+0

感谢您的报告。格式化时我以某种方式丢失了它。我编辑了我的答案。现在它应该做得很好。 – schurik 2011-05-09 15:54:22