2017-08-26 133 views
0

我有这些值集团逗号分隔的重复值

SITE_NAME | CATEGORY | 
---------------------- 
SITE1 | CAR, TRAVEL 
SITE2 | TRAVEL 
SITE3 | SPORT, GAME 
SITE4 | GAME 
SITE5 | CAR 
SITE6 | TRAVEL 
SITE7 | GAME 

我也想重复合计值的表,所以我用这个:

SELECT category, COUNT (*) FROM table_db group by category having count (*)> = 1 

这个工作在分组等于“类别'值,但将'CAR,TRAVEL'视为'CAR'以外的值,我希望它也被识别为重复值。

这个代码显示了这一点:

CAR, TRAVEL 
TRAVEL 
SPORT, GAME 
CAR 
GAME 

我希望它看起来像这样:

CAR 
TRAVEL 
SPORT 
GAME 
+6

你真的应该改变你的原始表的设计。切勿将多个值存储在单个列中! –

+2

此架构公然违反[零度,一度或无限规则](http://en.wikipedia.org/wiki/Zero_one_infinity_rule) [数据库标准化](http://en.wikipedia.org/wiki/Database_normalization)。如果你调整它有一些适当的正常形式,这将是微不足道的。 – tadman

+0

我无法重新设计数据库。我想在帖子中进行解释。 –

回答

0

虽然我完全有关数据库设计的其他意见基本一致,如果出于某种原因,你卡住了你的设计,那么你需要创建一个分裂功能。事情是这样的:

CREATE FUNCTION public.fnsplit(
    IN stringlist character varying, 
    IN delimit character varying) 
    RETURNS TABLE(items character varying) AS 
$BODY$ 
declare remainderlist character varying; 
declare front character varying; 
declare delimitpos integer; 
begin 
    drop table if exists tmptbl; 
    create temp table tmptbl(items character varying); 
    remainderlist := $1; 
    delimitpos := strpos(remainderlist, $2); 
    while delimitpos > 0 loop 
     front := trim(both from(left(remainderlist, delimitpos -1))); 
     remainderlist := substr(remainderlist, delimitpos + 1); 
     if length(front) > 0 then 
      insert into tmptbl values (front); 
     end if; 
     delimitpos := strpos(remainderlist, $2); 
    end loop; 
    --insert last value 
    remainderlist := trim(both from remainderlist); 
    if length(remainderlist) > 0 then 
     insert into tmptbl values (remainderlist); 
    end if; 
    return query 
     select * from tmptbl; 
     return; 
end; 
$BODY$ 
    LANGUAGE plpgsql VOLATILE 
    COST 100 
    ROWS 1000; 

你会那么可以使用它在你的选择是这样的:

SELECT category, COUNT (*) FROM 
(SELECT fnsplit(category, ', ') as category FROM table_db) d 
group by category having count(*) >= 1; 

我不禁强调,虽然,这应该是最后的手段!

编辑

有人指出,OP希望MySQL。这有点棘手,因为MySQL不允许函数返回表。所以你必须改用临时表。所以,现在的功能如下:

DELIMITER $$ 
CREATE PROCEDURE fnsplit(
    stringlist varchar(2000), 
    delimit varchar(20) 
) 
BEGIN 

declare remainderlist varchar(2000); 
declare front varchar(2000); 
declare delimitpos integer; 

    SET remainderlist = stringlist; 
    SET delimitpos = position(delimit in remainderlist); 
    while delimitpos > 0 do 
     SET front = trim(both from(left(remainderlist, delimitpos -1))); 
     SET remainderlist = substr(remainderlist, delimitpos + 1); 
     if length(front) > 0 then 
      insert into tblTmpSplit values (front); 
     end if; 
     SET delimitpos = position(delimit in remainderlist); 
    end while; 
    SET remainderlist = trim(both from remainderlist); 
    if length(remainderlist) > 0 then 
     insert into tblTmpSplit values (remainderlist); 
    end if; 

END$$ 
DELIMITER ; 

现在你可以这样调用:

SET @allcategories = (SELECT GROUP_CONCAT(category separator ', ') FROM table_db); 

drop table if exists tbltmpsplit; 
create temporary table tbltmpsplit(items varchar(2000)); 

call fnsplit(@allcategories, ', '); 

SELECT *, Count(*) FROM tbltmpsplit GROUP BY items having count(*) >= 1; 

drop table if exists tbltmpsplit; 

这将返回:

CAR 2 
GAME 3 
SPORT 1 
TRAVEL 3