2017-04-27 118 views
4

我在桌上称为距离。它有4列。 id,start_from,end_todistance查询从SQL中删除重复项

我有一些重复记录。重复的记录在这个意义上,

start_from | end_to | distance 
Chennai  Bangalore  350 
Bangalore  Chennai  350 
Chennai  Hyderabad  500 
Hyderabad  Chennai  510 

在上表中,奈班加罗尔班加罗尔奈都有相同的距离。所以我需要查询删除该记录选择

我要出去放像

start_from | end_to | distance 
Chennai  Bangalore  350 
Chennai  Hyderabad  500 
Hyderabad  Chennai  510 
+1

请分享确切的期望输出。字段值可能会重复,但根据要求,我们需要重写查询或重新设计表格。 –

+0

@SaurabhJhunjhunwala添加了所需的输出。我无法改变桌子。 – shiva

+0

为什么你要**钦奈** **班加罗尔**,而不是**班加罗尔** **钦奈**?如果** Chennai **到** Hyderabad **也是350,那么您会想要什么? – Blank

回答

2

如果Chennai to BangaloreBangalore to Chennai之间没有什么不同,你可以试试这个:

select 
    max(`start_from`) as `start_from`, 
    min(`end_to`) as `end_to`, 
    `distance` 
from yourtable 
group by 
    case when `start_from` > `end_to` then `end_to` else `start_from` end, 
    case when `start_from` > `end_to` then `start_from` else `end_to` end, 
    `distance` 

这里是一个rextester demo。即使Chennai to Hyderabad是350也可以工作demo

如果你想Bangalore to Chennai将依然存在,你可以改变的maxmin的地方:

select 
    min(`start_from`) as `start_from`, 
    max(`end_to`) as `end_to`, 
    `distance` 
from yourtable 
group by 
    case when `start_from` > `end_to` then `end_to` else `start_from` end, 
    case when `start_from` > `end_to` then `start_from` else `end_to` end, 
    `distance` 

demo

case when将与大多数数据库兼容。

+0

是的,你是对的。它看起来更好与案件和时间 – shiva

0

设置字段顺序查询(使用值)有助于获得一个唯一行:

select distinct 
    case when start_from > end_to then end_to  else start_from end as _start, 
    case when start_from > end_to then start_from else end_to  end as _end, 
    distance 
from distance; 

测试后,我得到:

+-----------+-----------+----------+ 
| _start | _end  | distance | 
+-----------+-----------+----------+ 
| Bangalore | Chennai |  350 | 
| Chennai | Hyderabad |  500 | 
| Chennai | Hyderabad |  510 | 
+-----------+-----------+----------+ 
+0

但是最后一行应该是'海得拉巴,钦奈,510' – Wanderer

+0

是的,好的一个通过使用值排序字段,然后筛选唯一的记录。但正如@Ullas提到*** start_from ***改变了 – shiva

2

您可以使用以下查询来查找重复项:

SELECT LEAST(start_from, end_to) AS start_from, 
     GREATEST(start_from, end_to) AS end_to, 
     distance 
FROM mytable 
GROUP BY LEAST(start_from, end_to), GREATEST(start_from, end_to), distance 
HAVING COUNT(*) > 1 

输出:

start_from, end_to, distance 
-------------------------------- 
Bangalore, Chennai, 350 

现在你可以使用上面的查询作为派生表过滤掉重复:

SELECT t1.* 
FROM mytable AS t1 
LEFT JOIN (
    SELECT LEAST(start_from, end_to) AS start_from, 
      GREATEST(start_from, end_to) AS end_to, 
      distance 
    FROM mytable 
    GROUP BY LEAST(start_from, end_to), GREATEST(start_from, end_to), distance 
    HAVING COUNT(*) > 1 
) AS t2 ON t1.start_from = t2.start_from AND 
      t1.end_to = t2.end_to AND 
      t1.distance = t2.distance  
WHERE t2.start_from IS NULL 

WHERE从句谓语,t2.start_from IS NULL,过滤掉重复记录。

输出:

start_from end_to  distance 
-------------------------------- 
Chennai  Bangalore 350 
Chennai  Hyderabad 500 
Hyderabad Chennai 510 
0

假设你的表像

id start_from    end_to     distance 
0 Chennai     Bangalore    350 
1 Bangalore    Chennai     350 
2 Chennai     Hyderabad    500 
3 Hyderabad    Chennai     510 

然后你就可以使用查询与ID进行比较。

Select 
    O.start_from, 
    O.end_to, 
    O.distance 
From 
    distance O 
Left Join 
    distance P 
On 
    1 = 1 
    and O.start_from = P.end_to 
    and O.end_to = P.start_from 
Where 
    1 = 1 
    and O.distance <> P.distance 
    or(O.distance = P.distance and O.id < P.id) 
+0

'CASE'与'JOIN'。 ** CASE **更好。所以最好使用***'CASE' *** – shiva