2017-08-28 73 views
1

我必须从MySQL表中删除40 mln行。MySQL加入和删除只发生第一次发生

我必须找到与输出的所有行 - >“静态输出” 然后删除上面的输出行和删除其上面的输出比其他产值“的同一台主机 和服务的下一行STATIC OUTPUT“。

的样本数据:

- > ID,主机,服务,输出

1,"127.0.0.1","service1","STATIC OUTPUT" 
2,"127.0.0.2","service5","RANDOM OUTPUT X0" 
3,"127.0.0.2","service5","STATIC OUTPUT" 
4,"127.0.0.3","service1","RANDOM OUTPUT X1" 
5,"127.0.0.3","service10","RANDOM OUTPUT X2" 
6,"127.0.0.2","service5","RANDOM OUTPUT X3" 
7,"127.0.0.1","service2","RANDOM OUTPUT X4" 
8,"127.0.0.1","service1","RANDOM OUTPUT X5" 
9,"127.0.0.2","service4","RANDOM OUTPUT X6" 
10,"127.0.0.3","service10","RANDOM OUTPUT X7" 
11,"127.0.0.1","service1","RANDOM OUTPUT X7" 
12,"127.0.0.1","service1","RANDOM OUTPUT X8" 
13,"127.0.0.1","service1","RANDOM OUTPUT X9" 
14,"127.0.0.2","service5","RANDOM OUTPUT X10" 
15,"127.0.0.1","service1","STATIC OUTPUT" 
16,"127.0.0.1","service1","RANDOM OUTPUT X11" 
17,"127.0.0.1","service1","RANDOM OUTPUT X12"  
... 

例子:当我们发现

1,"127.0.0.1","service1","STATIC OUTPUT" 

我们应该删除ID为值的行1和8,

8,"127.0.0.1","service1","RANDOM OUTPUT X5" 

当我们发现

3,"127.0.0.2","service5","STATIC OUTPUT" 

我们应该删除与ID值3和6行,

6,"127.0.0.2","service5","RANDOM OUTPUT X3" 

我写了这样的事情(选择,因为测试查询的DELETE语句的这一翻译),

SELECT * FROM data r1 INNER JOIN (SELECT id, host, service 
FROM data 
WHERE output = 'STATIC OUTPUT') r2 ON 
     r1.id>r2.id AND r1.service=r2.service 
     AND r1.host=r2.host 
     AND r1.output<>'STATIC OUTPUT' 
GROUP BY r1.host, r1.service 

但我认为这是一种错误的方式。

MySQL 5.1.73

+0

你有重组的表,因为它不觉得我的权利,你的存储方式,并与数据工作的选项。你在那里有数据的历史。你为什么不让主机/服务独一无二? – DanFromGermany

+0

我无法更改表格的结构。我是应用/系统管理员而不是应用开发人员。该查询只能使用一次。由于系统错误,我们有很多未连接的数据。 – Dream

+0

如果你没有正确地构建你的数据库,你将永远有不时的错误数据;-) – DanFromGermany

回答

1

校正

现在,这应该这样做!:

SELECT min(sp.id) as id FROM 
(SELECT hs.id, hs.host, hs.service, hs.output, so.id as soid 
FROM data hs 
INNER JOIN 
(SELECT id,host,service,output FROM data 
WHERE output = "STATIC OUTPUT") so 
ON so.host = hs.host and so.service = hs.service 
AND hs.id > so.id WHERE hs.output <> "STATIC OUTPUT") sp 
group by host,service, soid 
UNION 
SELECT id FROM data WHERE output = "STATIC OUTPUT"; 
+0

有什么不对。在测试桌上,我得到了多个具有相同主机和服务的记录。也许我没有显示足够的数据。相同主机和服务的每个下一行的值输出是不同的。我得到了全部,不仅是“STATIC OUTPUT”之后的第一排,同样的主机和服务。 – Dream

+0

你能提供更多的测试数据吗? –

+0

我又增加了4行。在你的查询中,我也得到了11-14。 – Dream

-1

您可以使用LIMIT语句选择第一个匹配项。 https://dev.mysql.com/doc/refman/5.7/en/select.html

SELECT * FROM data LIMIT 1;

编辑:

此示例会发现你的ID来擦除

 
    CREATE OR REPLACE VIEW v AS 
    SELECT r1.id as id1 , r2.id as id2 FROM data r1 
    INNER JOIN data r2 ON r1.host=r2.host AND r1.service = r2.service 
    WHERE LOWER(r1.output) LIKE "static output" AND r1.id < r2.id; 
 
    SELECT DISTINCT id1 FROM v 
    UNION 
    SELECT DISTINCT id2 FROM v; 

输出:

ID:1 3 6 8

+0

限制在这里没有用,只是使用我给出的答案 – kenfire

+0

输出(删除)应该是:1 3 6 8.对 - > 1,8和3,6。在2之前没有“静态输出”具有相同的IP和服务。 – Dream

+0

现在使用视图,您可以更轻松地做到这一点(具有所需的输出) – kenfire