2017-02-17 66 views
0

我有3个表:如何选择MariaDB中每个组的最新成员?

  1. 账户 - 账户信息
  2. 机 - 机资讯
  3. account_machine - 帐户映射到一台机器上

每个帐户由一个处理的日期机。随着时间的推移,一个帐户可以迁移到不同的机器上,但在某一天它只能由一台机器来处理。如果帐户不再有效,那么相应的machine_id为0。给定一个日期,我想找到所有活跃账户,所以我想出了这个查询:

SELECT account.id 
FROM account JOIN account_machine m 
ON m.account_id=account.id && m.machine_id && m.machine_id= 
(SELECT machine_id 
FROM account_machine 
WHERE account_id=account.id && date<=20170215 
ORDER BY date DESC LIMIT 1) 
GROUP BY account.id; 

这正常工作与MySQL,但没有按”跟MariaDB一起。

MariaDB [db]> select * from account_machine; 
+------------+------------+------------+ 
| date  | account_id | machine_id | 
+------------+------------+------------+ 
| 2013-01-01 |   1 |   1 | 
| 2013-01-01 |   8 |   1 | 
| 2013-01-01 |   2 |   2 | 
| 2013-01-01 |   3 |   2 | 
| 2013-01-01 |   4 |   3 | 
| 2013-01-01 |   12 |   3 | 
| 2016-04-01 |   24 |   3 | 
| 2013-01-01 |   5 |   5 | 
| 2013-01-01 |   6 |   8 | 
| 2013-01-01 |   7 |   6 | 
| 2014-01-01 |   9 |   6 | 
| 2013-01-01 |   10 |   4 | 
| 2014-07-01 |   11 |   10 | 
| 2014-01-01 |   13 |   7 | 
| 2014-01-01 |   14 |   7 | 
| 2014-07-01 |   15 |   11 | 
| 2014-07-01 |   16 |   14 | 
| 2014-07-01 |   17 |   12 | 
| 2015-01-01 |   18 |   13 | 
| 2015-01-01 |   19 |   13 | 
| 2015-04-01 |   20 |   13 | 
| 2015-04-01 |   21 |   7 | 
| 2015-04-01 |   22 |   13 | 
| 2016-04-01 |   23 |   15 | 
| 2016-05-01 |   25 |   9 | 
| 2016-05-19 |   26 |   4 | 
| 2014-08-06 |   1 |   0 | 
| 2016-01-15 |   12 |   0 | 
| 2015-11-04 |   19 |   12 | 
| 2016-05-23 |   10 |   0 | 
| 2016-05-26 |   2 |   18 | 
| 2016-05-27 |   13 |   16 | 
| 2016-06-02 |   27 |   3 | 
| 2016-06-02 |   4 |   0 | 
| 2016-06-08 |   28 |   17 | 
| 2016-06-21 |   29 |   19 | 
| 2016-07-11 |   30 |   20 | 
| 2016-08-15 |   13 |   0 | 
| 2016-08-19 |   2 |   18 | 
| 2016-08-25 |   31 |   21 | 
| 2016-09-08 |   32 |   20 | 
| 2016-11-30 |   19 |   12 | 
| 2016-11-30 |   22 |   13 | 
| 2017-01-20 |   33 |   15 | 
+------------+------------+------------+ 

MariaDB [db]> select account.id from account join account_machine m on m.account_id=account.id && m.machine_id && m.machine_id=(select a.machine_id from account_machine a where a.account_id=account.id && a.date<=20170215 order by a.date desc limit 1) group by account.id; 
+----+ 
| id | 
+----+ 
| 23 | 
| 33 | 
+----+ 

mysql> select account.id from account join account_machine m on m.account_id=account.id && m.machine_id && m.machine_id=(select a.machine_id from account_machine a where a.account_id=account.id && a.date<=20170215 order by a.date desc limit 1) group by account.id; 
+----+ 
| id | 
+----+ 
| 2 | 
| 3 | 
| 5 | 
| 6 | 
| 7 | 
| 8 | 
| 9 | 
| 11 | 
| 14 | 
| 15 | 
| 16 | 
| 17 | 
| 18 | 
| 19 | 
| 20 | 
| 21 | 
| 22 | 
| 23 | 
| 24 | 
| 25 | 
| 26 | 
| 27 | 
| 28 | 
| 29 | 
| 30 | 
| 31 | 
| 32 | 
| 33 | 
+----+ 

P.S.这里有3个表,为您重现:

CREATE TABLE `account` (
    `id` smallint(5) unsigned NOT NULL AUTO_INCREMENT, 
    PRIMARY KEY (`id`) 
) ENGINE=MyISAM; 
INSERT INTO `account` VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30),(31),(32),(33); 

CREATE TABLE `account_machine` (
    `date` date NOT NULL, 
    `account_id` smallint(5) unsigned NOT NULL, 
    `machine_id` smallint(5) unsigned NOT NULL, 
    PRIMARY KEY (`date`,`account_id`) 
) ENGINE=MyISAM; 
INSERT INTO `account_machine` VALUES ('2013-01-01',1,1),('2013-01-01',8,1),('2013-01-01',2,2),('2013-01-01',3,2),('2013-01-01',4,3),('2013-01-01',12,3),('2016-04-01',24,3),('2013-01-01',5,5),('2013-01-01',6,8),('2013-01-01',7,6),('2014-01-01',9,6),('2013-01-01',10,4),('2014-07-01',11,10),('2014-01-01',13,7),('2014-01-01',14,7),('2014-07-01',15,11),('2014-07-01',16,14),('2014-07-01',17,12),('2015-01-01',18,13),('2015-01-01',19,13),('2015-04-01',20,13),('2015-04-01',21,7),('2015-04-01',22,13),('2016-04-01',23,15),('2016-05-01',25,9),('2016-05-19',26,4),('2014-08-06',1,0),('2016-01-15',12,0),('2015-11-04',19,12),('2016-05-23',10,0),('2016-05-26',2,18),('2016-05-27',13,16),('2016-06-02',27,3),('2016-06-02',4,0),('2016-06-08',28,17),('2016-06-21',29,19),('2016-07-11',30,20),('2016-08-15',13,0),('2016-08-19',2,18),('2016-08-25',31,21),('2016-09-08',32,20),('2016-11-30',19,12),('2016-11-30',22,13),('2017-01-20',33,15); 

CREATE TABLE `machine` (
    `id` smallint(5) unsigned NOT NULL AUTO_INCREMENT, 
    PRIMARY KEY (`id`) 
) ENGINE=MyISAM; 
INSERT INTO `machine` VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22); 
+0

是表已在两种模式中相同的数据? –

+0

顺便说一句'和m.machine_id'总是如此。如果你不屑于格式化您的疑问,这将是显而易见的,你太 – Strawberry

+0

见http://meta.stackoverflow.com/questions/333952/why-should-i-provide-an-mcve-for-what-seems-to-我将是一个非常简单的sql查询 – Strawberry

回答

1

什么这样的事情?

SELECT am1.account_id AS id 
FROM account_machine am1 
JOIN (
    SELECT account_id, MAX(date) AS date 
    FROM account_machine 
    GROUP BY account_id 
    ) am2 
ON am1.account_id = am2.account_id 
AND am1.date = am2.date 
AND am1.machine_id != 0 
ORDER BY am1.account_id; 

+----+ 
| id | 
+----+ 
| 2 | 
| 3 | 
| 5 | 
| 6 | 
| 7 | 
| 8 | 
| 9 | 
| 11 | 
| 14 | 
| 15 | 
| 16 | 
| 17 | 
| 18 | 
| 19 | 
| 20 | 
| 21 | 
| 22 | 
| 23 | 
| 24 | 
| 25 | 
| 26 | 
| 27 | 
| 28 | 
| 29 | 
| 30 | 
| 31 | 
| 32 | 
| 33 | 
+----+ 
28 rows in set (0.00 sec) 

我很想知道从MySQL和MariaDB中看到EXPLAIN EXTENDED/SHOW WARNINGS的输出。这将向您显示查询优化器是如何重写查询的。例如:

[email protected] [stack]> EXPLAIN EXTENDED SELECT am1.account_id AS id 
    -> FROM account_machine am1 
    -> JOIN (
    ->  SELECT account_id, MAX(date) AS date 
    ->  FROM account_machine 
    ->  GROUP BY account_id 
    ->) am2 
    -> ON am1.account_id = am2.account_id 
    -> AND am1.date = am2.date 
    -> AND am1.machine_id != 0 
    -> ORDER BY am1.account_id\G 
*************************** 1. row *************************** 
      id: 1 
    select_type: PRIMARY 
     table: <derived2> 
     type: ALL 
possible_keys: NULL 
      key: NULL 
     key_len: NULL 
      ref: NULL 
     rows: 44 
    filtered: 100.00 
     Extra: Using where; Using temporary; Using filesort 
*************************** 2. row *************************** 
      id: 1 
    select_type: PRIMARY 
     table: am1 
     type: eq_ref 
possible_keys: PRIMARY 
      key: PRIMARY 
     key_len: 5 
      ref: am2.date,am2.account_id 
     rows: 1 
    filtered: 100.00 
     Extra: Using where 
*************************** 3. row *************************** 
      id: 2 
    select_type: DERIVED 
     table: account_machine 
     type: index 
possible_keys: NULL 
      key: PRIMARY 
     key_len: 5 
      ref: NULL 
     rows: 44 
    filtered: 100.00 
     Extra: Using index; Using temporary; Using filesort 
3 rows in set, 1 warning (0.00 sec) 

[email protected] [stack]> SHOW WARNINGS\G 
*************************** 1. row *************************** 
    Level: Note 
    Code: 1003 
Message: select `stack`.`am1`.`account_id` AS `id` from 
`stack`.`account_machine` `am1` join (select 
`stack`.`account_machine`.`account_id` AS 
`account_id`,max(`stack`.`account_machine`.`date`) AS `date` from 
`stack`.`account_machine` group by 
`stack`.`account_machine`.`account_id`) `am2` where 
((`stack`.`am1`.`account_id` = `am2`.`account_id`) and 
(`stack`.`am1`.`date` = `am2`.`date`) and (`stack`.`am1`.`machine_id` 
<> 0)) order by `stack`.`am1`.`account_id` 
1 row in set (0.00 sec) 

显然不是没有索引的高性能查询,但是对于有限的数据集来说很好。

0

我怀疑有在查询设计缺陷 - 如果子查询回来与account_idmachine_id = 0。之后,它不会再看。

使用JOIN...ON时,将仅限条款中的加入信息,而不是过滤信息是好的形式;那在WHERE

看起来这将是更简单,更快:

SELECT account_id 
    FROM account_machine AS m 
    WHERE machine_id != 0 
     AND date <= 20170215 
     AND EXISTS (
     SELECT * 
      FROM account 
      WHERE id = m.account_id 
       ) 
    ORDER BY date DESC 
    LIMIT 1; 

也许EXISTS()测试是多余的,可以去掉?

INDEX(date)可能有助于提高性能。

(不,我还没有发现,为什么在两个服务器可能会有所改变。看看我的版本的作品。)

+0

你的查询显然是错误的 - 我想返回所有活动帐户,但你的限制为1.同样,你对'machine_id = 0'的猜测也是错误的。我尝试删除该子句,但仍然只在MariaDB中获得2行。 – zhao