2013-10-01 31 views
0

这是我在StackOverflow上:)提高DATETIME分组查询性能

伸手求援的第一次尝试,我有以下表格:

CREATE TABLE `tinfinite_visits` (
    `visit_id` int(255) NOT NULL AUTO_INCREMENT, 
    `identity_id` int(255) NOT NULL, 
    `ip` varchar(39) NOT NULL, 
    `loggedin` enum('0','1') NOT NULL DEFAULT '0', 
    `url` longtext NOT NULL, 
    `realurl` longtext NOT NULL, 
    `referrer` longtext NOT NULL, 
    `method` enum('GET','POST','HEAD','OPTIONS','PUT','DELETE','TRACE','CONNECT','PATCH') NOT NULL, 
    `client` longtext NOT NULL, 
    `referring` longtext NOT NULL, 
    `timestart` datetime NOT NULL, 
    `timeend` datetime NOT NULL, 
    PRIMARY KEY (`visit_id`), 
    KEY `timestart` (`timestart`), 
    KEY `identity_id` (`identity_id`) 
) ENGINE=MyISAM DEFAULT CHARSET=utf8; 

在某个时候,我需要得到一些数据此表生成折线图。我目前使用5个不同的查询,5个不同的时间间隔(总计,年,月,周,日)。

该表目前有大约200.000行,但会变得更大(甚至数千万记录)。

尽管我的查询完美适用于此目的,但我试图找到更好的性能方式。

因此,如果可能的话,我将非常感谢关于如何提高查询性能的任何提示/建议,最好甚至将所有5个查询合并为1。是

我目前使用的查询,他们的解释,以及它们的执行时间(周边200.000行)如下:

日查询:

SELECT COUNT(DISTINCT(`identity_id`)) AS visits, DATE_FORMAT(CONVERT_TZ(timestart, '-5:00', '+3:00'), '%l%p') AS unit 
    FROM tinfinite_visits 
    WHERE `timestart` >= DATE_SUB(NOW(), INTERVAL 24 HOUR) 
    GROUP BY unit 
    ORDER BY `timestart` ASC 

说明:

+----+-------------+----------------------+-------+---------------+-----------+---------+-----+-------+------------------------------+ 
| id | select_type | table    | type | possible_keys | key  | key_len | ref | rows | Extra      | 
+----+-------------+----------------------+-------+---------------+-----------+---------+-----+-------+------------------------------+ 
| 1 | SIMPLE  | tinfinite_visits  | range | timestart  | timestart | 8  |  | 11113 | Using where; Using temporary | 
+----+-------------+----------------------+-------+---------------+-----------+---------+-----+-------+------------------------------+ 

时间:0.011280059814453

周查询:

SELECT COUNT(DISTINCT(`identity_id`)) AS visits, DATE_FORMAT(CONVERT_TZ(timestart, '-5:00', '+3:00'), '%a') AS unit 
    FROM tinfinite_visits 
    WHERE `timestart` >= DATE_SUB(NOW(), INTERVAL 7 DAY) 
    GROUP BY unit 
    ORDER BY `timestart` ASC 

解释:

+----+-------------+------------------+------+---------------+-----+---------+-----+--------+----------------------------------------------+ 
| id | select_type | table   | type | possible_keys | key | key_len | ref | rows | Extra          | 
+----+-------------+------------------+------+---------------+-----+---------+-----+--------+----------------------------------------------+ 
| 1 | SIMPLE  | tinfinite_visits | ALL | timestart  |  |   |  | 205897 | Using where; Using temporary; Using filesort | 
+----+-------------+------------------+------+---------------+-----+---------+-----+--------+----------------------------------------------+ 

时间:0.13543295860291

月查询:

SELECT COUNT(DISTINCT(`identity_id`)) AS visits, DATE_FORMAT(CONVERT_TZ(timestart, '-5:00', '+3:00'), '%d') AS unit 
    FROM tinfinite_visits 
    WHERE `timestart` >= DATE_SUB(NOW(), INTERVAL 28 DAY) 
    GROUP BY unit 
    ORDER BY `timestart` ASC 

解释:

+----+-------------+------------------+------+---------------+-----+---------+-----+--------+----------------------------------------------+ 
| id | select_type | table   | type | possible_keys | key | key_len | ref | rows | Extra          | 
+----+-------------+------------------+------+---------------+-----+---------+-----+--------+----------------------------------------------+ 
| 1 | SIMPLE  | tinfinite_visits | ALL | timestart  |  |   |  | 205897 | Using where; Using temporary; Using filesort | 
+----+-------------+------------------+------+---------------+-----+---------+-----+--------+----------------------------------------------+ 

时间:0.21460795402527

年份查询:

SELECT COUNT(DISTINCT(`identity_id`)) AS visits, DATE_FORMAT(`timestart`, '%b') AS unit 
    FROM tinfinite_visits 
    WHERE `timestart` >= DATE_SUB(NOW(), INTERVAL 1 YEAR) 
    GROUP BY unit 
    ORDER BY `timestart` ASC 

解释:

+----+-------------+------------------+------+---------------+-----+---------+-----+--------+----------------------------------------------+ 
| id | select_type | table   | type | possible_keys | key | key_len | ref | rows | Extra          | 
+----+-------------+------------------+------+---------------+-----+---------+-----+--------+----------------------------------------------+ 
| 1 | SIMPLE  | tinfinite_visits | ALL | timestart  |  |   |  | 205897 | Using where; Using temporary; Using filesort | 
+----+-------------+------------------+------+---------------+-----+---------+-----+--------+----------------------------------------------+ 

时间:0。50977802276611

总体查询:

SELECT COUNT(DISTINCT(`identity_id`)) AS visits, DATE_FORMAT(`timestart`, '%b') AS unit 
    FROM tinfinite_visits 
    WHERE `timestart` >= DATE_SUB(NOW(), INTERVAL 100 YEAR) 
    GROUP BY unit 
    ORDER BY `timestart` ASC 

解释:

+----+-------------+------------------+------+---------------+-----+---------+-----+--------+----------------------------------------------+ 
| id | select_type | table   | type | possible_keys | key | key_len | ref | rows | Extra          | 
+----+-------------+------------------+------+---------------+-----+---------+-----+--------+----------------------------------------------+ 
| 1 | SIMPLE  | tinfinite_visits | ALL | timestart  |  |   |  | 205897 | Using where; Using temporary; Using filesort | 
+----+-------------+------------------+------+---------------+-----+---------+-----+--------+----------------------------------------------+ 

时间:0.52196192741394

非常感谢你,非常感谢!

+0

不成熟的优化总是会引起麻烦。你认为会有多少条记录?什么显示'EXPLAIN'现在? –

+0

@AlmaDoMundo请参阅最新的问题。谢谢。 –

+1

解释将随着时间而改变。现在它看起来最佳。它使用何处进行过滤,然后存储分组值,然后对这些值进行排序。 – AdrianBR

回答