2017-08-10 103 views
1

我有一个包含1,019,502条记录和一个需要1.6秒运行的特定查询的表。如果可能,我想尽量减少运行时间。MySQL ISAM搜索优化

该表是INNODB在MySQL 5.7(在Ubuntu):

mysql> describe summary_data; 
+--------------+------------------+------+-----+---------+-------+ 
| Field  | Type    | Null | Key | Default | Extra | 
+--------------+------------------+------+-----+---------+-------+ 
| propId  | int(10) unsigned | NO | PRI | NULL |  | 
| elemType  | varchar(50)  | NO | PRI | NULL |  | 
| sku   | varchar(100)  | NO | PRI | NULL |  | 
| family  | varchar(100)  | NO | PRI | NULL |  | 
| subcategory | varchar(100)  | NO | PRI | NULL |  | 
| category  | varchar(100)  | NO | PRI | NULL |  | 
| details  | varchar(255)  | YES |  | NULL |  | 
| merchSales | float(12,2)  | YES |  | NULL |  | 
| orders  | int(10) unsigned | YES |  | NULL |  | 
| quantity  | int(10) unsigned | YES |  | NULL |  | 
| margin  | float(12,2)  | YES |  | NULL |  | 
| grossSales | float(12,2)  | YES |  | NULL |  | 
| discount  | float(12,2)  | YES |  | NULL |  | 
| shipping  | float(12,2)  | YES |  | NULL |  | 
| tax   | float(12,2)  | YES |  | NULL |  | 
| createDate | datetime   | YES |  | NULL |  | 
| date   | date    | NO | PRI | NULL |  | 
| dateType  | varchar(10)  | NO | PRI | NULL |  | 
+--------------+------------------+------+-----+---------+-------+ 

查询是如下:

SET @propId = 1, 
@from = '2016-01-01', 
@to = '2016-12-31', 
@elemType = 'sku', 
@sku = NULL, 
@family = NULL, 
@subcategory = NULL, 
@category = NULL; 

SELECT SUM(ifnull(merchSales,0)+ifnull(discount,0)) as totalSales 
,SUM(ifnull(merchSales,0)) as merchSales 
,SUM(ifnull(orders,0)) as orders 
,SUM(ifnull(quantity,0)) as quantity 
,sum(ifnull(grossSales,0)) as grossSales 
,sum(ifnull(discount,0))*(-1) as discount 
,sum(ifnull(shipping,0)) as shipping 
,elemType 
,sku 
,family 
,category 
,subcategory 
,details 
,SUM(ifnull(margin,0)) as margin 
,sum(ifnull(margin,0))/sum(ifnull(merchSales,0))*100 as marginPerc 
,SUM(ifnull(grossSales,0))/SUM(ifnull(orders,0)) as avgOrderVal 
,sum(ifnull(merchSales,0)+ifnull(discount,0))/sum(ifnull(margin,0))*100 as marginPercTotal 
FROM summary_data 
WHERE propId = @propId 
AND dateType = 'day' 
AND elemType = @elemType 
AND (@sku IS NULL OR sku = @sku) 
AND (@family IS NULL OR family = @family) 
AND (@subcategory IS NULL OR subcategory = @subcategory) 
AND (@category IS NULL OR category = @category) 
GROUP BY category,subcategory,family,sku 
ORDER BY merchSales DESC; 

由查询中所使用的指数:

mysql> show indexes from summary_data; 
+--------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | 
+--------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 
| summary_data |   0 | PRIMARY |   1 | propId  | A   |   218 |  NULL | NULL |  | BTREE  |   |    | 
| summary_data |   0 | PRIMARY |   2 | elemType | A   |  1529 |  NULL | NULL |  | BTREE  |   |    | 
| summary_data |   0 | PRIMARY |   3 | category | A   |  5528 |  NULL | NULL |  | BTREE  |   |    | 
| summary_data |   0 | PRIMARY |   4 | subcategory | A   |  11198 |  NULL | NULL |  | BTREE  |   |    | 
| summary_data |   0 | PRIMARY |   5 | family  | A   |  15678 |  NULL | NULL |  | BTREE  |   |    | 
| summary_data |   0 | PRIMARY |   6 | sku   | A   |  17470 |  NULL | NULL |  | BTREE  |   |    | 
| summary_data |   0 | PRIMARY |   7 | dateType | A   |  17470 |  NULL | NULL |  | BTREE  |   |    | 
| summary_data |   0 | PRIMARY |   8 | date  | A   |  985490 |  NULL | NULL |  | BTREE  |   |    | 

查询使用1,019,502条记录中的约115,000条记录。结果返回2106个聚合行。

任何意见将不胜感激!

*****编辑*****

添加说明:

+----+-------------+--------------+------------+------+----------------------------------+---------+---------+-------------+--------+----------+----------------------------------------------+ 
| id | select_type | table  | partitions | type | possible_keys     | key  | key_len | ref   | rows | filtered | Extra          | 
+----+-------------+--------------+------------+------+----------------------------------+---------+---------+-------------+--------+----------+----------------------------------------------+ 
| 1 | SIMPLE  | summary_data | NULL  | ref | PRIMARY,propId_4,propId_5,propId | PRIMARY | 156  | const,const | 492745 | 10.00 | Using where; Using temporary; Using filesort | 
+----+-------------+--------------+------------+------+----------------------------------+---------+---------+-------------+--------+----------+----------------------------------------------+ 
+0

我可能是错的,因为我还没有做任何研究,但我相信空值的总和默认为0。 – jhpratt

+0

解释输出说什么?请将其添加(以文本形式)到您的问题中。 (建议您始终为优化执行此操作) –

+0

及以下内容来自@jhpratt SUM()忽略NULL,因此您可以通过颠倒函数序列来避免在每行上运行IFNULL()。 IFNULL(SUM(column),0) –

回答

0

你的唯一不变的部分where子句包括:

WHERE propId = @propId 
AND dateType = 'day' 
AND elemType = @elemType 

所以有MIGHT在声明涉及这3个字段的非唯一组合索引时有一些优势propid,elemtype,datetype (nb:我不确定我可以在这样的索引指定这些列的顺序,它可能需要一些experimmentation),我会尝试定义这样一个指数后的解释,但要确保这些变量保持NULL,而试运行这样的:

@sku = NULL, 
@family = NULL, 
@subcategory = NULL, 
@category = NULL 

如果有现在尝试使这4个变量中的任何一个非空。对你的解释计划有什么影响?您可能会发现您需要在每个列上分别使用非唯一索引来帮助支持where子句的可变性。

即当您更改变量时,解释计划也会有所不同。

但是:在大于100万行的1.6秒内,您将进入收益递减的境界。

+0

TY @Used_By_Already。今天我会玩你的建议。另外我想再添加一个常量到我的where子句:日期范围。在where子句中总会有一个数据>和日期<。如果这改变了你的建议,请让我知道。 – mwex501

+0

我尝试了多次复合索引变体的迭代,包括您的具体建议,并且都增加了搜索时间。在cat/subcat/family/sku AND之前,我确实发现调整PRIMARY KEY优先考虑日期类型/日期列,包括所选列在内有重大影响。此主键将搜索时间缩短为0.8秒:添加主键(propId,elemType,dateType,日期,类别,子类别,系列,sku,商品销售额,grossSales,数量,订单,运费,税收,详细信息)。仍在寻求改进 - 感谢您的帮助! – mwex501

+0

这是一个好消息,对我来说,你已经掌握了使用和利用解释输出的技巧。做得好。 –