2010-01-13 136 views
2

我能够完成这个查询,但它需要25秒。太长了!我该如何优化这个查询?声明外面走进一个变量:(的startDate,间隔1个月)如何优化此SQL选择查询?

SELECT COUNT(DISTINCT u1.User_ID) AS total 
FROM UserClicks u1 
INNER JOIN (SELECT DISTINCT User_ID 
       FROM UserClicks 
      WHERE (Date BETWEEN DATE_SUB(:startDate, INTERVAL 1 MONTH) AND :startDate)) u2 
      ON u1.User_ID = u2.User_ID 
WHERE (u1.Date BETWEEN :startDate AND :endDate) 

这是正在一个MySQL数据库

+0

您的UserClicks.User_ID字段是否不唯一且已建立索引?这应该使你摆脱查询的两个DISTINCT部分。无论如何,我认为@Parrots的答案如下。 – JMD 2010-01-13 18:46:02

+0

@andrew:是否真的有你想要做的事情,在开始日期前一个月以及开始日期和结束日期之间点击了哪些人? (请参见下面的Quassnoi注释) – Hogan 2010-01-13 19:01:17

回答

2
SELECT COUNT(*) AS total 
FROM (
     SELECT DISTINCT User_ID 
     FROM UserClicks 
     WHERE Date BETWEEN DATE_SUB(:startDate, INTERVAL 1 MONTH) AND :startDate 
     ) u1 
WHERE EXISTS 
     (
     SELECT NULL 
     FROM UserClicks u2 
     WHERE u2.User_ID = u1.User_ID 
       AND u2.Date BETWEEN :startDate AND :endDate 
     ) 

(User_ID, Date)创建一个综合指数:

CREATE INDEX ix_userclicks_user_date ON UserClicks (User_ID, Date) 

如果你有

SELECT COUNT(DISTINCT UserClicks.User_ID) AS total 
FROM UserClicks 
WHERE (UserClicks.Date BETWEEN :startDate AND :endDate) 
AND (UserClicks.Date BETWEEN DATE_SUB(:startDate, INTERVAL 1 MONTH) AND :startDate) 

如果日期列上添加一个索引也可能有助于很少用户,但很多点击,并有一个表Users,您可以使用Users表代替DISTINCT

SELECT COUNT(*) 
FROM Users u 
WHERE EXISTS 
     (
     SELECT NULL 
     FROM UserClicks uc1 
     WHERE uc1.UserId = u.Id 
       AND uc1.Date BETWEEN DATE_SUB(:startDate, INTERVAL 1 MONTH) AND :startDate 
     ) 
     AND EXISTS 
     (
     SELECT NULL 
     FROM UserClicks uc2 
     WHERE uc2.UserId = u.Id 
       AND u2.Date BETWEEN :startDate AND :endDate 
     ) 
+0

创建复合索引后,我该如何更改? – Andrew 2010-01-13 19:31:51

+0

也......组合索引需要是唯一的吗? (对不起,如果这是一个愚蠢的问题) – Andrew 2010-01-13 19:33:25

+0

组合索引将帮助查询运行更快(特别是第二个查询) – Quassnoi 2010-01-13 19:33:38

0

您是否尝试过移动DATE_SUB上?您是否有UserClicks.Date的索引?

0

为什么不只是使用一条select语句而不是运行嵌套的一对选择。现在你基本上正在运行两个查询。试试这个:

ALTER TABLE `UserClicks` ADD INDEX ( `Date`); 
+0

这将不会返回原始查询返回的结果。 – Quassnoi 2010-01-13 18:46:05

+0

你是什么意思添加索引?你能证明吗? – Andrew 2010-01-13 18:46:11

+0

@Quassnoi查询结果如何不同?我很难看出差异。嵌套的基本上是说“让所有的人在开始日期和结束日期之间”,“现在是从开始日期到+1月之间获得所有人”。这与刚才和AND操作有何不同? – Parrots 2010-01-13 18:48:38

0

MySQL的趋向处理子查询时忽略索引,所以它必须处理所有行。如何自我加入呢?这只是我的头顶,所以它可能不太正确,但它至少应该指向正确的方向。

SELECT COUNT(DISTINCT u1.User_ID) AS total 
FROM UserClicks AS u1 
JOIN UserClicks AS u2 USING (User_ID) 
WHERE u1.Date BETWEEN :startDate AND :endDate 
AND u2.Date BETWEEN DATE_SUB(:startDate, INTERVAL 1 MONTH) AND :startDate)