2016-12-02 103 views
0

鉴于以下查询,我如何优化它以使子查询不依赖?MySQL的 - 如何优化查询与多个依赖子查询独立?

SELECT DISTINCT 
    inst.id, inst.name, inst.state, inst.farm_status, 
    (SELECT COUNT(inst_note.id) 
     FROM project_institution_note AS inst_note 
     WHERE inst_note.institution_id = inst.id) AS inst_note_count, 
    (SELECT COUNT(c.id) FROM project_catalog AS c 
     WHERE c.institution_id = inst.id 
     AND c.status = 0 
     AND c.catalog_type BETWEEN 0 AND 1) AS ug_count, 
    (SELECT COUNT(c.id) FROM project_catalog AS c 
     WHERE c.institution_id = inst.id 
     AND c.status = 0 
     AND c.catalog_type BETWEEN 1 AND 2) AS grad_count, 
    (SELECT COUNT(c.id) FROM project_catalog AS c 
     WHERE c.institution_id = inst.id 
     AND c.status = 0 AND c.catalog_type >= 3) AS alt_count, 
    (SELECT COUNT(c.id) FROM project_catalog_note AS cn 
     INNER JOIN farmtool_catalog AS c 
     ON c.id = cn.catalog_id 
     WHERE c.institution_id = inst.id) AS catalog_note_count, 
    (SELECT inst_note.text FROM project_institution_note AS inst_note 
     LEFT JOIN project_institution AS inst 
     ON inst_note.institution_id = inst.id 
     WHERE inst_note.institution_id = inst.id 
     ORDER BY inst_note.date DESC 
     LIMIT 1) AS latest_note 
FROM project_institution AS inst 
LEFT JOIN project_institution_note AS inst_note 
ON inst.id = inst_note.institution_id 
LEFT JOIN project_catalog AS c 
ON inst.id = c.institution_id 
WHERE LOWER(inst.state) = "me"; 

我试图重构第一子查询到INNER JOIN像这样:

INNER JOIN (SELECT COUNT(inst_note.id) 
     FROM project_institution_note AS inst_note 
     GROUP BY inst_note.institution_id) inst_note_count 
     ON inst_note.institution_id = inst.id 

最后LEFT JOIN操作之后包括,但返回一个空的结果。

我特别感兴趣的是优化第二个和第三个子查询,计算出ug_countgrad_count。两者之间的唯一区别是第一个依赖于(0,1)之间的字段值和(1,2)之间的第二个字段值。

此时此查询运行正常,处于低使用情况。尽管如此,它显然非常低效,所以我希望尽可能优化。

+0

我会使用临时表。它保持清晰并避免所有这些子查询 – Alee

+0

您的意思是? – Jason

+0

没有视图与临时表不同。 http://stackoverflow.com/questions/16897323/what-to-use-view-or-temporary-table。我会尝试用一些示例发布答案。 – Alee

回答

0

这应该有希望让你的一部分。

SELECT 
    inst.id, inst.name, inst.state, inst.farm_status, 
    COUNT(DISTINCT inst_note.id) AS inst_note_count, 
    SUM(CASE WHEN c.status = 0 and c.catalog_type BETWEEN 0 AND 1 THEN 1 ELSE 0 END) AS ug_count, 
    SUM(CASE WHEN c.status = 0 and c.catalog_type BETWEEN 1 AND 2 THEN 1 ELSE 0 END) AS grad_count, 
    SUM(CASE WHEN c.status = 0 and c.catalog_type >= 3 THEN 1 ELSE 0 END) AS alt_count, 
    COUNT(DISTINCT cn.id) AS catalog_note_count, 
    (SELECT inst_note.text FROM project_institution_note AS inst_note 
     LEFT JOIN project_institution AS inst 
     ON inst_note.institution_id = inst.id 
     WHERE inst_note.institution_id = inst.id 
     ORDER BY inst_note.date DESC 
     LIMIT 1) AS latest_note 
FROM project_institution AS inst 
LEFT JOIN project_institution_note AS inst_note ON inst.id = inst_note.institution_id 
LEFT JOIN project_catalog AS c ON inst.id = c.institution_id 
LEFT JOIN farmtool_catalog AS fc ON fc.institution_id = inst.id 
LEFT JOIN project_catalog_note AS cn ON fc.id = cn.catalog_id 
WHERE LOWER(inst.state) = "me" 
GROUP BY inst.id, inst.name, inst.state, inst.farm_status; 

不确定您是否可以做任何关于获取最新备注的内容。在Sql-Server中,我将使用cte和窗口函数,但这些东西在MySql中不可用。无论如何,我希望这是有用的。

0

也许最大的性能杀手是

WHERE LOWER(inst.state) = "me"; 

充分利用COLLATIONstate..._ci排序规则之一(它可能已经是),然后换乘只需

WHERE inst.state = "me"; 

而且可以肯定有

INDEX(state) 

同时,不这样做:

JOIN ... ON inst_note.institution_id = inst.id 
     WHERE inst_note.institution_id = inst.id 

ONWHERE是多余的和做的完全一样的东西。既然(我假设)表格是如何链接的;保留ON。但是......

这是真的LEFT JOIN...,这意味着“保持‘正确’的表,即使没有匹配的行。但是,那么WHERE将失败。所以......让它JOIN和摆脱。WHERE条款

一般情况下,除非你需要它不使用LEFT

对于原来的尝试,这(和其他一些 '复合' 索引)将是有益的:

project_catalog: INDEX(institution_id, status, catalog_type) 

潜在的性能问题JOINGROUP BY是一个爆炸,然后处理行内爆。为了避免它,凯西的建议的改进是有一个派生表来计算各种SUMs,然后JOIN回到inst,现在没有GROUP BY。这样可以避免在GROUP BY(也许还有其他一些好处)的4列左右。