2010-06-24 106 views
3

我听说很多次postgres句柄存在查询更快然后左加入http://archives.postgresql.org/pgsql-performance/2002-12/msg00185.phpPostgreSQL:存在vs左加入

这是一个汇聚绝对真实。

但在我们的情况下,它们更多的则是同一个查询建立与存在使Postgres的永远挂:

explain 
SELECT count(DISTINCT "groups".id) AS count_all 
FROM "groups" 
WHERE (exists(
    select * from products p where groups.id = p.group_id AND exists(
     select * from products_categories pc where p.id = pc.product_id AND pc.category_id in (2,3))) AND groups.id != 3) 

结果:

Aggregate (cost=26413436.66..26413436.67 rows=1 width=4) 
    -> Seq Scan on groups (cost=0.00..26413403.84 rows=13126 width=4) 
     Filter: ((id <> 3) AND (subplan)) 
     SubPlan 
      -> Index Scan using index_products_on_group_id on products p (cost=0.00..1006.13 rows=1 width=1483) 
       Index Cond: ($1 = group_id) 
       Filter: (subplan) 
       SubPlan 
        -> Seq Scan on products_categories pc (cost=0.00..498.49 rows=1 width=8) 
         Filter: ((category_id = ANY ('{2,3}'::integer[])) AND ($0 = product_id)) 

这是根本原因难以置信的长执行时间? 这是一种配置问题吗?

感谢, 波格丹。

+0

是否有groups.id的索引?因为对我来说,看起来没有。另外,你能告诉我们你想要完成什么吗?也许我们可以帮助您优化您的查询。 – EarthMind 2010-06-24 14:39:38

回答

1

那么,在“组”的每一行,PostgreSQL是做products_categories进行全面扫描,这是不好的。不一定是配置问题,但也许查询可以说没有像这样嵌套子查询?

SELECT count(DISTINCT "groups".id) AS count_all 
FROM "groups" 
WHERE exists(
    select 1 from products p where groups.id = p.group_id 
      join products_categories pc on pc.product_id = p.id 
    where pc.category_id in (2,3) 
    ) and groups.id <> 3 

也做products_categoriesproduct_id的指数?