如何优化查询成千上万的ID

2013-05-01 60 views 2 likes

这里有三个连续的查询与他们的基准性能：如何优化查询成千上万的ID

ids = @Company.projects.submitted.uniq.collect(&:person_id) 
    1.370000 0.060000 1.430000 ( 3.763946) 

@persons = Person.where("id IN (?)", ids) 
    0.030000 0.000000 0.030000 ( 0.332878) 

@emails = @persons.collect(&:email).reject(&:blank?) 
    16.550000 1.640000 18.190000 (128.002465)

ids几乎包含了10000的ID，并在运行最后一个查询我看到：

SELECT "persons".* FROM "persons" WHERE (id in (121,142,173,178...14202)) 
(*1000s ->) User Load (13.0ms) SELECT "users".* FROM "users" WHERE "users"."roleable_type" = 'Person' AND "users"."roleable_id" = 121 LIMIT 1 

Indexes on User: 
add_index "users", ["roleable_id", "roleable_type"], :name => "index_users_on_roleable_id_and_roleable_type" 
add_index "users", ["roleable_type", "roleable_id"], :name => "index_users_on_roleable_type_and_roleable_id"

我该如何解决这里发生的事情？

来源

2013-05-01 sscirrus

回答

您拥有它的第二个查询实际上并没有打到数据库。它构建了一个ActiveRecord::Relation（懒惰查询），直到第三个查询被调用才会被触发。您可以通过在第二个查询的末尾添加.all来证明这一点。

要解决性能问题，你想获得一个文字列表摆脱IN()的，因为这能真正伤害了大名单数据库性能：

@persons = Person.joins(:projects).merge(@Company.projects.submitted)

你也可以做到这一点使用子查询（虽然效率较低比JOIN）：

subquery = @Company.projects.submitted.select("projects.person_id").to_sql 
@persons = Person.where("id IN (#{subquery})")

如果你只想要得到的产生@emails，而并不真正需要的集合，可以使这稍微像这样高效：

@email = Person.joins(:projects).merge(@Company.projects.submitted). 
        where("LENGTH(persons.email) > 0").pluck(:email)

来源

2013-05-01 18:41:54 PinnyM

如何优化查询成千上万的ID

回答

相关问题