，Postgres的查询与指数仅缓慢扫描

我有一个表call_logs，它包含一个ID，DEVICE_ID，时间戳和其他一些领域一起变量。我目前正试图编写一个查询，返回最后一次调用，如果它正在为每个设备工作。当前我的查询是这样的：，Postgres的查询与指数仅缓慢扫描

SELECT DISTINCT ON (device_id) c.device_id, c.timestamp, c.working, c.id 
FROM call_logs c 
ORDER BY c.device_id, c.timestamp desc;

它返回我想要的信息。但是我的生产服务器现在变得相当庞大，我在表中有大约6,000,000条记录。

我增加了一个索引的表：

CREATE INDEX cl_device_timestamp 
ON public.call_logs USING btree 
(device_id, timestamp DESC, id, working) 
TABLESPACE pg_default;

但我得到什么，我认为是很慢的时间：这里是一个解释分析F中的查询：

EXPLAIN ANALYSE SELECT DISTINCT ON (device_id) c.device_id, c.timestamp, c.working, c.id 
                FROM call_logs c 
                 ORDER BY c.device_id, c.timestamp desc; 
    QUERY PLAN 
----------------------------------------------------------------------------------------------------------------------------------------------------------------- 
Unique (cost=0.56..363803.37 rows=120 width=25) (actual time=0.069..2171.201 rows=124 loops=1) 
    -> Index Only Scan using cl_device_timestamp on call_logs c (cost=0.56..347982.87 rows=6328197 width=25) (actual time=0.067..1594.953 rows=6331024 loops=1) 
     Heap Fetches: 8051 
Planning time: 0.184 ms 
Execution time: 2171.281 ms 
(5 rows)

我只有124个唯一的device_id。我不会认为这将是一个缓慢的过程与索引？任何想法出了什么问题？或者为什么它如此缓慢？

来源

2017-08-09 user1434177

怎么样的执行时间，如果你删除'DISTINCT'？如果你只想要最后一次调用，你不能添加'LIMIT 1'并且不需要'DISTINCT'吗？ –

尽量避免截然不同，请参阅：https://dba.stackexchange.com/questions/93158/how-to-speed-up-select-distinct – Tisp

但限制1只给我1个设备1每个设备需要1个 – user1434177

你的指数是4列，没有之一。根据四列数据分布之一，您无法估计复合指数的规模和效率。

下一页 - 事实上，你只有124不同的设备并不意味着更快的索引。相反 - 较不显着的价值观将树分割成更少的部分，因此部分更大。例如，百万bigint值的bigserial有一百万个不同的值，确切的id变得非常快。而布尔列索引扫描只有两个（三个）值，因此需要更长的时间。

最后一个参数 - 两秒很慢，确实如此。但考虑到你扫描600万行，比较时间戳，2秒变得完全可以接受我会说。

您可以牺牲OLTP速度并创建一些触发器，将每个设备等的数据更改的上次时间戳保存到某个外部表中。然后从短外部表中选择这些预先汇总的值将为127个设备花费几微秒。

来源

2017-08-09 09:27:33

我落得这样做：

SELECT DISTINCT d.id, c.timestamp, c.id, c.working 
FROM devices d 
INNER JOIN call_logs c on d.id = c.device_id AND c.timestamp = (SELECT max(t.timestamp) FROM call_logs t WHERE t.device_id = d.id)

，它结束了好多了

Unique (cost=607.92..608.06 rows=11 width=25) (actual time=3.291..3.344 rows=117 loops=1) 
    -> Sort (cost=607.92..607.95 rows=11 width=25) (actual time=3.289..3.310 rows=117 loops=1) 
     Sort Key: d.id, c."timestamp", c.id, c.working 
     Sort Method: quicksort Memory: 34kB 
     -> Nested Loop (cost=1.05..607.73 rows=11 width=25) (actual time=0.057..3.162 rows=117 loops=1) 
       -> Seq Scan on devices d (cost=0.00..4.18 rows=118 width=8) (actual time=0.006..0.029 rows=119 loops=1) 
       -> Index Only Scan using cl_device_timestamp on call_logs c (cost=1.05..5.10 rows=1 width=25) (actual time=0.007..0.007 rows=1 loops=119) 
        Index Cond: ((device_id = d.id) AND ("timestamp" = (SubPlan 2))) 
        Heap Fetches: 110 
        SubPlan 2 
         -> Result (cost=0.48..0.49 rows=1 width=8) (actual time=0.018..0.018 rows=1 loops=119) 
          InitPlan 1 (returns $1) 
           -> Limit (cost=0.43..0.48 rows=1 width=8) (actual time=0.017..0.017 rows=1 loops=119) 
            -> Index Only Scan Backward using test1 on call_logs t (cost=0.43..2674.01 rows=52483 width=8) (actual time=0.017..0.017 rows=1 loops=119) 
              Index Cond: ((device_id = d.id) AND ("timestamp" IS NOT NULL)) 
              Heap Fetches: 110 
Planning time: 0.645 ms 
Execution time: 3.461 ms 
(18 rows)

来源

2017-08-10 07:25:14 user1434177

，Postgres的查询与指数仅缓慢扫描

回答

相关问题