Postgres：使用光标更新时的令人惊讶的性能

考虑以下两个Python代码示例，它们实现相同但具有显着和令人惊讶的性能差异。Postgres：使用光标更新时的令人惊讶的性能

import psycopg2, time 

conn = psycopg2.connect("dbname=mydatabase user=postgres") 
cur = conn.cursor('cursor_unique_name') 
cur2 = conn.cursor() 

startTime = time.clock() 
cur.execute("SELECT * FROM test for update;") 
print ("Finished: SELECT * FROM test for update;: " + str(time.clock() - startTime)); 
for i in range (100000): 
    cur.fetchone() 
    cur2.execute("update test set num = num + 1 where current of cursor_unique_name;") 
print ("Finished: update starting commit: " + str(time.clock() - startTime)); 
conn.commit() 
print ("Finished: update : " + str(time.clock() - startTime)); 

cur2.close() 
conn.close()

和：

import psycopg2, time 

conn = psycopg2.connect("dbname=mydatabase user=postgres") 
cur = conn.cursor('cursor_unique_name') 
cur2 = conn.cursor() 

startTime = time.clock() 
for i in range (100000): 
    cur2.execute("update test set num = num + 1 where id = " + str(i) + ";") 
print ("Finished: update starting commit: " + str(time.clock() - startTime)); 
conn.commit() 
print ("Finished: update : " + str(time.clock() - startTime)); 

cur2.close() 
conn.close()

为表测试CREATE语句是：

CREATE TABLE test (id serial PRIMARY KEY, num integer, data varchar);

这表包含10万行和真空分析测试;已经运行。

我在几次尝试中得到了以下结果。

第一个代码示例：

Finished: SELECT * FROM test for update;: 0.00609304950429 
Finished: update starting commit: 37.3272754429 
Finished: update : 37.4449708474

第二个代码示例：

Finished: update starting commit: 24.574401185 
Finished committing: 24.7331461431

这是非常令人惊讶的我，我会觉得是应该是完全相反的，这意味着使用光标的更新应该是根据this回答显着更快。

来源

2011-01-23 David

我不认为测试是平衡的 - 你的第一个代码是从光标获取数据，然后更新，而第二个是盲目地通过ID更新而不提取数据。我假定第一个代码序列转换为一个FETCH命令，然后是UPDATE-，所以这是两个客户端/服务器命令转换，而不是一个。

（也是第一个代码开始在此拉动整个表到缓冲区的cache的表 - 锁定每一行，虽然想着它，我怀疑这实际上会影响性能，但是你没有提到它）

另外tbh我认为对于一个简单的表，ctid更新（我假设它是如何工作的）和通过主键更新之间不会有太大的不同 - pkey更新是额外的索引查找，但除非索引是巨大的它没有太大的退化。

为了更新这样的100,000行，我怀疑大部分时间都是生成额外的元组并将它们插入或附加到表中，而不是定位前面的元组来标记为已删除。

来源

2011-01-23 21:33:28 araqnid

Postgres：使用光标更新时的令人惊讶的性能

回答

相关问题