POSTGIS插入一段时间

我有如下结构的位置表之后变得缓慢：POSTGIS插入一段时间

CREATE TABLE location 
     ( 
      id BIGINT, 
      location GEOMETRY, 
      CONSTRAINT location_pkey PRIMARY KEY (id, location), 
      CONSTRAINT enforce_dims_geom CHECK (st_ndims(location) = 2), 
      CONSTRAINT enforce_geotype_geom CHECK (geometrytype(location) = 'POINT'::TEXT OR location IS NULL), 
      CONSTRAINT enforce_srid_geom CHECK (st_srid(location) = 4326) 
     ) 
    WITH ( 
      OIDS=FALSE 
     ); 

    CREATE INDEX location_geom_gist ON location 
    USING 
    GIST (location);

我运行下面的查询插入数据：

def insert_location_data(msisdn, lat, lon): 
    if not (lat and lon):  
      return 
    query = "INSERT INTO location (id, location) VALUES ('%s', ST_GeomFromText('POINT(%s %s)', 4326))"%(str(id), str(lat), str(lon)) 
    try: 
     cur = get_cursor() 
     cur.execute(query) 
     conn.commit() 
    except: 
     tb = traceback.format_exc() 
     Logger.get_logger().error("Error while inserting location in sql: %s", str(tb)) 
     return False 
    return True

我运行的代码块1000万次在一个循环中却有一百万插入插入速度急剧下降。当我重新启动脚本时，速度恢复正常，但它又下降了大约一百万个文档，并继续保持相同的趋势。我无法弄清楚为什么？任何帮助。

来源

2016-02-05 Arijit Basu

使用扩展插入语句。在循环中准备查询，然后将其发送到PostgreSQL –

这里有一个关于在postgres上加载批量数据的指南：http://www.postgresql.org/docs/current/interactive/populate.html –

但是为什么情况在发生。查询起初速度很快，但随着时间的推移，速度正在下降。数据库大小不是问题，因为重新启动脚本会将速度恢复到最大值。 –

这里有一些提示。

当心str(id)，这将始终返回一个字符串'<built-in function id>'，因为id并不证明是问题的变量，是一个内置的id() function。
PostGIS的正确轴顺序是（X Y）或（lon lat）。
There are more efficient ways to insert points。
Don't format a string to insert

这是如何插入一个点：

cur.execute(
    "INSERT INTO location (id, location) " 
    "VALUES (%s, ST_SetSRID(ST_MakePoint(%s, %s), 4326))", 
    (msisdn, lon, lat))

而且看executemany，如果你想在同一时间，在那里你会准备的参数插入表中插入更多的记录（即[(msisdn, lon, lat), (msisdn, lon, lat), ..., (msisdn, lon, lat)]）。

来源

2016-02-08 19:59:35

你能告诉我为什么插入一段时间后变得缓慢 –

@ArijitBasu我不能再现这种行为;我只得到* O（n）*（或线性）。 –

POSTGIS插入一段时间

回答

相关问题