Python SQL查询执行时间

我几乎没有使用Python和SQL的经验。为了完成我的硕士论文，我一直在自学。Python SQL查询执行时间

我只是写了一个小脚本基准约50个相同的结构化数据库，如下：

import thesis,pyodbc 

# SQL Server settings 
drvr = '{SQL Server Native Client 10.0}' 
host = 'host_directory' 
user = 'username' 
pswd = 'password' 
table = 'tBufferAux' # Found (by inspection) to be the table containing relevant data 
column = 'Data' 

# Establish a connection to SQL Server 
cnxn = pyodbc.connect(driver=drvr, server=host, uid=user, pwd=pswd) # Setup connection 

endRow = 'SELECT TOP 1 ' + column + ' FROM [' # Query template for ending row 
with open(thesis.db_metadata_path(),'w') as file: 
    for db in thesis.db_list(): 
     # Prepare queries 
     countRows_query = 'SELECT COUNT(*) FROM [' + db + '].dbo.' + table 
     firstRow_query = endRow + db + '].dbo.' + table + ' ORDER BY ' + column + ' ASC' 
     lastRow_query = endRow + db + '].dbo.' + table + ' ORDER BY ' + column + ' DESC' 
     # Execute queries 
     N_rows = cnxn.cursor().execute(countRows_query).fetchone()[0] 
     first_row = cnxn.cursor().execute(firstRow_query).fetchone() 
     last_row = cnxn.cursor().execute(lastRow_query).fetchone() 
     # Save output to text file 
     file.write(db + ' ' + str(N_rows) + ' ' + str(first_row.Data) + ' ' + str(last_row.Data) + '\n') 

# Close session 
cnxn.cursor().close() 
cnxn.close()

我惊讶地发现，这个简单的程序采取近10秒的运行，所以我在想，如果这是正常的，或者我有我的代码的任何部分，可能会延缓执行。（我提醒你，进行循环运行仅56倍）

注意，从thesis（定制）模块的所有功能，具有非常小的影响，因为所有的人都只是变量赋值（除了thesis.db_list()这是一个快速.TXT文件阅读）

编辑：This是由该程序生成的输出.txt文件。第二列是每个数据库的该表的记录数。

来源

2015-02-08 POliveira

作为一个附注，你的列名在'first_row/last_row.Data'中被硬编码。使用'getattr（first_row，column）'来避免这种情况。 – 2015-02-08 01:48:50

您正在使用的索引顺序列？如果不是，则可能需要很长时间才能找到第一行和最后一行。 – 2015-02-08 05:33:03

@ivan_pozdeev好的！我没有注意到这一点。谢谢。 – POliveira 2015-02-08 11:30:37

timeit是很好的衡量和比较单一的语句和代码块的性能（注意，在iPython，有一个内置的命令这样做更容易）。
Profilers将测量值分解为每个调用的函数（对更大量的代码更有用）。
请注意，独立程序（更是如此，解释型语言中的一个程序）具有启动（和关闭）开销。

结合起来，对于访问数据库的程序来说，10秒看起来不是很像。

作为一个测试，我在这样的探查包裹你的程序：

def main(): 
<your program> 
if __name__=='__main__': 
    import cProfile 
    cProfile.run('main()')

而且从cygwin的bash像这样运行它：

T1=`date +%T,%N`; /c/Python27/python.exe ./t.py; echo $T1; date +%T,%N

结果表中所列connect作为单一时间猪（我的机器是一个非常快的i7 3.9GHz/8GB与本地MSSQL和SSD作为系统磁盘）：

 7200 function calls (7012 primitive calls) in 0.058 seconds 

ncalls tottime percall cumtime percall filename:lineno(function) 
<...> 
    1 0.003 0.003 0.058 0.058 t.py:1(main) 
<...> 
    1 0.043 0.043 0.043 0.043 {pyodbc.connect} 
<...>

个

而且date命令表明自己跑了周围300毫秒的节目，给它250ms的总开销：

<...>:39,782700900 
<...>:40,072717400

（通过命令行排除python，我证实了其他命令的开销可以忽略不计 - 约7us）

来源

2015-02-08 01:12:02

因此连接呼叫对最后的持续时间有很大的影响，对吧？因此，我的代码没有什么特别的错误，对吗？谢谢 – POliveira 2015-02-08 11:33:32

是的，你的代码没有问题。除了连接和启动之外，您的情况可能还有其他障碍，例如网络或文件I/O。我测试的数据比您的数据少一个数量级。 – 2015-02-08 11:49:32

Python SQL查询执行时间

回答

相关问题