2016-12-15 78 views
0

我有一个桶文件夹,其中包含形式为yy-mm-dd.CSV的csv文件,其中包含几行标题,我可以忽略第二行末尾的日期,然后是151行timestamp:power(千瓦)。这里有一个片段:Python,MySQL回归,SQL错误或错误的条件?

sep=; 
    Version CSV|Tool SunnyBeam11|Linebreaks CR/LF|Delimiter semicolon|Decimalpoint point|Precision 3|Language en-UK|TZO=0|DST|2012.06.21 

    ;SN: removed 
    ;SB removed 
    ;2120138796 
    Time;Power 
    HH:mm;kW 
    00:10;0.000 
    00:20;0.000 
    00:30;0.000 
    00:40;0.000 
    00:50;0.000 
    01:00;0.000 
    01:10;0.000 
    01:20;0.000 
    01:30;0.000 
    01:40;0.000 
    01:50;0.000 
    02:00;0.000 
    02:10;0.000 
    02:20;0.000 
    02:30;0.000 
    02:40;0.000 
    02:50;0.000 
    03:00;0.000 
    03:10;0.000 
    03:20;0.000 
    03:30;0.000 
    03:40;0.000 
    03:50;0.000 
    04:00;0.000 
    04:10;0.000 
    04:20;0.000 
    04:30;0.000 
    04:40;0.000 
    04:50;0.006 
    05:00;0.024 
    05:10;0.006 
    05:20;0.000 
    05:30;0.030 
    05:40;0.036 
    05:50;0.042 
    06:00;0.042 
    06:10;0.042 
    06:20;0.048 
    06:30;0.060 
    06:40;0.114 
    06:50;0.132 
    07:00;0.150 

我解析这些文件的检查,他们有这种格式的文件名,因为有其他的文件,我不想分析桶文件夹,和我抢日期从每两排文件并存储它。我连接到数据库,然后处理剩下的行,将存储的日期与第9行(或其附近)后面的每行上的时间戳连接起来。我也抓住每条线上的第二个值(功率,单位为千瓦)。目的是将连接的日期时间值和关联的功率值插入连接的mysql数据库中。读取最后一行时,文件将被移至名为'parsed'的子文件夹。所有这些都按预期进行,但每行读取都会经过“不能附加到Db”的try/except循环(第107行)的除外分支。我已经通过登录到MySQL(实际上是MariaDB on OpenSuse LEAP 4.2)来检查存储的数据库credentails的工作情况,并且该工作和我已经打印了连接变量,这两个都导致我相信我实际上已经正确连接每个文件。我会剪掉了我的Python脚本的部分,使其短,但我不是一个particuarly高级Python编码器和我不想冒险缺少关键部分:

#!/usr/bin/python 

    from os import listdir 
    from datetime import datetime 
    import MySQLdb 
    import shutil 
    import syslog 
    #from sys import argv 


    def is_dated_csv(filename): 
     """ 
     Return True if filename matches format YY-MM-DD.csv, otherwise False. 
     """ 
     date_format = '%y-%m-%d.csv' 

     try: 
      date = datetime.strptime(filename, date_format) 
      return True 
     except ValueError: 
      # filename did not match pattern 
      syslog.syslog('SunnyData file ' + filename + ' did NOT match') 
     #print filename + ' did NOT match' 
      pass 
    #'return' terminates a function 
     return False 


    def parse_for_date(filename): 
    """ 
    Read file for the date - from line 2 field 10 
    """ 
    currentFile = open(filename,'r') 
    l1 = currentFile.readline() #ignore first line read 
    date_line = currentFile.readline() #read second line 
    dateLineArray = date_line.split("|") 
    day_in_question = dateLineArray[-1]#save the last element (date) 
    currentFile.close() 
    return day_in_question 


    def normalise_date_to_UTF(day_in_question): 
    """ 
    Rather wierdly, some days use YYYY.MM.DD format & others use DD/MM/YYYY 
    This function normalises either to UTC with a blank time (midnight) 
    """ 
    if '.' in day_in_question: #it's YYYY.MM.DD 
     dateArray = day_in_question.split(".") 
     dt = (dateArray[0] +dateArray[1] + dateArray[2].rstrip() + '000000') 
    elif '/' in day_in_question: #it's DD/MM/YYYY 
     dateArray = day_in_question.split("/") 
     dt = (dateArray[2].rstrip() + dateArray[1] + dateArray[0] + '000000') 
    theDate = datetime.strptime(dt,'%Y%m%d%H%M%S') 
    return theDate #A datetime object 


    def parse_power_values(filename, theDate): 
    currentFile = open(filename,'r') 
    for i, line in enumerate(currentFile): 
     if i <= 7: 
     doingSomething = True 
     print 'header' + str(i) + '/ ' + line.rstrip() 
     elif ((i > 7) and (i <= 151)): 
     lineParts = line.split(';') 
     theTime = lineParts[0].split(':') 
     theHour = theTime[0] 
     theMin = theTime[1] 
     timestamp = theDate.replace(hour=int(theHour),minute=int(theMin)) 
     power = lineParts[1].rstrip() 
     if power == '-.---': 
      power = 0.000 
     if (float(power) > 0): 
      print str(i) + '/ ' + str(timestamp) + ' power = ' + power + 'kWh' 
      append_to_database(timestamp,power) 
     else: 
      print str(i) + '/ ' 
     elif i > 151: 
     print str(timestamp) + ' DONE!' 
     print '----------------------' 
     break 
    currentFile.close() 

    def append_to_database(timestampval,powerval): 
    host="localhost", # host 
    user="removed", # username 
    #passwd="******" 
    passwd="removed" 
    database_name = 'SunnyData' 
    table_name = 'DTP' 
    timestamp_column = 'DT' 
    power_column = 'PWR' 
    #sqlInsert = ("INSERT INTO %s (%s,%s) VALUES('%s','%s')" % (table_name, timestamp_column, power_column, timestampval.strftime('%Y-%m-%d %H:%M:%S'), powerval)) 
    #sqlCheck = ("SELECT TOP 1 %s.%s FROM %s WHERE %s.%s = %s;" % (table_name, timestamp_column, table_name, table_name, timestamp_column, timestampval.strftime('%Y-%m-%d %H:%M:%S'))) 
    sqlInsert = ("INSERT INTO %s (%s,%s) VALUES('%s','%s')", (table_name, timestamp_column, power_column, timestampval.strftime('%Y-%m-%d %H:%M:%S'), powerval)) 
    sqlCheck = ("SELECT TOP 1 %s.%s FROM %s WHERE %s.%s = %s;", (table_name, timestamp_column, table_name, table_name, timestamp_column, timestampval.strftime('%Y-%m-%d %H:%M:%S'))) 
    cur = SD.cursor() 
    try: 
     #cur.execute(sqlCheck) 
     # Aim here is to see if the datetime for the file has an existing entry in the database_name 
     #If it does, do nothing, otherwise add the values to the datbase 
     cur.execute(sqlCheck) 
     if cur.fetchone() == "None": 
      cur.execute(sqlInsert) 
      print "" 
     SD.commit() 
    except: 
     print 'DB append failed!' 
     syslog.syslog('SunnyData DB append failed') 
     SD.rollback() 

    # Main start of program 
    path = '/home/greg/currentGenerated/SBEAM/' 
    destination = path + '/parsed' 
    syslog.syslog('parsing SunnyData CSVs started') 
    for filename in listdir(path): 
    print filename 
    if is_dated_csv(filename): 
     #connect and disconnect once per CSV file - wasteful to reconnect for every line in def append_to_database(...) 
     SD = MySQLdb.connect(host="localhost", user="root",passwd="removed", db = 'SunnyData') 
     print SD 
     print filename + ' matched' 
     day_in_question = parse_for_date(filename) 
     print 'the date is ' + day_in_question 
     theDate = normalise_date_to_UTF(day_in_question) 
     parse_power_values(filename, theDate) 
     SD.close() 
     shutil.move(path + '/' + filename, destination) 
     syslog.syslog('SunnyData file' + path + '/' + filename + 'parsed & moved to ' + destination) 

它用于工作,但它一直很长一段时间,自从我上次检查以来有很多更新。我担心回归可能会改变我的代码下的东西。只是不知道如何全力以赴。

道歉,这不是一个非常明确和具体的问题,但如果你能帮我分拣,它可能仍然是一个很好的例子,为其他人?

感谢

格雷格

+1

考虑捕获异常:'除了'例外作为e:print(e)'而不是'print'DB append失败或者除了''print append failed!'',因为你会得到实际的MySQL/Mariadb或Python错误消息。 – Parfait

+0

我加了你的建议,现在有更好的见解。原来这是一个类型错误 '参数1必须是字符串或只读缓冲区,而不是元组' 现在只需要阅读如何处理它:o) – Greg

回答

0

有在MySQL/MariaDB的无SELECT TOP ...语法,所以你的脚本必须在试图执行sqlCheck要失败。

应该是SELECT %s.%s FROM %s WHERE %s.%s = %s LIMIT 1

+0

我将版本更改为您的版本,虽然它没有做出区别。但是我确定你是对的,所以,我将编辑留下,直到我可以将SQL提供给_string_,而不是_tuple_(谁知道!?) – Greg