2017-05-30 55 views
0

我已经构建了一个刮板来使用它们的api从songkick检索音乐会数据。但是,从这些艺术家中检索所有数据需要很长时间。大约15小时后,脚本仍在运行,但JSON文件不再改变。我中断了脚本,并检查了是否可以使用TinyDB访问我的数据。不幸的是我得到以下错误。有人知道为什么会发生这种情况吗?当从Songkick检索数据时出现内存错误

错误:

('cannot fetch url', 'http://api.songkick.com/api/3.0/artists/8689004/gigography.json?apikey=###########&min_date=2015-04-25&max_date=2017-03-01') 
8961344 


Traceback (most recent call last): 
    File "C:\Users\rmlj\Dropbox\Data\concerts.py", line 42, in <module> 
    load_events() 
    File "C:\Users\rmlj\Dropbox\Data\concerts.py", line 27, in load_events 
    print(artist) 
    File "C:\Python27\lib\idlelib\PyShell.py", line 1356, in write 
    return self.shell.write(s, self.tags) 
KeyboardInterrupt 

>>> mydat = db.all() 

Traceback (most recent call last): 
    File "<pyshell#0>", line 1, in <module> 
    mydat = db.all() 
    File "C:\Python27\lib\site-packages\tinydb\database.py", line 304, in all 
    return list(itervalues(self._read())) 
    File "C:\Python27\lib\site-packages\tinydb\database.py", line 277, in _read 
    return self._storage.read() 
    File "C:\Python27\lib\site-packages\tinydb\database.py", line 31, in read 
    raw_data = (self._storage.read() or {})[self._table_name] 
    File "C:\Python27\lib\site-packages\tinydb\storages.py", line 105, in read 
    return json.load(self._handle) 
    File "C:\Python27\lib\json\__init__.py", line 287, in load 
    return loads(fp.read(), 
MemoryError 

下面你可以找到我的剧本

import urllib2 
import requests 
import json 
import csv 
import codecs 


from tinydb import TinyDB, Query 
db = TinyDB('events.json') 


def load_events(): 
     MIN_DATE = "2015-04-25" 
     MAX_DATE = "2017-03-01" 
     API_KEY= "###############" 
     with open('artistid.txt', 'r') as f: 
      for a in f: 
       artist = a.strip() 
       print(artist) 
       url_base = 'http://api.songkick.com/api/3.0/artists/{}/gigography.json?apikey={}&min_date={}&max_date={}' 
       url = url_base.format(artist, API_KEY, MIN_DATE, MAX_DATE) 
       # url = u'http://api.songkick.com/api/3.0/search/artists.json?query='+artist+'&apikey=WBmvXDarTCEfqq7h' 
       try: 
        r = requests.get(url) 
        resp = r.json() 
        if(resp['resultsPage']['totalEntries']): 
        results = resp['resultsPage']['results']['event'] 
        for x in results: 
         print(x) 
         db.insert(x) 
       except: 
        print('cannot fetch url',url); 

load_events() 
db.close() 
print ("End of script")  
+1

,但它是在你的错误的第一行可见。 – alxwrd

回答

相关问题