2013-02-26 75 views
1

我试图在Freebase上查询美国所有县和他们的goelocation(经度+纬度)。我注意到有时查询会起作用,但在其他尝试中,它会返回以下内容:<“HttpError 503在请求时返回”后端错误“>当查询结果超过一定数量时,Freebase + GoogleAPI查询返回错误

我试着更改查询结果限制,我发现的是,我的查询分解的极限有所不同;有时它在“极限”:2900时有效,有时在“极限”:1200时返回上述错误。我写的目前为止的代码:

 

    from itertools import islice 

    from apiclient import discovery 
    from apiclient import model 
    import json 
    from CREDENTIALS import FREEBASE_KEY 

    from pandas import DataFrame, Series 

    DEVELOPER_KEY = FREEBASE_KEY 

    model.JsonModel.alt_param = "" 
    freebase = discovery.build('freebase', 'v1', developerKey=DEVELOPER_KEY) 

    query_json = """ 
    [{ 
     "id": null, 
     "name": null, 
     "/location/us_county/fips_6_4_code": [], 
     "/location/location/geolocation": { 
     "latitude": null, 
     "longitude": null 
     }, 
     "limit": 3050 
    }]""".replace("\n", " ") 

    query = json.loads(query_json) 

    response = json.loads(freebase.mqlread(query=json.dumps(query)).execute()) 

    results = list() 

    for result in islice(response['result'], None): 
     results.append({'id': result['id'], 
         'name': result['name'], 
         'latitude': float(result['/location/location/geolocation']['latitude']), 
         'longitude': float(result['/location/location/geolocation']['longitude']), 
         'fips': result['/location/us_county/fips_6_4_code'], 
         }) 

    states = DataFrame(results) 
    plt.scatter(states["longitude"], states["latitude"]) 

这似乎不是配额问题,而其他人已经注意到Freebase邮箱中的类似问题g列表:http://lists.freebase.com/pipermail/freebase-discuss/2011-December/007710.html 但是这是针对另一种类型的数据,所以看起来他们的解决方案不适用于我正在处理的内容。


[编辑] 我使用的光标通过数据进行迭代,并且它很好地工作。下面是我用最后的代码:

 

    from itertools import islice 
    from apiclient import discovery 
    from apiclient import model 
    import json 
    from CREDENTIALS import FREEBASE_KEY 
    from pandas import DataFrame, Series 

    DEVELOPER_KEY = FREEBASE_KEY 

    model.JsonModel.alt_param = "" 
    freebase = discovery.build('freebase', 'v1', developerKey=DEVELOPER_KEY) 
    query = [{ 
     "id": None, 
     "name": None, 
     "type": "/location/us_county", 
     "/location/location/geolocation": { 
     "latitude": None, 
     "longitude": None 
     } 
    }] 

    results = [] 
    count = 0 
    def do_query(cursor=""): 
     response = json.loads(freebase.mqlread(query=json.dumps(query), cursor=cursor).execute()) 
     for result in islice(response['result'], None): 

      results.append({'id': result['id'], 
          'name': result['name'], 
          'latitude': result['/location/location/geolocation']['latitude'], 
          'longitude': result['/location/location/geolocation']['longitude'], 
          }) 
     return response.get("cursor") 

    cursor = do_query() 
    while(cursor): 
     cursor = do_query(cursor) 
     # Check how many iterations this loop has gone through. 
     #print count 
     count+=1 

    # Plug results into a pandas DataFrame and plot. 
    states = DataFrame(results) 
    plt.scatter(states["longitude"], states["latitude"]) 

回答

2

这是一个相对简单的查询,但要正确看待它的默认值是100,这是很多比你要求较低。我建议使用一个下限和一个光标来遍历结果(并提交一个错误报告,因为它不应该返回一个通用的“后端错误”,但某种MQL特定的错误)

0

下面是一些示例代码向您展示如何通过与游标结果迭代:

cursor = '' 
while cursor != False: 
    response = json.loads(freebase.mqlread(query=json.dumps(query), cursor=cursor).execute()) 
    for county in response['result']: 
    print county['name'] 
    cursor = response['cursor'] 

刚刚离开的limit条款进行查询的,它会通过县中的100个结果批次的整个列表进行迭代。

+0

谢谢大家。我用了一个光标,瞧!所有事情都是相应的。 – 2013-02-27 18:57:33