2016-05-04 36 views
0

我想要返回一个数据框中的列作为邮政编码。此代码有效,但不会在数据框gps中创建新列。python:使用多处理与数据框进行地理编码

import geocoder 
import multiprocessing as mp 
import pandas as pd 

google_key = 'key' 

def reverse_gecode(coordinates): 
    return geocoder.google(coordinates, key = google_key, method = 'reverse').postal 

if __name__ == '__main__':    
    gps = pd.DataFrame({'lat': [27.950575, 40.6936488], 
         'lon': [-82.4571776, -89.5889864]}) # dataframe mehtod 
    gps['gps'] = zip(gps.lat, gps.lon) 
    x = list(gps['gps']) 
    # multiprocessings      
    pool = mp.Pool(processes = (mp.cpu_count() - 1)) 
    result_latlong = pool.map(reverse_gecode, x) 
    pool.close() 
    pool.join() 

我已经试过

  1. gps['zip_code'] = gps.apply(lambda x: pool.map(reverse_gecode, list(x[2])), axis = 1)
  2. gps['zip_code'] = gps.apply(lambda x: pool.map(reverse_gecode, x[2]), axis = 1)
  3. gps['zip_code'] = gps.apply(lambda x: pool.map(reverse_gecode, [x[0], x[1]]), axis = 1)

但我不能得到任何工作。我不断收到错误是

ValueError: ('Unknown location: 27.950575', u'occurred at index 0')

回答

1

尝试是:

import geocoder 
import multiprocessing as mp 
import pandas as pd 

def reverse_gecode(coordinates): 
    return geocoder.google(coordinates, method = 'reverse').postal 

if __name__ == '__main__':    
    gps = pd.DataFrame({'lat': [27.950575, 40.6936488], 
         'lon': [-82.4571776, -89.5889864]}) # dataframe mehtod 
    coords = gps[['lat','lon']].astype(str).apply(lambda x: (x[0],x[1]), axis=1).tolist() 
    # multiprocessings      
    pool = mp.Pool(processes = (mp.cpu_count() - 1)) 
    gps['zip_code'] = pool.map(reverse_gecode, coords) 
    print(gps) 
    pool.close() 
    pool.join() 

PS我已经删除的geocoder.google()通话key=google_key,因为它没有为我工作

输出:

  lat  lon zip_code 
0 27.950575 -82.457178 33602 
1 40.693649 -89.588986 61603 
+0

@dustin,我已经更新了我的答案 - 请检查 – MaxU

+0

@dustin,你有没有尝试ru ñ我的代码'原样'? – MaxU

+0

我刚刚检查过它 - 适用于Python 3.5.1和2.7.11(pandas:0.18.0) - 你有什么版本? – MaxU