2011-01-08 73 views
8

我请求与jQuery的自动完成功能使用YouTube的搜索词,但我有一个很难转换的URL响应转换为正确的格式。的Python:转换JSON(通过URL返回)到列表

在我(Django的/ Python)的观点我做的:

data2 = urllib2.urlopen('http://suggestqueries.google.com/complete/search?hl=en&ds=yt&client=youtube&hjson=t&jsonp=window.yt.www.suggest.handleResponse&q=jum&cp=3') 

(我硬编码搜索项=为简单起见, '跳')

如果我做data2.read()我得到了我认为是JSON (复制粘贴网址到浏览器也会返回此。)

window.yt.www.suggest.handleResponse(["jum",[["jumpstyle","","0"],["jump","","1"],["jump around","","2"],["jump on it","","3"],["jumper","","4"],["jump around house of pain","","5"],["jumper third eye blind","","6"],["jumbafund","","7"],["jump then fall taylor swift","","8"],["jumpstyle music","","9"]],"","","","","",{}]) 

我需要在jQuery的自动完成功能可以读取的格式返回此。我知道它会工作,如果我能得到它到一个列表,例如,mylist = ['jumpstyle', 'jump', 'jump around', ...]

然后将其转换回JSON返回之前:

json.dumps(mylist) 

(这工作,如果我直接直接定义mylist为)

但我无法从URL返回的数据传递给一个简单的列表(然后我转换回JSON)或某种形式的JSON,我可以直接返回被使用自动完成。

我试过,除其他事项外,

j2 = json.loads(data2) 

j2 = json.loads(data2.read()) 

希望有人能帮助!

回答

13

删除&jsonp=window.yt.www.suggest.handleResponse部分

import json 
import urllib2 

data = urllib2.urlopen('http://suggestqueries.google.com/complete/search?hl=en&ds=yt&client=youtube&hjson=t&q=jum&cp=3') 

j = json.load(data) 
k = [i for i, j, k in j[1]] 
l = json.dumps(k) 
0

它不是JSON它的JavaScript,如果你想用它作为JSON你必须剥去的JavaScript部分:

j2 = json.loads(data2[37:-1]) 

,但你可以改变的URL(去掉“JSONP = window.yt.www .suggest.handleResponse”部分)具有纯JSON输出:

>>> data2 = urllib2.urlopen('http://suggestqueries.google.com/complete/search?hl=en&ds=yt&client=youtube&hjson=t&q=jum&cp=3') 
>>> json.loads(data2.read()) 
[u'jum', [[u'jumpstyle', '', u'0'], [u'jump', '', u'1'], [u'jump around', '', u'2'], [u'jump on it', '', u'3'], [u'jumper', '', u'4'], [u'jump around house of pain', '', u'5'], [u'jumper third eye blind', '', u'6'], [u'jumbafund', '', u'7'], [u'jump then fall taylor swift', '', u'8'], [u'jumpstyle music', '', u'9']], '', '', '', '', '', {}] 
0

从页的输出是不正确的JSON编码数据。您需要删除的JS函数调用加以包装。

做到这一点:

import urllib2 
import re 
import json 

data2 = urllib2.urlopen('http://suggestqueries.google.com/complete/search?' +  
    'hl=en&ds=yt&client=youtube&hjson=t&jsonp=window.yt.' + 
    'www.suggest.handleResponse&q=jum&cp=3') 

data = re.compile('^[^\(]+\(|\)$').sub('', data2.read()) 
parsedData = json.loads(data) 

parsedData是蟒蛇阵了。

3

你正在做自动包装了JSON在JavaScript回调函数JSON-P请求时,你已经在事实上请求:)

地带远离你的请求和JSON-P参数指定的一个你将直接获得直接从JSON的要求,而无需做任何额外的东西蟒在所有。

这应该是你的要求:

http://suggestqueries.google.com/complete/search?hl=en&ds=yt&client=youtube&hjson=t&q=jum&cp=3 

,它将返回:

["jum",[["jumpstyle","","0"],["jump","","1"],["jump around","","2"],["jump on it","","3"],["jumper","","4"],["jump around house of pain","","5"],["jumper third eye blind","","6"],["jumbafund","","7"],["jump then fall taylor swift","","8"],["jumpstyle music","","9"]],"","","","","",{}] 
+0

啊......我怎么说看例如:[“jumpstyle”,“jump”,“jump around”...]我不确定如何操作这些数据:它是什么?一个列表,一个字符串,一个json对象?我试过json.loads(returned_by_url),但得到一个错误。 – dkgirl 2011-01-08 14:41:41

+0

泽维尔似乎已经覆盖:) – 2011-01-08 16:14:11