2017-09-16 81 views
0

此代码是从网页搜索电影并打印搜索结果的第一个标题。TypeError:'NoneType'对象不可下标,网页扫描Python

from urllib.request import urlopen 
import urllib 
from bs4 import BeautifulSoup 
import requests 
import pprint 

def infopelicula(nombrepelicula): 
    my_url='http://www.imdb.com/find?ref_=nv_sr_fn&q='+nombrepelicula+'&s=tt' 
    rprincipal = requests.get(my_url) 
    soup= BeautifulSoup(rprincipal.content, 'html.parser') 
    title = soup.findAll("td", class_="result_text") 
    for name in title: 
     titulo = name.parent.find("a", href=True) 
     print (name.text)[0] 

它可以工作,但打印标题时出现错误。 这里一个例子:

>>>infopelicula("Harry Potter Chamber") 
Harry Potter and the Chamber of Secrets (2002) 
Traceback (most recent call last):File "<pyshell#49>", line 1, in <module> 
infopelicula("Harry Potter Chamber") 
File "xxxx", line 14, in infopelicula print (name.text)[0] 
TypeError: 'NoneType' object is not subscriptable 

回答

0

这个怎么样:

import requests 
from bs4 import BeautifulSoup 

def infopelicula(): 
    my_url = 'http://www.imdb.com/find?ref_=nv_sr_fn&q="Harry Potter Chamber"&s=tt' 
    soup = BeautifulSoup(requests.get(my_url).text, 'lxml') 
    for name in soup.find_all("td",class_="result_text"): 
     title = name.find_all("a",text=True)[0] 
     print (title.text) 
infopelicula() 

的部分输出:

Harry Potter and the Sorcerer's Stone 
Harry Potter and the Goblet of Fire 
Harry Potter and the Half-Blood Prince 
Harry Potter and the Deathly Hallows: Part 2 

仅适用于第一个标题:

import requests 
from bs4 import BeautifulSoup 

def infopelicula(): 
    my_url = 'http://www.imdb.com/find?ref_=nv_sr_fn&q="Harry Potter Chamber"&s=tt' 
    soup = BeautifulSoup(requests.get(my_url).text, 'lxml') 
    for name in soup.find_all("td",class_="result_text")[:1]: 
     title = name.find_all("a",text=True)[0] 
     print (title.text) 
infopelicula() 

输出:

Harry Potter and the Chamber of Secrets 
2

在Python3.5,print是返回None,其中(作为误差清楚地说)不能被下标函数。

也许你的意思是print(name.text[0])

+0

我认为'name.text'更有意义。 'name.text [0]'会打印名字的第一个字母。 – MrE

+0

我做了两个,打印(name.text [0])打印我没有和name.text打印我所有的标题,我只是第一个 –

+0

现在我用name.text [1],它打印每个的第一个字母title:/ –