2017-05-08 67 views
0
url = "https://technet.microsoft.com/en-us/library/hh135098(v=exchg.150).aspx" 
r = requests.get(url) 
soup = BeautifulSoup(r.content, 'lxml') 
table = soup.find_all('table', attrs={"responsive": "true"})[0] 
for rows in table.find_all('tr')[1]: 
    item = [] 
    for val in rows.find('td'): 
     item.append(val.text.strip()) 
     print (item) 

Python的网页抓取:类型错误: '诠释' 对象未标化的

Traceback (most recent call last): 
    File "<stdin>", line 3, in <module> 
TypeError: 'int' object is not interable 

4号线是指for val in rows.find('td'):

+1

什么呢'rows.find( 'TD')'的回报? – Mel

+1

第4行还是第3行?对于val中的rows.find('td'):'是上面代码中的第7行。如果您包含* full * traceback,它实际上会向您显示违规行。 – SiHa

回答

1

for val in rows.find('td'):当没有tdrows发现,就返回-1int object你试图循环,因此错误。

正确的做法:

>>> for rows in table.find_all('tr'): 
... item = [] 
... for val in rows.find_all('td'): 
...  item.append(val.text.strip()) 
... print(item) 
... 
[] 
['Exchange Server 2016 CU5', 'March 21, 2017', '15.01.0845.034'] 
['Exchange Server 2016 CU4', 'December 13, 2016', '15.01.0669.032'] 
['Exchange Server 2016 CU3', 'September 20, 2016', '15.01.0544.027'] 
['Exchange Server 2016 CU2', 'June 21, 2016', '15.01.0466.034'] 
['Exchange Server 2016CU1', 'March 15, 2016', '15.01.0396.030'] 
['Exchange Server 2016 RTM', 'October 1, 2015', '15.01.0225.042'] 
['Exchange Server 2016 Preview', 'July 22, 2015', '15.01.0225.016'] 
+0

我可能有时候会成为笨蛋。谢谢先生! –

相关问题