因此,我正在为用户“Sri”发布的所有“餐馆点评”(而不是自己的评论的自我评论)抓取此特定网页https://www.zomato.com/srijata。打印网页的某些文档元素的所有发生
zomato_ind = urllib2.urlopen('https://www.zomato.com/srijata')
zomato_info = zomato_ind.read()
open('zomato_info.html', 'w').write(zomato_info)
soup = BeautifulSoup(open('zomato_info.html'))
soup.find('div','mtop0 rev-text').text
这将打印了她的第一家餐厅的评论,即 - “斯里兰卡审查大草帽 - 啃这种”为: -
u'Rated This is situated right in the heart of the city. The items on the menu are alright and I really had to compromise for bubble tea. The tapioca was not fresh. But the latte and the soda pop my friends tried was good. Another issue which I faced was mosquitos... They almost had me.. Lol..'
我也尝试另一个选择: -
我有这样的问题, : -
如何打印下一家餐厅评论?我试过findNextSiblings等,但都没有看起来工作。
为什么保存在一个文件中的HTML然后将该文件读入汤对象? – 2014-10-01 12:22:02
这是我做的一项措施,以避免连续击中网站,从而遵循安全措施,防止刮擦! – shalini 2014-10-02 05:41:56