使用Scraping获取产品名称

以下是我的代码，从url获得产品名称“RENU FRESH LENS SOLUTION 120 ML”..这就是p标签..我只需要这个名称。使用Scraping获取产品名称

import requests 
import lxml 
from bs4 import BeautifulSoup 

url = "http://www.lenskart.com/renu-fresh-lens-solution-100-ml.html" 

source = requests.get(url) 
data = source.content 
soup = BeautifulSoup(data, "lxml") 

pn = soup.find_all("div", {"class":"prcdt-overview"})[0].text 
print pn

来源

2016-12-26 Nitin

好。你也有问题吗？ – DeepSpace

你面临的问题是什么？ – saeleko

只能得到产品名称“RENU FRESH LENS SOLUTION 120 ML”而不是全部内容 – Nitin

import requests 
from bs4 import BeautifulSoup 

url = "http://www.lenskart.com/renu-fresh-lens-solution-100-ml.html" 

source = requests.get(url) 
# data = source.content pass the variable in the BeautifulSoup() 
soup = BeautifulSoup(source.content, "lxml")

查找（）版本：

pn = soup.find('div', class_="prcdt-overview").p.text

你并不需要导入'lxml'，BeautifulSoup会为你做它
如果你只需要第一个标签的find_all()，则应该尝试find()，它会返回find_all()中的第一个标记
您可以使用tag.tag.find()/find_all()逐步获取标签。
tag.tag_name是tag.find('tag_name')

CSS选择器版本的短名称：

soup.select_one(".prcdt-overview p").text

select_one()将返回select()的第一个标签，就像find()和find_all()

来源

2016-12-26 10:35:21

试试这个

pn = soup.select(".prcdt-overview h1[itemprop=name] p")[0].text

或

pn =soup.select(".prcdt-overview")[0].select("h1[itemprop=name]>p")[0].text

还有其他的方式为好，试试这些

希望这有助于

来源

2016-12-26 09:04:42 SarathSprakash

请仔细阅读本文档，它非常简单https://www.crummy.com/software/BeautifulSoup/bs4/doc/ – SarathSprakash

更详细的方式：

pn = soup.find_all("div", {"title":"prcdt-overview"})[0] 
divTitle = pn.find("div",{"class":"title"}) 
pText = divTitle.find("p").text 
print pText

来源

2016-12-26 09:22:20

使用Scraping获取产品名称

回答

相关问题