2017-08-24 59 views
-1

这是我在python中的代码。我可以提取href标签,而不是身体内部的内容。我应该使用get()命令还是“内容”或其他方法来使用“body”?我无法使用python中的网络爬虫来提取标签的正文

import requests 
from bs4 import BeautifulSoup 

def web(): 
    url='https://www.phoenixmarketcity.com/mumbai/brands' 
    source = requests.get(url) 
    plain=source.text 
    soup = BeautifulSoup(plain,"html.parser") 
    for link in soup.findAll('a'): 
     href = link.get('body') 
     print(href)  

web() 
+0

'link.getText()' – eLRuLL

回答

0

我觉得这里是你想做的事: -

from bs4 import BeautifulSoup 
import requests 
def web(): 
    url='https://www.phoenixmarketcity.com/mumbai/brands' 
    source = requests.get(url) 
    plain=source.text 
    soup = BeautifulSoup(plain,"html.parser") 
    tags = soup('a') 
    for link in tags: 
     href = link.get('href') 
     print(href) 

    web()