我想从一个特定的网页使用Python的所有链接

我想能够从下面的网页使用python https://yeezysupply.com/pages/all拉动所有的URL我试着用我发现的一些其他建议，但他们似乎并没有与这个特定的工作网站。我最终根本找不到任何网址。我想从一个特定的网页使用Python的所有链接

import urllib 
import lxml.html 
connection = urllib.urlopen('https://yeezysupply.com/pages/all') 

dom = lxml.html.fromstring(connection.read()) 

for link in dom.xpath('//a/@href'): 
    print link

来源

2017-06-06 Josh Bijari

页面源代码中没有链接;在页面加载到浏览器中后，它们使用Javascript插入。

来源

2017-06-06 00:32:50

也许你会利用专门为此设计的模块。继承人快速和肮脏的脚本，获取页面

#!/usr/bin/python3 

import requests, bs4 

res = requests.get('https://yeezysupply.com/pages/all') 

soup = bs4.BeautifulSoup(res.text,'html.parser') 
links = soup.find_all('a') 

for link in links: 
    print(link.attrs['href'])

上的相关链接它会产生这样的输出：

/pages/jewelry 
/pages/clothing 
/pages/footwear 
/pages/all 
/cart 
/products/womens-boucle-dress-bleach/?back=%2Fpages%2Fall 
/products/double-sleeve-sweatshirt-bleach/?back=%2Fpages%2Fall 
/products/boxy-fit-zip-up-hoodie-light-sand/?back=%2Fpages%2Fall 
/products/womens-boucle-skirt-cream/?back=%2Fpages%2Fall 
etc...

这是你在找什么？请求和美丽的汤是令人惊叹的刮刮刀。

来源

2017-06-06 00:45:01 Nalaurien

是的，谢谢这正是我一直在寻找 –

我想从一个特定的网页使用Python的所有链接

回答

相关问题