我无法打印正确的关键字在下面的代码中发现的链接:设置变量等于行,其中关键字发现
import urllib2
from random import randint
import time
from lxml import etree
from time import sleep
a = requests.get('http://properlbc.com/sitemap.xml')
#time.sleep(1)
scrape = BeautifulSoup(a.text, 'lxml')
linkz = scrape.find_all('loc')
for linke in linkz:
if "products" in linke.text:
sitemap = str(linke.text)
break
while True:
# sleep(randint(4,6))
keyword1 = "properlbc"
keyword2 = "products"
keyword3 = "bb1296"
r = requests.get(sitemap)
# time.sleep(1)
soup = BeautifulSoup(r.text, 'lxml')
links = soup.find_all('loc')
for link in links:
while (keyword1 in link.text and keyword2 in link.text and keyword3 in link.text):
continue
print("LINK SCRAPED")
print(str(link.text) + "link scraped")
break
的代码是成功的循环,直到用关键字链接被发现但它不打印带有关键字的具体环节,它打印的,而不是“https://properlbc.com/collections/new-arrival/products/bb1296”
。 – furas