2017-05-07 88 views
3

我试图从https://www.google.com/flights/explore中提取一些航班价格信息,但我得到的屏幕截图是空白的。任何人都可以看到有什么问题?使用硒提取航班价格内容时遇到问题

from selenium import webdriver 
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities 
from bs4 import BeautifulSoup 

url = "https://www.google.com/flights/explore/#explore;f=JFK,EWR,LGA;t=r-Mexico-0x84043a3b88685353%253A0xed64b4be6b099811;li=3;lx=12;d=2017-05-13" 
driver = webdriver.PhantomJS() 
dcap = dict(DesiredCapabilities.PHANTOMJS) 
dcap["phantomjs.page.settings.userAgent"] = (my_agent) 
driver = webdriver.PhantomJS(desired_capabilities = dcap,service_args=['--ignore-ssl-errors=true']) 
driver.implicitly_wait(20) 
driver.get(url) 

driver.save_screenshot(r'flight_explorer.png') 
+1

采取快照之前添加一些明确的等待。 – kushal

+0

嗨kushal,我尝试显示从http://selenium-python.readthedocs.io/waits.html#explicit-waits显示的等待,但它仍然无法正常工作。不知道我是否做得对。 –

回答

3

试试这个

from selenium import webdriver 
from selenium.webdriver.common.by import By 
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 
url = "https://www.google.com/flights/explore/#explore;f=JFK,EWR,LGA;t=r-Mexico-0x84043a3b88685353%253A0xed64b4be6b099811;li=3;lx=12;d=2017-05-13" 
driver = webdriver.PhantomJS() 
driver.get(url) 
WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//div[@elt='results']//img[@class='LJTSM3-v-a']"))) 
print driver.current_url 
driver.save_screenshot(r'flight_explorer.png') 

result i got

+0

它的工作!谢谢。 –