我在搜索房地产数据。在用JavaScript硒做了出色的工作产生的站点:你发现有Python - Selenium:在Find_elements_by()上搜索带有循环的AngularJS元素
driver.find_elements_by...
缓缴全部的相关信息,并循环的标签,但在这site,该列表按角JS生产。我尝试了同样的方法:
for article in driver.find_elements_by_css_selector("div.property.ng-scope"):
do something
我想通了,我必须让我的webdriver(phantomJS)单击通向单独列表网站的链接:
linkbase = article.find_element_by_css_selector("div.info.clear.ng-scope")
link = linkbase.find_element_by_tag_name('a')
link.click()
然后webdriver的仅仅是指出对该网站,我可以得到我想要的所有信息一个清单。
只要通过一个运行结束,我得到以下错误:
> Message: {"errorMessage":"Element does not exist in cache","request":{"headers":
{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","
Content-Length":"142","Content-Type":"application/json;charset=UTF-8","Host":"12
7.0.0.1:56577","User-Agent":"Python-urllib/3.4"},"httpVersion":"1.1","method":"P
OST","post":"{\"sessionId\": \"f9ec2c10-dfd9-11e5-9d4c-3bbe8f5bf7c0\", \"using\"
: \"css selector\", \"id\": \":wdc:1456856343349\", \"value\": \"div.info.clear.
ng-scope\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"elemen
t","directory":"/","path":"/element","relative":"/element","port":"","host":"","
password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/ele
ment","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/f9ec2c10-dfd9-
11e5-9d4c-3bbe8f5bf7c0/element/:wdc:1456856343349/element"}}
包含页面上的链接的元素是:
<a ng-href="/detail/prodej/dum/rodinny/jemnice-jemnice-/3800125532" ng-click="beforeOpen(i.iterator, i.regionTip)" class="title" href="/detail/prodej/dum/rodinny/jemnice-jemnice-/3800125532">
<span class="name ng-binding"> ... </a>
这仅仅是标题文字的每个列表。我确实在this answer之后设置了用户代理,即使它没有出现在错误中。此外,我等待周围的元素加载之前:
wait = WebDriverWait(driver, getSearchResults_CZ.waiting)
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "div.content")))
我要的是分析所有这些属性元素,通过列表的链接保存到一个列表,然后循环,打开每个环节与driver.get( )我知道,通过点击链接,驱动程序的网址发生了变化,但我认为一旦文章列表已经建立了find_elements_by,它将作为一个稳定的参考点。通过搜索“a”标签访问链接,并调用get_attribute('href')在这种情况下无法使用角度js框架。我没有看到什么?
编辑: 如回答,没有.click()的get_attribute是正确的路要走。我原来的错误与CSS选择器有关:我一直在使用“div [class^='property']”并得到了完全不同的链接。必须找到我以前从未见过的另一个元素。
正如它对我来说......不是点击是正确的路要走。否则Selenium会丢失它应该循环的webobjects。 – Thanados