硒刮后返回空字符串前几个元素

我在python中使用硒刮一个网站。 xpath能够找到包含搜索结果的20个元素。但是，内容仅适用于前6个元素，其余内容为空字符串。这是硒刮后返回空字符串前几个元素

中的XPath使用的结果的所有网页真：

results = driver.find_elements_by_xpath("//li[contains(@class, 'search-result search-result__occluded-item ember-view')]")

的XPath发现

文本结果里面铬20元

[tt.text for tt in results]

匿名输出：

['Abcddwedwada', 
'Asefdasdfaca', 
'Asdaafcascac', 
'Asdadaacjkhi', 
'Sfskjfbsfvbkd', 
'Fjsbfksjnsvas', 
'', 
'', 
'', 
'', 
'', 
'', 
'', 
'', 
'', 
'', 
'', 
'', 
'', 
'']

我试过提取20个元素的id并使用driver.find_element_by_id，但仍然在前6个元素后面得到空字符串。

来源

2017-03-03 mrbot

你可以分享网页链接？ – Andersson

https://www.linkedin.com/search/results/people/?keywords=Python&origin=SUGGESTION&suggestedEntities=SKILL – mrbot

试试这个，

[str(tt.text) for tt in results if str(tt.text) !='']

[tt.text for tt in results if len(tt.text) > 0]

来源

2017-03-03 06:52:46

这将过滤掉空字符串的结果 – mrbot

@mrbot空字符串''的类型是什么？ unicode或字符串？ –

空字符串的类型是'str' – mrbot

我可以假设，这样的结果的原因如下：当你打开的页面有20项（在<ul><li>元素）但只显示6个内容。向下滚动显示其他元素的内容 - 从XHR请求动态生成的14个条目的内容。

所以，你可能需要执行列表中向下滚动到最后一个元素：

from selenium.webdriver.support.ui import WebDriverWait as wait 

wait(driver, 10).until(lambda x: len(driver.find_elements_by_xpath("//li[contains(@class, 'search-result search-result__occluded-item ember-view') and not(text()='')]")) == 20) 
results = driver.find_elements_by_xpath("//li[contains(@class, 'search-result search-result__occluded-item ember-view')]") 
results[-1].location_once_scrolled_into_view 
[tt.text for tt in results]

尝试，让我知道结果

来源

2017-03-03 08:48:56 Andersson

它没有工作。我想到了这一点，并尝试：'driver.execute_script（“window.scrollTo（0，Y）;”）' – mrbot

使用'pyvirtualdisplay'有什么用？ – mrbot

所有的20个元素都返回'True' for'is_displayed（）' – mrbot

硒刮后返回空字符串前几个元素

回答

相关问题