2016-11-17 89 views
1

我正在用html单元解析网站。这个过程基本上;HtmlUnit单击后没有获得内容

WebClient client = new WebClient(BrowserVersion.CHROME); 
client.waitForBackgroundJavaScript(5 * 1000); 
HtmlPage page = client.getPage("http://www.exapmle.com"); //here it waits to run js code. 

HtmlUnorderedList ul = (HtmlUnorderedList) page.getByXPath("//ul[contains(@class, 'class-name')]").get(0); 
HtmlListItem li = (HtmlListItem) ul.getChildNodes().get(1); // I want to click li and get result page. But it takes a little time to execute. 

li.click(); 

client.waitForBackgroundJavaScript(5 * 1000); //At here it does not do what I want. 

之后,当我检查页面时,我发现它的内容没有改变。

我能做些什么来获得正确的页面结果?

感谢。

回答

0

你可以尝试轮询一个javascript条件为真

int attempts = 20; 
int pollMillis = 500; 
boolean success = false; 
for (int i = 0; i < attempts && !success; i++) { 
    TimeUnit.MILLISECONDS.sleep(pollMillis); 
    if (someJavascriptCondition == true) { 
     success = true; 
    } 
} 
if (!success) throw new RuntimeException(String.format("Condition not met after %s millis", attempts * pollMillis); 

类似的技术讨论here

+0

我没有这样的Java脚本条件:/ – xxlali

+0

当然你可以。检查一个微调图像已停止或div已更新等等 –

0
WebClient client = new WebClient; 
HtmlPage page = client.getPage("http://www.exapmle.com"); 
client.waitForBackgroundJavaScript(5 * 1000); 
Thread.sleep(10*1000);// this code will waite to 10 seconds 
HtmlUnorderedList ul = (HtmlUnorderedList) page.getByXPath("//ul[contains(@class, 'class-name')]").get(0); 
HtmlListItem li = (HtmlListItem) ul.getChildNodes().get(1); // I want to click li and get result page. But it takes a little time to execute. 

li.click(); 

client.waitForBackgroundJavaScript(5 * 1000); 
// this code will waite to 10 seconds 
Thread.sleep(10*1000); 

使用Thread.sleep()方法,而不是waitForBackgroundJavaScript 对我的作品!

+0

不,它不起作用:/ – xxlali

0

您可以使用JavaScriptJobManager来检查尚未完成的JavaScript作业的数量。拨打click()后,请尝试以下代码。

JavaScriptJobManager manager = page.getEnclosingWindow().getJobManager(); 
while (manager.getJobCount() > 0) { 
    System.out.printlin("Jobs remaining: " + manager.getJobCount()); 
    Thread.sleep(1000); 
} 

您可能想要添加另一种方式来结束while循环,以防JavaScript作业永远无法完成。就个人而言,我开始手动终止工作:

JavaScriptJob job = manager.getEarliestJob(); 
System.out.println("Stopping job: " + job.getId()); 
manager.stopJob(job.getId()); 

希望这有助于。