2013-03-19 82 views
0

我试图使用HTMLUnit获取网页上的JavaScript元素(https://www.coursera.org/courses),并且它仅加载html数据。我如何获得它显示在javascript容器中显示的信息?使用HTMLUnit获取JavaScript元素

谢谢!

我当前的代码:

 public String DownloadPage(String str){ 
    final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3_6); 
    webClient.getOptions().setTimeout(20000); 
    webClient.getOptions().setJavaScriptEnabled(true); 
    webClient.getOptions().setThrowExceptionOnScriptError(false); 

    try{ 
     HtmlPage page = webClient.getPage(str); 
     XmlPage page2 = webClient.getPage(str); 
     int n = webClient.waitForBackgroundJavaScript(100000); 

     System.out.println("Executing " + n + " JavaSript jobs!"); 
     System.out.println("OUTPUT: " + page2); 

     System.out.println("OUTPUT: " + page.asXml()); 
     webClient.closeAllWindows(); 
    } 

    catch(IOException e){ 
     JOptionPane.showMessageDialog(null, "error"); 
    } 


    webClient.closeAllWindows(); 
    return ""; 
} 

回答

0

使用

String theContent1 = webClient.getPage(theURL).getWebResponse().getContentAsString(); 

代替

String theContent2 = webClient.getPage(theURL); 

theContent1应包含实际页面的源代码,包括Java脚本(如果有的话)。