2016-03-08 96 views
2

我已成功登录http://www.aogc2.state.ar.us:8080/DWClient/Login.aspx 然后我得到了页面内容。页面内容就像浏览器中的视图代码。好! 然后我成功了view2 http://www.aogc2.state.ar.us:8080/DWClient/View2.aspx它也行! 然后我tryed按下“好文件”和打印页的内容是如何处理不寻常的html页面内容?

 
    1|#||4|97|updatePanel|DWC_DWMUP| 
    |0|hiddenField|__EVENTTARGET||0|hiddenField|__EVENTARGUMENT||1844|hiddenField|__VIEWSTATE|/wEPDwULLTEyMzU3MDQ2NzQPZBYCAgMPZBYGZg9kFgIFBE1lbnUPZBYCZg9kFgJmD2QWAgIFD2QWAmYPZBYGAgcPEGRkFgBkAgkPZBYIAgMPD2QWAh4MYXV0b2NvbXBsZXRlBQNvZmZkAgcPD2QWAh8ABQNvZmZkAgsPDxYCHgtOYXZpZ2F0ZVVybAUkL0RXQ2xpZW50L0ZvcmdvdFBhc3N3b3JkLmFzcHg/dj0xNTg5ZGQCDQ8PFgIfAQUkL0RXQ2xpZW50L0NoYW5nZVBhc3N3b3JkLmFzcHg/dj0xNTg5ZGQCDw8UKwACFCsAAhQrAAIPFgIeBFNraW4FEURXQWpheENvbnRleHRNZW51ZGRkZGQCAQ9kFggCAw8WAh4HVmlzaWJsZWdkAgUPFgIfA2dkAgcPFgIfA2dkAgkPFgIfA2cWAmYPZBYCZg9kFgICAQ8PFgIeF0VuYWJsZUFqYXhTa2luUmVuZGVyaW5naGRkAgMPDxYEHwIFDkRXQWpheFNwbGl0dGVyHwRoZBYGZg8PFhweDk9yaWdpbmFsSGVpZ2h0GwAAAAAAAHlAAQAAAB4ITWluV2lkdGgCFB4NT3JpZ2luYWxXaWR0aBweCE1heFdpZHRoApBOHgZMb2NrZWRoHgxFeHBhbmRlZFNpemUbAAAAAAAAAAABAAAAHwRoHgVXaWR0aBsAAAAAALyiQAEAAAAeCkNvbnRlbnRVcmxlHglDb2xsYXBzZWRoHglNYXhIZWlnaHQCkE4eEkNvbGxhcHNlZERpcmVjdGlvbgIBHgZIZWlnaHQbAAAAAACIgUABAAAAHglNaW5IZWlnaHQCFGQWAgIDDw8WAh8DaGRkAgEPFCsAAg8WAh8EaGRkZAICDw8WHB8FHB8GAhQfBxwfCAKQTh8KGwAAAAAAAGRAAQAAAB8EaB8LGwAAAAAAvKJAAQAAAB8MZR8JZx8NZx8OApBOHw8CAR8QGwAAAAAAAAAAAQAAAB8RAiFkFgICAQ8PFggfEBsAAAAAAAAAAAEAAAAfCxsAAAAAALyiQAEAAAAfAgUORFdBamF4U3BsaXR0ZXIfBGhkFgJmDw8WHB8FHB8GAhQfDWgfCAKQTh8KGwAAAAAAAAAAAQAAAB8EaB8LGwAAAAAAvKJAAQAAAB8MZR8JaB8OApBOHwcbAAAAAAC8okABAAAAHw8CAR8QGwAAAAAAAGRAAQAAAB8RAiFkZBgGBRNXJE1lbnUkTG8kTXVsdGlWaWV3Dw9kAgFkBRJXJFN0JFN0QyRNdWx0aVZpZXcPD2QCAWQFEFckUiRSbCRNdWx0aVZpZXcPD2QCAWQFEVckSSRJbnMkTXVsdGlWaWV3Dw9kAgFkBR5fX0NvbnRyb2xzUmVxdWlyZVBvc3RCYWNrS2V5X18WEQUZVyRNZW51JExvJEltYWdlQnV0dG9uSG9tZQUPVyRNZW51JExvJFZpZXcxBQ9XJE1lbnUkTG8kVmlldzIFD1ckTWVudSRMbyRWaWV3MwUUVyRNZW51JExvJE5vcm1hbE1lbnUFE1ckTWVudSRMbyRMYXJnZU1lbnUFGVckU2UkU2VDJGxvZ2ljQ29udGV4dE1lbnUFFFckUiRSbCREV01lbnVDZW50cmFsBRREV0MkRmF2b3VyaXRlc1dpbmRvdwUPRFdDJExpbmtzV2luZG93BRdEV0MkTm90aWZpY2F0aW9uVG9vbFRpcAUBUwUCUDEFA1JTMQUDV0JQBQVEdW1teQUJRHVtbXlQYW5lBRJXJFNlJFNlQyRNdWx0aVZpZXcPD2QCAWRp7I11pnzuZDTn09YDxJ6Cp/HOTXS/eo4HZpH7M8GzzA==|564|hiddenField|__EVENTVALIDATION|/wEdABha5mfgDZSbLzZUyUmX1HxcOjJG2/OXQfO53LaLPTz+JKFbxsBT2H8rPbJozRdJwiAgKGohx7LcryBfxGS+hF2E4NlePdrVjBf/TPB5os3NdFIlICQJHHDKGPuD8UVyFvooNofeUTbg7nk9AH14WLqQyPKBpDYvU5rSctiCYJhpRPg2WkkrhV0MIyWtu9xnPvNiC4AVC7l3nkSJ4INPIB4hnzjsrTlJSSRzjrQ6bke9bUH+N4R/gDuZ/KfX+AOQGo/02VXeZ9PaIemoEvx+U13v8QrR/2ZOW/prD69FB8B4l86dZj6xFRFNJ0+l8RQrF3lsl+3Cx//bTJLETxQ5erW+AupPWcrY4v9U5sOeCcNNlbxNM22455lBVj/AfKTS4gk8x0uSRBr1tRfQw2LR1xi8zFB2K3kM0zEwKsPh+eFwiSn00CkX7UOoQEfARl4AxVdv8ByYecGT0TjnARANUdIIgtna+c+VOToEt8OWul45cFjL1lu0d13QEk4fESJ3YGoWqa8caeJbQztMaLH9+HYOWZNp4F70iXVwjq2ZUpxgXg==|0|asyncPostBackControlIDs|||0|postBackControlIDs|||228|updatePanelIDs||tDWC$DWMUP,,tDWC$DWNUP,,tW$Menu$Lo$UPMenu,,tW$Menu$Lo$UPMenuCmd,,tW$Se$SeC$SearchUpdatePanel,,tW$R$Rl$DWPanelResultLists,,tW$R$Rl$CommandAjaxHelpPanel,,tW$St$StC$StoreUpdatePanel,,tW$I$Ins$InfoUpdatePanel,,tW$Vi$Vp$CommandPanel,|0|childUpdatePanelIDs|||10|panelsToRefreshIDs||DWC$DWMUP,|2|asyncPostBackTimeout||90|35|formAction||View2.aspx?DWSubSession=4783&v=1589|36|pageTitle||DocuWare Public Web Client - Pubuser|62|scriptStartupBlock|ScriptContentNoTags|if(typeof ($telerik)!='undefined'){$telerik.registerSkins();};|47|scriptStartupBlock|ScriptContentNoTags|Telerik.Web.UI.RadSplitter._preInitialize("S");|76|scriptStartupBlock|ScriptContentNoTags|Telerik.Web.UI.RadPane._preInitialize("P1", "S", "", "RS1", 0, 0, "False");|73|scriptStartupBlock|ScriptContentNoTags|Telerik.Web.UI.RadSplitBar._preInitialize("RS1", "S", "P1", "WBP", 1, 0);|76|scriptStartupBlock|ScriptContentNoTags|Telerik.Web.UI.RadPane._preInitialize("WBP", "S", "RS1", "", 2, 1, "True");|51|scriptStartupBlock|ScriptContentNoTags|Telerik.Web.UI.RadSplitter._preInitialize("Dummy");|83|scriptStartupBlock|ScriptContentNoTags|Telerik.Web.UI.RadPane._preInitialize("DummyPane", "Dummy", "", "", 0, 0, "True");| 

WHY这个代码是由视图-SOURSE浏览器代码不同??? 我发送了与谷歌浏览器相同的请求。 好了,然后我tryed按“搜索”,并印刷内容是

 
    1|#||4|97|updatePanel|DWC_DWMUP| 
    |0|hiddenField|__EVENTTARGET||0|hiddenField|__EVENTARGUMENT||1844|hiddenField|__VIEWSTATE|/wEPDwULLTEyMzU3MDQ2NzQPZBYCAgMPZBYGZg9kFgIFBE1lbnUPZBYCZg9kFgJmD2QWAgIFD2QWAmYPZBYGAgcPEGRkFgBkAgkPZBYIAgMPD2QWAh4MYXV0b2NvbXBsZXRlBQNvZmZkAgcPD2QWAh8ABQNvZmZkAgsPDxYCHgtOYXZpZ2F0ZVVybAUkL0RXQ2xpZW50L0ZvcmdvdFBhc3N3b3JkLmFzcHg/dj0xNTg5ZGQCDQ8PFgIfAQUkL0RXQ2xpZW50L0NoYW5nZVBhc3N3b3JkLmFzcHg/dj0xNTg5ZGQCDw8UKwACFCsAAhQrAAIPFgIeBFNraW4FEURXQWpheENvbnRleHRNZW51ZGRkZGQCAQ9kFggCAw8WAh4HVmlzaWJsZWdkAgUPFgIfA2dkAgcPFgIfA2dkAgkPFgIfA2cWAmYPZBYCZg9kFgICAQ8PFgIeF0VuYWJsZUFqYXhTa2luUmVuZGVyaW5naGRkAgMPDxYEHwIFDkRXQWpheFNwbGl0dGVyHwRoZBYGZg8PFhweDk9yaWdpbmFsSGVpZ2h0GwAAAAAAAHlAAQAAAB4ITWluV2lkdGgCFB4JQ29sbGFwc2VkaB4ITWF4V2lkdGgCkE4eBkxvY2tlZGgeDEV4cGFuZGVkU2l6ZRsAAAAAAAAAAAEAAAAfBGgeBVdpZHRoGwAAAAAAvKJAAQAAAB4KQ29udGVudFVybGUeCU1heEhlaWdodAKQTh4NT3JpZ2luYWxXaWR0aBweEkNvbGxhcHNlZERpcmVjdGlvbgIBHgZIZWlnaHQbAAAAAACIgUABAAAAHglNaW5IZWlnaHQCFGQWAgIDDw8WAh8DaGRkAgEPFCsAAg8WAh8EaGRkZAICDw8WHB8FHB8GAhQfB2cfEBsAAAAAAAAAAAEAAAAfCAKQTh8KGwAAAAAAAGRAAQAAAB8EaB8LGwAAAAAAvKJAAQAAAB8MZR8NApBOHw4cHw8CAR8JZx8RAiFkFgICAQ8PFggfEBsAAAAAAAAAAAEAAAAfCxsAAAAAALyiQAEAAAAfAgUORFdBamF4U3BsaXR0ZXIfBGhkFgJmDw8WHB8FHB8GAhQfB2gfEBsAAAAAAABkQAEAAAAfCAKQTh8EaB8LGwAAAAAAvKJAAQAAAB8MZR8NApBOHw4bAAAAAAC8okABAAAAHwobAAAAAAAAAAABAAAAHw8CAR8JaB8RAiFkZBgGBRNXJE1lbnUkTG8kTXVsdGlWaWV3Dw9kAgFkBRJXJFN0JFN0QyRNdWx0aVZpZXcPD2QCAWQFEFckUiRSbCRNdWx0aVZpZXcPD2QCAWQFEVckSSRJbnMkTXVsdGlWaWV3Dw9kAgFkBR5fX0NvbnRyb2xzUmVxdWlyZVBvc3RCYWNrS2V5X18WEQUZVyRNZW51JExvJEltYWdlQnV0dG9uSG9tZQUPVyRNZW51JExvJFZpZXcxBQ9XJE1lbnUkTG8kVmlldzIFD1ckTWVudSRMbyRWaWV3MwUUVyRNZW51JExvJE5vcm1hbE1lbnUFE1ckTWVudSRMbyRMYXJnZU1lbnUFGVckU2UkU2VDJGxvZ2ljQ29udGV4dE1lbnUFFFckUiRSbCREV01lbnVDZW50cmFsBRREV0MkRmF2b3VyaXRlc1dpbmRvdwUPRFdDJExpbmtzV2luZG93BRdEV0MkTm90aWZpY2F0aW9uVG9vbFRpcAUBUwUCUDEFA1JTMQUDV0JQBQVEdW1teQUJRHVtbXlQYW5lBRJXJFNlJFNlQyRNdWx0aVZpZXcPD2QCAWSj322M9rw1FX77ZKZbFteidPsHiXp3olXMiGZQgb11aQ==|564|hiddenField|__EVENTVALIDATION|/wEdABhRVbuGY+FX+li0tWxz+8ClOjJG2/OXQfO53LaLPTz+JKFbxsBT2H8rPbJozRdJwiAgKGohx7LcryBfxGS+hF2E4NlePdrVjBf/TPB5os3NdFIlICQJHHDKGPuD8UVyFvooNofeUTbg7nk9AH14WLqQyPKBpDYvU5rSctiCYJhpRPg2WkkrhV0MIyWtu9xnPvNiC4AVC7l3nkSJ4INPIB4hnzjsrTlJSSRzjrQ6bke9bUH+N4R/gDuZ/KfX+AOQGo/02VXeZ9PaIemoEvx+U13v8QrR/2ZOW/prD69FB8B4l86dZj6xFRFNJ0+l8RQrF3lsl+3Cx//bTJLETxQ5erW+AupPWcrY4v9U5sOeCcNNlbxNM22455lBVj/AfKTS4gk8x0uSRBr1tRfQw2LR1xi8zFB2K3kM0zEwKsPh+eFwiSn00CkX7UOoQEfARl4AxVdv8ByYecGT0TjnARANUdIIgtna+c+VOToEt8OWul45cFjL1lu0d13QEk4fESJ3YGo60E+IUI1ezNJlZi1JIK0s/iLe31JBQcSxeBxW80kwAg==|0|asyncPostBackControlIDs|||0|postBackControlIDs|||228|updatePanelIDs||tDWC$DWMUP,,tDWC$DWNUP,,tW$Menu$Lo$UPMenu,,tW$Menu$Lo$UPMenuCmd,,tW$Se$SeC$SearchUpdatePanel,,tW$R$Rl$DWPanelResultLists,,tW$R$Rl$CommandAjaxHelpPanel,,tW$St$StC$StoreUpdatePanel,,tW$I$Ins$InfoUpdatePanel,,tW$Vi$Vp$CommandPanel,|0|childUpdatePanelIDs|||10|panelsToRefreshIDs||DWC$DWMUP,|2|asyncPostBackTimeout||90|35|formAction||View2.aspx?DWSubSession=2849&v=1589|36|pageTitle||DocuWare Public Web Client - Pubuser|62|scriptStartupBlock|ScriptContentNoTags|if(typeof ($telerik)!='undefined'){$telerik.registerSkins();};|47|scriptStartupBlock|ScriptContentNoTags|Telerik.Web.UI.RadSplitter._preInitialize("S");|76|scriptStartupBlock|ScriptContentNoTags|Telerik.Web.UI.RadPane._preInitialize("P1", "S", "", "RS1", 0, 0, "False");|73|scriptStartupBlock|ScriptContentNoTags|Telerik.Web.UI.RadSplitBar._preInitialize("RS1", "S", "P1", "WBP", 1, 0);|76|scriptStartupBlock|ScriptContentNoTags|Telerik.Web.UI.RadPane._preInitialize("WBP", "S", "RS1", "", 2, 1, "True");|51|scriptStartupBlock|ScriptContentNoTags|Telerik.Web.UI.RadSplitter._preInitialize("Dummy");|83|scriptStartupBlock|ScriptContentNoTags|Telerik.Web.UI.RadPane._preInitialize("DummyPane", "Dummy", "", "", 0, 0, "True");| 

那么,这是什么意思?以及如何从这个数据中获取数据?万分感谢! 这里我的代码

package com.company; 

import org.jsoup.Connection; 
import org.jsoup.Jsoup; 
import org.jsoup.helper.HttpConnection; 
import org.jsoup.nodes.Document; 
import org.jsoup.nodes.Element; 

import java.io.IOException; 
import java.net.*; 
import java.util.ArrayList; 
import java.util.List; 

public class Parser { 
    private String viewState; 
    private String eventValidation; 
    private String subSession; 
    String url = "http://www.aogc2.state.ar.us:8080/DWClient/"; 
    List cookies; 
    public void start() throws Exception{ 
     login(); 
     chooseWells(); 
     searchForm(); 
    } 


    public void login() throws IOException { 
     CookieManager manager = new CookieManager(); 
     manager.setCookiePolicy(CookiePolicy.ACCEPT_ALL); 
     CookieHandler.setDefault(manager); 
     Connection connection = HttpConnection.connect(url + "Login.aspx"); 
     updateViewState(connection); 
     try { 
      Connection.Response res = connection 
        .data("DWC$DWMessages", "") 
        .data("__VIEWSTATE", viewState) 
        .data("__EVENTVALIDATION", eventValidation) 
        .data("DWC_NotificationToolTip_ClientState", "") 
        .data("LoginWebPart$LoginTypes", "Guest") 
        .data("LoginWebPart$TextBoxUserName", "") 
        .data("LoginWebPart$TextBoxPassword", "") 
        .data("LoginWebPart$ButtonLogin", "Login") 
        .data("LoginWebPart_LanguageContextMenu_ClientState", "") 
        .userAgent("Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36") 
        .method(Connection.Method.POST) 
        .timeout(6000) 
        .execute(); 
      CookieStore cookieJar = manager.getCookieStore(); 
      cookies = cookieJar.getCookies(); 
      Element element = res.parse().select("form").first(); 
      subSession = element.attr("action").substring(10); 
      updateViewState(getView2()); 

     } catch (IOException ex) { 
      ex.printStackTrace(); 
      System.exit(1); 
     } 
    } 

    private void updateViewState(Connection connection) throws IOException { 
     Element element = connection.get().select("form").first(); 
     Element el = element.getElementById("__VIEWSTATE"); 
     viewState = el.attr("value"); 
     Element el1 = element.getElementById("__EVENTVALIDATION"); 
     eventValidation = el1.attr("value"); 
    } 

    private void updateViewStateFromPartial(String html) throws IOException { 
     String viewStateToken = "__VIEWSTATE|"; 
     int idx = html.indexOf(viewStateToken); 
     int startIdx = idx + viewStateToken.length(); 
     int endIdx = html.indexOf("|", startIdx); 
     viewState = html.substring(startIdx, endIdx); 

     String eventValidationToken = "__EVENTVALIDATION|"; 
     idx = html.indexOf(eventValidationToken); 
     startIdx = idx + eventValidationToken.length(); 
     endIdx = html.indexOf("|", startIdx); 
     eventValidation = html.substring(startIdx, endIdx); 
    } 


    private Connection getView2() { 
     url=url+"View2.aspx"+subSession; 
     return Jsoup.connect(url); 
    } 

    public void chooseWells() throws IOException { 
     Connection.Response response = Jsoup.connect(url) 
       .data("DWC%24SM", "W%24Menu%24Lo%24UPMenuCmd%7CW_Menu_Lo_ClickedMenuCmd") 
       .data("__WPPS", "s") 
       .data("__EVENTTARGET", "W_Menu_Lo_ClickedMenuCmd") 
       .data("__EVENTARGUMENT", "") 
       .data("__VIEWSTATE", viewState) 
       .data("__EVENTVALIDATION", eventValidation) 
       .data("DWC_FavouritesWindow_ClientState", "") 
       .data("DWC_LinksWindow_ClientState", "") 
       .data("DWC$24DWMessages", "") 
       .data("DWC_NotificationToolTip_ClientState", "") 
       .data("W_Menu_Lo_NormalMenu_ClientState", "") 
       .data("W_Menu_Lo_LargeMenu_ClientState", "{\"logEntries\":[],\"selectedItemIndex\":\"4\"}\"") 
       .data("W%24Menu%24Lo%24CPE_ClientState", "false") 
       .data("W%24Menu%24Lo%24FavMenuCmd", "") 
       .data("W%24Menu%24Lo%24ClickedMenuCmd", "1457348832725%2Cdb19b928-1c2d-4a0f-9963-95cc8d87bae9") 
       .data("W%24Menu%24Lo%24ClientCommand", "") 
       .data("W%24Se%24SeC%24HiddenSearchUpdateField", "none") 
       .data("W_Se_SeC_logicContextMenu_ClientState", "") 
       .data("W%24St%24StC%24HiddenStoreUpdateField", "") 
       .data("storeDialogClicked", "") 
       .data("W%24R%24Rl%24CommandResultHiddenField", "") 
       .data("W%24R%24Rl%24infoDlgMode", "Off") 
       .data("W%24R%24Rl%24DblClickCommandHiddenField", "") 
       .data("W%24R%24Rl%24CommandHiddenField", "") 
       .data("W%24R%24Rl%24ClientCommands", "") 
       .data("W_R_Rl_DWMenuCentral_ClientState", "") 
       .data("W%24I%24Ins%24LastSelectedTab", "UserIndexes") 
       .data("W%24I%24Ins%24infoDialogMode", "Off") 
       .data("W%24I%24Ins%24HiddenInfoUpdateField", "") 
       .data("W%24Vi%24Vp%24Commands", "") 
       .data("W%24Vi%24Vp%24AnnotationTool", "") 
       .data("P1_ClientState", "{\"_originalWidth\":\"\",\"_originalHeight\":\"400px\",\"_collapsedDirection\":1,\"_scrollLeft\":0,\"_scrollTop\":0,\"_expandedSize\":0,\"width\":2398,\"height\":561,\"collapsed\":false,\"contentUrl\":\"\",\"minWidth\":20,\"maxWidth\":10000,\"minHeight\":20,\"maxHeight\":10000,\"locked\":false}") 
       .data("RS1_ClientState", "") 
       .data("DummyPane_ClientState", "{\"_originalWidth\":\"398px\",\"_originalHeight\":\"\",\"_collapsedDirection\":1,\"_scrollLeft\":0,\"_scrollTop\":0,\"_expandedSize\":0,\"width\":2398,\"height\":160,\"collapsed\":false,\"contentUrl\":\"\",\"minWidth\":20,\"maxWidth\":10000,\"minHeight\":33,\"maxHeight\":10000,\"locked\":false}") 
       .data("Dummy_ClientState", "") 
       .data("WBP_ClientState", "{\"_originalWidth\":\"\",\"_originalHeight\":\"\",\"_collapsedDirection\":1,\"_scrollLeft\":0,\"_scrollTop\":0,\"_expandedSize\":160,\"width\":2398,\"height\":0,\"collapsed\":true,\"contentUrl\":\"\",\"minWidth\":20,\"maxWidth\":10000,\"minHeight\":33,\"maxHeight\":10000,\"locked\":true}") 
       .data("S_ClientState", "") 
       .data("__ASYNCPOST", "true") 
       .header("DWASPSession", String.valueOf(cookies.get(2)).substring(10)) 
       .header("DWSubSession", subSession.substring(14,18)) 
       .header("X-MicrosoftAjax", "Delta=true") 
       .header("X-Requested-With", "XMLHttpRequest") 
       .timeout(5000) 
       .userAgent("Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36") 
       .method(Connection.Method.POST) 
       .execute(); 
     Element element = response.parse().select("html").first(); 
     System.out.println(element); 
     updateViewStateFromPartial(element.text()); 
    } 

    public void searchForm() throws IOException { 
     Connection.Response response = Jsoup.connect(url) 
       .data("DWC$24SM", "W%24Se%24SeC%24SearchUpdatePanel%7CW%24Se%24SeC%24P%241%241CC%24ssd%24btnCtrl") 
       .data("__WPPS", "s") 

       .data("DWC_FavouritesWindow_ClientState", "") 
       .data("DWC_LinksWindow_ClientState", "") 
       .data("DWC$24DWMessages", "") 
       .data("DWC_NotificationToolTip_ClientState", "") 
       .data("W_Menu_Lo_NormalMenu_ClientState", "") 
       .data("W_Menu_Lo_LargeMenu_ClientState", "{\"logEntries\":[],\"selectedItemIndex\":\"4\"}\"")// 
       .data("W%24Menu%24Lo%24CPE_ClientState", "false") 
       .data("W%24Menu%24Lo%24FavMenuCmd", "") 
       .data("W%24Menu%24Lo%24ClickedMenuCmd", "") 
       .data("W%24Menu%24Lo%24ClientCommand", "") 
       .data("W%24Se%24SeC%24HiddenSearchUpdateField", "none") 
       .data("W%24Se%24SeC%24P%241%24P", "0") 
       .data("W%24Se%24SeC%24P%241%24Co", "0") 
       .data("W%24Se%24SeC%24P%241%24Ma", "0") 
       .data("W%24Se%24SeC%24P%241%24Cl", "0") 
       .data("W_Se_SeC_P_1_1CC_ssd_slm_ClientState", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24ssd%24IF%24F0%24T", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24ssd%24IF%24F1%24T", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24ssd%24IF%24F2%24T", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24ssd%24IF%24F3%24T", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24ssd%24IF%24F4%24T", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24ssd%24IF%24F5%24T", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24ssd%24IF%24F6%24T", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24ssd%24IF%24F7%24T", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24ssd%24IF%24F8%24T", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24ssd%24IF%24F9%24T", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24ssd%24IF%24F10%24D1", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24ssd%24IF%24F10%24D2", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24ssd%24IF%24F11%24T", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24ssd%24IF%24F12%24D1", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24ssd%24IF%24F12%24D2", "") 
       .data("W%24Se%24SeC%24P%241%241CC%24SearchMod", "And") 
       .data("W%24Se%24SeC%24P%241%241CC%24SC", "") 
       .data("W%24Se%24SeC%24P%241%24CPE_ClientState", "false") 
       .data("W_Se_SeC_logicContextMenu_ClientState", "") 
       .data("W%24St%24StC%24HiddenStoreUpdateField", "") 
       .data("storeDialogClicked", "") 
       .data("W%24R%24Rl%24CommandResultHiddenField", "") 
       .data("W%24R%24Rl%24infoDlgMode", "Off") 
       .data("W%24R%24Rl%24DblClickCommandHiddenField", "") 
       .data("W%24R%24Rl%24CommandHiddenField", "") 
       .data("W%24R%24Rl%24ClientCommands", "") 
       .data("W_R_Rl_DWMenuCentral_ClientState", "") 
       .data("W%24I%24Ins%24LastSelectedTab", "UserIndexes") 
       .data("W%24I%24Ins%24infoDialogMode", "Off") 
       .data("W%24I%24Ins%24HiddenInfoUpdateField", "") 
       .data("W%24Vi%24Vp%24Commands", "") 
       .data("W%24Vi%24Vp%24AnnotationTool", "") 
       .data("P1_ClientState", "{\"_originalWidth\":\"\",\"_originalHeight\":\"400px\",\"_collapsedDirection\":1,\"_scrollLeft\":0,\"_scrollTop\":0,\"_expandedSize\":0,\"width\":2398,\"height\":561,\"collapsed\":false,\"contentUrl\":\"\",\"minWidth\":20,\"maxWidth\":10000,\"minHeight\":20,\"maxHeight\":10000,\"locked\":false}") 
       .data("RS1_ClientState", "") 
       .data("DummyPane_ClientState", "{\"_originalWidth\":\"398px\",\"_originalHeight\":\"\",\"_collapsedDirection\":1,\"_scrollLeft\":0,\"_scrollTop\":0,\"_expandedSize\":0,\"width\":2398,\"height\":160,\"collapsed\":false,\"contentUrl\":\"\",\"minWidth\":20,\"maxWidth\":10000,\"minHeight\":33,\"maxHeight\":10000,\"locked\":false}") 
       .data("Dummy_ClientState", "") 
       .data("WBP_ClientState", "{\"_originalWidth\":\"\",\"_originalHeight\":\"\",\"_collapsedDirection\":1,\"_scrollLeft\":0,\"_scrollTop\":0,\"_expandedSize\":160,\"width\":2398,\"height\":0,\"collapsed\":true,\"contentUrl\":\"\",\"minWidth\":20,\"maxWidth\":10000,\"minHeight\":33,\"maxHeight\":10000,\"locked\":true}") 
       .data("S_ClientState", "") 
       .data("__ASYNCPOST", "true") 
       .data("__EVENTTARGET", "W%24Se%24SeC%24P%241%241CC%24ssd%24btnCtrl") 
       .data("__EVENTARGUMENT", "") 
       .data("__VIEWSTATE", viewState) 
       .data("__EVENTVALIDATION", eventValidation) 
       .data("__ASYNCPOST", "true") 


       .header("DWASPSession", String.valueOf(cookies.get(2)).substring(10)) 
       .header("DWSubSession", subSession.substring(14,18)) 
       .header("X-MicrosoftAjax", "Delta=true") 
       .header("X-Requested-With", "XMLHttpRequest") 
       .userAgent("Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36") 
       .timeout(10000) 
       .ignoreContentType(true) 
       .method(Connection.Method.POST) 
       .execute(); 

     System.out.println(response.parse().select("html")); 
     getTabularResults(response.parse().select("html").toString()); 

    } 

    private List<DocMetaData> getTabularResults(String raw){ 
     List<DocMetaData> docs=new ArrayList<>(); 
     String html=getTabularHtml(raw); 
     Document document=Jsoup.parse(html); 
     Element table=document.getElementById("W_R_Rl_P_1_1CC_ctl01_ctl00"); 
     for (Element td: table.select("tbody/tr") 
      ) { 
      List<Element> columns=td.select("td"); 
      DocMetaData doc=new DocMetaData(); 
      doc.setLeastName(String.valueOf(columns.get(1))); 
      docs.add(doc); 
     } 
     return docs; 

    } 
    private String getTabularHtml(String raw){ 
     String startToken="W_R_Rl_DWPanelResultLists|"; 
     int startIdx=raw.indexOf(startToken)+startToken.length(); 
     int endIdx=raw.indexOf("|0|",startIdx); 
     String result=raw.substring(startIdx,endIdx); 
     System.out.println(result); 
     return result; 
    } 
} 

回答

0

你将永远不会得到一个1:1匹配的浏览器如何解释HTML。这是一个尽力而为,浏览器疯狂地复杂,以便与几十年的HTML破坏版本兼容:)

即使您设法模仿Chrome的行为,它可能不会匹配Firefox,Opera甚至Chrome的下一个版本。

对于第二个问题(通常最好只问问一个问题),只需使用Jsoup并尝试提取所需的内容 - 不要过多担心Chrome所说的内容。也许this question的答案有帮助?

+0

eh,no。看看打印的内容。没有元素,没有标签。只是文本,没有信息,我需要(不过谢谢你。也许别的吗? –

+0

很抱歉,我们“打印文本” –

+0

你能尝试创建自己的问题吗?你的代码的最小工作示例(和off-SO示例html)目前有点难以消化,至少对我而言:) –

相关问题