2012-08-15 57 views
0


我希望能够使用任何Html分析器提取contentPlaceHolder的代码。 问题是,我需要一个url,但是因为它是一个masterpage我不能拥有它。如何以编程方式获取contentPlaceHolder的代码

实际上有一个select标签,您可以在其中选择一个选项,当您选择一个选项时,它将加载一个contentPlaceHolder。我想从contentPlaceHolder中提取代码。

注:我没有建立网站。

下面是一些图片来更好地解释它:

这是主页。 enter image description here

这是内容(当你按下红色标志): enter image description here

我希望这是不够清楚了解...... 谢谢!

+0

如果您要附加页面的URL,将会更容易帮助您。 – Jens 2012-08-15 12:52:36

+0

http://blich.co.il/timetable-shahaf 这是在希伯来语,所以你不会理解xD – 2012-08-15 12:54:13

+0

好吧,HTML是幸运的不是写在希伯来=) - 对于初学者来说,该页面包含一个IFRAME,其中包含的东西你在之后。 – Jens 2012-08-15 13:03:44

回答

1

首先,这需要JSoup

try { 
    // Regexp pattern used to strip the links 
    Pattern p = Pattern.compile("\'([^\']*)\'"); 

    // First, let's find the IFRAME from the main page 
    Document doc = Jsoup.connect("http://blich.co.il/timetable-shahaf").get(); 
    Elements iframe = doc.select("iframe"); 
    if (!iframe.isEmpty()) { 
     String src = iframe.get(0).absUrl("src"); 
     if (!TextUtils.isEmpty(src)) { 
      // Now we need to fetch the contents of the IFRAME 
      doc = Jsoup.connect(src).get(); 

      // This is where we manipulate the <select ..> statement. There's only 
      // one on this page, so this will be done quick and dirty 
      Elements selects = doc.select("select.HeaderClasses"); 
      Elements options = selects.select("option"); 
      if (!options.isEmpty()) { 
       // There's a lot of options here.. dunno what they mean, so let's just 
       // select a **random** and go with that. Your code should probably let the user 
       // choose from a dialog or something. 
       Collections.shuffle(options); 
       Element option = options.get(0); 

       String name=selects.get(0).attr("name"); 
       if (!TextUtils.isEmpty(name)) { 
        doc = Jsoup.connect(src) 
          .data("__EVENTTARGET", name) 
          .data("__EVENTARGUMENT", "") 
          .data(name, option.attr("value")) // Add random option value 
          .data("__VIEWSTATE", 
            doc.select("input#__VIEWSTATE").attr("value")) 
          .data("__LASTFOCUS", "") 
          .post(); 
       } 
      } 
      // All the relevant links are stored in a td with the class "HeaderCell" 
      Elements links = doc.select("td.HeaderCell a"); 
      for (Element link : links) {      
       // These are all links to a silly java-script method, _doPostBack(..) 
       // function __doPostBack(eventTarget, eventArgument) { 
       // if (!theForm.onsubmit || (theForm.onsubmit() != false)) { 
       //  theForm.__EVENTTARGET.value = eventTarget; 
       //  theForm.__EVENTARGUMENT.value = eventArgument; 
       //  theForm.submit(); 
       // } 
       // } 
       // The important bits appear to be eventTarget and eventArgument at least, 
       // but none of the links define an eventArgument in any case - so we just 
       // need "eventTarget". 

       // Naïve splitting, take the first quoted string 
       Matcher m = p.matcher(link.attr("href")); 
       if (m.find()) { 
        String eventTarget = m.group(1); 
        // The eventTarget you're looking for ends with 'ChangesTable' 
        if (eventTarget != null && eventTarget.endsWith("ChangesTable")) { 
         // Now we need to do a POST :-D - this API requires us to retain 
         // __VIEWSTATE - so we need to post that to. 
         doc = Jsoup.connect(src) 
           .data("__EVENTTARGET", eventTarget) 
           .data("__EVENTARGUMENT", "") 
           .data("__VIEWSTATE", 
             doc.select("input#__VIEWSTATE").attr("value")) 
           .data("__LASTFOCUS", "") 
           .post(); 


         // All the lesson information is stored in a div with the class 
         // TTLesson, so let's select those 
         Elements lessons = doc.select("div.TTLesson"); 
         if (lessons.isEmpty()) { 
          Log.w(TAG, "Unable to list any lessons"); 
         } else { 
          for (Element lesson : lessons) { 
           // This is were knowledge of Hebrew would come in handy - 
           // but this will list all lessons. You should be able 
           // to figure out how to find the one you want. 
           System.out.println(lesson); 
          } 
         } 
        } 
       } 
      } 
     } else { 
      Log.w(TAG, "Unable to find iframe src"); 
     } 
    } else { 
     Log.w(TAG, "Unable to find iframe"); 
    } 
} catch (IOException e) { 
    Log.w(TAG, "Error reading timetable", e); 
} 

这将列出您想要的页面上的所有课程。由于我不知道足够的希伯来语来辨别任何单元格包含什么,所以我会留下找到正确的教训。

编辑:现在这个例子会在<中选择一个选项>并刷新页面。

+0

这很好,但它专门为一个班级(学校班级)工作。当你在网站上时,有许多班级的下拉菜单。我需要获取特定课程的信息。起初我虽然不可能得到任何课程的时间表,那么你是如何告诉它选择哪个课程的? – 2012-08-15 14:08:36

+1

'