2011-11-17 92 views
0

我试图做一个刮网页,但为了发布数据,我需要像如何从cookie获取Web会话?

web会话ID web_session = HQJ3G1GPAAHRZGFR

我怎样才能像ID?

到目前为止我的代码是:

Private Sub test() 

    Dim postData As String = "web_session=HQJ3G1GPAAHRZGFR&intext=O&term_code=201210&search_type=A&keyword=&kw_scope=all&kw_opt=all&subj_code=BIO&crse_numb=205&campus=*&instructor=*&instr_session=*&attr_type=*&mon=on&tue=on&wed=on&thu=on&fri=on&sat=on&sun=on&avail_flag=on" '/BANPROD/pkgyc_yccsweb.P_Results 
    Dim tempCookie As New CookieContainer 
    Dim encoding As New UTF8Encoding 
    Dim byteData As Byte() = encoding.GetBytes(postData) 

    System.Net.ServicePointManager.SecurityProtocol = Net.SecurityProtocolType.Ssl3 
    Try 

     tempCookie.GetCookies(New Uri("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Results")) 
     'postData="web_session=" & tempCookie. 

     Dim postReq As HttpWebRequest = DirectCast(WebRequest.Create("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Results"), HttpWebRequest) 
     postReq.Method = "POST" 
     postReq.KeepAlive = True 
     postReq.CookieContainer = tempCookie 
     postReq.ContentType = "application/x-www-form-urlencoded" 


     postReq.UserAgent = "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.0.3705; Media Center PC 4.0; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)" 
     postReq.ContentLength = byteData.Length 
     Dim postreqstream As Stream = postReq.GetRequestStream 
     postreqstream.Write(byteData, 0, byteData.Length) 
     postreqstream.Close() 
     Dim postresponse As HttpWebResponse 
     postresponse = DirectCast(postReq.GetResponse, HttpWebResponse) 
     tempCookie.Add(postresponse.Cookies) 

     Dim postresreader As New StreamReader(postresponse.GetResponseStream) 
     Dim thepage As String = postresreader.ReadToEnd 
     MsgBox(thepage) 
    Catch ex As WebException 
     MsgBox(ex.Status.ToString & vbNewLine & ex.Message.ToString) 
    End Try 

End Sub 

回答

2

的问题是,tempCookie.GetCookies()没有做什么,你认为它做的事情。它实际上所做的实质上是将预先存在的CookieCollection过滤为仅包含提供的URL的cookie。相反,你需要做的是首先创建一个请求到一个页面,这会给你这个会话令牌,然后对你的数据进行实际的请求。因此,首先请求P_Search页面,然后重新使用该请求并将CookieContainer绑定到该页面并发布到P_Results

但是,请让我指向WebClient类和my post here about extending it to support cookies,而不是HttpWebRequest对象。你会发现你可以简化你的代码。下面是一个完整的VB2010 WinForms应用程序,展示了这一点。如果你仍然想使用HttpWebRequest的对象,这应该至少让你知道还需要做什么:

Option Strict On 
Option Explicit On 

Imports System.Net 

Public Class Form1 

    Private Sub Form1_Load(sender As System.Object, e As System.EventArgs) Handles MyBase.Load 
     ''//Create our webclient 
     Using WC As New CookieAwareWebClient() 
      ''//Set SSLv3 
      System.Net.ServicePointManager.SecurityProtocol = Net.SecurityProtocolType.Ssl3 
      ''//Create a session, ignore what is returned 
      WC.DownloadString("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Search") 
      ''//POST our actual data and get the results 
      Dim S = WC.UploadString("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Results", "POST", "term_code=201130&search_type=K&keyword=math") 
      Trace.WriteLine(S) 
     End Using 
    End Sub 
End Class 

Public Class CookieAwareWebClient 
    Inherits WebClient 

    Private cc As New CookieContainer() 
    Private lastPage As String 

    Protected Overrides Function GetWebRequest(ByVal address As System.Uri) As System.Net.WebRequest 
     Dim R = MyBase.GetWebRequest(address) 
     If TypeOf R Is HttpWebRequest Then 
      With DirectCast(R, HttpWebRequest) 
       .CookieContainer = cc 
       If Not lastPage Is Nothing Then 
        .Referer = lastPage 
       End If 
      End With 
     End If 
     lastPage = address.ToString() 
     Return R 
    End Function 
End Class 
+0

这很棒。我试图弄清楚这一点,永远无法做到。谢谢你的帮助! – Jon49

+0

我需要在并发环境中支持这种类型的功能。我知道WebClient不支持并发I/O,但是有没有办法为多个Web请求提供一个'CookieContainer',以便它们都使用单个会话?如果需要可以更多地解释逻辑。 – Terry

+0

你可以使用'synclock'吗? http://stackoverflow.com/a/396248/231316 –