2011-12-15 62 views
0

我有两个富文本框和两个按钮在我的屏幕上。第一个按钮从URL抓取HTML,然后将HTML转换为位于富文本框1中的XML。VB.NET〜需要帮助解析在richtextbox中加载的XML(从URL加载)

第二个按钮是从富文本框1中获取XML,然后解析它以获取所有输入元素由他们的ID。

我的问题是我的解析器没有做任何事情。我的猜测是我没有从第一个富文本框中获取XML。

从富文本框中抓取XML的最佳方式是将其加载到内存中,然后解析XML以获取所有ID标记?

这是我的代码 - 感谢您的任何帮助。

Imports mshtml 
Imports System.Text 
Imports System.Net 
Imports System.Xml 
Imports System.IO 
Imports System.Xml.XPath 

Public Class Scraper 

    Private Sub Scraper_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load 
    End Sub 

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click 
     ' Note: This example uses two Chilkat products: Chilkat HTTP 
     ' and Chilkat HTML-to-XML. The "Chilkat Bundle" can be licensed 
     ' at a price that is less than purchasing each product individually. 
     ' The "Chilkat Bundle" provides licenses to all existing Chilkat components. Also, new-version upgrades are always free. 

     Dim http As New Chilkat.Http() 

     ' Any string argument automatically begins the 30-day trial. 
     Dim success As Boolean 
     success = http.UnlockComponent("30-day trial") 
     If (success <> True) Then 
      TextBox1.Text = TextBox1.Text & http.LastErrorText & vbCrLf 
      Exit Sub 
     End If 

     Dim html As String 
     html = http.QuickGetStr("http://www.quiltingboard.com/register.php") 
     If (html = vbNullString) Then 
      TextBox1.Text = TextBox1.Text & http.LastErrorText & vbCrLf 
      Exit Sub 
     End If 

     Dim htmlToXml As New Chilkat.HtmlToXml() 

     ' Any string argument automatically begins the 30-day trial. 
     success = htmlToXml.UnlockComponent("30-day trial") 
     If (success <> True) Then 
      TextBox1.Text = TextBox1.Text & htmlToXml.LastErrorText & vbCrLf 
      Exit Sub 
     End If 

     ' Indicate the charset of the output XML we'll want. 
     htmlToXml.XmlCharset = "utf-8" 

     ' Set the HTML: 
     htmlToXml.Html = html 

     ' Convert to XML: 
     Dim xml As String 
     xml = htmlToXml.ToXml() 

     ' Save the XML to a file. 
     ' Make sure your charset here matches the charset 
     ' used for the XmlCharset property. 
     htmlToXml.WriteStringToFile(xml, "out.xml", "utf-8") 

     RichTextBox1.Text = xml 
    End Sub 

    Private Sub LoopThroughXmlDoc(ByVal nodeList As XmlNodeList) 
     For Each elem As XmlElement In nodeList 
      If elem.HasChildNodes Then 
       LoopThroughXmlDoc(elem.ChildNodes) 
      Else 
       '' Extract the information 
       If elem.HasAttribute("id") Then 
        'elem.Attributes("AssetID").Value.ToString() 
       ElseIf elem.HasAttribute("name") Then 
        'elem.Attributes("AttributeID").Value.ToString() 
       End If 
      End If 
     Next 
    End Sub 

    Private Sub Button2_Click_1(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button2.Click 
     Dim doc As XmlDocument = New XmlDocument 
     doc.Load("xmlFile.xml") 
     Dim nodeList As XmlNodeList = doc.GetElementsByTagName("input") 
     LoopThroughXmlDoc(nodeList) 
    End Sub 
End Class 

回答

0

第二个按钮没有从RichTextBox获取XML,它试图从xmlFile.xml加载它。

该文件与保存在button1中的文件是out.xml不同。

如果用户可以更改richtextbox中的XML,则解决方案是更改button2中的代码以从RTB中检索文本。

否则,解决方案是将在button2中读取的文件的名称更改为out.xml。

+0

感谢Competent_tech – user1096419 2012-07-15 20:58:54