2017-12-18 373 views
0

我对如何从这个网页刮数据的问题:VBA动态网页凑的Excel

http://tvc4.forexpros.com/init.php?family_prefix=tvc4&carrier=64694b96ed4909e815f1d10605ae4e83&time=1513525898&domain_ID=70&lang_ID=70&timezone_ID=31&pair_ID=171&interval=86400&refresh=4&session=session&client=1&user=200743128&width=650&height=750&init_page=instrument&m_pids=&watchlist=&site=https://au.investing.com&version=1.11.2

它出现在IFRAME举行和一堆JavaScript编程的出现在屏幕上。

当我尝试收集在iframe下存放的span或div或tr标签中的元素时,我似乎无法收集其中的数据。

我的目标是在class =“pane-legend-item-value pane-legend-line main”的元素内部保存的innertext。

显然光标所在的那个特定的时间在屏幕上的innerText会因改变。所以我试图做的是设置一个IE浏览器,它已经加载了页面,光标放在正确的位置,在图表的最后(为了给我最后一个数据点),然后你可以将光标移出屏幕,然后我写了一些简单的代码来抓取该IE窗口,然后尝试了GetElements,但此时我无法获取任何数据。

这是到目前为止我的代码,它非常粗略的,因为我一直在试图编辑,因为我读更多的选择,但还没有任何胜:(...任何想法或帮助将不胜感激!(截屏也是在底部)

Sub InvestingCom() 

    Dim IE As InternetExplorer 
    Dim htmldoc As MSHTML.IHTMLDocument 'Document object 
    Dim eleColth As MSHTML.IHTMLElementCollection 'Element collection for th tags 
    Dim eleColtr As MSHTML.IHTMLElementCollection 'Element collection for tr tags 
    Dim eleColtd As MSHTML.IHTMLElementCollection 'Element collection for td tags 
    Dim eleRow As MSHTML.IHTMLElement 'Row elements 
    Dim eleCol As MSHTML.IHTMLElement 'Column elements 
    Dim elehr As MSHTML.IHTMLElement 'Header Element 
    Dim iframeDoc As MSHTML.HTMLDocument 
    Dim frame As HTMLIFrame 
    Dim ieURL As String 'URL 

    'Take Control of Open IE 
    marker = 0 
    Set objShell = CreateObject("Shell.Application") 
    IE_count = objShell.Windows.Count 
    For x = 0 To (IE_count - 1) 
     On Error Resume Next 
     my_url = objShell.Windows(x).document.Location 
     my_title = objShell.Windows(x).document.Title 

     If my_title Like "*" & "*" Then 'compare to find if the desired web page is already open 
      Set IE = objShell.Windows(x) 
      marker = 1 
      Exit For 
     Else 
     End If 
    Next 

    'Extract data 
    Set htmldoc = IE.document 'Document webpage 

    ' I have tried span, tr, td etc tags and various other options 
    ' I have never actually tried collecting an HTMLFrame but googled it however was unsuccessful 
End Sub 

Screenshot of the already existing IE which excel can find and talk to with excel and VB open on the other screen and the data I would like to scrape

回答

1

这是真的,我很难从页面处理两个嵌套iframes来收集所需的内容。但不管怎么说,我终于固定它。运行下面的代码,并获得您所要求的内容:

Sub forexpros() 
    Dim IE As New InternetExplorer, html As HTMLDocument 
    Dim frm As Object, frmano As Object, post As Object 

    With IE 
     .Visible = True 
     .navigate "http://tvc4.forexpros.com/init.php?family_prefix=tvc4&carrier=64694b96ed4909e815f1d10605ae4e83&time=1513525898&domain_ID=70&lang_ID=70&timezone_ID=31&pair_ID=171&interval=86400&refresh=4&session=session&client=1&user=200743128&width=650&height=750&init_page=instrument&m_pids=&watchlist=&site=https://au.investing.com&version=1.11.2" 
     Do Until .readyState = READYSTATE_COMPLETE: Loop 
     Application.Wait (Now + TimeValue("0:00:05")) 
     Set frm = .document.getElementsByClassName("abs") ''this is the first iframe 
     .navigate frm(0).src 
     Do Until .readyState = READYSTATE_COMPLETE: Loop 
     Application.Wait (Now + TimeValue("0:00:05")) 
     Set html = .document 
    End With 

    Set frmano = html.getElementsByTagName("iframe")(0).contentWindow.document ''this is the second iframe 

    For Each post In frmano.getElementsByClassName("pane-legend-item-value pane-legend-line main") 
     Debug.Print post.innerText 
    Next post 
    IE.Quit 
End Sub