vba html形式 table,VBA: Scraping information from HTML Table

问题

I'm trying to pull information from an html table. I want to add each element with in the table to a collection. This is what I have so far.

Dim htmlTable As Object

Dim coll2 As Collection

Set coll2 = New Collection

Set IE = New InternetExplorerMedium

With IE

'.AddressBar = False

'.MenuBar = False

.Navigate ("PASSWORDED SITE")

.Visible = True

End With

Set htmlTable = IE.Document.getElementById("ctl00_ContentPlaceHolder1_gvExtract")

Set tableItem = IE.Document.getElementsByTagName("td")

With coll2

For Each tableItem In htmlTable.innerHTML

.Add tableItem

Next

End With

I have a problem with this line For Each tableItem In htmlTable.innerText I tried diffent variations of htmlTable.innerText each throwing differant errors.

This is the HTML Extract for the table.

DeleteDownloadExtract DateUser Id Owner
DeleteWork Order Inquiry - Work Order06/20/2017 07:50:37MBMAYO
DeleteWork Order Inquiry - Work Order06/20/2017 07:39:29MBMAYO
DeleteWork Order Inquiry - Work Order06/20/2017 07:23:54MBMAYO
DeleteWork Order Inquiry - Work Order06/20/2017 07:16:16MBMAYO
DeleteWork Order Inquiry - Work Order06/20/2017 07:14:06MBMAYO

The goal is to store each

as an item for a collection and then retrieve the date for example 06/20/2017 07:50:37 from it. This table Grows so I think an array is out of the question?

Edit from comment:

I have been trying call this function, I'm getting a object does not support this method error:

Public Function htmlCell(id As String) As String

htmlCell = IE.getElementById("ctl00_ContentPlaceHolder1_gvExtract")

.get‌​ElementsByTagName("t‌​d")(id).innerHTML

End Function

回答1:

What you probably need is something like this. HTH

Dim htmlTable As MSHTML.htmlTable

Dim htmlTableCells As MSHTML.IHTMLElementCollection

Dim htmlTableCell As MSHTML.htmlTableCell

Dim htmlAnchor As MSHTML.HTMLAnchorElement

Set htmlTable = ie.document.getElementById("ctl00_ContentPlaceHolder1_gvExtract")

Set htmlTableCells = htmlTable.getElementsByTagName("td")

With coll2

For Each htmlTableCell In htmlTableCells

If VBA.TypeName(htmlTableCell.FirstChild) = "HTMLAnchorElement" Then

Set htmlAnchor = htmlTableCell.FirstChild

.Add htmlAnchor.innerHTML

Else

.Add htmlTableCell.innerHTML

End If

Next

End With

Result

Dim el

For Each el In coll2

Debug.Print el

Next el

Output:

Delete

Work Order Inquiry - Work Order

06/20/2017 07:50:37

MBMAYO

Delete

Work Order Inquiry - Work Order

06/20/2017 07:39:29

MBMAYO

Delete

Work Order Inquiry - Work Order

06/20/2017 07:23:54

MBMAYO

Delete

Work Order Inquiry - Work Order

06/20/2017 07:16:16

MBMAYO

Delete

Work Order Inquiry - Work Order

06/20/2017 07:14:06

MBMAYO

回答2:

I would try something like this:

Dim htmlTable As Object

Dim collTD As Collection

Dim oNode as Object

' Set IE ...

Set htmlTable = IE.Document.getElementById("ctl00_ContentPlaceHolder1_gvExtract")

' You only want the td's inside htmlTable !

Set collTD = htmlTable.getElementsByTagName("td")

For Each oNode In collTD

Debug.Print oNode.InnerHTML

' Stop -> use Watch window to drill down into oNode subitems

Next oNode

and go from there.

回答3:

I think it should be something like this.

Sub Scrape_HTML()

Set ie = CreateObject("InternetExplorer.application")

With ie

.Visible = True

.navigate "your_URL_here"

' Wait for the page to fully load; you can't do anything if the page is not fully loaded

Do While .Busy Or _

.readyState <> 4

DoEvents

Loop

Set Links = ie.document.getElementsByTagName("tr")

RowCount = 1

' Scrape out the innertext of each 'tr' element.

With Sheets("DataSheet")

For Each lnk In Links

.Range("A" & RowCount) = lnk.innerText

RowCount = RowCount + 1

Next

End With

End Sub

来源:https://stackoverflow.com/questions/44653360/vba-scraping-information-from-html-table

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值