I am working on an Excel VBA project to scrape some specific information from a website. The view of this data on the website is as such:
ukVGe.png
What I am looking to do is extract text based on two criteria: Name and post date. For example, I have the name Kaelan and the post date of 11/16/2016. I want to extract the amount of $365.
This is the HTML code:
< div class="familyLedgerAmountCategory" id="id_4541278" >
< table >
< tr >
< td class="tdCategoryRow" >
< div class="cmFloatLeft divExpandToggle expanded" id="divCategoryToggle_id_4541278" >< /div >
< div class="cmFloatLeft" id="divCategoryLabel_id_4541278" style="width: 430px;" >
Kaelan
< /div >< span style="margin-left: 5px;" >$ 465.00< /span >
< /td >
< /tr >
< tbody >
< tr class="trListTableBody LedgerExisting" id="CamperFamilyLedgerRowControl_14816465" >
< td class="tdCamperFamilyLedgerTableColumnDescription tdBorderTop" id="tdCamperFamilyLedgerTableColumnDescription_CamperFamilyLedgerRowControl_14816465" >
< div class="divListTableBodyCell" id="tdColumnDescriptionCell" >
< table class="tblListTableBodyCell" >
< tr >
< td >
< div class="divListTableBodyLabel" >
< a class="aColumnDescriptionCell" id="aColumnDescriptionCell_CamperFamilyLedgerRowControl_14816465" name="aColumnDescriptionCell_CamperFamilyLedgerRowControl_14816465" target="_self" title="Click to view details">2017 Super Early Bird Teen Camp - Tuition< /a >
< /div >
< /td >
< /tr >
< /table >
< /div >
< /td >
< td class="tdCamperFamilyLedgerTableColumnPostDate tdBorderTop" id="tdCamperFamilyLedgerTableColumnPostDate_CamperFamilyLedgerRowControl_14816465" >
< div class="divListTableBodyCell" id="tdColumnPostDateCell" >
< table class="tblListTableBodyCell" >
< tr >
< td >
< div class="divListTableBodyLabel" >
11/16/2016
< /div >
< /td >
< /tr >
< /table >
< /div >
< /td >
< td class="tdCamperFamilyLedgerTableColumnEffective tdBorderTop" id="tdCamperFamilyLedgerTableColumnEffective_CamperFamilyLedgerRowControl_14816465" >
< div class="divListTableBodyCell" id="tdColumnEffectiveCell" >
< table class="tblListTableBodyCell" >
< tr >
< td >
< div class="divListTableBodyLabel" >
11/15/2016
< /div >
< /td >
< /tr >
< /table >
< /div >
< /td >
< td class="tdCamperFamilyLedgerTableColumnQty tdBorderTop" id="tdCamperFamilyLedgerTableColumnQty_CamperFamilyLedgerRowControl_14816465" >
< div class="divListTableBodyCell" id="tdColumnQtyCell" >
< table class="tblListTableBodyCell" >
< tr >
< td >
< div class="divListTableBodyLabel" >
1
< /div >
< /td >
< /tr >
< /table >
< /div >
< /td >
< td class="tdCamperFamilyLedgerTableColumnAmount tdBorderTop" id="tdCamperFamilyLedgerTableColumnAmount_CamperFamilyLedgerRowControl_14816465" >
< div class="divListTableBodyCell" id="tdColumnAmountCell" >
< table class="tblListTableBodyCell" >
< tr >
< td >
< div class="divListTableBodyLabel" >
$ 365.00
< /div >
< /td >
< /tr >
< /table >
< /div >
< /td >
< td class="tdCamperFamilyLedgerTableColumnAction tdBorderTop" id="tdCamperFamilyLedgerTableColumnAction_CamperFamilyLedgerRowControl_14816465" >< /td >
< /tr >
< /tbody >
< /table >
< /div >
This is the VBA code for my attempt.
Function GetParent(el, tagParent)
Dim rv As Object
Set rv = el
Do While Not rv.parentElement Is Nothing
Set rv = rv.parentElement
If UCase(rv.tagName) = UCase(tagParent) Then
Set GetParent = rv
Exit Function
End If
Loop
Set GetParent = Nothing
End Function
Sub Test()
Dim ie As Object
Dim oElement As Object
Dim wsTarget As Worksheet
Dim i As Integer
Dim NewWB As Workbook
Set NewWB = ActiveWorkbook
Set wsTarget = NewWB.Sheets(1)
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
ie.navigate website..
Wait 6
ie.document.All.Item("txtUserName").Value = "User"
ie.document.All.Item("pswdPassword").Value = "Pass"
Wait 1
ie.document.getElementById("btnLogin").Click
Wait 5
ie.navigate website....
Wait 7
Set doc = ie.document
Set els = doc.getElementsByClassName("cmFloatLeft")
i = 1
For Each oElement In els
'Debug.Print oElement.innerText
If Trim(oElement.innerText) = "Kaelan" Then
Set tbl = GetParent(oElement, "table") '<< find the parent table
If Not tbl Is Nothing Then
'loop over the parent table
m = 1
n = 1
For Each rw In tbl.Rows
wsTarget.Range("F" & m) = rw.innerText
m = m + 1
For Each cl In rw.Cells
wsTarget.Range("D" & n) = cl.innerText
n = n + 1
Next cl
Next rw
End If
End If
Next
So the above code will return the header descriptions of the table such as "Description, Post Date, Effective, Qty, Amount." How can I use this information where I am to extract information from a table below? Since I believe the actual amounts and dates are part of a table below the table where I have extracted information from.
Bookmarks