+ Reply to Thread
Results 1 to 21 of 21

Script loop - get data from html table

  1. #1
    Registered User
    Join Date
    09-30-2012
    Location
    Stockholm
    MS-Off Ver
    Excel 2010
    Posts
    13

    Script loop - get data from html table

    Hi!

    I finally gave up and joined the forum after having read loads of threads and tutorials. I just can't figure out how to make this script. I'd be very grateful if someone helped me with this!

    What I want is a script that gets information from a html table. I found a tutorial on how to do that, so far so good. The problem I have is that I have thousands of url's, which all goes to the same site, but to different search results. Something like this:

    www.somesite.com/searchresult1
    www.somesite.com/searchresult2
    www.somesite.com/searchresult3 and so on....

    Therefor I have to get the script to read each url from a list, one at a time, and save each result. I.E, the script has to be a loop.

    The table is divided in 2 columns and several rows. The first column always contain the same text because they are titles. Something like this:

    Age 30
    Gender Male
    Marital status Married
    Height 6"4
    Weight 194 lbs
    Location Arizona
    I'm only interested in the second column (green text). At least, it's the only text I'm going to include in the final result. The first column will only be column headers.

    The final result should contain the values from half the table (green text), from all 5000 url's. Only three url's in this example of the final result:


    Age Gender Marital status Height Weight Location
    30 Male Married 6"4 194 lbs Arizona
    57 Female Not married 5"4 153 lbs New York
    11 Male Not married 4"8 131 lbs Ohio

    Anyone know how I could do this!?

  2. #2
    Forum Expert mrice's Avatar
    Join Date
    06-22-2004
    Location
    Surrey, England
    MS-Off Ver
    Excel 2013
    Posts
    4,967

    Re: Script loop - get data from html table

    http://stackoverflow.com/questions/2...with-excel-vba

    The above link has a mechanism for extracting the HTML from a given link. This could be incorporated into a loop. If you need more help, I would suggest providing a file containing the actual HTML as an example.
    Martin

  3. #3
    Registered User
    Join Date
    09-30-2012
    Location
    Stockholm
    MS-Off Ver
    Excel 2010
    Posts
    13

    Re: Script loop - get data from html table

    Quote Originally Posted by mrice View Post
    http://stackoverflow.com/questions/2...with-excel-vba

    The above link has a mechanism for extracting the HTML from a given link. This could be incorporated into a loop. If you need more help, I would suggest providing a file containing the actual HTML as an example.
    Thanks for answering! I think getting the data is the easy part though.

    I've tried the code below, which was posted as an answer to someone elses problem. It gets the content from the table and puts it in Excel like the table above (with green text). The problem I have is to have the script to read many url's and save everything as in the last table above (with no green text).

    Please Login or Register  to view this content.
    Last edited by Rawland Hustle; 09-30-2012 at 04:41 PM.

  4. #4
    Registered User
    Join Date
    09-30-2012
    Location
    Stockholm
    MS-Off Ver
    Excel 2010
    Posts
    13

    Re: Script loop - get data from html table

    Anyone?

  5. #5
    Registered User
    Join Date
    09-30-2012
    Location
    Stockholm
    MS-Off Ver
    Excel 2010
    Posts
    13

    Re: Script loop - get data from html table

    If it's any help, here's the element ID's:

    Index: 246
    Inner Text: Just two words of text that I can't show
    OuterHTML: <TD class=cant-show-this>Same two words as in Inner Text</TD>

    All the elements (<td>'s) I'm interested in looks the same, except for text and index number. Index numbers for the elements are: 246, 249, 252, 255, 258, 261, 264, 267, 270, 273 and 276.

    Hope that helps?

  6. #6
    Registered User
    Join Date
    09-30-2012
    Location
    Stockholm
    MS-Off Ver
    Excel 2010
    Posts
    13

    Re: Script loop - get data from html table

    This is exactly what I want: http://stackoverflow.com/questions/6...e-result-using

    I just have to change it so it fits to my needs. What values do I have to change?

    I'm so close now, please help me! :D

  7. #7
    Forum Expert mrice's Avatar
    Join Date
    06-22-2004
    Location
    Surrey, England
    MS-Off Ver
    Excel 2013
    Posts
    4,967

    Re: Script loop - get data from html table

    It would really help if you were able to provide some examples URLs (assuming that these are public) so that respondents can test what they propose.

  8. #8
    Registered User
    Join Date
    09-30-2012
    Location
    Stockholm
    MS-Off Ver
    Excel 2010
    Posts
    13

    Re: Script loop - get data from html table

    Quote Originally Posted by mrice View Post
    It would really help if you were able to provide some examples URLs (assuming that these are public) so that respondents can test what they propose.
    I would do that in a heartbeat if it wasn't for the fact that I can't. You'd have to be logged in with my e-ID, as an admin, to access those url's.

    I'm not asking for someone to do the job for me, "just" point out wich parameters in this script that I need to change. Which part of it defines which elements it should look in? I want to give it the <td>'s it should look in. I know the ID's to those <td>'s.

  9. #9
    Valued Forum Contributor
    Join Date
    05-21-2009
    Location
    Great Britain
    MS-Off Ver
    Excel 2003
    Posts
    550

    Re: Script loop - get data from html table

    Quote Originally Posted by Rawland Hustle View Post
    I'm not asking for someone to do the job for me, "just" point out wich parameters in this script that I need to change. Which part of it defines which elements it should look in? I want to give it the <td>'s it should look in. I know the ID's to those <td>'s.
    The URL in ActiveSheet.QueryTables.Add and the table number(s) 20 in .WebTables = "20". The code uses a web query, which can retrieve data from specific HTML tables; a web query doesn't use <td> IDs, so you may not be able to retrieve the specific data values, but only the table(s) in which they reside, which may include other unwanted data.

    Web queries are not the fastest way of retrieving web data, but they are the simplest. The easiest way to get started is by using the macro recorder with you creating a manual web query on its own sheet. This retrieves the table(s) in which the data resides, and then copy and transpose the specific values you want to your main sheet. Loop through the URLs, changing the qt.Connection value in the single web query, similar to the way shown in the code.
    Post responsibly. Search for excelforum.com

  10. #10
    Registered User
    Join Date
    09-30-2012
    Location
    Stockholm
    MS-Off Ver
    Excel 2010
    Posts
    13

    Re: Script loop - get data from html table

    Quote Originally Posted by Chippy View Post
    The URL in ActiveSheet.QueryTables.Add and the table number(s) 20 in .WebTables = "20". The code uses a web query, which can retrieve data from specific HTML tables; a web query doesn't use <td> IDs, so you may not be able to retrieve the specific data values, but only the table(s) in which they reside, which may include other unwanted data.

    Web queries are not the fastest way of retrieving web data, but they are the simplest. The easiest way to get started is by using the macro recorder with you creating a manual web query on its own sheet. This retrieves the table(s) in which the data resides, and then copy and transpose the specific values you want to your main sheet. Loop through the URLs, changing the qt.Connection value in the single web query, similar to the way shown in the code.
    Dude, I love you! I changed the existing script and it works!

    As always, there are some adjusting to do :P For som url's, a certain cell <td> in the table <table> has two rows or more, instead of just one. That messes up the copy/paste/transpose to the second sheet in Excel, beause the text I want to copy from cell A2 is actually in A3 or A4. Can I make the script smart enough to get the value from the cell, that is to right of the cell with a specific value? I.E., instead of copying value from "A2", always copy from "the cell to the right of the cell "marital status"".

  11. #11
    Valued Forum Contributor
    Join Date
    05-21-2009
    Location
    Great Britain
    MS-Off Ver
    Excel 2003
    Posts
    550

    Re: Script loop - get data from html table

    Use Find with the macro recorder and modify the code generated. I think there is an example in the VBA Find help. Use foundCell.Offset(0,1).Value to get the cell value one column to the right of the found cell.

  12. #12
    Registered User
    Join Date
    09-30-2012
    Location
    Stockholm
    MS-Off Ver
    Excel 2010
    Posts
    13

    Re: Script loop - get data from html table

    Quote Originally Posted by Chippy View Post
    Use Find with the macro recorder and modify the code generated. I think there is an example in the VBA Find help. Use foundCell.Offset(0,1).Value to get the cell value one column to the right of the found cell.
    I'm sorry, but I don't understand what you mean How do I use "Find" with the macro recorder? Where is "VBA Find help"?
    Let's say the text in a cell is "Marital status", and I want the value from the cell to the right of it. How do write that? This is the part that I have to change:
    Please Login or Register  to view this content.

  13. #13
    Registered User
    Join Date
    09-30-2012
    Location
    Stockholm
    MS-Off Ver
    Excel 2010
    Posts
    13

    Re: Script loop - get data from html table

    I changed the code (marked red below) but that didn't work. What am I doing wrong?

    Please Login or Register  to view this content.

  14. #14
    Valued Forum Contributor
    Join Date
    05-21-2009
    Location
    Great Britain
    MS-Off Ver
    Excel 2003
    Posts
    550

    Re: Script loop - get data from html table

    Quote Originally Posted by Rawland Hustle View Post
    I'm sorry, but I don't understand what you mean How do I use "Find" with the macro recorder?
    In the main Excel GUI: Tools - Macro - Record New Macro - OK to start the macro recorder. Then Edit - Find - enter string to find and set other options as required. Stop the macro recorder. Go into VB editor to view the code generated. Search for excel macro recorder if you need more help.

    Where is "VBA Find help"?
    In the VB editor: Help - Microsoft VBA Help - enter 'find' in search box. Or more simply, type in the word 'Find' somewhere in the code, put cursor on the Find keyword and press F1 key.

    The above is for Excel 2003 and should be similar for other versions.

  15. #15
    Valued Forum Contributor
    Join Date
    05-21-2009
    Location
    Great Britain
    MS-Off Ver
    Excel 2003
    Posts
    550

    Re: Script loop - get data from html table

    Quote Originally Posted by Rawland Hustle View Post
    I changed the code (marked red below) but that didn't work. What am I doing wrong?
    In what way didn't it work? You haven't said. Any error messages? As already mentioned, use the macro recorder whilst doing a manual Find in the Excel GUI and incorporate the code generated (which will need editing to remove Select, Activate) and assign the range returned by Find to a range variable (as you do for the code in red).

    The code in red is using the code name for the sheet (Sheet1.Range), which may not be correct for your specific situation. And is your web query on Sheet1 (code name). Use sheet names instead like Sheets("Sheet1").Range to avoid confusion with code names. Also, shouldn't the Find be done inside the loop after the Refresh, if the web query worked?

    Also, if your range Sheet2.Range("A2:A29") contains complete URLs, changing the query table connection parameter for each URL in the loop should just be:
    Please Login or Register  to view this content.
    Remember, the code you linked to is specific to the particular request posted so probably won't work exactly for your situation without modification. I would also comment out the On Error lines whilst testing the code so that errors are not suppressed - in this case it should only be used to trap possible errors from the web query.

  16. #16
    Registered User
    Join Date
    09-30-2012
    Location
    Stockholm
    MS-Off Ver
    Excel 2010
    Posts
    13

    Re: Script loop - get data from html table

    Thanks for your replies!

    Quote Originally Posted by Chippy View Post
    In what way didn't it work? You haven't said. Any error messages?
    Sorry. When I ran the scripts, the cells got blank. The same cells recieved values before I started fiddeling with the script.

    Quote Originally Posted by Chippy View Post
    As already mentioned, use the macro recorder whilst doing a manual Find in the Excel GUI and incorporate the code generated (which will need editing to remove Select, Activate) and assign the range returned by Find to a range variable (as you do for the code in red).
    This is what I get if I record when I select the range I want to look within, then click find and enter "Marital status":

    Please Login or Register  to view this content.
    Quote Originally Posted by Chippy View Post
    The code in red is using the code name for the sheet (Sheet1.Range), which may not be correct for your specific situation. And is your web query on Sheet1 (code name).
    I thought the actual sheets had to be named "Sheet1" and "Sheet2" I downloaded the whole .xls in this answer so everything works fine.

    Quote Originally Posted by Chippy View Post
    Also, shouldn't the Find be done inside the loop after the Refresh, if the web query worked?
    I have no idea :P I've never worked with VBA before.

    Quote Originally Posted by Chippy View Post
    Also, if your range Sheet2.Range("A2:A29") contains complete URLs, changing the query table connection parameter for each URL in the loop should just be:
    Please Login or Register  to view this content.
    Remember, the code you linked to is specific to the particular request posted so probably won't work exactly for your situation without modification. I would also comment out the On Error lines whilst testing the code so that errors are not suppressed - in this case it should only be used to trap possible errors from the web query.
    To clarify: the web query works fine. Everything with the script works fine, except for this one tiny little thing. When the web query results are copy/pasted and transposed to the other sheet, I can't use "A2" as a reference because it doesn't match sometimes. I have to change "A2" to "The cell to the right of the cell with the text "Marital status" in it" I just don't know how to write that.

  17. #17
    Valued Forum Contributor
    Join Date
    05-21-2009
    Location
    Great Britain
    MS-Off Ver
    Excel 2003
    Posts
    550

    Re: Script loop - get data from html table

    Quote Originally Posted by Rawland Hustle View Post
    Everything with the script works fine, except for this one tiny little thing. When the web query results are copy/pasted and transposed to the other sheet, I can't use "A2" as a reference because it doesn't match sometimes. I have to change "A2" to "The cell to the right of the cell with the text "Marital status" in it" I just don't know how to write that.
    See if you can incorporate/adapt this snippet into your code (I've used MsgBox for debugging):
    Please Login or Register  to view this content.
    It assumes the web query retrieves data to the sheet named "Sheet1", which is not necessarily the same as the CodeName for that sheet. You should put this code inside the If Err.Number = 0 Then statement because you want to find the data for each web query retrieval.

  18. #18
    Registered User
    Join Date
    09-30-2012
    Location
    Stockholm
    MS-Off Ver
    Excel 2010
    Posts
    13

    Re: Script loop - get data from html table

    Quote Originally Posted by Chippy View Post
    See if you can incorporate/adapt this snippet into your code
    I'm sorry, but I don't know exactly where to put that snippet. It gives me "Not found" no matter where I put it (that's what she said). Should it replace some part?

    Below is the whole code. Could you please give me the correct, full version of it?

    Please Login or Register  to view this content.
    Remember, everything with this code works fine, except for the specific problem we've been talking about for the last few posts. The red part is what needs fixing.
    Last edited by Rawland Hustle; 10-01-2012 at 05:06 PM. Reason: Wrong code first. Pasted the correct one.

  19. #19
    Valued Forum Contributor
    Join Date
    05-21-2009
    Location
    Great Britain
    MS-Off Ver
    Excel 2003
    Posts
    550

    Re: Script loop - get data from html table

    Try this - I've added the Find call in the correct place (it replaces the line with the red bit). As previously mentioned, I've commented out the On Error lines in case errors occur with the code (you want to see what they are and write code to handle potential errors instead of suppressing whilst testing).
    Please Login or Register  to view this content.

  20. #20
    Registered User
    Join Date
    09-30-2012
    Location
    Stockholm
    MS-Off Ver
    Excel 2010
    Posts
    13

    Re: Script loop - get data from html table

    VIOLA! It works! :D

    You're awesome. I really appreciate the help you've given me! I really do.

    Now I'm going to run this script on 4800 urls. When I tried 160 urls Excel kinda froze for a while. Should/could I change the speed of the script so it doesn't overload?

    Again, thanks alot!

  21. #21
    Valued Forum Contributor
    Join Date
    05-21-2009
    Location
    Great Britain
    MS-Off Ver
    Excel 2003
    Posts
    550

    Re: Script loop - get data from html table

    Quote Originally Posted by Rawland Hustle View Post
    Now I'm going to run this script on 4800 urls. When I tried 160 urls Excel kinda froze for a while. Should/could I change the speed of the script so it doesn't overload?
    Putting DoEvents after the qt.Refresh will allow Excel to respond and update itself.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1