Closed Thread
Results 1 to 17 of 17

Scraping Data from Web Into Excel Using VBA

  1. #1
    Registered User
    Join Date
    10-17-2003
    Posts
    32

    Scraping Data from Web Into Excel Using VBA

    I have a list of several hundred URLs in Excel. The URLs are all from the same site and have the same format and structure, but the specific data on the page varies. There are 3 specific pieces of information from each resulting web page that I would like to capture in an Excel worksheet. I am hopeful that there is a way to do this using VB, but I have not been able to get it to work.

    I have successfully used a Web Query for one of the URLs from my list, identified the appropriate table (table 9), imported the data from that URL into Excel, identified the 3 fields I would like to copy (A2, A3, B5) and pasted that data into a new "results" worksheet. But now I need an automated solution that will look up the next URL in the list, run the Web Query and copy the contents of the three cells into the next row of the results worksheet, then move on to the next URL, etc.

    Does anyone have any pointers how to do this? Or perhaps a better way to accomplish this than using VB? Thanks for your help!

  2. #2
    Valued Forum Contributor mudraker's Avatar
    Join Date
    11-10-2003
    Location
    Melbourne, Australia
    Posts
    3,983
    It can be done.

    The best way to get the results your require is to record a macro as you setup your webquery and get your data.

    Post a copy of the code here and also provide details of were the list of URL's is kept and onto what sheet rows, cells your require the data to be placed

  3. #3
    Forum Contributor
    Join Date
    07-13-2006
    Posts
    400
    i just did pretty much what mudraker suggested

    assumes your URL's are in Sheet1 column A
    assume you have a sheet named Data you want the data posted to
    it posts cells B2,B3,C5 to the first top left cells of sheet Data

    i just tested it with google.com so you'll have to update the webimport code with whatever you find from going to your webaddress

    Please Login or Register  to view this content.
    hth
    --Mark

    Quantum materiae materietur marmota monax si marmota monax materiam possit materiari?

  4. #4
    Registered User
    Join Date
    10-17-2003
    Posts
    32
    Thanks so much for both of your replies. MDubbelboer, I tried your code, but it kept throwing "out of range" and other errors. I was able to patch some of them (fixing worksheet names, etc.), but still couldn't get it to work. I have a feeling the trouble lies with my inadequate description of my situation, so let me give you some additional information.

    My workbook has 3 worksheets: URLs, Results and Scrape. In the sheet URLs, I have a list of URLs starting in Cell A1 and continuing down to cell A408. Here is an example of one of the URLs (from cell A1):

    http://www.webflyer.com/travel/milem...&ticket_price=

    The sheet Scrape is there to host the Web Query data (I wasn't sure if I needed a separate sheet for this or not). So if I switch to the Scrape sheet and manually run a Web Query on the above URL and import Table 9, cell A2 will contain the first city name ("O'Hare Intl Arpt, Chicago, IL (ORD)"), cell A3 will contain the second city name ("Dyce Airport, Aberdeen, GB (ABZ)") and cell B5 will contain the miles between those cities ("3710 miles").

    In the sheet Results, cell A1 is labeled City1, B1 is labeled City2 and C1 is labeled Miles. I would like to paste the results from the Scrape sheet into cells A2 through C2 respectively, then move on to the URL in cell A2 of the URLs sheet and paste the results into cells A3 through C3, etc., until the sheet Results contains 409 lines of data (1 row of headers + 408 rows of data).

    Does that change the VB code required, or should the initial code you gave me work? Thanks again for your help.

  5. #5
    Valued Forum Contributor mudraker's Avatar
    Join Date
    11-10-2003
    Location
    Melbourne, Australia
    Posts
    3,983
    try this


    Sub WebData()

    Dim wSU As Worksheet
    Dim wSR As Worksheet
    Dim wSS As Worksheet

    Dim iForRow As Integer
    Dim iLastRow As Integer
    Dim sURL As String

    Set wSU = ThisWorkbook.Sheets("URL's")
    Set wSR = ThisWorkbook.Sheets("Results")
    Set wSS = ThisWorkbook.Sheets("Scrape")

    iLastRow = Cells(Rows.Count, "a").End(xlUp).Row
    For iForRow = 1 To iLastRow Step 1
    sURL$ = wSU.Cells(iForRow, "a").Value
    With ActiveSheet.QueryTables.Add(Connection:= _
    "URL;http://" & sURL, Destination:=Range("A1"))
    .Name = _
    "getmileage.php?city=ord&city=ABZ&city=&city=&city=&bonus=" _
    & "0&bonus_use_min=0&class_bonus=0&class_bonus_use_min=0&promo_bonus=" _
    & "0&promo_bonus_use_min=0&min=0&min_type=m&ticket_price="
    .FieldNames = True
    .RowNumbers = False
    .FillAdjacentFormulas = False
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .RefreshStyle = xlInsertDeleteCells
    .SavePassword = False
    .SaveData = True
    .AdjustColumnWidth = True
    .RefreshPeriod = 0
    .WebSelectionType = xlSpecifiedTables
    .WebFormatting = xlWebFormattingNone
    .WebTables = "9"
    .WebPreFormattedTextToColumns = True
    .WebConsecutiveDelimitersAsOne = True
    .WebSingleBlockTextImport = False
    .WebDisableDateRecognition = False
    .WebDisableRedirections = False
    .Refresh BackgroundQuery:=False
    End With
    wSR.Cells(iForRow + 1, "a").Value = wSS.Range("a2").Value
    wSR.Cells(iForRow + 1, "b").Value = wSS.Range("a3").Value
    wSR.Cells(iForRow + 1, "c").Value = wSS.Range("b5").Value
    Next iForRow
    End Sub

  6. #6
    Registered User
    Join Date
    10-17-2003
    Posts
    32
    Okay, getting closer. I tried that script. I had to make 2 changes (changed "URL's" to URLs" and took out "http://" after "URL;"). Revised script is below.

    I selected cell A1 in the Scrape sheet and ran the script. It retrieved the first 15 sets of results, filling them across the Scrape sheet (so the first set was in A1:C15, second set in D1:F15, etc.). It also copied the first 15 results into the Results sheet perfectly. But then it stopped, and I can't get it to run any more.

    Any ideas? Thanks!

    REVISED VBA CODE:

    Sub WebData()

    Dim wSU As Worksheet
    Dim wSR As Worksheet
    Dim wSS As Worksheet

    Dim iForRow As Integer
    Dim iLastRow As Integer
    Dim sURL As String

    Set wSU = ThisWorkbook.Sheets("URLs")
    Set wSR = ThisWorkbook.Sheets("Results")
    Set wSS = ThisWorkbook.Sheets("Scrape")

    iLastRow = Cells(Rows.Count, "a").End(xlUp).Row
    For iForRow = 1 To iLastRow Step 1
    sURL$ = wSU.Cells(iForRow, "a").Value
    With ActiveSheet.QueryTables.Add(Connection:= _
    "URL;" & sURL, Destination:=Range("A1"))
    .Name = _
    "getmileage.php?city=ord&city=ABZ&city=&city=&city =&bonus=" _
    & "0&bonus_use_min=0&class_bonus=0&class_bonus_use_m in=0&promo_bonus=" _
    & "0&promo_bonus_use_min=0&min=0&min_type=m&ticket_p rice="
    .FieldNames = True
    .RowNumbers = False
    .FillAdjacentFormulas = False
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .RefreshStyle = xlInsertDeleteCells
    .SavePassword = False
    .SaveData = True
    .AdjustColumnWidth = True
    .RefreshPeriod = 0
    .WebSelectionType = xlSpecifiedTables
    .WebFormatting = xlWebFormattingNone
    .WebTables = "9"
    .WebPreFormattedTextToColumns = True
    .WebConsecutiveDelimitersAsOne = True
    .WebSingleBlockTextImport = False
    .WebDisableDateRecognition = False
    .WebDisableRedirections = False
    .Refresh BackgroundQuery:=False
    End With
    wSR.Cells(iForRow + 1, "a").Value = wSS.Range("a2").Value
    wSR.Cells(iForRow + 1, "b").Value = wSS.Range("a3").Value
    wSR.Cells(iForRow + 1, "c").Value = wSS.Range("b5").Value
    Next iForRow
    End Sub

  7. #7
    Valued Forum Contributor mudraker's Avatar
    Join Date
    11-10-2003
    Location
    Melbourne, Australia
    Posts
    3,983
    Try this one

    I have made several changes

    Turned off scren updating whilst macro is running
    changed activesheet to wSS for scrape sheet and add wSS to the With comand
    Changed .Refreshstyle from insert to overwrite - I think this is the main cause of your problem
    I originally only tested it on 1 run this time i tried it for 20 times

    Added message box at end to let user now process finished


    Clear your Scrape sheet then try running with this code

    Sub WebData()

    Dim wSU As Worksheet
    Dim wSR As Worksheet
    Dim wSS As Worksheet

    Dim iForRow As Integer
    Dim iLastRow As Integer
    Dim sURL As String

    Set wSU = ThisWorkbook.Sheets("URLs")
    Set wSR = ThisWorkbook.Sheets("Results")
    Set wSS = ThisWorkbook.Sheets("Scrape")

    Application.ScreenUpdating = False
    iLastRow = Cells(Rows.Count, "a").End(xlUp).Row
    For iForRow = 1 To iLastRow Step 1
    sURL$ = wSU.Cells(iForRow, "a").Value
    With wSS.QueryTables.Add(Connection:= _
    "URL;" & sURL, Destination:=wSS.Range("A1"))
    .Name = _
    "getmileage.php?city=ord&city=ABZ&city=&city=&city =&bonus=" _
    & "0&bonus_use_min=0&class_bonus=0&class_bonus_use_m in=0&promo_bonus=" _
    & "0&promo_bonus_use_min=0&min=0&min_type=m&ticket_p rice="
    .FieldNames = True
    .RowNumbers = False
    .FillAdjacentFormulas = False
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .RefreshStyle = xlOverwriteCells
    .SavePassword = False
    .SaveData = True
    .AdjustColumnWidth = True
    .RefreshPeriod = 0
    .WebSelectionType = xlSpecifiedTables
    .WebFormatting = xlWebFormattingNone
    .WebTables = "9"
    .WebPreFormattedTextToColumns = True
    .WebConsecutiveDelimitersAsOne = True
    .WebSingleBlockTextImport = False
    .WebDisableDateRecognition = False
    .WebDisableRedirections = False
    .Refresh BackgroundQuery:=False
    End With
    wSR.Cells(iForRow + 1, "a").Value = wSS.Range("a2").Value
    wSR.Cells(iForRow + 1, "b").Value = wSS.Range("a3").Value
    wSR.Cells(iForRow + 1, "c").Value = wSS.Range("b5").Value
    Next iForRow
    Application.ScreenUpdating = True
    MsgBox "Process Completed"
    End Sub

  8. #8
    Registered User
    Join Date
    10-17-2003
    Posts
    32
    Cleared the Scrape sheet, selected cell A1 and tried the new code. It ran okay for the first URL (grabbed data from web, placed table 9 in Scrape, copied correct data to Results), but then gave me the "Process Completed" notification and stopped running.

    What next? Sorry this is turning out to be such a hassle for you, but thanks again for your help.

  9. #9
    Valued Forum Contributor mudraker's Avatar
    Join Date
    11-10-2003
    Location
    Melbourne, Australia
    Posts
    3,983
    Its not a hassel. I should check my code a little more before posting it.

    When getting the lastrow I did not specify which sheet to check so it was getting the last used row of the activesheet.

    I have amended the code to make sure it looks at the url sheet.

    Sub WebData()

    Dim wSU As Worksheet
    Dim wSR As Worksheet
    Dim wSS As Worksheet

    Dim iForRow As Integer
    Dim iLastRow As Integer
    Dim sURL As String

    Set wSU = ThisWorkbook.Sheets("URLs")
    Set wSR = ThisWorkbook.Sheets("Results")
    Set wSS = ThisWorkbook.Sheets("Scrape")

    Application.ScreenUpdating = False
    iLastRow = wSU.Cells(wSU.Rows.Count, "a").End(xlUp).Row
    For iForRow = 1 To iLastRow Step 1
    sURL$ = wSU.Cells(iForRow, "a").Value
    With wSS.QueryTables.Add(Connection:= _
    "URL;" & sURL, Destination:=wSS.Range("A1"))
    .Name = _
    "getmileage.php?city=ord&city=ABZ&city=&city=&city =&bonus=" _
    & "0&bonus_use_min=0&class_bonus=0&class_bonus_use_m in=0&promo_bonus=" _
    & "0&promo_bonus_use_min=0&min=0&min_type=m&ticket_p rice="
    .FieldNames = True
    .RowNumbers = False
    .FillAdjacentFormulas = False
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .RefreshStyle = xlOverwriteCells
    .SavePassword = False
    .SaveData = True
    .AdjustColumnWidth = True
    .RefreshPeriod = 0
    .WebSelectionType = xlSpecifiedTables
    .WebFormatting = xlWebFormattingNone
    .WebTables = "9"
    .WebPreFormattedTextToColumns = True
    .WebConsecutiveDelimitersAsOne = True
    .WebSingleBlockTextImport = False
    .WebDisableDateRecognition = False
    .WebDisableRedirections = False
    .Refresh BackgroundQuery:=False
    End With
    wSR.Cells(iForRow + 1, "a").Value = wSS.Range("a2").Value
    wSR.Cells(iForRow + 1, "b").Value = wSS.Range("a3").Value
    wSR.Cells(iForRow + 1, "c").Value = wSS.Range("b5").Value
    Next iForRow
    Application.ScreenUpdating = True
    MsgBox "Process Completed"
    End Sub

  10. #10
    Registered User
    Join Date
    10-17-2003
    Posts
    32
    SUCCESS! I have been trying to figure out a way to get this data into a local database for quite a while. Thanks so much for your help! EAG

  11. #11
    Registered User
    Join Date
    01-31-2006
    Posts
    17

    scraping - with login?

    This has been a most informative thread. I would like to add a thought to it. Can this code be modified to allow a login to a secure site? Of course, I have the password and user ID.

  12. #12
    Registered User
    Join Date
    05-26-2004
    Posts
    7
    Can anyone help tell me what the NAME part of the code does?

    Cheers Ezy

  13. #13
    Forum Contributor
    Join Date
    04-06-2013
    Location
    india
    MS-Off Ver
    Excel 2007
    Posts
    193

    Re: Scraping Data from Web Into Excel Using VBA

    finally get a msgbox completed
    but nothing happend in the Result sheet
    means still completed blank page

  14. #14
    Registered User
    Join Date
    08-13-2012
    Location
    London, England
    MS-Off Ver
    Excel 2003
    Posts
    1

    Re: Scraping Data from Web Into Excel Using VBA

    Hi

    I'm trying to get the following data, particullary the 49.88 EUR price, but I can't with all I have learnt above. Any idea?

    Sin título.jpg

  15. #15
    Valued Forum Contributor mudraker's Avatar
    Join Date
    11-10-2003
    Location
    Melbourne, Australia
    Posts
    3,983

    Re: Scraping Data from Web Into Excel Using VBA

    munu

    Please read forum rules

    2. Don't post a question in the thread of another member -- start your own. If you feel it's particularly relevant, provide a link to the other thread. You may include up to 3 links to other URLs in a single post, no more, so only link to the relevant pages

    HTML Code: 
    Please Read Forum Rules Before Posting
    Wrap VBA code by selecting the code and clicking the # icon or Read This
    How To Cross Post politely

    Top Excel links for beginners to Experts

    If you are pleased with a member's answer then use the Scales icon to rate it
    If my reply has assisted or failed to assist you I welcome your Feedback.

  16. #16
    Registered User
    Join Date
    01-31-2015
    Location
    california
    MS-Off Ver
    2013
    Posts
    1

    Re: Scraping Data from Web Into Excel Using VBA

    Hi Guys, I am trying to organize some of my clients listings in a spreadsheet.. In Excel I have this list
    http://ventura.craigslist.org/reo/4866324189.html
    http://visalia.craigslist.org/reo/4834244809.html
    http://visalia.craigslist.org/reo/4835909395.html

    And I used the code above so I can extract the Data of these 3 urls into the spreadsheet but I keep getting OUT OF RANGE ERRORS... any ideas? I am on outlook 2013 Thank you!

  17. #17
    Forum Moderator zbor's Avatar
    Join Date
    02-10-2009
    Location
    Croatia
    MS-Off Ver
    365 ProPlus
    Posts
    15,576

    Re: Scraping Data from Web Into Excel Using VBA

    Unfortunately your post does not comply with Rule 2 of our Forum RULES. Do not post a question in the thread of another member -- start your own thread.

    If you feel an existing thread is particularly relevant to your need, provide a link to the other thread in your new thread.

    Old threads are often only monitored by the original participants. New threads not only open you up to all possible participants again, they typically get faster response, too.

Closed Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1