+ Reply to Thread
Results 1 to 7 of 7

VBA: Website Scraper

  1. #1
    Forum Contributor
    Join Date
    08-25-2015
    Location
    London
    MS-Off Ver
    MS 365
    Posts
    213

    Question VBA: Website Scraper

    Hi all,

    Another day, another exciting VBA project to solve ;-)

    In essence, I'm trying to scrape the HTML source code from multiple (intranet) sites. I found the below VBA script that helps with two things:

    a) it can scrape the HTML source code from one target intranet site
    b) it can extract all the links on that target site

    What I would like to get to is:

    # ie.Navigate (see script below) to point to one specific URL that sits in a dropdown in Lookups!$C$3
    # If the dropdown states "All" (and not a specific URL) I would want it to run through all URLs in a pre-defined list in Lookups!$H$6:$H$100 one by one
    # Results should be written to:
    ## Lookups!$B$299 downwards: Link name (e.g. in a href="https://www.w3schools.com">Visit W3Schools</a this would be "Visit W3Schools")
    ## Lookups!$C$299 downwards: URL (e.g. in a href="https://www.w3schools.com">Visit W3Schools</a this would be "https://www.w3schools.com")
    ## Lookups!$D$299 downwards: the HTML source code of that page (e.g. source code of "https://www.w3schools.com")
    ## Lookups!$E$299 downwards: the URL that I am scraping (so what ie.Navigate points to; either the URL in Lookups!$C$3 or one of the URLs in list Lookups!$H$6:$H$100)

    This is the source code I am trying to adapt - sadly not getting any further with this :-( Only managed to scrape the HTML source code and the URLs on the site with this...

    Please Login or Register  to view this content.
    Would love to share an example file or the actual URL, but it's internal (URL also isn't accessible outside my network, but above script worked so far).

    Hope anyone can help or point me in the right direction?
    Thank you!

  2. #2
    Forum Contributor
    Join Date
    08-25-2015
    Location
    London
    MS-Off Ver
    MS 365
    Posts
    213

    Question Re: VBA: Website Scraper

    Hi all,

    Invested a few hours (with my limited VBA knowledge) to tweak the code myself and was able to get some bits to work...

    Please Login or Register  to view this content.
    1. I had to use InternetExplorerMedium as I kept on getting an error (trying to scrape from the intranet)
    2. I added IE.navigate Sheets("Lookups").Range("C3").Value which now pulls the URL from a cell
    3. For some weird reason it kept on overwriting content depending on which sheet I was on. Added Application.Goto Sheets("Sheet1").Range("A1"), True to make sure it always lands on the sheet I want the scraped data to go to.

    I still haven't managed to get the following to work:
    1. If the dropdown in Lookups!$c$3 states "All" (and not a specific URL) I would want it to scrape from all URLs in a pre-defined list in Lookups!$H$6:$H$100 one by one
    > Tried a few basic "if then" rules, but didn't know how to scrape a range of URLs (one after the other), rather than just one
    2. Results should be written to:
    2a) Lookups!$B$299 downwards: Link name (e.g. in a href="https://www.w3schools.com">Visit W3Schools</a this would be "Visit W3Schools")
    2b) Lookups!$C$299 downwards: URL (e.g. in a href="https://www.w3schools.com">Visit W3Schools</a this would be "https://www.w3schools.com")
    2c) Lookups!$D$299 downwards: the HTML source code of that page (e.g. source code of "https://www.w3schools.com")
    2d) Lookups!$E$299 downwards: the URL that I am scraping (so what ie.Navigate points to; either the URL in Lookups!$C$3 or one of the URLs in list Lookups!$H$6:$H$100)
    > I assume this could be related to Set ElementCol = html.getElementsByTagName("a")? So instead of the URL (a) pull e.g. the link name? Not sure how I would define that attribute though...

    That definitely exceeds my knowledge :-( Hope anyone has an idea and it's not a lot to tweak?

    Appreciate any help!
    Thanks!

  3. #3
    Forum Contributor
    Join Date
    08-25-2015
    Location
    London
    MS-Off Ver
    MS 365
    Posts
    213

    Re: VBA: Website Scraper

    Hi all,

    Invested some time in this over the last few weeks and made some progress (with my very basic VBA knowledge). Not everything is working yet, but hoping for help...any suggestion is welcome!
    Here's my progress:

    Please Login or Register  to view this content.
    As you can see, this is a combination of two scripts now:
    1) Sub WebsiteScraper() - provides link names and links from one specific website. Still outstanding is scraping the same information from a range of URLs (this is what's meant to go after the if statement)
    2) Private Sub HTMLScraper() - is supposed to then extract the source code from the list of URLs I got through script 1. This is a code I found that works for 1 URL (code for reference below), however, couldn't manage to get it to work for a range of URLs yet...so that bit isn't working at the moment. I also noticed that Excel can only hold somewhat 32k characters in one cell so doesn't import the full source code, sadly. Is there any code I can add that would automatically splits the content across several cells (ideally horizontally / across columns in same row) if it exceeds 32k characters?

    Code to scrape HTML source code from 1 website:
    Please Login or Register  to view this content.
    Hope anyone can help...losing my mind over this haha.

    Thanks so much!

  4. #4
    Forum Expert
    Join Date
    11-24-2013
    Location
    Paris, France
    MS-Off Ver
    Excel 2003 / 2010
    Posts
    9,831

    Arrow Re: VBA: Website Scraper


    Hi,

    you can export the html code to a text file rather than using any worksheet cell …

  5. #5
    Forum Contributor
    Join Date
    08-25-2015
    Location
    London
    MS-Off Ver
    MS 365
    Posts
    213

    Re: VBA: Website Scraper

    Hey Marc,

    Thanks for the suggestion, but would need it in Excel as I need to do a search (on a large scale) across the scraped content. Already got the formulae ready for this, just basically need the full source code from multiple pages...

    It's a tricky one!

  6. #6
    Forum Expert
    Join Date
    11-24-2013
    Location
    Paris, France
    MS-Off Ver
    Excel 2003 / 2010
    Posts
    9,831

    Arrow Re: VBA: Website Scraper


    Ok then do not ever store the html code within any cell but just directly play with it under VBA with its basics text functions

    like in this thread and the other many samples in fact whatever the Excel forum …
    Last edited by Marc L; 02-22-2022 at 04:03 PM.

  7. #7
    Forum Contributor
    Join Date
    08-25-2015
    Location
    London
    MS-Off Ver
    MS 365
    Posts
    213

    Re: VBA: Website Scraper

    Thanks, Marc! What I'm trying to do here is actually super complex...the above is really just one part of the full project. I basically need to scrape hundreds of websites, extract the HTML source code and then search for hundreds of content strings within that. The tool is then meant to show me which source code had what content and where (so would need to visually see the source code). All of this is already working and I was able to solve via formulae, it's just the scraping part I'm stuck with as VBA would be required :-(

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Web scraper - Parameter value
    By Jollyfrog in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 02-28-2022, 05:01 PM
  2. web scraper won't run
    By bd3161 in forum Excel Programming / VBA / Macros
    Replies: 0
    Last Post: 12-17-2017, 06:49 AM
  3. Web Scraper Function Vessels data
    By gerarddevries in forum Excel Programming / VBA / Macros
    Replies: 8
    Last Post: 07-31-2015, 06:35 AM
  4. Data Scraper
    By MotionEyes in forum Excel General
    Replies: 3
    Last Post: 05-24-2015, 02:41 PM
  5. Scraper working with IE9 but not IE10 (Mid VBA?)
    By mattmc419 in forum Excel Programming / VBA / Macros
    Replies: 2
    Last Post: 10-16-2013, 10:54 PM
  6. [SOLVED] How to collect data with web scraper ?
    By Carly Fiorina in forum Excel General
    Replies: 7
    Last Post: 03-13-2013, 05:04 AM
  7. Help modifying someone else's VBA screen scraper
    By DublinMeUp in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 08-03-2012, 06:08 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1