+ Reply to Thread
Results 1 to 23 of 23

Looking for Assistance with Parsing HTML for Strings

  1. #1
    Registered User
    Join Date
    08-22-2014
    Location
    Australia
    MS-Off Ver
    2013
    Posts
    81

    Looking for Assistance with Parsing HTML for Strings

    Hi everyone

    Im looking for someone that would be willing to collaborate and help me build a Sub Function, that would search some HTML of a website and return very particular results from the page. Specifically I need to search a company website, and "collect" the specs for computers that are found on that website.

    The results then need to be placed into an excel spreadsheet.

    I have no previous experience in Parsing HTML text and getting a string/result from it, thats why im asking for such help.

    If you think you up for the challenge, or maybe this kind of thing is easy, then please message back with your interest.

    example website: http://www.harveynorman.com.au/homep...ptop-blue.html

    would like to gather the specs that appear on the specs tab of the page, and if possible the Price of the item that appears there.

    Would preferably like it to be a module function that i could use for various other projects.

    Any help would be greatly appreciated, and i would go into more detail with someone who would be willing to help / give up the time to assist and teach.

    Cheers.

    Andrew.

  2. #2
    Forum Guru Kyle123's Avatar
    Join Date
    03-10-2010
    Location
    Leeds
    MS-Off Ver
    365 Win 11
    Posts
    7,238

    Re: Looking for Assistance with Parsing HTML for Strings

    Just to be clear, are you wanting to parse information fust from http://www.harveynorman.com.au/ or different websites also?

  3. #3
    Registered User
    Join Date
    08-22-2014
    Location
    Australia
    MS-Off Ver
    2013
    Posts
    81

    Re: Looking for Assistance with Parsing HTML for Strings

    Just Harvey Norman, that I assume would make it a lot easier.

  4. #4
    Forum Guru Kyle123's Avatar
    Join Date
    03-10-2010
    Location
    Leeds
    MS-Off Ver
    365 Win 11
    Posts
    7,238

    Re: Looking for Assistance with Parsing HTML for Strings

    It would be impossible otherwise

    How's this as a starter for 10:
    Please Login or Register  to view this content.

  5. #5
    Registered User
    Join Date
    08-22-2014
    Location
    Australia
    MS-Off Ver
    2013
    Posts
    81

    Re: Looking for Assistance with Parsing HTML for Strings

    Hi Kyle 123

    i get a error at

    With oRequest
    .Open "GET", URL, False
    .send <<THE ERROR HAPPENS ON THIS LINE
    ohtml.body.innerhtml = .responsetext
    End With

    The error message states that the operation has timed out.

    Im not sure it makes any difference but im working with Office 2007.

    Also i know i gave you the link to a specific web address, ideally what we would like to happen is, a string variable is loaded with a "Product Code" for a computer from the website.

    The address that then searches the Harvey Norman website http://www.harveynorman.com.au/catalogsearch/result/?q= [INSERT PRODUCT CODE STRING HERE]

    for example: if the string variable was loaded with: "F550DP-XX008H"

    and the address was: http://www.harveynorman.com.au/catal...=F550DP-XX008H

    Then the landing page result would be: http://www.harveynorman.com.au/compu...8h-laptop.html

    Can we preform the Search/Query of the website first, to see if that item is on the website, and if it is, then parse the page for the specs for that computer.

    And if that product code / computer is not found on the website, a msgbox would state that its not available.

    I know this is probably challenging but it would be of a lot of help.


    Cheers

    Andrew

  6. #6
    Forum Guru Kyle123's Avatar
    Join Date
    03-10-2010
    Location
    Leeds
    MS-Off Ver
    365 Win 11
    Posts
    7,238

    Re: Looking for Assistance with Parsing HTML for Strings

    It may time out because it doesn't exist anymore? I can't access it either now - have you tried with other urls?

  7. #7
    Registered User
    Join Date
    08-22-2014
    Location
    Australia
    MS-Off Ver
    2013
    Posts
    81

    Re: Looking for Assistance with Parsing HTML for Strings

    i just tried this one

    http://www.harveynorman.com.au/compu...8h-laptop.html

    and that didn't seem to work, even tho i can navigate to it with a browser.

  8. #8
    Registered User
    Join Date
    08-22-2014
    Location
    Australia
    MS-Off Ver
    2013
    Posts
    81

    Re: Looking for Assistance with Parsing HTML for Strings

    Is there any specific References that i would need to add into the project?

  9. #9
    Valued Forum Contributor
    Join Date
    08-29-2012
    Location
    In lockdown
    MS-Off Ver
    Excel 2010 (2003 to 2016 but 2010 for choice)
    Posts
    1,766

    Re: Looking for Assistance with Parsing HTML for Strings

    Quote Originally Posted by Andrew Andromeda View Post
    ...If you think you up for the challenge, or maybe this kind of thing is easy, then please message back with your interest...Any help would be greatly appreciated, and i would go into more detail with someone who would be willing to help / give up the time to assist and teach...
    Yes I could help out provided it isn't urgent (there may be 2-3 days delay between replies).

    Also be warned that I usually use an inefficient method to return web data. I automate the IE browser - it's slower so this would be a drawback if you were intending to process hundreds of pages but it is the easiest to learn.

    See attached demo.
    Attached Files Attached Files
    *******************************************************

    HELP WANTED! (Links to Forum threads)
    Trying to create reusable code for Custom Events at Workbook (not Application) level

    *******************************************************

  10. #10
    Registered User
    Join Date
    08-22-2014
    Location
    Australia
    MS-Off Ver
    2013
    Posts
    81

    Re: Looking for Assistance with Parsing HTML for Strings

    Quote Originally Posted by mc84excel View Post
    Yes I could help out provided it isn't urgent (there may be 2-3 days delay between replies).

    Also be warned that I usually use an inefficient method to return web data. I automate the IE browser - it's slower so this would be a drawback if you were intending to process hundreds of pages but it is the easiest to learn.

    See attached demo.
    so far your method works, and its rather fast.

    any way you can work on what i mentioned in Post #5 - that would make you truly a champ!

    cheers.

  11. #11
    Registered User
    Join Date
    08-22-2014
    Location
    Australia
    MS-Off Ver
    2013
    Posts
    81

    Re: Looking for Assistance with Parsing HTML for Strings

    I modified the URL to be the one that searches the website first and then gets a result. And it worked perfectly actually.

    I would just like to be able to pull the Price from the webpage (If you could comment where this occurs) that would be handy.

    And i noticed that it would display a message if the element we were looking for could not be found.

    Would it be possible to return focus to excel, as the IE page is remaining on top. and you don't notice that the msgbox has been displayed, or when a result is found, you don't notice that it has finished as IE remains in focus.

    When i get a chance i will use the debug mode to see if i can insert code to return focus to excel / my userform. i assume i will just need to use a Application.activate request to do this. but if you know of a easy method this would be great.

    Cheers

    Andrew

  12. #12
    Valued Forum Contributor
    Join Date
    08-29-2012
    Location
    In lockdown
    MS-Off Ver
    Excel 2010 (2003 to 2016 but 2010 for choice)
    Posts
    1,766

    Re: Looking for Assistance with Parsing HTML for Strings

    Quote Originally Posted by Andrew Andromeda View Post
    so far your method works, and its rather fast.

    any way you can work on what i mentioned in Post #5 - that would make you truly a champ!

    cheers.
    Er which? 1. Debugging the winHTTP method? or 2. running a batch of searches using the websites search function?

    If 1. - I highly recommend Kyle123

    If 2. - Probably possible. I've written code to do batch searches on various websites before. However it could get messy if the search results don't keep to a consistent standard (e.g. the page doesn't return the same table fields for each laptop). I would need you to send me a list of model names to test (As many as you like. I'd need at least 5 search values for testing)

  13. #13
    Valued Forum Contributor
    Join Date
    08-29-2012
    Location
    In lockdown
    MS-Off Ver
    Excel 2010 (2003 to 2016 but 2010 for choice)
    Posts
    1,766

    Cool Re: Looking for Assistance with Parsing HTML for Strings

    Quote Originally Posted by Andrew Andromeda View Post
    I modified the URL to be the one that searches the website first and then gets a result. And it worked perfectly actually.

    I would just like to be able to pull the Price from the webpage (If you could comment where this occurs) that would be handy.

    And i noticed that it would display a message if the element we were looking for could not be found.

    Would it be possible to return focus to excel, as the IE page is remaining on top. and you don't notice that the msgbox has been displayed, or when a result is found, you don't notice that it has finished as IE remains in focus.

    When i get a chance i will use the debug mode to see if i can insert code to return focus to excel / my userform. i assume i will just need to use a Application.activate request to do this. but if you know of a easy method this would be great.

    Cheers

    Andrew
    Hi Andrew, thats a lot of requests at once!
    1. Price = See attached. Note I only tested it on the page you asked for. It will fail if each page uses a different price ID
    2. Search not found =I would need a few search values for testing. I may not be able to get started on this until mid next week though.
    3. Focus to Excel = You could quit IE. See attached. Or you could use AppActivate but it sometimes errors for me. A better solution would be to set focus through an API on the Excel window.

  14. #14
    Registered User
    Join Date
    08-22-2014
    Location
    Australia
    MS-Off Ver
    2013
    Posts
    81

    Re: Looking for Assistance with Parsing HTML for Strings

    All good buddy. Iv been working on your code all day.

    Iv modified it to do exactly what i want. managed to work out how it was working

    Is it possible to HIDE Internet Explorer while its being parsed?

    I noticed in one of the modules there is:
    With rIE
    .visible = Not blnInvisible

    would this allow me to hide it if it was changed? and if so what would make it hidden/invisible?

    Cheers

    Andrew.

  15. #15
    Registered User
    Join Date
    08-22-2014
    Location
    Australia
    MS-Off Ver
    2013
    Posts
    81

    Re: Looking for Assistance with Parsing HTML for Strings

    Im going to see if i can implement getting the price from the website today.

    will let you know how i go.

  16. #16
    Valued Forum Contributor
    Join Date
    08-29-2012
    Location
    In lockdown
    MS-Off Ver
    Excel 2010 (2003 to 2016 but 2010 for choice)
    Posts
    1,766

    Re: Looking for Assistance with Parsing HTML for Strings

    Quote Originally Posted by Andrew Andromeda View Post
    All good buddy. Iv been working on your code all day.

    Iv modified it to do exactly what i want. managed to work out how it was working
    Heh. I told you that automating IE is the easiest method to learn when trying to scrape data from the net.


    Quote Originally Posted by Andrew Andromeda View Post
    Is it possible to HIDE Internet Explorer while its being parsed?
    You can hide IE while automating it but if you were going to go down that route, you may as well go the whole way IMO - i.e. avoid using the IE automation method and switch to using the xmlHTTP method

    Generally whenever I use IE automation method I prefer to leave it visible. I haven't found it to save much in speed by hiding it and leaving it visible has the advantage that if something goes wrong (and it will eventually - webpages don't remain the same forever), you can see where it has failed.
    (Also visible has the somewhat dubious benefit of impressing non-VBA users. "Look! No hands!!" )


    Quote Originally Posted by Andrew Andromeda View Post
    would this allow me to hide it if it was changed? and if so what would make it hidden/invisible?
    Did you try passing the boolean as True?

  17. #17
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Looking for Assistance with Parsing HTML for Strings

    Change this line

    Please Login or Register  to view this content.
    INTO

    Please Login or Register  to view this content.

  18. #18
    Registered User
    Join Date
    08-22-2014
    Location
    Australia
    MS-Off Ver
    2013
    Posts
    81

    Re: Looking for Assistance with Parsing HTML for Strings

    Hi mc84Excel

    i just wanted to give you a big THANK YOU, today i have been working again on your code for getting the ONLINE price from the website and i have managed to get it to work.

    this is what i went with in the end.

    Please Login or Register  to view this content.
    i know my solution is probably a little crude but it does seem to work.

    many thanks for your code / help

    regards

    Andrew

  19. #19
    Registered User
    Join Date
    08-22-2014
    Location
    Australia
    MS-Off Ver
    2013
    Posts
    81

    Re: Looking for Assistance with Parsing HTML for Strings

    Quote Originally Posted by mc84excel View Post
    Heh. I told you that automating IE is the easiest method to learn when trying to scrape data from the net.



    You can hide IE while automating it but if you were going to go down that route, you may as well go the whole way IMO - i.e. avoid using the IE automation method and switch to using the xmlHTTP method

    Generally whenever I use IE automation method I prefer to leave it visible. I haven't found it to save much in speed by hiding it and leaving it visible has the advantage that if something goes wrong (and it will eventually - webpages don't remain the same forever), you can see where it has failed.
    (Also visible has the somewhat dubious benefit of impressing non-VBA users. "Look! No hands!!" )



    Did you try passing the boolean as True?

    I did managed to work out hiding IE, that was also easy.

    I need it hidden, because it would open IE every time you selected an item on a list.

    The program works by having a list of product codes, you select one to see if its on sale. i just wanted to pull the online price, with that "look no hands" effect, and also to keep the computer screen less cluttered. i set the mouse to fmMousePointerAppStarting, so that the user knows something is happen in the background.

    I continually surprise myself at how easy visual basic is to learn and interpret. but then thats probably why its called " Basic" :P



    i'm assuing that if i went down the xmlHTTP path that would be a lot harder to get the result i want / learn?

    regards

  20. #20
    Valued Forum Contributor
    Join Date
    08-29-2012
    Location
    In lockdown
    MS-Off Ver
    Excel 2010 (2003 to 2016 but 2010 for choice)
    Posts
    1,766

    Re: Looking for Assistance with Parsing HTML for Strings

    Quote Originally Posted by Andrew Andromeda View Post
    i have been working again on your code
    No worries.

    A few comments/explanations:
    1. As you probably noticed from my commenting out, I nearly ran with "product-view-price"
    2. Why are you worried about the string being 168? Could you try setting the string variable to say 40 len (
      Please Login or Register  to view this content.
      )
    3. I can explain the Len 2. The minimum value I am expecting is a currency symbol and at least one numeral. If Len was 1 or 0 then something had gone wrong. This would prevent an error when it came to the line to remove the currency symbol
      Please Login or Register  to view this content.
      (BTW I shoud have done Len > 1 not Len > 2 on the offchance that Harvey Norman sells something for less than $10! )
    4. Are you certain that the price will always be 25 characters from the Left? Could you use Instr to locate the first $ symbol in the string instead?
    Last edited by mc84excel; 02-01-2015 at 10:22 PM.

  21. #21
    Valued Forum Contributor
    Join Date
    08-29-2012
    Location
    In lockdown
    MS-Off Ver
    Excel 2010 (2003 to 2016 but 2010 for choice)
    Posts
    1,766

    Re: Looking for Assistance with Parsing HTML for Strings

    Quote Originally Posted by Andrew Andromeda View Post
    i'm assuing that if i went down the xmlHTTP path that would be a lot harder to get the result i want / learn?
    I only know a little of using the xmlHTTP method so I'm not the person to ask. As I understand it - anything that you can do through IE automation can be done through xmlHTTP. It is more difficult to learn and write IMO. It does have the advantage that it doesn't use the browser at all.

    If you are doing a large batch of searches and are interested in obtaining the quickest possible completion speed, then xmlHTTP can be used with the swarm method. (Google multithreaded VBA Daniel Ferry)

  22. #22
    Valued Forum Contributor
    Join Date
    08-29-2012
    Location
    In lockdown
    MS-Off Ver
    Excel 2010 (2003 to 2016 but 2010 for choice)
    Posts
    1,766

    Re: Looking for Assistance with Parsing HTML for Strings

    Quote Originally Posted by Andrew Andromeda View Post
    many thanks for your code / help
    If you send me about 5-6 URLs, I will see if I can improve pulling the price from the "product-view-price" element.

  23. #23
    Valued Forum Contributor
    Join Date
    08-29-2012
    Location
    In lockdown
    MS-Off Ver
    Excel 2010 (2003 to 2016 but 2010 for choice)
    Posts
    1,766

    Cool Re: Looking for Assistance with Parsing HTML for Strings

    See attached.
    • Price now determined by Instr on element outer text (rather than relying on number of characters)
    • Loop demo on batch of URLs
    • Output Wbk now includes field headers

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. [SOLVED] Parsing HTML element into excel
    By chococ in forum Excel Programming / VBA / Macros
    Replies: 3
    Last Post: 02-11-2014, 05:20 PM
  2. Time Window Parsing Assistance
    By kingsolo in forum Excel General
    Replies: 6
    Last Post: 01-25-2011, 01:46 AM
  3. Parsing an html address ?
    By mariom in forum Excel Programming / VBA / Macros
    Replies: 9
    Last Post: 05-24-2010, 07:53 AM
  4. Excel & HTML Parsing
    By Porphyria in forum Excel General
    Replies: 4
    Last Post: 06-23-2006, 07:25 PM
  5. [SOLVED] HTML parsing with VBA
    By [email protected] in forum Excel Programming / VBA / Macros
    Replies: 3
    Last Post: 04-17-2006, 05:10 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1