+ Reply to Thread
Results 1 to 48 of 48

Web scraping problem

  1. #1
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Web scraping problem

    Hi guys I have run into a little problem as one of the sites I usally scrape info from has just done a complete overhaul. As of now they use javascript to generate their content.
    So I had to rebuild my script. Im almost finished but Im missing one element from the site and I can not figure out how to scrape it. html code is here:

    Please Login or Register  to view this content.
    As of now there are 243 elements like this. I need only this data-league-name node
    I have tried something like this:

    Please Login or Register  to view this content.
    But this does not return everything from the above code only this:
    Please Login or Register  to view this content.
    Im not sure how to I should phrase my code to get the league from this site :/ But I know It should be possible as all of this is stored in the .DocumentElement.innerHTML

    Any help making me handle this would be much appreciated.

    Frederik

  2. #2
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Web scraping problem

    getElementsByClassName usually returns a collection(Array). If you know the index position, you use the index number to get in to specific element, or you could to loop through the collections to get all elements.

    X=HTMLDoc.getElementsByClassName("match_line score_row other_match o_true")(0).innertext
    should normally return the first collection.
    Last edited by AB33; 10-07-2017 at 03:57 PM.

  3. #3
    Valued Forum Contributor Sean Thomas's Avatar
    Join Date
    03-25-2012
    Location
    HerneBay, Kent, UK
    MS-Off Ver
    Excel 2007,2016
    Posts
    971

    Re: Web scraping problem

    Are there more than one elements with that class name? as you clearly have extracted the data from a different element
    You may need to collect all the elements with that class name and then run through them to check the id="1326133" or data-home-team="GUINEA"
    Regards
    Sean

    Please add to my reputation if you think i helped
    (click on the star below the post)
    Mark threads as "Solved" if you have your answer
    (Thread Tools->Mark thread as Solved)
    Use code tags when posting your VBA code:
    [code] Your code here [code]
    Please supply a workbook containing example Data:
    It makes its easier to answer your problem & saves time!

  4. #4
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    Hi and good morning , thanks for your replies.
    Are there more than one elements with that class name? as you clearly have extracted the data from a different element
    - Good observation . I was to lazy to search trough the html doc to find the match . But I just did it. Here is my code that extract the data:

    Please Login or Register  to view this content.
    Here is what Set HTML_League = HTMLDoc.getElementsByClassName("match_line score_row other_match o_true") returns:

    excactley like this for row number 1 today

    Please Login or Register  to view this content.
    Searching trough the html doc I found it here,
    Please Login or Register  to view this content.
    I guess there are normally 200-300 elements from "match_line score_row other_match o_true" hope this gives better understanding of my problem Is this not the same element that I have extracted ?
    Last edited by colddeck84; 10-08-2017 at 03:51 AM.

  5. #5
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Web scraping problem

    Have you tried using an id element? Id is usually unique per page and returns a single value.

  6. #6
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    No I have not tried that . Do you mean like this:

    id="1426760"

    I can try this now, but how would I figure out the id without checking manually ?

  7. #7
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    No I have not tried that . Do you mean like this:

    id="1426760"

    I can try this now, but how would I figure out the id without checking manually ?

    Here is how it looks like in the explorer console:

    Please Login or Register  to view this content.

  8. #8
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Web scraping problem

    X = HTMLDoc.GetElementByID("1426760").innertext

  9. #9
    Valued Forum Contributor Sean Thomas's Avatar
    Join Date
    03-25-2012
    Location
    HerneBay, Kent, UK
    MS-Off Ver
    Excel 2007,2016
    Posts
    971

    Re: Web scraping problem

    As you have already got a collection of all the elements with the classname
    Set HTML_League = HTMLDoc.getElementsByClassName("match_line score_row other_match o_true")
    Then you need to loop through these elements to find the one that has the id that you want

    you could do something like this

    Please Login or Register  to view this content.
    Last edited by Sean Thomas; 10-08-2017 at 08:36 AM.

  10. #10
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    [QUOTE=Sean Thomas;4758193]As you have already got a collection of all the elements with the classname
    Set HTML_League = HTMLDoc.getElementsByClassName("match_line score_row other_match o_true")
    Then you need to loop through these elements to find the one that has the id that you want

    Ill try your suggestion, But as I stated in my first post when I try to search this element I get only the long string I attached in first post. Ill try both of your sugeestions.

    I found some info here ad I guess this guy has the same problem:

    https://stackoverflow.com/questions/...ement-with-vba

    But I dont quite understand how to handle this still

    Edit: Ive tried this now:

    Please Login or Register  to view this content.
    But this produces exactly the same result as:
    Please Login or Register  to view this content.
    Last edited by colddeck84; 10-08-2017 at 09:12 AM.

  11. #11
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    Quote Originally Posted by Sean Thomas View Post
    As you have already got a collection of all the elements with the classname
    Set HTML_League = HTMLDoc.getElementsByClassName("match_line score_row other_match o_true")
    Then you need to loop through these elements to find the one that has the id that you want

    you could do something like this

    Please Login or Register  to view this content.
    Problem is that when I extract this element HTMLDoc.getElementsByClassName("match_line score_row other_match o_true")
    The data-league-name node is not found in it hmmm...

  12. #12
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Web scraping problem

    data-note data-country-name="WORLD (FIFA)"
    Is not an inner text, but rather most likely is an attribute.
    You could try innerHTML not innertext and see what it returns.

    X = HTMLDoc.getElementById("1426760").innerHTML

  13. #13
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    Quote Originally Posted by AB33 View Post
    data-note data-country-name="WORLD (FIFA)"
    Is not an inner text, but rather most likely is an attribute.
    You could try innerHTML not innertext and see what it returns.

    X = HTMLDoc.getElementById("1426760").innerHTML
    Yes that is what I also figured however with my skillset its not so easy Ill try , However I need to figure out how to get all of these attributes without searching by id name as i have no idea how to get the ID number..

  14. #14
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    Quote Originally Posted by AB33 View Post
    data-note data-country-name="WORLD (FIFA)"
    Is not an inner text, but rather most likely is an attribute.
    You could try innerHTML not innertext and see what it returns.

    X = HTMLDoc.getElementById("1426760").innerHTML
    Yes that is what I also figured however with my skillset its not so easy Ill try , However I need to figure out how to get all of these attributes without searching by id name as i have no idea how to get the ID number..

    Edit think we are getting closer to the solution , however it did not return the desired results, here is what it returned:
    Please Login or Register  to view this content.
    - three line where removed as the site the not allow me to post it (reference to javascript)
    Last edited by colddeck84; 10-08-2017 at 10:06 AM.

  15. #15
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Web scraping problem

    <div id="1326133" class="match_line score_row live_match o_true" data-statustype="live" data-ko="19:00" data-home-team="GUINEA" data-away-team="TUNISIA" data-league-sort="52" data-correction="-7" data-matchday="2017-10-07" data-game-status="2 HF" data-league-code="39384" data-league-name="WORLD CUP RUSSIA 2018" data-note data-country-name="WORLD (FIFA)" data-league-round="MATCHDAY 5" data-league-short="WCQ" data-home-id="10121" data-away-id="20360">…</div>

    class="match_line score_row live_match o_true"
    is the class CSS with "match_line score_row live_match o_true" attribute.
    Since class can be applied to many elements, it is considered as a collection, while id id="1326133" is unique to each element.
    Since you are using IE, you can also find the attribute value.

  16. #16
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    Quote Originally Posted by AB33 View Post
    <div id="1326133" class="match_line score_row live_match o_true" data-statustype="live" data-ko="19:00" data-home-team="GUINEA" data-away-team="TUNISIA" data-league-sort="52" data-correction="-7" data-matchday="2017-10-07" data-game-status="2 HF" data-league-code="39384" data-league-name="WORLD CUP RUSSIA 2018" data-note data-country-name="WORLD (FIFA)" data-league-round="MATCHDAY 5" data-league-short="WCQ" data-home-id="10121" data-away-id="20360">…</div>

    class="match_line score_row live_match o_true"
    is the class CSS with "match_line score_row live_match o_true" attribute.
    Since class can be applied to many elements, it is considered as a collection, while id id="1326133" is unique to each element.
    Since you are using IE, you can also find the attribute value.
    im not sure I understand what you mean.. ?

  17. #17
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Web scraping problem

    I mean you can not create your ID or class. I thought you mentioned you do not know how to get an ID.
    In the above div element, you have both ID and Class to access the div element.

    Is it secure website? If not, could you attach the URL?

  18. #18
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    Thanks for your time helping me with this

    "I thought you mentioned you do not know how to get an ID."
    - what I meant I need to create , I need to extract all 300+ rows of league data , doing it with the ID number seams cumbersome
    Its not a secure website url is
    http://www.xscores.com/soccer

    edit and here is my code to pull it:

    Please Login or Register  to view this content.
    Last edited by colddeck84; 10-08-2017 at 10:33 AM.

  19. #19
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Web scraping problem

    Ok, let me have a go. Which data do you want to extract?

  20. #20
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    thanks so much its really only the league name I need (not the short name) as all the others I have a handle on

    "Data-league-name"

  21. #21
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Web scraping problem

    Let's say

    <div id="1484267" class="match_line score_row live_match o_true" data-statustype="live" data-ko="14:00" data-home-team="YEOVIL (W)" data-away-team="SUNDERLAND (W)" data-league-sort="901" data-correction="0" data-matchday="2017-10-08" data-game-status="2 HF" data-league-code="42784" data-league-name="WOMENS SUPER LEAGUE" data-note="Venue: Huish Park. Turf: Natural. Capacity: 9,565." data-country-name="ENGLAND" data-league-round="3" data-league-short="WSL" data-home-id="50043" data-away-id="50027">

    Is it right you want to extract

    data-league-name="WOMENS SUPER LEAGUE"

    "WOMENS SUPER LEAGUE"
    from
    data-league-name?

  22. #22
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    That is correct. but I would need a way to extract all the leagues

    So far I have All countries, Team A , Team B and Starting times, Only league is missing, I was able to extract the short league name but that does not work with my overall setup.

  23. #23
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Web scraping problem

    You are now confusing me!
    There are 17 "match_line score_row live_match o_true"
    and I assume all the names are found within this div.
    Which one have you already extracted?

  24. #24
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    when I hit refresh now I recived 326 rows of data

    In col B I have: HTML_Time = HTMLDoc.getElementsByClassName("score_ko score_cell")
    Col C : HTML_Country = HTMLDoc.getElementsByClassName("tooltip_flag")
    Col D: Should be league , this is what I need help with
    Col E: HTML_TeamA = HTMLDoc.getElementsByClassName("score_home_txt score_cell wrap")
    Col F: HTML_TeamB = HTMLDoc.getElementsByClassName("score_away_txt score_cell wrap")

    basically I want to extract it looking quite similar to their site

  25. #25
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Web scraping problem

    No, I do not think I am getting what you are trying to do.
    For e.g
    if I extract All the div element, I get

    Set HTML_TeamA = HTMLDoc.getElementsByClassName("match_line score_row live_match o_true")

    "
    16:00

    FIN





    SHOW GAMES FROM ITALY

    B



    PERUGIA


    4

    1

    1


    PRO VERCELLI


    17

    2




    0

    1


    1

    5

    "


    Is one of them and SHOW GAMES FROM ITALY is found.
    is not SHOW GAMES FROM ITALY what you are after?

  26. #26
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    I guess they differentiate between live matches and upcoming matches , when you said there are 17 elements of "match_line score_row live_match o_true" I guess that is all the games that are currentley live there should be an element like this "match_line score_row live_match o_false" or something

    See my screenshot so its clear

    Edit:
    Yes I was correct elements of upcoming matches are called:
    "match_line score_row other_match e_true"

    And for the sample you provided I am after only the league name which is "DRUGA CFL"
    Attached Images Attached Images
    Last edited by colddeck84; 10-08-2017 at 11:24 AM.

  27. #27
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Web scraping problem

    Er, I think I have came across this before. I think the data are generated by JavaScript. Despite the data appears to be on the table, but it is not. Your best bet is to get an API if the site owner has one. I suspect the owner is blocking web scrapping. They let you to use IE as it is not stable and reliable.
    Sorry, I can not more helpful.

  28. #28
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    Unfortunately I guarantee they will not provide an API.. Yes all of this is generated by javascript (which I kinda hate)
    But all of the League names is stored in in the html Doc , should be someway to extract it ?

    Look in the HTML Doc , All the data I need is there..

    Maybe there is a way to search the Html Doc for the league name I have it saved as a .txt file
    Last edited by colddeck84; 10-08-2017 at 11:35 AM.

  29. #29
    Valued Forum Contributor Sean Thomas's Avatar
    Join Date
    03-25-2012
    Location
    HerneBay, Kent, UK
    MS-Off Ver
    Excel 2007,2016
    Posts
    971

    Re: Web scraping problem

    you can try this
    one found the classnames you can then search through the inner html code to find the data using the classnames of that data source

    Please Login or Register  to view this content.
    This will get just the innerhtml
    Please Login or Register  to view this content.
    Last edited by Sean Thomas; 10-09-2017 at 03:15 AM. Reason: changed headers

  30. #30
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    Wow thanks buddy. This looks interesting , I can see that you used some time to compile this. Can you confirm this is working? , I see that you have an http request for this , but when I tried with an http request the javascript part of the page does not get generated ?

    As of now I'm off to work but will check this out when I'm back . But this looks real promising )

    Frederik

  31. #31
    Valued Forum Contributor Sean Thomas's Avatar
    Join Date
    03-25-2012
    Location
    HerneBay, Kent, UK
    MS-Off Ver
    Excel 2007,2016
    Posts
    971

    Re: Web scraping problem

    yes its all tried and tested.
    I have checked why it only returns a few rows of data, the main reason is that it is just looking for the live matches. "match_line score_row live_match e_true"
    Let me know if you need anything changed and I will see what I can do

  32. #32
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254
    Thanks for your help with this, I'm little intressted in how you were able to get the generated javascipt content without ie.. Maybe this would work on other sites with js also?

    I'm currently away from the system but I'll test as soon as I'm back home

    Frederik

  33. #33
    Forum Guru Kyle123's Avatar
    Join Date
    03-10-2010
    Location
    Leeds
    MS-Off Ver
    365 Win 11
    Posts
    7,239

    Re: Web scraping problem

    You never Need to use Internet explorer since you never really need to execute JavaScript. You just read the data from the same source as the JavaScript does, in this case the data is in the HTML - in html attributes, Sean's code simply reads the values of the html attributes.

  34. #34
    Valued Forum Contributor Sean Thomas's Avatar
    Join Date
    03-25-2012
    Location
    HerneBay, Kent, UK
    MS-Off Ver
    Excel 2007,2016
    Posts
    971

    Re: Web scraping problem

    Quote Originally Posted by colddeck84 View Post
    Thanks for your help with this, I'm little intressted in how you were able to get the generated javascipt content without ie.. Maybe this would work on other sites with js also?

    I'm currently away from the system but I'll test as soon as I'm back home

    Frederik
    As Kyle suggests this works different in terms of it returns the HTML source code. Once you have that information you can then extract what you need from it.
    It gets a little bit more difficult as the data is laid out differently.
    In this case I found the information you want is in the innerhtml of the classname, but to get that data I had to convert it to a new htmldoc and then get the data from that.
    You have many attributes for the classname but due to the id term used I couldn't extract them directly unlike tag.id.

  35. #35
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    Ok so I just came home from work , Been looking forward to test this all day

    Ive tested:

    GetWebData()

    which returned:
    score_away_txt score_cell wrap
    score_home_txt score_cell wrap
    score_ko score_cell
    score_league_flag score_cell
    score_league_txt score_cell
    scoreh_ht score_cell centerTXT
    scorea_ht score_cell centerTXT
    scoreh_ft score_cell centerTXT
    scorea_ft score_cell centerTXT

    Also tested

    GetinnerHTML()
    But that did not return anything ?

    Maybe im doing something wrong ?

    Frederik

  36. #36
    Valued Forum Contributor Sean Thomas's Avatar
    Join Date
    03-25-2012
    Location
    HerneBay, Kent, UK
    MS-Off Ver
    Excel 2007,2016
    Posts
    971

    Re: Web scraping problem

    I have just ran the code and it currently returns 2 results.

    How have you used this code?
    Just open a new workbook, add a new module, paste the code into the module as it is, run the code.

    There are no references required so all should work ok

  37. #37
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    Ill try in a new workbook. I have actually found a solution for my problem at this point by saving the htmldoc and just searching trough the txt file for the league names . But I'm really interested in your way to tackle this problem. Especially after hearing its possible to get javascript content using your technique. I really thought I would need selenium wrapper for this ? As I have a few tricky sites I really want to scrape the content of. But those sites are more and less 100% javascript and Flash generated and IE can not handle these type of site. When I have tried downloading the htmldoc at these type of sites It comes out almost empty...

  38. #38
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Web scraping problem

    Yes, You are better off with third-parties applications, particularly if you do lots of data crawling and scrapping.
    Python has many more utilities and functions for getting data from dynamic sites. In fact, Selenium is written in Python.

  39. #39
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    I was thinking about taking all of my work out of excel , but that's a gigantic task...

  40. #40
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Web scraping problem

    Yes, that is why many people are stack with excel, but VBA has limited functionalities. Many of the Third-parties have the tool to export data in to CSV and excel formats. In some aspects it works much better and faster to open a CSV file. You are also not constrained with 1m rows of excel.

  41. #41
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    Yes you are right, I have started doing some research of which coding language to choose should I choose to move my project out of excel , but i'm stuck at the selection , lol. It was recomended to me to try Visual basic as it is quite similar to excel vba. But I've also heard quite a lot about Python as well. I have 0 skills with either of them , but I think Im a fast learner and I spend quite some time on something when Im intrested in it. In your opinion which code lang. is the way to go, or does it even matter ?

  42. #42
    Forum Guru Kyle123's Avatar
    Join Date
    03-10-2010
    Location
    Leeds
    MS-Off Ver
    365 Win 11
    Posts
    7,239

    Re: Web scraping problem

    Python will be less of a culture shock and you'll find more tutorials out there on web scraping

  43. #43
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Web scraping problem

    Kyle, is "The Jack of all trades man" and he is the best person to give you the right advice, but let me throw my two cents. Yes, VB syntax is similar to VBA. In fact, VBA is an off-shot of VB. VB-6 was dead in 2010. We now have VB.NET. You need a compiler to run VB.NET codes and can use Visual studio. If I were you, I would go for Python. There are many things going Python's way- popular with big data, data science and Machine learning.
    Last edited by AB33; 10-09-2017 at 12:36 PM.

  44. #44
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    OK guys thanks alot for all the info , This thread has been very informative for me. I think I will look into python..

    cheers

  45. #45
    Forum Guru Kyle123's Avatar
    Join Date
    03-10-2010
    Location
    Leeds
    MS-Off Ver
    365 Win 11
    Posts
    7,239

    Re: Web scraping problem

    Whilst VBA is an offshoot of VB6, VB.NET is a completely different animal. Despite the rather confusing naming, it's actually nothing like VB6 - it's primarily object orientated for starters which VB6 never was. Most of the classic VB developers I know, migrated to VB.NET, hated it and moved over to C#, the issue is that it sort of smells like VB6 but is completely different under the skin, so much so they got frustrated and found it easier to get their heads around a completely different syntax (C#) - C# and VB.NET compile to the same language anyway.

    To learn VB.NET you need to learn OOP, for Python you don't. It's a scripting language that lets you write in the same procedural manner you'll be using in Excel, it's easier to write scripts that focus on getting things done. It's also open source, meaning that it's widely used (much more so than a Microsoft stack - which is predominantly corporate focused) and so there are loads of tutorials and 3rd party libraries to help you - python's also widely used for webs scraping so there's plenty of resource out there.

    Personally, I don't like python, but it's a great getting started language that's easier to get started with and doesn't force you to learn architecture and OOP concepts first (which you need to really with something like VB, C#, Java etc). I'm with AB33 on this one, Python would be a good choice, but I don't think that Excel is particularly bad either.

  46. #46
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Web scraping problem

    Ok then after hearing all of this I will definitely go for python. can I build a 100% program with this for windows? where can I download a starter pack ?

    and you are right excel is not a bad at all , but would be sweet to get all of my work out of it.

  47. #47
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Web scraping problem

    Go to Python website and download the applications from the official site. There are two versions of Python- Python 2 and 3. Both versions included an IDE called IDEL, which in my opinion is absolute crap. There are many free Python IDE's. I personal prefer Visual studio- IMO is the best IDE ever made(So far). There is a community free version of VS. Yes, you can built any type of applications with Python, but I am not the person to advice as I am not a Python chap.
    Last edited by AB33; 10-09-2017 at 01:14 PM.

  48. #48
    Valued Forum Contributor Sean Thomas's Avatar
    Join Date
    03-25-2012
    Location
    HerneBay, Kent, UK
    MS-Off Ver
    Excel 2007,2016
    Posts
    971

    Re: Web scraping problem

    Glad to help.
    Yes this method does work with JavaScript whereas the IE doesn't always return any information.
    I have played about with web scraping to a point of knowing how to do it. I wouldn't say I'm an expert but I get by. Not that I do much of it.
    Personally I have learnt VBA, HMTL, SQL and this keeps me more than busy.
    I'm currently looking into SAS and our company are also pushing Python of which I aim to get some training in - they are giving it to us for free!

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. [SOLVED] Scraping Data from Web site problem
    By Josephrandall in forum Excel Programming / VBA / Macros
    Replies: 7
    Last Post: 04-26-2015, 07:57 AM
  2. Web scraping
    By vijay.jp in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 01-13-2015, 08:07 AM
  3. Web scraping
    By vijay.jp in forum Excel Programming / VBA / Macros
    Replies: 5
    Last Post: 01-02-2015, 12:56 AM
  4. problem while scraping data from website
    By dps700 in forum Excel Programming / VBA / Macros
    Replies: 0
    Last Post: 12-09-2014, 09:46 AM
  5. [SOLVED] Web scraping with xml
    By abousetta in forum Excel Programming / VBA / Macros
    Replies: 4
    Last Post: 09-24-2012, 05:25 AM
  6. Scraping outlook problem
    By bobaftt in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 04-20-2012, 02:56 PM
  7. Web scraping
    By mickbarry in forum Excel Formulas & Functions
    Replies: 2
    Last Post: 02-01-2006, 06:25 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1