+ Reply to Thread
Results 1 to 7 of 7

Very weird problem with Data Scrapping VBA (works randomly)

  1. #1
    Forum Contributor
    Join Date
    08-28-2015
    Location
    Montreal, Canada
    MS-Off Ver
    2013 on PC, 2011 on MAC
    Posts
    159

    Very weird problem with Data Scrapping VBA (works randomly)

    Hey guys,

    I didn't know if I should post this in the General or VBA forum (mods feel free to move this topic if needed), as I'm not exactly sure what the problem is, but here it is. I'll try to keep it short.

    For the past 3 years or so, my partner and I have been using a VBA to collect market data from a website. We've had a few hickups over the years caused by some changes made to the html codes that the procedure goes through to collect said data, and recently had to slow down our procedure to avoid getting timed-out by the website, but nothing we couldn't handle. Until a few weeks ago...

    For some unknown reason, our different procedures (which are all similar, just refers to different filters of which titles to refresh) started to act very weird. We'd run it over and over again, with no result. Then, out of nowhere, one would work. And from that point on, every single procedure does as well. Until we close the workbook, and we have to start again. Like the procedure has to "warm-up" before it actually becomes succesful.

    Even stranger than that...my partner and I each have our own version of the exact same documents (identical files), each one saved on our own harddrive. We can both start running the same procedure, at the very same time, and one of us would be successful (typically the one that had the procedure "warmed-up") while the other one won't. Sometimes it's him, sometimes it's me. Completely random.

    Additional things to know: I run the home version of Excel (Version 1811, built 11029.20079), while my partner uses Office 365 (latest version I assume). Depending on who's succesful doing the refresh after the market closes, we're saving it on our harddrive, and then copy/paste it on our dropbox...so that the next morning, the other one would have to overwrite his version with the one on dropbox to start the day with refreshed data.

    So I guess my questions are:
    • Are there any known issues between Home Office and Office 365 that could cause such an issue ?
    • To the best of your knowledge, does it seem to be an Excel related problem, or maybe an issue with some data scrapping protection on the website we're using ?

    I'm kinda leaning towards an issue with our Excels, as there wouldn't be any reasons for the procedure to start working after a while (and for many hours straight) if this was caused by the website itself. Plus, it wouldn't make sense for the procedures to work on one computer, and not on the other if this was the case.

    But I can't figure it out.
    Any help is appreciated.

    I'm leaving a copy of one of the xlsm file we're using (we have different version, depending on what we're monitoring to avoid refreshing things we don't need) in case you want to take a look at the procedure itself. It can be found in Module Refresh_Index.

    Thanks.
    Attached Files Attached Files

  2. #2
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Very weird problem with Data Scrapping VBA (works randomly)

    I couple of things to be aware of. You are at the mercy of the host to collect data from any host site. They own the data and have the right to deny you access. There are some sites which do block web scrapping on purpose, while others will require you sign to thier T & C. Even if the host gives you an API, it has usually stringe attached to it: moderate use and daily restrcitions.

    Your code relies heavily on the format of HTML and as you know HTML can change at any time at the when of the owner.
    I suggest you should look for commercial scrapper if you want reliability and durability.

  3. #3
    Forum Contributor
    Join Date
    08-28-2015
    Location
    Montreal, Canada
    MS-Off Ver
    2013 on PC, 2011 on MAC
    Posts
    159

    Re: Very weird problem with Data Scrapping VBA (works randomly)

    Quote Originally Posted by AB33 View Post
    I couple of things to be aware of. You are at the mercy of the host to collect data from any host site. They own the data and have the right to deny you access. There are some sites which do block web scrapping on purpose, while others will require you sign to thier T & C. Even if the host gives you an API, it has usually stringe attached to it: moderate use and daily restrcitions.

    Your code relies heavily on the format of HTML and as you know HTML can change at any time at the when of the owner.
    I suggest you should look for commercial scrapper if you want reliability and durability.
    Yeah, we are aware of that issue. Changes to the html code happened in the past, and we had to adjust. We also understand that the host site have every right to do allow or deny us access to the data. Which is why we already slowed down our procedure (by activating the auto calculation function midway through on most of our procedure) to avoid high density connection. Plus, we also tweaked all of our refresh procedure to avoid refreshing data we weren't monitoring on that day, and target only the ones we actually needed, to avoid being too heavy on the site. So we are in fact very dependent of all that.

    But that doesn't seem to be the issue here (unless I'm missing something). As I said, both my partner and I can run the same the procedure, at the very same time, and only one of us would be succesful. We're on two separate computers, two separate xlsm files, two separate internet connections...two different results. Sometimes it's him, sometimes it's me. And once a procedure works, say on my version...then they all work. Even the ones that failed before, now work like a charm. For hours straight. Until we close Excel.

    It really feels like I'm starting a lawn mower...I'm pulling the cord, and pulling and pulling. Nothing happens. Then, once the engine starts, I'm okay for the entire day.

  4. #4
    Forum Guru Norie's Avatar
    Join Date
    02-02-2005
    Location
    Stirling, Scotland
    MS-Off Ver
    Microsoft Office 365
    Posts
    19,643

    Re: Very weird problem with Data Scrapping VBA (works randomly)

    Have you checked what is actually being returned by your requests?
    If posting code please use code tags, see here.

  5. #5
    Forum Contributor
    Join Date
    08-28-2015
    Location
    Montreal, Canada
    MS-Off Ver
    2013 on PC, 2011 on MAC
    Posts
    159

    Re: Very weird problem with Data Scrapping VBA (works randomly)

    Quote Originally Posted by Norie View Post
    Have you checked what is actually being returned by your requests?
    Can't say I have. Unfortunately, the procedure itself has been written by someone else, who I'm not in contact with anymore...and that kind of programming is way above my skills. I recognize a few things here and there, so I'm able to tweak it to do different things I need (mostly applying different filters of things to be refreshed), but that's about it. All I know is that it usually paste the collected data in the INVESTING table, columns J:W...and now, everything is blank until one of the procedure starts working, and then we're good to go on every single documents we have.

  6. #6
    Forum Expert
    Join Date
    03-28-2012
    Location
    TBA
    MS-Off Ver
    Office 365
    Posts
    12,454

    Re: Very weird problem with Data Scrapping VBA (works randomly)

    You said

    "We're on two separate computers, two separate xlsm files, two separate internet connections...two different results"

    This probably explains why the site is behaving erratically. It has to do with your IPS. I have similar experience and I could not explain the reason, but highly likley to be that the site somehow recoginse your IPS since you are hammering thier site with traffic. To prove it, use a different IPS and see if this makes any diffreence.

  7. #7
    Forum Contributor
    Join Date
    08-28-2015
    Location
    Montreal, Canada
    MS-Off Ver
    2013 on PC, 2011 on MAC
    Posts
    159

    Re: Very weird problem with Data Scrapping VBA (works randomly)

    Quote Originally Posted by AB33 View Post
    You said

    "We're on two separate computers, two separate xlsm files, two separate internet connections...two different results"

    This probably explains why the site is behaving erratically. It has to do with your IPS. I have similar experience and I could not explain the reason, but highly likley to be that the site somehow recoginse your IPS since you are hammering thier site with traffic. To prove it, use a different IPS and see if this makes any diffreence.
    That was my first assumption as well...but I don't understand why one of our IPS would be blocked at one point, and then unblocked once a procedure finally end-up working, and we're clear for the rest of the day, doing a bunch of different refresh.

    Plus, in our largest .xlsm file (the one that contains all the titles in our database), it seems like one of our REFRESH often seem to have the capacity to unlock everything. Our MAIN REFRESH, which goes through all the different exchange rate, commodities, world index and Canadian and American sectors. For some reason, it seems like when nothing works, and we try that specific REFRESH, it skips the first few lines (exchange rate, commodities, north american and european index)...and ALWAYS get in at the precise same place : IBEX (5H). Once it reaches that specific line, the procedure gets all the following data (asian index and sectors) leaving us with roughly 100 lines missing. We then run a "MISSING REFRESH" right after, and everything's filled. And then, no matter which REFRESH we try after, no matter in which .xlsm document they are...we get the data needed, no problem, for the rest of the day.

    I'll try switching IPS later on today (once the market close) nonetheless, just in case. But seems to me like it would be one heck of a coincidence that a IPS block would be lifted on the same line that frequently.

    But thanks for your input. I'm not trying to be argumentative here, I'm just puzzled by the whole situation.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Macro works but stops randomly with "code execution has been interrupted"
    By terrypin in forum Excel Programming / VBA / Macros
    Replies: 8
    Last Post: 03-06-2017, 05:54 AM
  2. Scrapping column data from a different unopenned workbook.
    By Ghosted in forum Excel Formulas & Functions
    Replies: 0
    Last Post: 05-08-2015, 12:21 PM
  3. Weird Problem
    By gundam13 in forum Excel General
    Replies: 2
    Last Post: 08-10-2007, 12:52 PM
  4. Weird problem
    By Aussie_Striker in forum Excel Programming / VBA / Macros
    Replies: 6
    Last Post: 01-02-2007, 10:07 PM
  5. Weird Problem
    By Aussie_Striker in forum Excel Programming / VBA / Macros
    Replies: 0
    Last Post: 10-17-2006, 04:37 AM
  6. [SOLVED] why cut and insert cells only works randomly?
    By tony in forum Excel General
    Replies: 1
    Last Post: 12-04-2005, 12:15 AM
  7. Weird Problem
    By Patrick in forum Excel Formulas & Functions
    Replies: 4
    Last Post: 03-27-2005, 10:06 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1