+ Reply to Thread
Results 1 to 5 of 5

Using URLDownloadToFile function from VBA

  1. #1
    Registered User
    Join Date
    04-18-2007
    Posts
    5

    Using URLDownloadToFile function from VBA

    Hi.

    I'm writing some Excel VBA code that grabs the source code of a webpage, manipulates it and displays data in Excel format. Only I can't get it working properly.

    The website I am grabbing data from is the Racing Post website (racingpost.co.uk), which requires users to be logged in before they can access subpages. At the minute, for the utility I am writing, I am specifying that the user must be logged in to the Racing Post website via a normal web browser. This seemed reasonable enough to me and I have written code on the assumption that the user is logged in OK.

    I thought I had all this working yesterday but today it is not. I have since realised that yesterday I was playing with an embedded web browser in the Excel workbook (add as a control object) and logging in to the Racing Post site that way, and everything was fine. Today I have not logged in through the embedded browser, only through the default web browser, and it's now working.

    If I open the subpages of the website (having previously logged in) using the ShellExecute command to open them then all seems fine. But if I try the URLDownloadToFile on the same pages I get the website error pages telling me I have to be logged in.

    Can anyone offer any help or advice?

    I can navigate round the site fine in the web browser. My ShellExecute commands work to open the pages. But my URLDownloadToFile commands don't work.

  2. #2
    Forum Moderator Leith Ross's Avatar
    Join Date
    01-15-2005
    Location
    San Francisco, Ca
    MS-Off Ver
    2000, 2003, & 2010
    Posts
    23,258
    Hello Mathare,

    When posting questions about code, you should include code examples, data layout, your operating system, programs you are using, and list any error messages. When asking about API code problems you must supply the code. API call are low level programming functions that are influenced by many factors. If you aren't an API programmer, you can get into trouble very easily. The best way to get an answer to your question would be to post the workbook as an attachment to your post.

    Sincerely,
    Leith Ross

  3. #3
    Registered User
    Join Date
    04-18-2007
    Posts
    5
    Quote Originally Posted by Leith Ross
    Hello Mathare,

    When posting questions about code, you should include code examples, data layout, your operating system, programs you are using, and list any error messages. When asking about API code problems you must supply the code. API call are low level programming functions that are influenced by many factors. If you aren't an API programmer, you can get into trouble very easily. The best way to get an answer to your question would be to post the workbook as an attachment to your post.

    Sincerely,
    Leith Ross
    Aye, sorry about that LR. It was late and I was getting desperate for an answer. And to be honest I wasn't sure if it was the sort of problem that would get a "Ah, this is an easy one, you just need to..." response or a "What the hell are you doing? I don't understand" response. I also didn't want to post any/many links to the pages I am trying to access as they are date-specific and I don't know what the policy on external links is round here. Better safe than banned I figured.

    Anyway, I have cut down my workbook to hopefully just the essentials as:
    a) I didn't want to cloud the issue, and
    b) the code I am writing is a bit hacky in places and I am quite protective of it as I have put a lot of effort in to it and as a software engineer it's not up to my usual standard so I am both protective of what I am doing and embarrassed about some of the hacks I have resorted to.

    But I have attached a copy of the workbook showing the basics of what I am trying to acheive. I am going to add links to this post as necessary, if a moderator feels they are inappropriate then they can be removed but they will help understand what I am trying to do.

    A bit of background, and forgive me if I am teaching my grandmother to suck eggs here but I guessed this was an American forum when it set my default time zone to GMT -5hrs. I'm UK-based, as is the Racing Post (RP) website. It's a site containing the racecards for UK horse racing meetings, and it's that information I am looking to convert to Excel format for various reasons. The site uses cookies to control access to the racecards and results stored in their databases.

    I have written a workbook that can use an embedded web browser (added from the Control Toolbox) to navigate round the Racing Post website (www.racingpost.co.uk). However, I am trying to get away from this and to use the user's default browser (Avant in my case, which builds on the IE engine). I would like the workbook to be as platform-independent as possible.

    The basic premise is this:
    1. User logs in to RP website
    2. User uses my workbook to download the racecards and convert them to Excel format
    3. User then does with the Excel racecards as he pleases
    The workbook I have attached covers the first step and a half; logging on to the RP website and downloads the racecards to HTML files. The conversion and display in Excel format has been stripped out for the reasons given above.

    Here's what happens under various circumstances:
    1) User has no logged in to RP website via either the embedded browser or standard web browser
    On clicking the "Download RP racecards" button (to the right of the browser on the Racecards sheet) he gets an error querying whether he is logged in. My code has correctly determined the user has yet to log in to the website so cannot access the required data.
    2) User has logged in to the RP website via the embedded browser
    On clicking the "Download RP racecards" button the racecards are downloaded as HTML and contain all the required information.
    3) User has logged in to the RP website via a standard web browser
    On clicking the "Download RP racecards" the HTML files are created but each contains a copy of the webpage informing the user that he needs to be logged in to access the racecard information.

    It's case 3 that I am trying to address as that's the route I want to go down for users of this workbook. The user has logged into the site and were he to enter the URLs I am trying to download into the address bar of his browser he would see the desired pages.

    I have tested this with the VBA, calling a ShellExecute command to open the file (see ExtractRacecards() routine)

    Please Login or Register  to view this content.
    where YearVal, MonthVal and DateVal are date elements obtained from the worksheet (cell Z9 of Racecards sheet)

    This works when the user has logged in to the RP website via a web browser. But the following call to DownloadFile (which instantiates Microsoft's URLDownloadToFile function) doesn't work correctly, downloading the webpage showing that I need to be logged in to access the data.

    Please Login or Register  to view this content.
    Anyone who can shed any light on this will have my greatest thanks. I'm no API programmer and this is my first foray into this sort of venture and I am finding it rather frustrating. I can't see how/why the embedded browser and the standard broswer can behave differently.

    Thanks in advance

    NB You will need to register for the Racing Post website to be able to log in and access the racecards. Registration is quick, easy and free. Note also that I have absolutely no motive for getting anyone signed up to the RP website other than them being able to help me with my problem. I have no affiliate IDs with them or any such nonsense.
    Attached Files Attached Files

  4. #4
    Forum Moderator Leith Ross's Avatar
    Join Date
    01-15-2005
    Location
    San Francisco, Ca
    MS-Off Ver
    2000, 2003, & 2010
    Posts
    23,258
    Hello Mathare,

    There 2 important issues to be aware of when using this API. First the API works only with static web pages not dynamic ones, and second, If the default browser is Internet Explorer this can you several more problems.

    I haven't examined your code yet as I been busy with some unexpected issues today, but wanted you know I am still working on this with you. I'll post back after I have looked it over.

    Thanks,
    Leith Ross

  5. #5
    Registered User
    Join Date
    04-18-2007
    Posts
    5
    Quote Originally Posted by Leith Ross
    Hello Mathare,

    There 2 important issues to be aware of when using this API. First the API works only with static web pages not dynamic ones, and second, If the default browser is Internet Explorer this can you several more problems.
    Static webpages are fine, that's what I have. The URL has some date-related aspects to it but the page itself is static. Why am I not surprised IE can cause me issues though

    I haven't examined your code yet as I been busy with some unexpected issues today, but wanted you know I am still working on this with you. I'll post back after I have looked it over.

    Thanks,
    Leith Ross
    You're a star, thanks

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1