+ Reply to Thread
Results 1 to 14 of 14

Advice for Web data extraction

  1. #1
    Registered User
    Join Date
    08-18-2012
    Location
    United Kingdom
    MS-Off Ver
    Excel 2003
    Posts
    35

    Question Advice for Web data extraction

    Hello Gents!

    I've just joined the big family of this forum, nice to meet everyone here!

    I am writing a Macro code which could extract historical stock data of Hong Kong according to the date entered by a user.

    This is the data of the page I wish to extract from:
    http://finance.yahoo.com/q/hp?s=3888...torical+Prices

    Suppose I am only interested in the historical data of this page, how can I direct my program to obtain a certain value, let's say the "Close Price" of "Jun 11, 2012" A part of the html code which includes the data I need looks like this:

    HTML Code: 
    I have written some very basic web data extraction code before (shown below) but I have no idea for this one.

    My previous code:
    Please Login or Register  to view this content.
    Any help would be VERY VERY MUCH appreciated!!!!

    Many thanks,

    Gordon

  2. #2
    Valued Forum Contributor
    Join Date
    05-21-2009
    Location
    Great Britain
    MS-Off Ver
    Excel 2003
    Posts
    550

    Re: Advice for Web data extraction

    There's no need to use IE automation for this, as Excel can load the data directly. On the link you gave, set the start and end dates to 11 June 2012, click Get Prices and copy the Download to Spreadsheet link at the bottom of the page. In Excel, use File - Open, paste the spreadsheet link as the file name and Excel retrieves the data. Recorded as a macro, this gives:
    Please Login or Register  to view this content.
    It's then one line of VBA to copy the Close price from E2 in table.csv to your workbook.
    Post responsibly. Search for excelforum.com

  3. #3
    Registered User
    Join Date
    08-18-2012
    Location
    United Kingdom
    MS-Off Ver
    Excel 2003
    Posts
    35

    Re: Advice for Web data extraction

    Hello Chippy, thanks for your help. However, I really want to make this process automated, instead I have to manually check the link of every date each time. And I would think it'd be the best if it does not involve and file downloaded to my computer.

    My ideal solution is a code which could identify the data I need from the html and format the part of the data to extract only the number requires.
    Last edited by Cutter; 08-19-2012 at 08:02 PM. Reason: Removed whole post quote

  4. #4
    Valued Forum Contributor
    Join Date
    05-21-2009
    Location
    Great Britain
    MS-Off Ver
    Excel 2003
    Posts
    550

    Re: Advice for Web data extraction

    You don't have to manually click the link; you can construct the URL as a string with VBA code according to the query string parameters, i.e. this part:

    ?s=3888.HK&a=05&b=11&c=2012&d=05&e=11&f=2012&g=d&ignore=.csv"

    a = start month (zero-based), b = start day, c = start year, then similarly for the d, e and f values for the end date.

    The file (table.csv) is only downloaded to your computer temporarily; you don't need to save it.

    Believe me, this really is the easiest way, if not the most efficient. If you don't like this method, I would use XMLhttp to extract the data. This is similar to using IE automation but doesn't have the overhead of invoking IE.

  5. #5
    Registered User
    Join Date
    08-18-2012
    Location
    United Kingdom
    MS-Off Ver
    Excel 2003
    Posts
    35

    Re: Advice for Web data extraction

    I get you chippy, a very efficient method indeed! Thanks, this solve all of my problems Do you have to close the csv file afterwards? or does it close itself?
    Just want to learn a bit more, in case if I need to extract some data from a long coding like html, what would you use? I am very not familiar with text selecting, editing and stuff. I don't even know what does "Trim(doc.getElementsByTagName("span")(22).innerText)" do.... I just copied it from other people's work
    Last edited by Cutter; 08-19-2012 at 08:01 PM. Reason: Removed whole post quote

  6. #6
    Forum Guru Kyle123's Avatar
    Join Date
    03-10-2010
    Location
    Leeds
    MS-Off Ver
    365 Win 11
    Posts
    7,238

    Re: Advice for Web data extraction

    Here's a different approach, this is my preference. It uses something called Yahoo Query Language (YQL) which allows you to return data from a wide range of sources (html, XML, Data Tables, JSON, etc) in an XML format using a SQL like syntax. It's a brilliant resource for things like web scraping and getting data in general since it also allows the use of XPath in the Where clause for use on HTML, it also allows you to query many sites in a single call which can be handy.

    Anyways, back on topic You can use YQL to access on-line datatables that have set up, in the example below I am using one of the Yahoo Finance tables called historicaldata, the syntax for the query would be:
    PHP Code: 
    select from yahoo.finance.historicaldata where symbol '3888.HK' and startDate '2012-04-11' and endDate '2012-04-11' 
    You can try it out here to see what it returns:
    http://developer.yahoo.com/yql/conso...72012-04-11%27

    We can then take the url at the bottom and adapt it to our needs, so something like:
    Please Login or Register  to view this content.
    It can then be called with something like:
    Please Login or Register  to view this content.
    I've only scratched the surface with what you can do with YQL, but if you are interested in scraping web data, it's well worth looking into.

  7. #7
    Registered User
    Join Date
    08-18-2012
    Location
    United Kingdom
    MS-Off Ver
    Excel 2003
    Posts
    35

    Re: Advice for Web data extraction

    Thanks for your reply Kyle! I am sure it does work but I have no idea what does most of the comments do. It would be much much helpful (I know i am asking too much!) if you could explain what does it do for each command. Or alternatively, do you know any web resources which tells you the meaning of each function or command of Macro?

    In addition, does anyone know if i have a string with something like

    nowrap align="right">Jun 11, 2012</td><td class="yfnc_tabledata1" align="right">3.54</td><td class="yfnc_tabledata1" align="right">3.63</td><td class="yfnc_tabledata1" align="right">3.51</td><td class="yfnc_tabledata1" align="right">3.63</td><td class="yfnc_tabledata1" align="right">1,807,000</td><td class="yfnc_tabledata1" align="right">3.63</td></tr><tr><td class="yfnc_tabledata1" nowrap align="right">

    How can I extract the data I need? Let's say i want "1,807,000" from the above text.

    Many thanks,

    Gordon
    Last edited by Cutter; 08-20-2012 at 09:04 AM. Reason: Removed whole post quote

  8. #8
    Forum Guru Kyle123's Avatar
    Join Date
    03-10-2010
    Location
    Leeds
    MS-Off Ver
    365 Win 11
    Posts
    7,238

    Re: Advice for Web data extraction

    Does the below help?

    Please Login or Register  to view this content.

  9. #9
    Valued Forum Contributor
    Join Date
    05-21-2009
    Location
    Great Britain
    MS-Off Ver
    Excel 2003
    Posts
    550

    Re: Advice for Web data extraction

    Quote Originally Posted by gordonf35 View Post
    I get you chippy, a very efficient method indeed! Thanks, this solve all of my problems Do you have to close the csv file afterwards? or does it close itself?
    You have to close it (with VBA). Here is the complete code which accepts a date from the user and retrieves the Close price and Volume into cells A1:B1 in your workbook.
    Please Login or Register  to view this content.
    It's not the most efficient or cleverest method, but it does the job and is simple to understand - you don't need to know anything about HTML tags, XML, string parsing, etc.

  10. #10
    Registered User
    Join Date
    08-18-2012
    Location
    United Kingdom
    MS-Off Ver
    Excel 2003
    Posts
    35

    Re: Advice for Web data extraction

    Quote Originally Posted by Kyle123 View Post
    Does the below help?

    Please Login or Register  to view this content.
    Many thanks, this is very helpful! I am sure it is detail and comprehensive enough! Just some questions if you don't mind. I was wondering what does (Open "GET"), (responsetext) means? And how do you define where to spilt, I can't identify the code Val(Split(.responseText, "<" & strProperty & ">")(1)) defining any splitting point. How or when did you define it?

    THanks,

    Gordon

    ---------- Post added at 02:28 PM ---------- Previous post was at 02:20 PM ----------

    Quote Originally Posted by Chippy View Post
    You have to close it (with VBA). Here is the complete code which accepts a date from the user and retrieves the Close price and Volume into cells A1:B1 in your workbook.
    Please Login or Register  to view this content.
    It's not the most efficient or cleverest method, but it does the job and is simple to understand - you don't need to know anything about HTML tags, XML, string parsing, etc.
    Hi Chippy, thanks for your understanding. Yes, i only need a code which works. How did you guys learn all these. I have read some books about VBA but they only tell very simple programming command. By the way, what kind of date does you method accept? mm-dd-yyyy or dd-mmm-yy or others? I am now writing a code to convert user input, 20-08-2012 to AUG 20, 2012...

    May be easy for you but it's soooooooooooooo difficult for me!
    Last edited by gordonf35; 08-20-2012 at 03:23 PM.

  11. #11
    Registered User
    Join Date
    08-18-2012
    Location
    United Kingdom
    MS-Off Ver
    Excel 2003
    Posts
    35

    Re: Advice for Web data extraction

    Hi Kyle, I had a look for your code and trying out YQL for hours, finally I understand what you have done (although still not very understand on what I've asked above)! YQL is a really really good tool for programmer! Very smart! However, what would you do when YQL doesn not support the web address?

  12. #12
    Valued Forum Contributor
    Join Date
    05-21-2009
    Location
    Great Britain
    MS-Off Ver
    Excel 2003
    Posts
    550

    Re: Advice for Web data extraction

    Quote Originally Posted by gordonf35 View Post
    By the way, what kind of date does you method accept? mm-dd-yyyy or dd-mmm-yy or others?
    Any string or number that Excel can interpret as a date, for example in the UK you could enter 11/06/12, 11/6/12, 11-6-2012, 11/Jun/12, 11-Jun-12, etc. In the USA you would enter month, day, year so that Excel correctly interprets the local date. For user input it is better to specify the required format (e.g. dd/mm/yy) in the prompt and validate the input accordingly.

    A better approach from a user interface POV is to display the calendar control on a userform and have the user select a date instead of typing it in.

  13. #13
    Forum Guru Kyle123's Avatar
    Join Date
    03-10-2010
    Location
    Leeds
    MS-Off Ver
    365 Win 11
    Posts
    7,238

    Re: Advice for Web data extraction

    Quote Originally Posted by gordonf35
    Just some questions if you don't mind. I was wondering what does (Open "GET"), (responsetext) means? And how do you define where to spilt, I can't identify the code Val(Split(.responseText, "<" & strProperty & ">")(1)) defining any splitting point. How or when did you define it?
    GET is the type of web request we want and responseText is the data that is returned from the request.

    I explained how the split worked, it is based on the input parameter, so if you pass close, it will split on close, if you pass open it will split on open etc. You might want to read up on variables to understand how that works. tbh getting data from the web isn't going to be the easiest place to learn VBA, it can get very complex very quickly. For what you are doing right now, Chippy's solution will be better for you since it is easier and more transparent, the concepts used in mine are somewhat more complex.

  14. #14
    Registered User
    Join Date
    08-18-2012
    Location
    United Kingdom
    MS-Off Ver
    Excel 2003
    Posts
    35

    Re: Advice for Web data extraction

    Thanks Kyle and Chippy, very much apprecieated! I have learnt a lot from both of you

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1