+ Reply to Thread
Results 1 to 8 of 8

How to collect data with web scraper ?

  1. #1
    Carly Fiorina
    Guest

    How to collect data with web scraper ?

    Hi everyone,
    There is a large amount of data available only through websites. However, as many people have found out, trying to copy data into a usable database or spreadsheet directly out of a website can be a tiring process. Data entry from internet sources can quickly become cost prohibitive as the required hours add up. Clearly, an automated method for collating information from HTML-based sites can offer huge management cost savings. Web scrapers are programs that are able to aggregate information from the internet. They are capable of navigating the web, assessing the contents of a site, and then pulling data points and placing them into a structured, working database or spreadsheet.

    Thanks
    Carly Fiorina
    Last edited by Carly Fiorina; 11-10-2010 at 10:37 AM.

  2. #2
    Forum Expert
    Join Date
    08-27-2008
    Location
    England
    MS-Off Ver
    2010
    Posts
    2,561

    Re: How to collect data with web scraper ?

    Rather than suggest the removal of this advert. I'd like to say that it's relatively easy to create internet data sources (I have a lovely one based on some Yahoo! Finance pages) and only slightly less easy to automate Internet Explorer from Excel to browse automatically and pull back data.

    Free here.
    CC


    If you feel really indebted please consider a donation to charity. My preferred charity is ActionAid but there are plenty of worthy alternatives.

  3. #3
    Registered User
    Join Date
    11-03-2010
    Location
    New York, New York
    MS-Off Ver
    Excel 2003
    Posts
    4

    Re: How to collect data with web scraper ?

    Cheeky Charlie,

    Could you point me in the direction of any resources to do that? I am very interested in webscraping (for free!)

  4. #4
    Forum Expert
    Join Date
    08-27-2008
    Location
    England
    MS-Off Ver
    2010
    Posts
    2,561

    Re: How to collect data with web scraper ?

    The first thing:
    Data -> Import External Data -> New web query

    The second thing:
    http://www.google.co.uk/search?hl=en...=&oq=&gs_rfai=

    I'm not about to create a generic web-scraping tool, if that's what you want, buy it. We tend to solve specific problems here.

  5. #5
    Registered User
    Join Date
    03-06-2013
    Location
    Sydney
    MS-Off Ver
    Excel 2010
    Posts
    1

    Re: How to collect data with web scraper ?

    Hi Charlie

    I came across this thread (and forum) by attempting to find answers to what I need.

    It might be more involved on the web development side rather than Excel proficiency but I need to:

    a) web scrape a government website (data in table updated daily)
    b) transfer multiple tables into worksheets
    c) auto re-stylise the header row of each table (including making it printable)
    d) auto-forward the worksheets to an email group (email group part is easy enough with Gmail)
    e) loop the above every weekday after 6.30pm local time

    The other issue (which I am doubtful could be answered here) is that the URL changes daily and I don't know how 'deconstruct' what it means so that this automated system I'm looking to build might not work (e.g. www.....com/Page1/28Feb to www.....com/Page1/1Mar is easy to work out what needs to be done next. But here, the URL string seems to be random)

    If anything I'm asking if it's possible to auto-stylise data as it comes in from the same source (albeit updated daily) and whether it is possible to auto-forward that worksheet via email.

    Ta,

    Andrew

  6. #6
    Forum Expert
    Join Date
    08-27-2008
    Location
    England
    MS-Off Ver
    2010
    Posts
    2,561

    Re: How to collect data with web scraper ?

    Hi Andrew,

    Welcome to the forum.

    Unfortunately your post does not comply with Rule 2 of our Forum RULES. Do not post a question in the thread of another member -- start your own thread.

    If you feel an existing thread is particularly relevant to your need, provide a link to the other thread in your new thread.

    Old threads are often only monitored by the original participants. New threads not only open you up to all possible participants again, they typically get faster response, too.


    ...

    That said... I think most of this stuff is doable, although I wouldn't try to tackle the whole lot without expecting a fee! If you break the bits up you will be able to start threads for each (in general I don't look at threads with too many requirements, the work-fruit relationship isn't satisfying).
    As an example, a way of following your dynamic URL may be automating the access process (click this link, then this link) - rather than attempt to reverse-engineer what almost certainly is a random number - and pulling the data that way - but this itself is a chunky challenge before you add all the rest!

    Best of luck. Feel free to PM me links to any threads you start.
    Last edited by Cheeky Charlie; 03-06-2013 at 01:00 PM. Reason: qualification

  7. #7
    Registered User
    Join Date
    03-12-2013
    Location
    Toronto, Canada
    MS-Off Ver
    Excel 20011
    Posts
    2

    Re: How to collect data with web scraper ?

    mju4t, cdafam: Depending on what datasets you're looking for, Quandl might be able to help. It's a website that aims to be a "generalized web scraper" -- it's scraped some 3 million numerical datasets from around the web, and made them accessible in a single format. There's an Excel Add-In as well -- so you can get any of those datasets directly into Excel, very easily. Hope this helps.

    --
    qatenary

    Disclosure: I'm on the Quandl team.

  8. #8
    Forum Expert
    Join Date
    08-27-2008
    Location
    England
    MS-Off Ver
    2010
    Posts
    2,561

    Re: How to collect data with web scraper ?

    Well pitched.
    I am now playing with this new shiny thing.

    I will recommend it as long as it stays free

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1