+ Reply to Thread
Results 1 to 6 of 6

Web Scrapping Help

  1. #1
    Registered User
    Join Date
    10-14-2011
    Location
    chesapeake, va
    MS-Off Ver
    Excel 2007
    Posts
    59

    Web Scrapping Help

    So I'm new to VBA, but I'm also completely hooked on it. I've learned Quite a bit, and right now I'm trying to learn how to pull Data from IE. Web Scraping.
    so wit m monster of a project that I've been posting all over this site i would like to populate a sheet from IMBD

    Web Queries are much too slow for this, I'm searching the site using a Variant from Cell value, then pulling specific values from the site to the same row next column again dependent on Cell Value, then moving to the next row.

    I have no code for this, and anything pointing to the right direction will give me something to build off. I'm just not sure where to start.

    There are two locations I'm trying to read the source Data from:
    The Main Site: with "tt1520211" being the Variable
    HTML Code: 
    and: Same as the Main Site, without "combined" and "episodes?season=" and the number "1" also being the Variable
    then Matching Ep1 to a line on the workbook
    HTML Code: 
    and the Sheet that the code will read from:
    Series_V1.xlsx
    on the sheet the variables will come from Sheets("Series"), range being A3:A for hte title, but most importantly B3:B for the "tt1520211"
    then sheets("The Walking Dead") etc will contain also the "tt1520211" as well as the number Variable, and the ,EP1 to match.

    the code i'm trying to scrape looks like this:
    PHP Code: 
     <div class="clear" itemscope itemtype="http://schema.org/TVSeason">
          <
    meta itemprop="numberofEpisodes" content="6"/>
          <
    div class="sort">
            <
    button data-direction="asc" class="small sort_direction btn sort_asc" title="Reverse the order">&nbsp;</button>
          </
    div>
        <
    h3 id="episode_top"  itemprop="name">Season&nbsp;1</h3>
      <
    div class="list detail eplist">
          <
    div class="list_item odd">
      <
    div class="image">
    <
    a
    onclick
    ="(new Image()).src='/rg/episodes/image-1/images/b.gif?link=/title/tt1589921/';"
    href="/title/tt1589921/"
    title="Days Gone Bye"
    itemprop="url"> <div data-const="tt1589921" class="hover-over-image zero-z-index ">
    <
    img width="120" class="zero-z-index" alt="Days Gone Bye" src="http://ia.media-imdb.com/images/M/MV5BMTM5NDkxNDM0Nl5BMl5BanBnXkFtZTcwNDI3MDQwNA@@._V1_SX120_CR0,0,120,180_.jpg">
    <
    div>S1Ep1</div>
    </
    div>
    </
    a>  </div>
      <
    div class="info" itemprop="episodes" itemscope itemtype="http://schema.org/TVEpisode">
        <
    meta itemprop="episodeNumber" content="1"/>
        <
    div class="airdate">
          
    Oct312010
        
    </div>
        <
    strong><a
    onclick
    ="(new Image()).src='/rg/episodes/episode-1/images/b.gif?link=/title/tt1589921/';"
    href="/title/tt1589921/"
    title="Days Gone Bye"
    itemprop="name">Days Gone Bye</a></strong>
        <
    div class="item_description" itemprop="description">Sheriff DeputyRick Grimeswakes up in the hospitalafter being shotto find his town overrun by flesh-eating zombiesAfter making friends with survivor Morgan Jones and his son DuaneRick sets out to find his wife and son.</div>
        <
    div class="popoverContainer">
      <
    div class="watchInfo">
        <
    label><span>Watch now</span></label>
      <
    div class="watchIcon"><img src="http://ia.media-imdb.com/images/G/01/imdb/images/video/provider_logos/amazon-1815097792._V149567120_.gif" /></div>
      <
    div class="watchIcon"><img src="http://ia.media-imdb.com/images/G/01/imdb/images/video/provider_logos/amazon-1815097792._V149567120_.gif" /></div>
      </
    div>
      <
    div class="htwPopover">
        <
    div class="popoverClose touch"></div>
        <
    h5>Watch now</h5>
        <
    div class="watchOptionList">
            <
    div class="watchOption">
    <
    a
    onclick
    ="(new Image()).src='/rg/episodes-watch-now/on-demand-amazon/images/b.gif?link=/video/amazon/vi3316753177/offsite';"
    href="/video/amazon/vi3316753177/offsite"
    target="_blank"> <div class="watchIcon"><img src="http://ia.media-imdb.com/images/G/01/imdb/images/video/provider_logos/amazon-1815097792._V149567120_.gif" /></div>
    Watch on
    Amazon
    </a>        </div>
              <
    div class="sep"></div>
            <
    div class="watchOption">
    <

    <div>S1, Ep1</div> for the Matching using ("C6")

    This would Definitely get me started in the right directcion, I would be able to make the loops, and things of such nature, but the IE interaction, and pointing to the correct items for the .getElementsbytagname type items, I'm not at all clear on. Again, any help would be apprciated, and i will of course run with what ever gets me started.

    Thanks Again for the time

    itemprop="name">Days Gone Bye</a></strong> for Cell.Value("D6")

    <div class="airdate">
    Oct. 31, 2010
    </div> for Cell.Value("H6")

    <div class="item_description" itemprop="description">Sheriff Deputy, Rick Grimes, wakes up in the hospital, after being shot, to find his town overrun by flesh-eating zombies. After making friends with survivor Morgan Jones and his son Duane, Rick sets out to find his wife and son.</div>
    for ("E6")

  2. #2
    Registered User
    Join Date
    10-14-2011
    Location
    chesapeake, va
    MS-Off Ver
    Excel 2007
    Posts
    59

    Re: Web Scrapping Help

    So I've Been running the internet, and found this code.. ive been able to pull various data, but still having a great deal of issues with this area:

    Series_V1.xlsm Located in the Scraping Module

    Please Login or Register  to view this content.
    Right now I'm just dumping code into sheet1 to connect the dots, but anyone willing, i would still very much appreciate a clue

    Please Login or Register  to view this content.

  3. #3
    Forum Guru Norie's Avatar
    Join Date
    02-02-2005
    Location
    Stirling, Scotland
    MS-Off Ver
    Microsoft Office 365
    Posts
    19,643

    Re: Web Scrapping Help

    Why have you changed to automating IE?

  4. #4
    Registered User
    Join Date
    10-14-2011
    Location
    chesapeake, va
    MS-Off Ver
    Excel 2007
    Posts
    59

    Re: Web Scrapping Help

    to learn it, for lack of better discription. I'm actually building the same thing in paralell using the xml as we did with the movies, but theres alot of projects i can do if i learn the IE automation.. Though i'm finding out that its a great deal slower and difficult to write.. least for me

  5. #5
    Forum Guru Kyle123's Avatar
    Join Date
    03-10-2010
    Location
    Leeds
    MS-Off Ver
    365 Win 11
    Posts
    7,238

    Re: Web Scrapping Help

    Automation of IE should always only be a last resort, it's extremely slow as you have found. In the majority of times it's not necessary since you can run GET and POST requests directly from Excel using WinHTTP and XML objects, once you understand how web-servers send and receive information, it isn't too difficult to emulate the browser calls from Excel, try using the developer tools in Chrome/IE to see what data is actually sent/received.

  6. #6
    Registered User
    Join Date
    10-14-2011
    Location
    chesapeake, va
    MS-Off Ver
    Excel 2007
    Posts
    59

    Re: Web Scrapping Help

    So My Frustration has hit it's peak, and will be putting the IE automation on the back burner for the time being..
    this is a link to the same project using the XML search, rather than the IE approach.. which is Rather Irritating
    XML Series Search

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1