So I'm new to VBA, but I'm also completely hooked on it. I've learned Quite a bit, and right now I'm trying to learn how to pull Data from IE. Web Scraping.
so wit m monster of a project that I've been posting all over this site i would like to populate a sheet from IMBD
Web Queries are much too slow for this, I'm searching the site using a Variant from Cell value, then pulling specific values from the site to the same row next column again dependent on Cell Value, then moving to the next row.
I have no code for this, and anything pointing to the right direction will give me something to build off. I'm just not sure where to start.
There are two locations I'm trying to read the source Data from:
The Main Site: with "tt1520211" being the Variable
HTML Code:
and: Same as the Main Site, without "combined" and "episodes?season=" and the number "1" also being the Variable
then Matching Ep1 to a line on the workbook
HTML Code:
and the Sheet that the code will read from: Series_V1.xlsx
on the sheet the variables will come from Sheets("Series"), range being A3:A for hte title, but most importantly B3:B for the "tt1520211"
then sheets("The Walking Dead") etc will contain also the "tt1520211" as well as the number Variable, and the ,EP1 to match.
the code i'm trying to scrape looks like this:
PHP Code:
<div class="clear" itemscope itemtype="http://schema.org/TVSeason"> <meta itemprop="numberofEpisodes" content="6"/> <div class="sort"> <button data-direction="asc" class="small sort_direction btn sort_asc" title="Reverse the order"> </button> </div> <h3 id="episode_top" itemprop="name">Season 1</h3> <div class="list detail eplist"> <div class="list_item odd"> <div class="image"> <a onclick="(new Image()).src='/rg/episodes/image-1/images/b.gif?link=/title/tt1589921/';" href="/title/tt1589921/" title="Days Gone Bye" itemprop="url"> <div data-const="tt1589921" class="hover-over-image zero-z-index "> <img width="120" class="zero-z-index" alt="Days Gone Bye" src="http://ia.media-imdb.com/images/M/MV5BMTM5NDkxNDM0Nl5BMl5BanBnXkFtZTcwNDI3MDQwNA@@._V1_SX120_CR0,0,120,180_.jpg"> <div>S1, Ep1</div> </div> </a> </div> <div class="info" itemprop="episodes" itemscope itemtype="http://schema.org/TVEpisode"> <meta itemprop="episodeNumber" content="1"/> <div class="airdate"> Oct. 31, 2010 </div> <strong><a onclick="(new Image()).src='/rg/episodes/episode-1/images/b.gif?link=/title/tt1589921/';" href="/title/tt1589921/" title="Days Gone Bye" itemprop="name">Days Gone Bye</a></strong> <div class="item_description" itemprop="description">Sheriff Deputy, Rick Grimes, wakes up in the hospital, after being shot, to find his town overrun by flesh-eating zombies. After making friends with survivor Morgan Jones and his son Duane, Rick sets out to find his wife and son.</div> <div class="popoverContainer"> <div class="watchInfo"> <label><span>Watch now</span></label> <div class="watchIcon"><img src="http://ia.media-imdb.com/images/G/01/imdb/images/video/provider_logos/amazon-1815097792._V149567120_.gif" /></div> <div class="watchIcon"><img src="http://ia.media-imdb.com/images/G/01/imdb/images/video/provider_logos/amazon-1815097792._V149567120_.gif" /></div> </div> <div class="htwPopover"> <div class="popoverClose touch"></div> <h5>Watch now</h5> <div class="watchOptionList"> <div class="watchOption"> <a onclick="(new Image()).src='/rg/episodes-watch-now/on-demand-amazon/images/b.gif?link=/video/amazon/vi3316753177/offsite';" href="/video/amazon/vi3316753177/offsite" target="_blank"> <div class="watchIcon"><img src="http://ia.media-imdb.com/images/G/01/imdb/images/video/provider_logos/amazon-1815097792._V149567120_.gif" /></div> Watch on Amazon </a> </div> <div class="sep"></div> <div class="watchOption"> <a
<div>S1, Ep1</div> for the Matching using ("C6")
This would Definitely get me started in the right directcion, I would be able to make the loops, and things of such nature, but the IE interaction, and pointing to the correct items for the .getElementsbytagname type items, I'm not at all clear on. Again, any help would be apprciated, and i will of course run with what ever gets me started.
Thanks Again for the time
itemprop="name">Days Gone Bye</a></strong> for Cell.Value("D6")
<div class="airdate">
Oct. 31, 2010
</div> for Cell.Value("H6")
<div class="item_description" itemprop="description">Sheriff Deputy, Rick Grimes, wakes up in the hospital, after being shot, to find his town overrun by flesh-eating zombies. After making friends with survivor Morgan Jones and his son Duane, Rick sets out to find his wife and son.</div>
for ("E6")
to learn it, for lack of better discription. I'm actually building the same thing in paralell using the xml as we did with the movies, but theres alot of projects i can do if i learn the IE automation.. Though i'm finding out that its a great deal slower and difficult to write.. least for me
Automation of IE should always only be a last resort, it's extremely slow as you have found. In the majority of times it's not necessary since you can run GET and POST requests directly from Excel using WinHTTP and XML objects, once you understand how web-servers send and receive information, it isn't too difficult to emulate the browser calls from Excel, try using the developer tools in Chrome/IE to see what data is actually sent/received.
So My Frustration has hit it's peak, and will be putting the IE automation on the back burner for the time being..
this is a link to the same project using the XML search, rather than the IE approach.. which is Rather Irritating XML Series Search
Bookmarks