+ Reply to Thread
Results 1 to 6 of 6

Text extraction from News Articles Into Excel

  1. #1
    Registered User
    Join Date
    06-12-2013
    Location
    London, England
    MS-Off Ver
    Excel 2007
    Posts
    24

    Text extraction from News Articles Into Excel

    Hello,
    I am running into the following challenge, and would really appreciate some serious help with a VBA code/process.
    I have been tasked to extract full text from hundreds of URLs and put them in unique cells. The only way I see will be using a VBA code.
    Here is an example of what I have been trying to do using this URL,
    HTML Code: 
    I extracted the full text using boilerpipe, which is found here:
    HTML Code: 
    I then recorded the following micro using the web import wizard in excel:
    Please Login or Register  to view this content.
    This imports the text into multiple cells in the the same column. But to automate the import process for a large number of URLs and to make things visually appealing, I need to have the entire full text imported in a single cell.

    Any help/suggestion will be appreciated.

    Thanks!

  2. #2
    Forum Expert JasperD's Avatar
    Join Date
    05-07-2013
    Location
    Netherlands
    MS-Off Ver
    Excel 2016
    Posts
    1,393

    Re: Text extraction from News Articles Into Excel

    Hi there mmtoure,

    Put your links in column A, starting with A1
    Then run this code - it will parse each link through boilerpipe and put the text output in the cell next to it (column b).
    It worked fine on the single sample link you gave.
    I hope that's what you're looking for.

    Please Login or Register  to view this content.
    Please click the * below if this helps
    Please click the * below if this helps

  3. #3
    Registered User
    Join Date
    06-12-2013
    Location
    London, England
    MS-Off Ver
    Excel 2007
    Posts
    24

    Re: Text extraction from News Articles Into Excel

    Hi JasperD,
    Thanks for the code. It worked nicely. Is there a way though that I could extract text without using boilerpipe...just wondering.

  4. #4
    Forum Expert JasperD's Avatar
    Join Date
    05-07-2013
    Location
    Netherlands
    MS-Off Ver
    Excel 2016
    Posts
    1,393

    Re: Text extraction from News Articles Into Excel

    You could just remove the boilerpipe addin and extract the whole site -- but then you'd have to remove all formatting etc yourself, so I don't see the point

  5. #5
    Registered User
    Join Date
    06-12-2013
    Location
    London, England
    MS-Off Ver
    Excel 2007
    Posts
    24

    Re: Text extraction from News Articles Into Excel

    Hey, JasperD I am problem extracting from URLs gotten from google alert. Maybe you could help...here is the new thread I started
    HTML Code: 

  6. #6
    Forum Expert JasperD's Avatar
    Join Date
    05-07-2013
    Location
    Netherlands
    MS-Off Ver
    Excel 2016
    Posts
    1,393

    Re: Text extraction from News Articles Into Excel

    Nories solution in that thread seems to do what you ask for just fine...

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1