+ Reply to Thread
Results 1 to 4 of 4

Extracting/Exporting HTML Tables or PDF Tables into Excel

  1. #1

    Extracting/Exporting HTML Tables or PDF Tables into Excel

    Hi All,

    I just had a quick programming/general excel question surrounding my
    current dilemma. Essentially, I am trying to extract financial tables
    from SEC filings (either made in PDF or HTML). Ideally I would like to
    have the capability of searching an SEC filing for a specific table
    (i.e. lets say a "Consolidated Income Statement") and then have a macro
    which would export that table into excel without losing the formatting.
    If you guys have any idea as to how to go about doing this and
    potentially provide me some starter code I would greatly appreciate
    that.

    Thanks

    Mohammed

    P.S. I know next to nothing about VB. So if you could explain what
    parameters I may need it would be quite useful and what is going on
    with the the code that would be helpful.


  2. #2
    Mo Money
    Guest

    Re: Extracting/Exporting HTML Tables or PDF Tables into Excel

    Hmm I have no idea...but your questions seems quite interesting.

    [email protected] wrote:
    > Hi All,
    >
    > I just had a quick programming/general excel question surrounding my
    > current dilemma. Essentially, I am trying to extract financial tables
    > from SEC filings (either made in PDF or HTML). Ideally I would like to
    > have the capability of searching an SEC filing for a specific table
    > (i.e. lets say a "Consolidated Income Statement") and then have a macro
    > which would export that table into excel without losing the formatting.
    > If you guys have any idea as to how to go about doing this and
    > potentially provide me some starter code I would greatly appreciate
    > that.
    >
    > Thanks
    >
    > Mohammed
    >
    > P.S. I know next to nothing about VB. So if you could explain what
    > parameters I may need it would be quite useful and what is going on
    > with the the code that would be helpful.



  3. #3
    NickHK
    Guest

    Re: Extracting/Exporting HTML Tables or PDF Tables into Excel

    Mohammed,
    Reading the HTML files could be achieved with a web query. Look into
    Data>Get External Data>New Web Query, selecting the table to import from.
    Getting data out PDF and into XL can be done manually as I've not looked
    into coding this:
    Open the PDF in Acrobat, NOT the Reader.
    Use the Select Table tool.
    Right click and export or open in Excel, depending on your version of
    Acrobat.
    Or can save the PDF as HTML, then web query that.

    NickHK

    "Mo Money" <[email protected]> wrote in message
    news:[email protected]...
    > Hmm I have no idea...but your questions seems quite interesting.
    >
    > [email protected] wrote:
    > > Hi All,
    > >
    > > I just had a quick programming/general excel question surrounding my
    > > current dilemma. Essentially, I am trying to extract financial tables
    > > from SEC filings (either made in PDF or HTML). Ideally I would like to
    > > have the capability of searching an SEC filing for a specific table
    > > (i.e. lets say a "Consolidated Income Statement") and then have a macro
    > > which would export that table into excel without losing the formatting.
    > > If you guys have any idea as to how to go about doing this and
    > > potentially provide me some starter code I would greatly appreciate
    > > that.
    > >
    > > Thanks
    > >
    > > Mohammed
    > >
    > > P.S. I know next to nothing about VB. So if you could explain what
    > > parameters I may need it would be quite useful and what is going on
    > > with the the code that would be helpful.

    >




  4. #4

    Re: Extracting/Exporting HTML Tables or PDF Tables into Excel

    For HTML to Excel, you might consider using the following script
    extract -
    ---------------------------------------------------------------
    sURL = "http://www.ibm.com"
    On Error GoTo error_handler
    Set objIE = CreateObject("InternetExplorer.Application")
    With objIE
    .Navigate sURL
    Do While .Busy: DoEvents: Loop
    RowNum = 1
    ColNum = 1
    With objIE.Document
    Set theTables = .all.tags("table")
    For Each Table In theTables
    For Each Row In Table.Rows
    For Each cell In Row.Cells
    ws.Cells(RowNum, ColNum) = cell.innerText
    ColNum = ColNum + 1
    Next
    RowNum = RowNum + 1
    Next
    Next
    End With
    End With
    Set objIE = Nothing
    Exit Sub
    ---------------------------------------------------------------
    For PDF to Excel, there's no direct tool I could found, but you might
    try PDF->HTML->Excel.

    For PDF to HTML, you can use pdf2html, freely available on
    sourceforge.net

    NickHK wrote:
    > Mohammed,
    > Reading the HTML files could be achieved with a web query. Look into
    > Data>Get External Data>New Web Query, selecting the table to import from.
    > Getting data out PDF and into XL can be done manually as I've not looked
    > into coding this:
    > Open the PDF in Acrobat, NOT the Reader.
    > Use the Select Table tool.
    > Right click and export or open in Excel, depending on your version of
    > Acrobat.
    > Or can save the PDF as HTML, then web query that.
    >
    > NickHK
    >
    > "Mo Money" <[email protected]> wrote in message
    > news:[email protected]...
    > > Hmm I have no idea...but your questions seems quite interesting.
    > >
    > > [email protected] wrote:
    > > > Hi All,
    > > >
    > > > I just had a quick programming/general excel question surrounding my
    > > > current dilemma. Essentially, I am trying to extract financial tables
    > > > from SEC filings (either made in PDF or HTML). Ideally I would like to
    > > > have the capability of searching an SEC filing for a specific table
    > > > (i.e. lets say a "Consolidated Income Statement") and then have a macro
    > > > which would export that table into excel without losing the formatting.
    > > > If you guys have any idea as to how to go about doing this and
    > > > potentially provide me some starter code I would greatly appreciate
    > > > that.
    > > >
    > > > Thanks
    > > >
    > > > Mohammed
    > > >
    > > > P.S. I know next to nothing about VB. So if you could explain what
    > > > parameters I may need it would be quite useful and what is going on
    > > > with the the code that would be helpful.

    > >



+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1