+ Reply to Thread
Results 1 to 5 of 5

Import Tables from PDF into Excel

  1. #1
    Registered User
    Join Date
    03-12-2014
    Location
    Houston
    MS-Off Ver
    Excel 2010
    Posts
    52

    Import Tables from PDF into Excel

    I am trying to import multiple PDF tables into Excel but the table appear on different pages. In one pdf (2020.pdf), the table can appear on Page 1, in 2021.pdf the table appears on Page 3 and in 2022.pdf the table appears in page 5. In this case the table headers are the same. How can I use Excel VBA to find this table header and pull these tables into excel? I have tried Power Query but it did not work for me.

    I have attached a sample PDFs by putting the tables on random pages, and the excel "output.xls" sheet is result that I am looking for.

    Overall, I plan to read through a folder that has about 700 pdf files with these tables but these tables do no appear on specific pages.
    Thanks for the anticipated help.
    Attached Files Attached Files

  2. #2
    Registered User
    Join Date
    09-02-2012
    Location
    NYC
    MS-Off Ver
    Excel 2007
    Posts
    1

    Re: Import Tables from PDF into Excel

    Try

    let PullPDF = (variable)=> let
    Source = Pdf.Tables(File.Contents(variable), [Implementation="1.3"]),
    List = List.Union(List.Transform(Source[Data], each Table.ColumnNames(_))),
    #"Expanded Data" = Table.ExpandTableColumn(Source, "Data", List,List) ,
    Row=Table.RemoveFirstN(#"Expanded Data",List.PositionOf(#"Expanded Data"[Column2],"Volume")),
    #"Removed Other Columns" = Table.SelectColumns(Row,{"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7"}),
    #"Promoted Headers" = Table.PromoteHeaders(#"Removed Other Columns", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"A3 Unit", type text}, {"Volume", Int64.Type}, {"∆V", Int64.Type}, {"Method", type text}, {"Volume_1", Int64.Type}, {"∆V_2", Int64.Type}, {"Delta#(lf)Volume", Int64.Type}})
    in #"Changed Type",


    Source2 = Folder.Files("c:\temp5"),
    #"Filtered Rows" = Table.SelectRows(Source2, each ([Extension] = ".pdf")),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows", "Data", each PullPDF([Folder Path]&[Name])),
    List = List.Union(List.Transform(#"Added Custom"[Data], each Table.ColumnNames(_))),
    #"Expanded Data" = Table.ExpandTableColumn(#"Added Custom", "Data", List,List)
    in #"Expanded Data"

  3. #3
    Registered User
    Join Date
    03-12-2014
    Location
    Houston
    MS-Off Ver
    Excel 2010
    Posts
    52

    Re: Import Tables from PDF into Excel

    How do I implement this? Do I put it in VBA?

  4. #4
    Forum Expert
    Join Date
    04-01-2013
    Location
    East Auckland
    MS-Off Ver
    Excel 365
    Posts
    1,343

    Re: Import Tables from PDF into Excel

    Horseyride is giving you a power query solution,
    A vba solution would likely require the acrobat library so maybe be is right that power query is going to be an easier solution.
    If you want something done right... find a forum and ask an online expert.

    Time flies like an arrow. Fruit flies like a banana.

  5. #5
    Registered User
    Join Date
    03-12-2014
    Location
    Houston
    MS-Off Ver
    Excel 2010
    Posts
    52

    Re: Import Tables from PDF into Excel

    Quote Originally Posted by horseyride View Post
    Try

    let PullPDF = (variable)=> let
    Source = Pdf.Tables(File.Contents(variable), [Implementation="1.3"]),
    List = List.Union(List.Transform(Source[Data], each Table.ColumnNames(_))),
    #"Expanded Data" = Table.ExpandTableColumn(Source, "Data", List,List) ,
    Row=Table.RemoveFirstN(#"Expanded Data",List.PositionOf(#"Expanded Data"[Column2],"Volume")),
    #"Removed Other Columns" = Table.SelectColumns(Row,{"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7"}),
    #"Promoted Headers" = Table.PromoteHeaders(#"Removed Other Columns", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"A3 Unit", type text}, {"Volume", Int64.Type}, {"∆V", Int64.Type}, {"Method", type text}, {"Volume_1", Int64.Type}, {"∆V_2", Int64.Type}, {"Delta#(lf)Volume", Int64.Type}})
    in #"Changed Type",


    Source2 = Folder.Files("c:\temp5"),
    #"Filtered Rows" = Table.SelectRows(Source2, each ([Extension] = ".pdf")),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows", "Data", each PullPDF([Folder Path]&[Name])),
    List = List.Union(List.Transform(#"Added Custom"[Data], each Table.ColumnNames(_))),
    #"Expanded Data" = Table.ExpandTableColumn(#"Added Custom", "Data", List,List)
    in #"Expanded Data"
    WOW...I tried it in PowerQuery and it worked.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Import data to Word from Excel tables
    By 2pado in forum Excel General
    Replies: 0
    Last Post: 04-07-2023, 02:39 PM
  2. Import multiple excel tables at once.
    By kris01 in forum Word Formatting & General
    Replies: 2
    Last Post: 10-15-2020, 12:49 AM
  3. How to Import Data and Tables from Excel into a Word Doc
    By todd18us in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 07-23-2015, 10:00 AM
  4. Import tables from all Word documents within a folder into excel
    By Milk Snake in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 01-17-2014, 12:03 PM
  5. Import many excel files to many tables in one access database
    By boss1982 in forum Access Tables & Databases
    Replies: 3
    Last Post: 12-26-2012, 04:05 PM
  6. [SOLVED] Automating import of Word tables into Excel
    By Han in forum Excel General
    Replies: 0
    Last Post: 01-18-2006, 04:45 AM
  7. How do I import tables in pdf format into office excel 2003 ?
    By New Excel 2003 User in forum Excel General
    Replies: 1
    Last Post: 05-23-2005, 02:06 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1