Hi,
I am tasked with converting 100's of PDF pages from many files to Excel and creating a single database.
Can anyone tell me if Excel Power Query can help improve this process for me?
Attached is an example of what I am doing. The process is slow and exposed to error.
Step No.1 => Sheet No.1 => I first convert each PDF file and this is the result.
(NB: A converted PDF file will have multiple pages converted to multiple sheets in Excel).
Step No.2 => Sheet No.2 => I use a combination of formulas and manual operations to normalize the data.
(I do this to every sheet generated by the PDF to Excel conversion program).
Step No.3 => Sheet No.3 => I create a Table for each sheet and then use Power Query to remove any empty rows and return a finished table.
(NB: The attached shows only a single sheet example. Each PDF conversion will normally have many sheets, therefore I also use
Power Query to "Append" all the sheets (tables) into one. Later I append the table(s) returned from Power Query into one large table.
Step No.2 is the issue and I am interested to learn (determine) if Power Query can do the work I am doing manually and with formulas.
This project is my first introduction into Power Query. Everything I have learned so far (from online) does not provide examples similar to my situation.
Power Query seems very powerful and I am willing to learn how to use it. My concern is that for this task it cannot do what I need and I am thus no
further ahead.
Regards,
Spiros
Bookmarks