I am happy to join your group. I got into programming long ago when most programmers came from non-programming backgrounds. I started ‘programming’ with Lotus 123 macros. I moved to dBase then an advanced form of dBase, Clipper. From Clipper I advanced to PowerBuilder a client server product. Currently I am a PL/SQL developer. I am not all that Access savvy but I am willing to learn.
I have been assigned a yearly task. I copy hundreds of spreadsheets from CDs to a hard disk. I must reduce the spread sheets to 18 common and needed columns, then append them all into 1 long spread sheet or access table then export it to a CSV file. Probably each sheet was customized by a different person. The spreadsheet column count ranges from 18 to 30. Even the common columns may not have consistently named headers. They can be formatted in too many ways to count. Many have protected cells preventing an easy copy. I used a batch process that would bulk export spreadsheets to csv files but some of the columns had too many commas for the export to run correctly. Those rows consistently had too many columns. This is also true for Excel which only encloses text columns with double quotes if needed. That logic breaks down for some columns with many commas. Only Access which uses no logic, incasing all text columns within “” consistently exported the data correctly. The extra columns are normally found after column 10 and before the last 8 columns.
I have about a year before I get to do this again. So far, I can bulk export the SSs to csv test files using a utility. Run them through a Clipper routine that counts column header names, counts rows, detects too many comma columns and creates a report, a row for each spreadsheet. This way I can bulk process the sheets by processing sheets that all have the same number of columns. The CSV files created by the utility can’t be used if the file has some too may comma rows. I have another utility that will process the combined CSV. It creates new columns and will check for various errors and remove the header rows. It checks some of the last columns that can be easily tested such as a zip code column. I can view all the rows that have non-numeric values for zip code. The errors should be header info or Canadian zips. These should be less than 1% so a quick review works for me. I keep the headers until the last step. This allows me to reprocess any ‘bad’ spreadsheets manually then cut and paste the corrections where they belong.
I would prefer to automate this process as much as I can. My first job was completely manual and I suspect I made mistakes I didn’t catch but 20,000 rows made it difficult to detect the remaining errors for both me and my clients. I am using Office 2010 I can’t find the 'about information' but I guess it is the professional version.
I did find this link http://support.microsoft.com/kb/291296/en-us. For my money if I was to do VB programming it would be in Access. I would like to process all the spreadsheets in a folder. For the SSs with more than 18 columns I would to delete the extra columns. Even if I could process SSs with the same number of columns that would be enough.
If it exports the tables to csv or anything else would be a bonus.
Thanks!
Bookmarks