Hi everyone,
There are LOTS of posts regarding Excel date formatting (172 pages or so) but tons of skimming, I didn't see my situation--so apologies in advance if this is a duplicate thread.
I'm helping a nonprofit organization with their records and came across a VERY messy dataset for their donations. I can only imagine how many volunteers entered the data using their own data entry preferences.
This data has nearly 2400 rows, where each row is a donation and should have a date. While nearly all of the rows have a date, the formats vary widely. My goal is to get all the dates into some sort of common format and I can later use for sorting and in other functions. No luck after many hours of fiddling with it.
Any help is immensely appreciated! I've extracted the dates into a sample file and attached it here.
I have 4 categories of dates to get into a clean, common format.
1. Most of the dates were entered as MM-DD-YY (or YYYY), although Excel is reading them as DD-MM-YYYY. That isn't so bad to fix, but is complicated with the other date data entries.
2. Some date info looks numeric (MM-DD-YY), but is text. Again, not so bad to fix, but is troublesome given the other type of date data in the column.
3. Also, in many rows, the date info contains inconsistently formatted alphanumeric data that I haven't been able to nicely identify and convert.
4. Lastly, there are a few random rows that are not date information at all or has non-date text mixed with date info--the non-date text doesn't belong in the date column and needs its own column.
Below is an example of the messy dates with Cat. matching the 4 categories above and testing to see which rows are text v numeric data:
Cat. IsText Date
1 FALSE 04-10-06
2 TRUE 09-16-2004
1 FALSE 03-09-05
1 FALSE 08-03-06
3 TRUE Nov. ‘02
3 TRUE Jan/Feb 2011
2 TRUE 2/16/03
2 TRUE 2/22/04
1 FALSE 01-02-05
4 TRUE 8/12/04 (scholarship)
4 TRUE Paid
I tried:
Cell formatting features, but it didn't work for me due to the different text & numeric format types in each cell.
Text-to-Columns to help separate and format each piece, but again the mixed formats caused trouble.
Many text conversion formulas found on this and other forums, but most just produced #Value results.
Unfortunately, I don't know VBA, but I might be able to follow a solution that uses it.
Any thoughts on how this can be cleaned up? It would be great to have the dates in (any format) one column and any 'extra' non-date text in a separate column.
For rows that only show a year (or month & year), I don't mind using a 01 proxy for the missing day (or month & day).
For any instances that have more than one month listed, it doesn't matter to me which one is kept and the other can be dropped.
If you find a solution in the sample file, please keep the Count column so I can match the data to my master file.
Again, any thoughts are greatly appreciated! Thank you for the time and consideration.
Bookmarks