+ Reply to Thread
Results 1 to 11 of 11

How to clean up and format very messy date data (mix of text and date)

  1. #1
    Registered User
    Join Date
    07-10-2012
    Location
    Washington, DC, USA
    MS-Off Ver
    Excel 2016
    Posts
    13

    How to clean up and format very messy date data (mix of text and date)

    Hi everyone,

    There are LOTS of posts regarding Excel date formatting (172 pages or so) but tons of skimming, I didn't see my situation--so apologies in advance if this is a duplicate thread.

    I'm helping a nonprofit organization with their records and came across a VERY messy dataset for their donations. I can only imagine how many volunteers entered the data using their own data entry preferences.
    This data has nearly 2400 rows, where each row is a donation and should have a date. While nearly all of the rows have a date, the formats vary widely. My goal is to get all the dates into some sort of common format and I can later use for sorting and in other functions. No luck after many hours of fiddling with it.
    Any help is immensely appreciated! I've extracted the dates into a sample file and attached it here.

    I have 4 categories of dates to get into a clean, common format.
    1. Most of the dates were entered as MM-DD-YY (or YYYY), although Excel is reading them as DD-MM-YYYY. That isn't so bad to fix, but is complicated with the other date data entries.
    2. Some date info looks numeric (MM-DD-YY), but is text. Again, not so bad to fix, but is troublesome given the other type of date data in the column.
    3. Also, in many rows, the date info contains inconsistently formatted alphanumeric data that I haven't been able to nicely identify and convert.
    4. Lastly, there are a few random rows that are not date information at all or has non-date text mixed with date info--the non-date text doesn't belong in the date column and needs its own column.

    Below is an example of the messy dates with Cat. matching the 4 categories above and testing to see which rows are text v numeric data:
    Cat. IsText Date
    1 FALSE 04-10-06
    2 TRUE 09-16-2004
    1 FALSE 03-09-05
    1 FALSE 08-03-06
    3 TRUE Nov. ‘02
    3 TRUE Jan/Feb 2011
    2 TRUE 2/16/03
    2 TRUE 2/22/04
    1 FALSE 01-02-05
    4 TRUE 8/12/04 (scholarship)
    4 TRUE Paid

    I tried:
    Cell formatting features, but it didn't work for me due to the different text & numeric format types in each cell.
    Text-to-Columns to help separate and format each piece, but again the mixed formats caused trouble.
    Many text conversion formulas found on this and other forums, but most just produced #Value results.

    Unfortunately, I don't know VBA, but I might be able to follow a solution that uses it.

    Any thoughts on how this can be cleaned up? It would be great to have the dates in (any format) one column and any 'extra' non-date text in a separate column.
    For rows that only show a year (or month & year), I don't mind using a 01 proxy for the missing day (or month & day).
    For any instances that have more than one month listed, it doesn't matter to me which one is kept and the other can be dropped.
    If you find a solution in the sample file, please keep the Count column so I can match the data to my master file.

    Again, any thoughts are greatly appreciated! Thank you for the time and consideration.
    Attached Files Attached Files

  2. #2
    Forum Guru AlKey's Avatar
    Join Date
    07-20-2009
    Location
    Lakeland, FL USA
    MS-Off Ver
    Microsoft Office 2010/ Office 365
    Posts
    8,903

    Re: How to clean up and format very messy date data (mix of text and date)

    Enter formula in C2 and copy down
    Formula: copy to clipboard
    Please Login or Register  to view this content.
    If you like my answer please click on * Add Reputation
    Don't forget to mark threads as "Solved" if your problem has been resolved

    "Nothing is so firmly believed as what we least know."
    --Michel de Montaigne

  3. #3
    Forum Moderator
    Join Date
    01-21-2014
    Location
    St. Joseph, Illinois U.S.A.
    MS-Off Ver
    Office 365 v 2403
    Posts
    13,406

    Re: How to clean up and format very messy date data (mix of text and date)

    This is much longer. Attempts to remove additional characters.

    Formula: copy to clipboard
    Please Login or Register  to view this content.


    Filter on this column for 1905, 1900, etc. and do those manually. I found only 80.
    Dave

  4. #4
    Registered User
    Join Date
    07-10-2012
    Location
    Washington, DC, USA
    MS-Off Ver
    Excel 2016
    Posts
    13

    Re: How to clean up and format very messy date data (mix of text and date)

    Thank you very much to AlKey and FlameRetired for the assist. However, I my results from both solutions didn't quite match want I need. Perhaps I miscommunicated.

    In the example here, I only use the written month to illustrate the difference in the needed and previous solution dates.
    Help AlKey FlameRetired Reformatted What it should be Comments
    04-10-16 38994 38994 4-Oct-16 10-Apr-16 Month & day are flipped
    09-16-2004 42614 1-Sep-16 16-Sep-2004 Month & day are flipped, FlameRetired had the wrong year & AlKey had or no result/text conversion
    Nov. ‘02 37561 1-Nov-02 1-Nov-02 This one works. Thanks FlameRetired.
    Jan/Feb 2011 40575 1-Feb-11 1-Feb-11 or 1-Jan-11 This one works. Thanks FlameRetired.
    8/12/04 (scholarship) 38329 8-Dec-04 12-Aug-04 & "scholarship" in next column AlKey was blank/skipped; FlameRetired had Month & Day flipped and dropped the text
    Paid #N/A #N/A 1st col. blank, "Paid" in next col. AlKey was blank/skipped; FlameRetired was #N/A

    AlKey had a great first pass, while FlameRetired picked up many of the dates that were entered as text and extracting the text date that had other non-date text in the cell. However, FlameRetired's formula flipped the months & days for numeric dates, while working well for text dates. Both formulas either passed over or dropped non-date text.

    Quite the head scratcher, but I REALLY appreciate the help. Those are some clever formulas!
    Last edited by RUJedi; 08-04-2017 at 12:02 PM. Reason: Put example feedback into a table for easier viewing.

  5. #5
    Forum Moderator
    Join Date
    01-21-2014
    Location
    St. Joseph, Illinois U.S.A.
    MS-Off Ver
    Office 365 v 2403
    Posts
    13,406

    Re: How to clean up and format very messy date data (mix of text and date)

    RUJedi thank you for the detailed feedback. I/we really appreciate when the OP does that.

    I couldn’t get the days/months to flip at my end. Those were all correct

    I was able to take care of most of the year only “dates”. With this formula there are only 3 items left to clean up manually.
    Formula: copy to clipboard
    Please Login or Register  to view this content.

    I was not aware that you wanted formula to extract the “notes” part. Unfortunately this formula only returns a portion of the extra notes. Many of the strings are only partial. After applying this formula filtering and manually correcting those might be your only option.
    Formula: copy to clipboard
    Please Login or Register  to view this content.

  6. #6
    Registered User
    Join Date
    07-10-2012
    Location
    Washington, DC, USA
    MS-Off Ver
    Excel 2016
    Posts
    13

    Re: How to clean up and format very messy date data (mix of text and date)

    Thanks again FlameRetired.

    Looks like we're getting there, but I found a few bugs in your recent and previous formulas.
    The first one did something weird that I didn't catch on your initial reply. When a date was initially entered as numeric, your formula produced the right date. However, when it was entered as text, your formula treated the day as the year, dropped the actual year, and added a 1 for the day which was not there previously. Example below. I tinkered to change dash (-) and slash (/) separators in the source cell, but it didn't seem to make any difference in the result.

    Help Help type FR's forumula Reformatted What it should be Comments
    07-09-01 number 37141 07-Sep-01 07-Sep-01 This works
    03/30/05 text 11018 1-Mar-30 30-Mar-05 Used the day as the year, dropped the actual year, used 1 as the day which was not in the original date

    Your second formula did as great job extracting the non-date text from the date. I saw what you meant about sometime extracting only part of the text. There are not many cells that have both date and non-date text, so I should be able to manually correct them. Thanks! I might also have to manually correct a few instances of incorrect dates it produced, but that isn't a big deal. For reference of incorrect dates, it did this:
    Help FR's forumula Reformatted What it should be 2nd column Comments
    Deposited in Bratislava in 2001 2001 23-Jun-05 01-01-2001 in Bratislava in 2001 Looks like the formula found the year as text and put that as the unformatted date, which then made the wrong date when formatted. I can fix the 2nd column.

    I'll keep looking up if there are ways to recognize and fix the month & day flips. Thanks again.
    Last edited by RUJedi; 08-05-2017 at 07:46 PM. Reason: added column in table so show if text or number.

  7. #7
    Forum Moderator
    Join Date
    01-21-2014
    Location
    St. Joseph, Illinois U.S.A.
    MS-Off Ver
    Office 365 v 2403
    Posts
    13,406

    Re: How to clean up and format very messy date data (mix of text and date)

    RUJedi I am getting the right date on 03/30/05 at my end. See the attachment.

    I am not sure how to resolve special exceptions like Deposited in Bratislava in 2001 in formula. I've reached the point where solution of each exception defeats solutions already solved.

    Have a look at the attached.

  8. #8
    Registered User
    Join Date
    07-10-2012
    Location
    Washington, DC, USA
    MS-Off Ver
    Excel 2016
    Posts
    13

    Re: How to clean up and format very messy date data (mix of text and date)

    Thanks again FlameRetired. Your help is much appreciated.
    Did you get the right date for 3/30/05? I downloaded your file and the original data of 3/30/05 (March 30th 2005) appears on row 18, consistent with my sample file that I applied your formula. In your file, C18 shows 01-03-30 (March 1st 2030). Where else did you see the 03/30/05 converted properly?

    Curious--I notice your use of ROW($1:$30) nested within the larger date converting formula, although the formula was copied and operates down all 2373 rows. What was the purpose of the Row 1 through 30 range, but more specifically--why did you stop it at 30 rather than row 2373? At a first glance, using ROW$1$30 produces the same output as ROW$1$2373.

    No worries on extra solutions for the special exceptions. There are few enough that I should be able to fix them manually. Your formula to extract them in a separate column was the real heavy lifting.

    Thanks again!

  9. #9
    Forum Moderator
    Join Date
    01-21-2014
    Location
    St. Joseph, Illinois U.S.A.
    MS-Off Ver
    Office 365 v 2403
    Posts
    13,406

    Re: How to clean up and format very messy date data (mix of text and date)

    Where else did you see the 03/30/05 converted properly?
    In Count numbers 1345 and 1993.

    C18 shows 01-03-30 (March 1st 2030)
    I don't find 01-03-30 anywhere in the 'messy' dates column and there are no dates in column C in the year 2030. They only go as high as 11/2/2017 (Count number 5). Although I just noticed in Count number 497 that Nov. 02 failed to convert properly.

    Curious--I notice your use of ROW($1:$30) nested within the larger date converting formula, although the formula was copied and operates down all 2373 rows. What was the purpose of the Row 1 through 30 range, but more specifically--why did you stop it at 30 rather than row 2373?
    The ROW($1:$30) doesn't apply to spreadsheet rows. It applies to the num_chars argument

    LEFT(text, [num_chars])

    ROW($1:$30) returns multiple increasingly longer strings starting at one character until exhausting the length of the source text. I chose 30 because it was sufficiently large enough to exhaust the length of the text. Its value from there is arbitrary.

    I you type in a temporary cell E19 (Count 18) this formula you will see what I mean.

    =LEFT(B19,ROW($1:$30))

    Then click in the formula bar and press the F9 function key you will see ...
    {"3";"3/";"3/3";"3/30";"3/30/";"3/30/0";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05";"3/30/05"}

    Once converted to numbers

    {3;#VALUE!;42797;42824;#VALUE!;36615;38441;38441;38441;38441;38441;38441;38441;38441;38441;38441;38441;38441;38441;38441;38441;38441;38441;38441;38441;38441;38441;38441;38441;38441}

    LOOKUP ignores errors and returns the right most number in that array that is less than 1E306 (a sufficiently large number guaranteed to be larger than anything in the array).

    Hope that answers.

    Glad to hear these have worked out, and thank you for the feedback.

  10. #10
    Registered User
    Join Date
    07-10-2012
    Location
    Washington, DC, USA
    MS-Off Ver
    Excel 2016
    Posts
    13

    Re: How to clean up and format very messy date data (mix of text and date)

    Very clever use of the ROW$1:$30. Thanks for the detailed explanation.

  11. #11
    Forum Moderator
    Join Date
    01-21-2014
    Location
    St. Joseph, Illinois U.S.A.
    MS-Off Ver
    Office 365 v 2403
    Posts
    13,406

    Re: How to clean up and format very messy date data (mix of text and date)

    You're welcome. Thanks for the feedback.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Stop values from being entered and clean up messy code
    By cgibbons461 in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 01-31-2017, 11:02 AM
  2. [SOLVED] Dates in Text and Date format, need help converting all to date format...
    By Branbran10 in forum Excel Formulas & Functions
    Replies: 9
    Last Post: 03-14-2016, 12:21 PM
  3. Replies: 2
    Last Post: 05-23-2015, 08:27 AM
  4. [SOLVED] Converting Date Text to Number format to use in a VLOOKUP to polpulate data based by date
    By Rossovich in forum Excel Formulas & Functions
    Replies: 12
    Last Post: 04-08-2015, 10:43 AM
  5. Parse, Clean and Format Address Data from Text String
    By jerrydiaz in forum Excel Programming / VBA / Macros
    Replies: 12
    Last Post: 12-04-2013, 11:52 AM
  6. Cannot convert date (which is in text format) to date in numeric format
    By geniuspro in forum Excel Charting & Pivots
    Replies: 2
    Last Post: 10-19-2012, 09:21 AM
  7. [SOLVED] Keep Date in Date Text Format not Date Value
    By wonderfulle in forum Excel General
    Replies: 2
    Last Post: 08-10-2010, 07:44 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1