Hello,
I want a tool that identifies any matches/similarities from one list of names, to another list of names. This will be helpful to quickly compare raw data to what we have on file and recognize any duplicates/similarities.
Problem:
In a perfect world, the raw data (names) I receive comes spelled correctly and in a standard format: First Name + Last Name. This isn’t the case.
As you will see in my attached example, I will have two lists I want to compare: Correct Spelling and Raw Data. I want to ensure that any names under Raw Data will be highlighted/bolded to signify that we already have it on file.
This example shows that I don’t receive perfect raw data, in more ways than one. The data I receive sometimes has spelling mistakes and mixed formats, along with commas and hyphens.
I purposefully made only one record correct and identifiable (Sasha Vi) in the long list of ‘#NA’s. As for the other names, they are only partially similar, but I’m not sure how to make their similarities/differences quantifiable.
Solution Preference:
I prefer a solution where there’s a ‘percentage of accuracy/match’ next to the name under Raw Data. This percentage would then later be conditionally formatted by colour.
Research:
- I found an add-in for excel: ‘Fuzzy Match/Lookup’. I’m not sure I want to rely on this add-in. If this is the only solution to my problem, then I’ll look into it some more.
- I am familiar with VBA, so if that’s the solution, I would be happy to receive any coding ideas.
I hope there’s an easy fix. Or better yet, some work-arounds that I haven’t considered.
Thanks in advance for the help!
Bookmarks