Hi,
I'm trying to do data name matching work with my job and I've looked into a number of Fuzzy Matching tools, but what I've anecdotally seen work best is a combination of name type specific "data cleaning", Word order algorithm and a Jaccard Coefficient.
Together, these have resulted in matching confidence as high as 95% but no lower than 70%. (very good for company name matching).
I hate to say it, but I've been working on this so long, that I'm just too wiped to code a Jaccard Coefficient right now.
So, here I am asking for help from my Excel Brothers-and-Sisters-in-Arms.
Does anyone here happen to have a Jaccard Coefficient formula laying around somewhere that they wouldn't mind posting?
It'd be greatly appreciated!
Thanks,
rjw524
Bookmarks