+ Reply to Thread
Results 1 to 22 of 22

Compare words between two cells and get matching percentage

  1. #1
    Registered User
    Join Date
    12-01-2006
    Posts
    55

    Question Compare words between two cells and get matching percentage

    Hi all,

    I have a column with 10,000+ lines, and lots of them are duplicated.
    I have a formula that tells me how many times they are duplicated, but it only looks for 100% matching cells.

    Now, lots of cells are partially duplicated, for example:

    A1 - Team Abcde
    A2 - Team Abcde Fghi
    B1 - Team Abc

    I would like to have a formula giving me a percentage of how two cells are similar to each others by checking the words.
    In this case, B1 is 50% similar to A1 because only the word 'Team' is duplicated.
    B1 is only 33,34% similar to A2.

    I hope any of you can give me a hand on this, thanks :-)
    Last edited by paokun; 04-26-2010 at 11:04 AM.

  2. #2
    Registered User
    Join Date
    04-29-2008
    Location
    Tennessee
    MS-Off Ver
    2007
    Posts
    37

    Re: Compare words between two cells and get matching percentage

    disregard this post...I had a formula here but it didn't work.
    Last edited by Rebuild8; 04-19-2010 at 10:39 AM.

  3. #3
    Registered User
    Join Date
    04-15-2010
    Location
    Greeley, PA
    MS-Off Ver
    Excel 2007
    Posts
    13

    Re: Compare words between two cells and get matching percentage

    Quote Originally Posted by paokun View Post
    Hi all,

    I have a column with 10,000+ lines, and lots of them are duplicated.
    I have a formula that tells me how many times they are duplicated, but it only looks for 100% matching cells.

    Now, lots of cells are partially duplicated, for example:

    A1 - Team Abcde
    A2 - Team Abcde Fghi
    B1 - Team Abc

    I would like to have a formula giving me a percentage of how two cells are similar to each others by checking the words.
    In this case, B1 is 50% similar to A1 because only the word 'Team' is duplicated.
    B1 is only 33,34% similar to A2.

    I hope any of you can give me a hand on this, thanks :-)

    I was able to do this with a Levenshtein formula. I borrowed the VBA algorithm from here:

    http://en.wikibooks.org/wiki/Algorit...htein_distance

    Insert a module into your workbook by hitting Alt-F11, right-click vbaproject, insert > module. Paste this code into the window:
    Please Login or Register  to view this content.
    Now you can use this formula to compare A1 and B1:

    Please Login or Register  to view this content.
    This might compare a bit more accurately than you were asking for, as it is on a character-by-character basis rather than word-by-word (among other things). This gives matches of 80% and 53.33%, but I think it should suit your purposes.

  4. #4
    Registered User
    Join Date
    12-01-2006
    Posts
    55

    Re: Compare words between two cells and get matching percentage

    Hi, thanks a lot for your reply!
    I've tried the formula and it works, thing is that when I have, for example:

    A1 - A look at the sea.
    B1 - kkkkkkkk kkkkkk kkk

    It gives me a 10.53% match, but I'd like to have 0%.

    Also I would prefer to check by words. For example:

    A1 - A look at the sea.
    B1 - Beach look at the a.

    Without comparing the words order, out of 5 words, only 4 match, so I'd like to have 80% as a result.

    Pretty complicated I guess, but since you are geniuses I'm sure it's worth asking.

  5. #5
    Registered User
    Join Date
    12-01-2006
    Posts
    55

    Re: Compare words between two cells and get matching percentage

    Bump.. anyone?

  6. #6
    Forum Guru DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Northumberland, UK
    MS-Off Ver
    O365
    Posts
    21,531

    Re: Compare words between two cells and get matching percentage

    I think you need to clarify what happens regards

    a) punctuation, eg A and a. are not a match
    b) case sensitivity (eg A and a are not a match per se)

    and what happens in the case of say a common word with differing frequencies, eg:

    A1: my the weather is lovely
    B1: my oh my what wonderful weather

    In this instance what is the % output ?

    Given B seemingly takes precedence in calculating % of the 6 words used in B1 3 are found, however, "my" appears with differing frequency - once in A1 and twice in B1.

  7. #7
    Registered User
    Join Date
    12-01-2006
    Posts
    55

    Re: Compare words between two cells and get matching percentage

    Hi,

    Thanks a lot for your reply and questions!

    a) punctuation, eg A and a. are not a match
    No need to consider punctuation.

    b) case sensitivity (eg A and a are not a match per se)
    'Aurora' and 'aurora' should be considered different words.

    Regarding the frequency words appear, for example:
    A1: my the weather is lovely
    B1: my oh my what wonderful weather

    Each repeated word can be considered only once. So, B1 has 5 words, two of which appear in A1 too, so the output should be 40%.

    Still, complicated, but I appreciate any help! Thanks a lot.

  8. #8
    Forum Guru DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Northumberland, UK
    MS-Off Ver
    O365
    Posts
    21,531

    Re: Compare words between two cells and get matching percentage

    The case sensitivity makes this a little more complex I think...

    Please Login or Register  to view this content.
    the above, stored in a module, could be called from a cell along the lines of:

    Please Login or Register  to view this content.
    this would generate the results outlined for your examples with the exception of:

    Quote Originally Posted by paokun
    A1 - A look at the sea.
    B1 - Beach look at the a.

    Without comparing the words order, out of 5 words, only 4 match, so I'd like to have 80% as a result.
    which given stipulated case sensitivity should actually return 60% given a <> A.
    The UDF has optional third parameter (Boolean) for Case Sensitivity - default being True - if explicitly set to False in the call you would get the 80%, eg:

    Please Login or Register  to view this content.
    I hope that helps.

    Note: in the UDF only a basic punctuation assumption is made (eg period delimiter) - pending complexity of your real strings you will most likely need to elaborate on this part.

  9. #9
    Registered User
    Join Date
    12-01-2006
    Posts
    55

    Re: Compare words between two cells and get matching percentage

    Wow, this is fantastic stuff!

    Thanks a lot for your quick reply, I really appreciate it.
    It works like a charm, and being able to use the 'false' tag is also extremely helpful.

    I've got one last question. Would it be possible to compare B1 with a range of cells in A, and show as a result the highest matching percentage?

    Edit: .. and show as a result the highest matching percentage and the best matching cell?

    For example:
    C1 - 86%
    D1 - my the weather is lovely (<- text content of the most similar cell)
    Last edited by paokun; 04-21-2010 at 08:38 AM.

  10. #10
    Registered User
    Join Date
    04-15-2010
    Location
    Greeley, PA
    MS-Off Ver
    Excel 2007
    Posts
    13

    Re: Compare words between two cells and get matching percentage

    Edit: Unnecessary.
    Last edited by SpeedingLunatic; 04-21-2010 at 11:27 AM.

  11. #11
    Forum Guru DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Northumberland, UK
    MS-Off Ver
    O365
    Posts
    21,531

    Re: Compare words between two cells and get matching percentage

    I'm not quite sure what you're asking for... if you want the UDF to accept multiple strings for comparative purposes then perhaps the below adaptation might help ?

    Please Login or Register  to view this content.
    the above is now called along the lines of:

    Please Login or Register  to view this content.
    to return the greatest percentage

    Please Login or Register  to view this content.
    to return the associated string

    In the case of matching "max" % the first string is returned and as such the above code would stop should an exact match be found - ie it won't iterate through additional cells unnecessarily.

    As before you have the optional Boolean as final parameter to denote case sensitivity (default is TRUE - ie Case Sensitive matching)


    (edit: and apologies for slight delay in responding - I tend to pop in and out of the forum in the afternoons)
    Last edited by DonkeyOte; 04-21-2010 at 10:20 AM.

  12. #12
    Registered User
    Join Date
    12-01-2006
    Posts
    55

    Re: Compare words between two cells and get matching percentage

    Hi, apologies from my side for the delay, I've been out on a business trip and didn't get a minute to access the forum.
    I've just tried your solution and it works like a charm, you're a genius! I'm impressed, just copying and pasting your code makes the trick.

    Thanks a lot, I really appreciate your big help! :-)

  13. #13
    Registered User
    Join Date
    12-01-2006
    Posts
    55

    Re: Compare words between two cells and get matching percentage

    Edit!

    If I compare exactly the same sentence:
    A1 - Mark scored an absolutely great point.
    B1 - Mark scored an absolutely great point.

    it gives me 83.33%. Removing the dot in the end would give me a 100% match.
    Would it be possible to have 100% even with the dots?

    Also, I've got some cells with line-breaks:
    A1 - Mark scored an absolutely
    great point.
    B1 - Mark scored an absolutely great point.

    Although the content is a 100%, because of the line-break the percentage goes down to 50%.
    Also, the result text I get in D1 is:

    D1 - Mark scored an absolutelygreat point.

    It would be best to have it on two lines like A1.

    Sorry to keep pestering you, but we're almost there!

  14. #14
    Forum Guru DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Northumberland, UK
    MS-Off Ver
    O365
    Posts
    21,531

    Re: Compare words between two cells and get matching percentage

    Going forward it would be a lot easier if you posted sample files so we can ensure we're both dealing with the same variations.

    Based on your post perhaps:

    Please Login or Register  to view this content.
    Note however that regards:

    Quote Originally Posted by paokun
    It would be best to have it on two lines like A1.
    this can only be achieved by altering the word-wrap setting on the cell itself (via Formatting) - once done the result will be per A1 (ie the line break is present it's simply not visible given the cell formatting) ... the UDF can not alter the cell's format.

  15. #15
    Registered User
    Join Date
    12-01-2006
    Posts
    55

    Re: Compare words between two cells and get matching percentage

    Thank you very much again, that was quick!
    Now it's perfect, it does exactly what I need.

    Thank you again very much for your big support :-)

  16. #16
    Forum Guru DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Northumberland, UK
    MS-Off Ver
    O365
    Posts
    21,531

    Re: Compare words between two cells and get matching percentage

    No problem - please remember to denote thread as solved if resolved to your satisfaction.

  17. #17
    Registered User
    Join Date
    09-07-2010
    Location
    Madrid, Spain
    MS-Off Ver
    Excel 2003
    Posts
    1

    Re: Compare words between two cells and get matching percentage

    Sorry, my mistake. This code is great.

    thanks a lot
    Last edited by dosbajistas; 09-07-2010 at 11:48 AM.

  18. #18
    Registered User
    Join Date
    05-19-2011
    Location
    Portland, OR
    MS-Off Ver
    Excel 2007
    Posts
    11

    Re: Compare words between two cells and get matching percentage

    Will the macro work regardless of which two columns you're comparing? If I wanted to get a % between two cells but have several columns in the table..

    Thank you in advance.

  19. #19
    Registered User
    Join Date
    05-25-2009
    Location
    Orlando
    MS-Off Ver
    Excel 2007
    Posts
    9

    Re: Compare words between two cells and get matching percentage

    please remove
    Last edited by stussy5555; 12-07-2016 at 03:46 PM.

  20. #20
    Administrator FDibbins's Avatar
    Join Date
    12-29-2011
    Location
    Duncansville, PA USA
    MS-Off Ver
    Excel 7/10/13/16/365 (PC ver 2310)
    Posts
    52,917

    Re: Compare words between two cells and get matching percentage

    Quote Originally Posted by stussy5555 View Post
    Not to hijack this thread, but .........
    we actually have a rule about that, please start your own thread and reference this 1 if necessary
    1. Use code tags for VBA. [code] Your Code [/code] (or use the # button)
    2. If your question is resolved, mark it SOLVED using the thread tools
    3. Click on the star if you think someone helped you

    Regards
    Ford

  21. #21
    Registered User
    Join Date
    05-25-2009
    Location
    Orlando
    MS-Off Ver
    Excel 2007
    Posts
    9

    Re: Compare words between two cells and get matching percentage

    Quote Originally Posted by FDibbins View Post
    we actually have a rule about that, please start your own thread and reference this 1 if necessary
    Sorry about that, started a new thread

  22. #22
    Registered User
    Join Date
    10-29-2015
    Location
    sydney
    MS-Off Ver
    2013
    Posts
    1

    Re: Compare words between two cells and get matching percentage

    this message has been removed
    Last edited by coop624; 10-10-2019 at 03:14 AM. Reason: starting new thread

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1