+ Reply to Thread
Results 1 to 7 of 7

Highlight and Tag Unique Instances

  1. #1
    Forum Contributor
    Join Date
    06-21-2010
    Location
    -
    MS-Off Ver
    Excel 2010
    Posts
    1,211

    Highlight and Tag Unique Instances

    I have a list of about 20,000 items in a single column. It is a concatenated list containing name, address, and bank account details. I need to try and find a way to highlight unique duplicate entries and then put ideally a sequential number in the adjoining column, so for example if I have in Column A say:

    ABC Company ABC Road 123456 98765432
    XYZ Windows Roman Road 456783 546512
    Rotten Green Tomatoes Tomatoe Avenue 458712 56128693
    Big Yellow Box Little Yellow Street 345612 56872523
    ABC Company ABC Road 123456 98765432
    Rotten Green Tomatoes Tomatoe Avenue 458712 56128693
    Big Blue Sky Blue Mews 451278 65329845

    I would want row 1 and 5 to have a highlight added (which I could obviously do using the highlight duplicate option), but what I want to do is to put the number 000001 in Column B against both instances of 'ABC Company ABC Road 123456 98765432' and then number 000002 in Column B against both instances of 'Rotten Green Tomatoes Tomatoe Avenue 458712 56128693' and so on, so I can isolate the duplicates.

    I also need to find a similar method to 'tag' partial matches, is this possible?

    Many thanks

  2. #2
    Forum Expert JBeaucaire's Avatar
    Join Date
    03-21-2004
    Location
    Bakersfield, CA
    MS-Off Ver
    2010, 2016, Office 365
    Posts
    33,492

    Re: Highlight and Tag Unique Instances

    Partial matches?? Eeek. Good luck with that!

    To "code" your like items, one way:

    1) Use the Advanced Filter on the data to create a UNIQUE List in another column, say column M.
    2) In column N add a "code" to index each item. You could put '000001 in N2 and then pull down the lower right corner to create a sequential list
    3) Now over in column B, use a VLOOKUP formula to get your "code" for each string. In B2: =VLOOKUP($A2, $M:$N, 2, 0)
    ...copy B2 downward.
    _________________
    Microsoft MVP 2010 - Excel
    Visit: Jerry Beaucaire's Excel Files & Macros

    If you've been given good help, use the icon below to give reputation feedback, it is appreciated.
    Always put your code between code tags. [CODE] your code here [/CODE]

    ?None of us is as good as all of us? - Ray Kroc
    ?Actually, I *am* a rocket scientist.? - JB (little ones count!)

  3. #3
    Forum Contributor
    Join Date
    06-21-2010
    Location
    -
    MS-Off Ver
    Excel 2010
    Posts
    1,211

    Re: Highlight and Tag Unique Instances

    Hi JBeaucaire,

    Thanks for that suggestion, I'll give that a go to see if it soes what I need it to!

    How about the Microsoft Fuzzy Logic Add In? I've not tried it as I'm running Excel 2007 and I think it only works with 2010 (I can't find a downloadable version for 2007). Do you or anyone know if that would help with both exact and partial matches?

  4. #4
    Forum Expert JBeaucaire's Avatar
    Join Date
    03-21-2004
    Location
    Bakersfield, CA
    MS-Off Ver
    2010, 2016, Office 365
    Posts
    33,492

    Re: Highlight and Tag Unique Instances

    Quote Originally Posted by HangMan View Post
    Do you or anyone know if that would help with both exact and partial matches?
    Based on what? This is a can of worms like no other. The human eye sees context that Excel simply can't. You have to explain everything. You see all these as equivalent:


    John Doe, 1234 Main Street, Sometown, Pennsylvania
    J Doe, Sometown
    J Williams Doe, PA
    Doe Residence, Main Street, PA, USA

    And this is just GLARING contextual comparisons. You're probably wanting even more surgical matches with only a few characters missing.

    Partial matches is not a trivial matter. You would have to create and solve every possible "scenario" in a macro trying to accomplish this, and I'm not up for that. Sorry.

  5. #5
    Forum Contributor
    Join Date
    06-21-2010
    Location
    -
    MS-Off Ver
    Excel 2010
    Posts
    1,211

    Re: Highlight and Tag Unique Instances

    I wasn't expecting you to even attempt to write a macro for this! That would truly be an almost impossible task I'm sure!

    I wasn't sure if Microsoft's own Fuzzy Logic Add In might do the trick, but as mentioned, I think it is only available for 2010, I'm running 2007, so I've not been able to test it and wondered if anyone could advise or has experience using it and whether it will do what I'm trying to achieve?

    I totally agree that the human eye sees context that Excel simply can't. My issue is that trying do do this manually for 20,000 records is incredibly time consuming. I made a start and have calculated that it would take around 150 hours to go through the whole list, which isn't very productive use of my time, so I was hoping to find some sort of automated method.

    Sure, I could identify 'exact matches', but there are so many factors involved from typo's to duplicates where the data doesn't always sit in the same column for what are effectively duplicate records, which only a critical eye could identify using normal means, hence the thoughts re Fuzzy Locgic...

    Any thoughts, input would be greatly received...

  6. #6
    Forum Expert JBeaucaire's Avatar
    Join Date
    03-21-2004
    Location
    Bakersfield, CA
    MS-Off Ver
    2010, 2016, Office 365
    Posts
    33,492

    Re: Highlight and Tag Unique Instances

    I don't, no. I'll draw some other eyes to see what others may offer that's helpful.

  7. #7
    Forum Guru TMS's Avatar
    Join Date
    07-15-2010
    Location
    The Great City of Manchester, NW England ;-)
    MS-Off Ver
    MSO 2007,2010,365
    Posts
    44,411

    Re: Highlight and Tag Unique Instances

    A couple of comments:

    1. Your example doesn't actually show any partial matches, unless I'm missing something.

    2. It looks like JB's suggestion would give you what you have asked for. But I don't see the point unless you're going to sort on the number. Even then, there's no partial matching.

    You might just as well sort by company and address and highlight (using Conditional Formatting) the first row when they are different.

    I think a sample workbook with some typical, if desensitised, data would be useful.


    Regards, TMS
    Trevor Shuttleworth - Retired Excel/VBA Consultant

    I dream of a better world where chickens can cross the road without having their motives questioned

    'Being unapologetic means never having to say you're sorry' John Cooper Clarke


+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1