+ Reply to Thread
Results 1 to 1 of 1

Finding almost duplicates

  1. #1
    Registered User
    Join Date
    12-17-2013
    Location
    Greater NYC Area
    MS-Off Ver
    Excel 2010
    Posts
    84

    Finding almost duplicates

    Good afternoon all,

    I have two big sets of data.

    1. lists all of our clients and includes all sorts of information including their name, address, contact information etc.
    2. lists a set of prospective clients that we get from an outside database.

    My job is to compare the lists in data 1 with data 2. If there are duplicates those are to be ignored, but if they are not already in data 1 then I need to pull their data and input them in to our system.

    The trick is that our naming conventions rarely match up. So company X is listed as Jones Farm in data 1, and Jones Farm, Inc. in data 2 and so on and so forth. Sometimes the naming is off by larger amounts because of abbreviations. So in data 1 there is "Jones Farm, Inc." and in data 2 "JF, Inc."

    So what I normally do is first do a vlookup to get exact matches and then mark them as such.
    Then I do a fuzzy lookup from this macro that was put up on an excel forum several years ago and use it to compare the name of data 1 with the name of data 2.

    That has been working alright for me, aside from being painfully slow, but I'm sure there is a better way to compare data. Fuzzylookup seems like it is geared towards typos in the two data sets or minor misspellings, not so much big abbreviations.

    Then I remembered I have all these addresses and other contact markers. Is there a way to use a different macro perhaps, or to narrow my fuzzy or vlookups to evaluate not only the names, but also incorporate the addresses? I know that the naming convention problem extends to addresses as well so it won't be a panacea, but some things are pretty reliable like state abbreviations. So could I develop a formula / macro that evaluates both the names, and then in order to look more specifically also looks at the corresponding address components, to find possible matches?

    I will try and build a template of my problem and upload it as soon as possible. For confidentiality reasons I need to make up all new names.

    Thanks

    *note my profile may have listed that I use Excel 2010, but here at work I'm stuck on Excel 2007

  2. #2
    Registered User
    Join Date
    12-17-2013
    Location
    Greater NYC Area
    MS-Off Ver
    Excel 2010
    Posts
    84

    Re: Finding almost duplicates

    *update I actually just upgraded to excel 2013. Are there any tools in 2013 that would work for this that weren't available in 2007?

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. [SOLVED] Help finding duplicates
    By gogz2005 in forum Excel General
    Replies: 10
    Last Post: 03-13-2014, 05:07 PM
  2. Replies: 2
    Last Post: 12-27-2013, 09:24 AM
  3. Finding Duplicates
    By kumark in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 06-06-2013, 03:37 AM
  4. Finding duplicates
    By prd689 in forum Excel Formulas & Functions
    Replies: 2
    Last Post: 09-18-2008, 12:17 PM
  5. Finding Duplicates
    By nospaminlich in forum Excel Formulas & Functions
    Replies: 4
    Last Post: 02-05-2005, 08:06 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1