+ Reply to Thread
Results 1 to 16 of 16

Finding closest match in large dataset

  1. #1
    Registered User
    Join Date
    09-02-2024
    Location
    Bristol, England
    MS-Off Ver
    365
    Posts
    9

    Question Finding closest match in large dataset

    Hi all,

    I'm new here and looking some advice on a formula for finding the `best match` in a large dataset. I've attached an example
    I have a sheet of 95,000 locations, each are given a score based on certain criteria (listed A-V). I need a formula whereby if a user types in the scores they want to see (in the green cells), cell A12 will tell them the location that best matches their requirements (which is the location that matches each score (values in A-V) within a tolerance of, ideally, but not essentially, 5%). The scores will never be exact, but what I'm looking for is the closest overall match

    Due to security restrictions, VBA or Add-ins can't be used

    Any help would be gratefully received

    Example.xlsx
    Last edited by neil_999; 09-03-2024 at 05:47 AM. Reason: Attachment added

  2. #2
    Forum Expert
    Join Date
    09-25-2015
    Location
    Milan Italy
    MS-Off Ver
    office 365
    Posts
    2,009

    Re: Finding closest match in large dataset

    , Fast answers need clear examples. Post a small Excel sheet (not a picture) showing realistic & representative sample data WITHOUT confidential information (10-20 rows, not thousands...) and some manually calculated results. For a new thread (1st post), scroll to Manage Attachments, otherwise scroll down to GO ADVANCED, click, and then scroll down to MANAGE ATTACHMENTS and click again. Now follow the instructions at the top of that screen.

  3. #3
    Forum Guru TMS's Avatar
    Join Date
    07-15-2010
    Location
    The Great City of Manchester, NW England ;-)
    MS-Off Ver
    MSO 2007,2010,365
    Posts
    48,946

    Re: Finding closest match in large dataset

    Please define "the location that best matches their requirements."
    Trevor Shuttleworth - Retired Excel/VBA Consultant

    I dream of a better world where chickens can cross the road without having their motives questioned

    'Being unapologetic means never having to say you're sorry' John Cooper Clarke


  4. #4
    Registered User
    Join Date
    09-02-2024
    Location
    Bristol, England
    MS-Off Ver
    365
    Posts
    9

    Re: Finding closest match in large dataset

    I've added the attachment and been more specific with the requirements

  5. #5
    Forum Moderator AliGW's Avatar
    Join Date
    08-10-2013
    Location
    Retired in Ipswich, Suffolk, but grew up in Sawley, Derbyshire (both in England)
    MS-Off Ver
    MS 365 Subscription Insider Beta Channel v. 2507 (Windows 11 Home 24H2 64-bit)
    Posts
    91,800

    Re: Finding closest match in large dataset

    There is NO explanation of the values in pink. Are these lookup values or expected results? If the latter, how did you arrive at them? If the former, what would the expected results be?
    Last edited by AliGW; 09-03-2024 at 06:22 AM. Reason: Typos fixed.
    Ali


    Enthusiastic self-taught user of MS Excel who's always learning!
    Don't forget to say "thank you" in your thread to anyone who has offered you help. It's a universal courtesy.
    You can reward them by clicking on * Add Reputation below their user name on the left, if you wish.

    NB:
    as a Moderator, I never accept friendship requests.
    Forum Rules (updated August 2023): please read them here.

  6. #6
    Registered User
    Join Date
    09-02-2024
    Location
    Bristol, England
    MS-Off Ver
    365
    Posts
    9

    Re: Finding closest match in large dataset

    The % values in pink are the lookup values (against the others). The expected results would be the location number in pink where the lookup values most closely match collectively

  7. #7
    Forum Moderator AliGW's Avatar
    Join Date
    08-10-2013
    Location
    Retired in Ipswich, Suffolk, but grew up in Sawley, Derbyshire (both in England)
    MS-Off Ver
    MS 365 Subscription Insider Beta Channel v. 2507 (Windows 11 Home 24H2 64-bit)
    Posts
    91,800

    Re: Finding closest match in large dataset

    Mmm. I think you need to provide a couple more examples. It would help if you highlighted in the main table which is the 'best fit' in each column and explain how you'd break a tie.

  8. #8
    Forum Expert
    Join Date
    01-05-2013
    Location
    Singapore
    MS-Off Ver
    H&B2019
    Posts
    4,542

    Re: Finding closest match in large dataset

    One way I will do is to find the aggregate of AVERAGE, MAX, MIN and MEDIAN, and match the lowest aggregate to the location. Please refer formulas highlighted in yellow.
    Attached Files Attached Files

  9. #9
    Forum Expert
    Join Date
    09-01-2012
    Location
    Norway
    MS-Off Ver
    Office 365
    Posts
    2,887

    Re: Finding closest match in large dataset

    I just did the least square sum thing. No idea how valid that is.
    Attached Files Attached Files
    <----- If you were helped by my posts you can say "Thank you" by clicking the star symbol down to the left

    If the problem is solved, finish of the thread by clicking SOLVED under Thread Tools
    I don't wish to leave you with no answer, yet I sometimes miss posts. If you feel I forgot you, remind me with a PM or just bump the thread.

  10. #10
    Forum Expert
    Join Date
    09-01-2012
    Location
    Norway
    MS-Off Ver
    Office 365
    Posts
    2,887

    Re: Finding closest match in large dataset

    I ran your method against mine, josephteh. Interestingly they sometimes produce the opposite results. There seems to be almost no correlation between these two methods.

  11. #11
    Forum Expert
    Join Date
    09-01-2012
    Location
    Norway
    MS-Off Ver
    Office 365
    Posts
    2,887

    Re: Finding closest match in large dataset

    After googling this for several seconds I realized that weighted least square is probably the way to go. The plain least square that I posted will be dominated by the large numbers and small numbers will be more or less ignored.
    Weighting the data should even out that difference. I'll leave this to people who actually knows statistic.

  12. #12
    Registered User
    Join Date
    09-02-2024
    Location
    Bristol, England
    MS-Off Ver
    365
    Posts
    9

    Re: Finding closest match in large dataset

    This is amazing, thank you! Some great ideas

  13. #13
    Forum Expert
    Join Date
    01-05-2013
    Location
    Singapore
    MS-Off Ver
    H&B2019
    Posts
    4,542

    Re: Finding closest match in large dataset

    Jacc's solution is definitely better. Least square or weighted least square is definitely the way to go. Btw, how will you calculate weighted least square, Jacc?

    I made a mistake with my method.. ABS should apply to MIN as well.

    Actually, Excel has a function for this thing: CORREL.

    Here is a Rep for you, Jacc!
    Attached Files Attached Files
    Last edited by josephteh; 09-03-2024 at 08:08 PM.

  14. #14
    Forum Expert
    Join Date
    09-01-2012
    Location
    Norway
    MS-Off Ver
    Office 365
    Posts
    2,887

    Re: Finding closest match in large dataset

    Thanks josephtet! Nice comparison of methods!
    As for the weighted least square I don't know how to calculate that. I'm not even sure that is correct.
    I was hoping someone with a proper statistical background would show up in this thread and enlighten us.
    Last edited by Jacc; 09-04-2024 at 04:43 AM.

  15. #15
    Forum Expert
    Join Date
    09-01-2012
    Location
    Norway
    MS-Off Ver
    Office 365
    Posts
    2,887

    Re: Finding closest match in large dataset

    I just looked at it once more, your comparison matrix with good match highlighted in green might has me wondering. Should the comparison be made by division instead of subtraction? It would be difficult to match smaller numbers otherwise.

  16. #16
    Forum Expert
    Join Date
    01-05-2013
    Location
    Singapore
    MS-Off Ver
    H&B2019
    Posts
    4,542

    Re: Finding closest match in large dataset

    Here you go, Jacc! Subtraction or division do not seem to make any difference. Better to stick with CORREL or your least square solution.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. [SOLVED] Formula to match certain data in large dataset
    By EvansB in forum Excel General
    Replies: 8
    Last Post: 09-21-2023, 03:09 AM
  2. Index match equivalent in VBA for large dataset?
    By jhuvba in forum Excel Programming / VBA / Macros
    Replies: 8
    Last Post: 04-26-2018, 11:49 PM
  3. Replies: 3
    Last Post: 10-20-2017, 12:40 AM
  4. VBA code to replace INDEX/MATCH in large dataset
    By iantix in forum Excel Formulas & Functions
    Replies: 33
    Last Post: 07-25-2017, 11:23 PM
  5. Finding similar issues in a large excel dataset
    By Mary22 in forum Excel General
    Replies: 4
    Last Post: 07-01-2016, 01:53 AM
  6. [SOLVED] Finding ALL CAPS in large dataset
    By renee705 in forum Excel Formulas & Functions
    Replies: 4
    Last Post: 07-30-2012, 02:35 PM
  7. Finding first and bottomlast years in large dataset
    By laurafv in forum Excel General
    Replies: 2
    Last Post: 02-28-2011, 09:55 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1