+ Reply to Thread
Results 1 to 24 of 24

Finding similar cities based on multiple criteria

  1. #1
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Finding similar cities based on multiple criteria

    I have an example data table with cities and I want to find similar cities based on multiple criteria. I have no idea where to start but I think that correlation might be important. While I am writing this I am thinking about vlookup with an approximative match?

    The criteria could be a number of variables like population, number of tourists, number of sport arenas, number of cars or pretty much anything.

    I would like to say that City1 is similar to City 39 and City 39 is similar to City 45 based on the criteria.

    All suggestions are welcome on how I can achieve this result! A good start would be to see if similarity based on population can be achieved?

    Ex.

    City 1 has 100 000 people and City 2 has 50 000 people and City 3 has 95 000 people. City 1 and City3 would be similar based on number of people.

    See attached file for more details!
    Attached Files Attached Files

  2. #2
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    Any suggestions on how I should proceed? What formulas I could try to use?

    I have calculated cosine similarity for my cities but I don't know how to interpret the results.

    How should I interpret a cosine similarity score of 0,676881937 and a distance of 0,32 compared to a cosine similarity score of 0,413100032 and a distance of 0,59?

  3. #3
    Forum Expert dflak's Avatar
    Join Date
    11-24-2015
    Location
    North Carolina
    MS-Off Ver
    365
    Posts
    7,920

    Re: Finding similar cities based on multiple criteria

    Here is my take on it.

    Way over in columns Y-AA, set up the parameters for "similar." In this case, I set everything to be within +/- 25% of the selected city.

    Column V is merely a list of the cities. It's a pivot table overlaid with a name dynamic range called List_City. It's only purpose is to drive the drop-down list in cell B1.

    Column Q is another pivot table with an overlaid named dynamic range: List_Criteria. This is where you select the criteria by which you want to evaluate the cities. It is an AND condition. In this case, the population, number of tourists, and number of cars must all be within 25% for the two cities to be similar. Other criteria do not count.

    Cell C1 has the formula: =MATCH($B$1,Table_Cities[Name],0) - this formula finds the row number on which the selected city is in the City Table.

    Now we get down to the helper columns in columns H:M, All of these have a similar formula:
    =IF(ISNUMBER(MATCH("Population",List_Criteria,0)),AND([@Population]>=INDEX(Table_Cities,$C$1,2)*INDEX(Table_Criteria,1,2),[@Population]<=INDEX(Table_Cities,$C$1,2)*INDEX(Table_Criteria,1,3)),TRUE)

    It makes sense if we take it piece by piece.
    INDEX(Table_Cities,$C$1,2) - this gets the population of the selected city.
    INDEX(Table_Criteria,1,2) - this gets the lower limit for the metric
    INDEX(Table_Criteria,1,3) - this gets the upper limit for the metric

    So the inside part of the formula is evaluating whether the population for the selected city is within 25% of the population of the city on the individual rows.

    All this is wrapped in an if statement:
    IF(ISNUMBER(MATCH("Population",List_Criteria,0))
    This part of the formula evaluates if Population is one of the selected metrics. If so, then Population needs to be evaluated by the criteria. If the metric is not selected, then this statement is false; the metric does not matter and should not be considered as a limiting factor. So, make the cell TRUE and skip the evaluation.

    Column N has the formula: =AND(Table_Cities[@[C Population]:[C Other 2]]) - If all of the selected metrics are true (all of the non-selected metrics are automatically true) then this is a matching city.

    I applied some conditional formatting to highlight the rows. Naturally, a city matches itself.

    You can hide all of the helper cells.
    Attached Files Attached Files
    One spreadsheet to rule them all. One spreadsheet to find them. One spreadsheet to bring them all and at corporate, bind them.

    A picture is worth a thousand words, but a sample spreadsheet is more likely to be worked on.

  4. #4
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    Hi dflak,

    thank you for your excellent reply. There are a lot of pieces to understand but your take on it is really impressive and it is clever that you define what is "similar".

    I am going to see if I can replicate your take on it in my own version this evening. One similarity to rule them all.

  5. #5
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    I think that this workbook shows the 5 most similar and the 5 least similar cities in my data table based on 7 metrics.

    I am not sure on how to interpret the cosine similarity and the distance?
    Attached Files Attached Files

  6. #6
    Forum Expert dflak's Avatar
    Join Date
    11-24-2015
    Location
    North Carolina
    MS-Off Ver
    365
    Posts
    7,920

    Re: Finding similar cities based on multiple criteria

    I have no idea what is going on with that file.

    However, it does bring to my mind another approach which is less binary (it either matches or it doesn't). That is to compute the sum of the squares of the differences of all possible matches - those with the smallest sum are most similar, those with the largest are most unlike. However, this brings up the question as to if any of the criteria should have more weight than the others.

  7. #7
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    That is a good question and I don't know if I should weight the criteria and what the weights should be.

    If you are reading this thread, please take a look at my top5 file and see if you can help me figure out what is going on in the file.

  8. #8
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    I am still trying to figure out exactly what I am calculating and what it means in plain English.

    Any suggestions?

  9. #9
    Forum Expert dflak's Avatar
    Join Date
    11-24-2015
    Location
    North Carolina
    MS-Off Ver
    365
    Posts
    7,920

    Re: Finding similar cities based on multiple criteria

    When you do the least squares fit of a curve in regression analysis, what you are trying to do is minimize the some of the squares of the difference between the observed values and those computed by coordinates in the equation. I am using similar logic here. Once again I've arbitrarily set the criteria in a separate table. I assigned weights based on my personal preferences.

    Select a city in cell B1. The formulas are set up to compute the sum of the squares of the differences accounting for the weighting factors. The formulas only consider those metrics selected in Column J.

    Right click in either of the pivot tables for Best / Least Match - these tables provide the top / bottom 5 cities.

    Interestingly, this model and the other model yield similar results.
    Attached Files Attached Files

  10. #10
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    Thank you for your reply, your 2nd take on it is interesting and I am looking at it right now.

    It going to take some time to figure out how you did it!

  11. #11
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    I have read up on cosine similarity and it seems to be what I want, in my file top5.xlxs I am manually calculating cosine similarity.


    I am asking for your help with how to automatically:


    1. Calculate the dot product of two arrays with X number of rows and Y number of columns

    Please Login or Register  to view this content.
    B2:H2 is the baseline and I want to be able to select B to X columns by using a InputBox or an array with the desired columns.


    2. Calculate the square root of sum of squares

    Please Login or Register  to view this content.

    3. Calculate the cosine similarity

    Please Login or Register  to view this content.

    4. Calculate the cosine distance

    Please Login or Register  to view this content.

    5. Copy and paste the top5 and the bottom5 values to a another spreadsheet in the same workbook

    Please Login or Register  to view this content.
    Please Login or Register  to view this content.

    I don't understand how to go from formulas in my spreadsheet to VBA code and all help is very much appreciated.

  12. #12
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    I am trying to create this myself but I don't know where to start?

    Maybe with a fixed number of columns and rows?

    All suggestions are welcome!

  13. #13
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    Please Login or Register  to view this content.
    This is the code I have right now. I am trying to loop through a range and then put randbetween in the cells in the range.

    All help is very much appreciated!

  14. #14
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    Anyone? I am really struggling with this!

  15. #15
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    Please Login or Register  to view this content.
    How can I improve this code to automatically figure out the number of columns that is being used to insert my formulas in the right adjacent column?

  16. #16
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    Now I have a working macro for my file but I want to loop through a range of 1600 rows in steps of 40 and for every 40 line change the range of the formulas.

    The formulas are the same but the range of the formulas changes.

    I have tried to do it manually but there must be a better way??

  17. #17
    Forum Expert dflak's Avatar
    Join Date
    11-24-2015
    Location
    North Carolina
    MS-Off Ver
    365
    Posts
    7,920

    Re: Finding similar cities based on multiple criteria

    I suspect that this could be done with Tables and/or the MOD formula. However, I can't really tell without seeing the file. Post a copy of the workbook and maybe we can help you out.

  18. #18
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    Hi dflak, thank you for your reply.

    I am trying to automate the formulas used in the calculation, one thing that I am thinking about is the placement of the baseline city.

    It starts with city1 as the baseline, then city2 as the baseline all the way down to city40 as the baseline. This spreadsheet has 1600 rows + header but I also have a spreadsheet with 3481 + header rows.

    I think that it would be easiest to copy the current baseline city to the top of current batch of 40.

    Because then the formulas would be the same all the time and only the range changes?
    Attached Files Attached Files

  19. #19
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    Perhaps I could use OFFSET to offset the formulas since they change with the same 40 or X batches?

  20. #20
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    I am still trying to loop through these formulas, perhaps someone could show me how to loop through 40 items in a time and change the formula from one range to another.

    I think that I need a For loop and string concatenation? And perhaps offset to offset the range from column P to column T?

    Please Login or Register  to view this content.
    This gives me an error at
    Please Login or Register  to view this content.
    Last edited by waimea; 12-18-2018 at 05:52 AM.

  21. #21
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    I am still working on this withous success.

    Perhaps a combination of Resize and IF(AND formula could work??

    Please Login or Register  to view this content.

  22. #22
    Forum Expert dflak's Avatar
    Join Date
    11-24-2015
    Location
    North Carolina
    MS-Off Ver
    365
    Posts
    7,920

    Re: Finding similar cities based on multiple criteria

    Of the things you are trying, offset offers the most hope.

    http://www.utteraccess.com/wiki/Offs...Dynamic_Ranges

  23. #23
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    Thanks for your reply and I'll read up on your link!

  24. #24
    Forum Contributor
    Join Date
    06-30-2018
    Location
    Sweden
    MS-Off Ver
    2016.
    Posts
    397

    Re: Finding similar cities based on multiple criteria

    Is there anything bad or wrong with using a lot of custom sorts?

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Finding Matching Values from Multiple Columns Based on Criteria
    By rmmohan in forum Excel Formulas & Functions
    Replies: 10
    Last Post: 08-29-2017, 03:37 PM
  2. HELP with Finding similar values in same date range -- Multiple Criteria
    By yessuz in forum Excel Formulas & Functions
    Replies: 3
    Last Post: 01-08-2016, 04:33 AM
  3. Replies: 1
    Last Post: 04-29-2014, 09:06 PM
  4. [SOLVED] Help regarding finding a time based on multiple criteria
    By wassm in forum Excel Formulas & Functions
    Replies: 5
    Last Post: 04-15-2014, 07:51 PM
  5. [SOLVED] Finding Nth ranking value based on multiple criteria
    By scaffdog845 in forum Excel Formulas & Functions
    Replies: 18
    Last Post: 04-26-2013, 01:00 PM
  6. Finding the lowest value based on multiple text criteria
    By meherenow9 in forum Excel General
    Replies: 14
    Last Post: 09-16-2011, 02:01 PM
  7. Finding top 2 values per group based on multiple criteria
    By schuc in forum Excel Formulas & Functions
    Replies: 9
    Last Post: 05-22-2007, 10:51 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1