This is my first time using Excel for much besides sorting data, and I figure this has got to be possible, but I have no idea how.
I have a data set that looks something like this:
a| b | c | d
33 | 102 | 1 | 1
34 | - | 2 | 2
119 | - | 33 | 6
- | - | 119 | 33
- | - | - | 119
Where the column header represents a word, and the numbers represent the cities where the word is found. The number of cities connected to a word varies from 1 to 50+. There's about a thousand words.
I'm trying to find which words share the greatest number of cities - that is, the columns with the most cells in common. Ideally, I'd like to calculate this at different thresholds (columns with 2+ cells in common, 3+, etc.)
It'd also be cool to do additional queries, like "for columns that share cell X, what other cells do they share?" But that might be too much to ask for...
Any help would be wonderful!
Bookmarks