# Generate a "similarity" matrix on non-continuous values (not numbers)?

1. ## Generate a "similarity" matrix on non-continuous values (not numbers)?

Dear all,

I have this matrix of non-continuous, independent data (let's call them "a", "b", "c")

``Please Login or Register  to view this content.``
Values are not continous numbers, or measures, but more of a label for each "Variable". Additionally, "a" in Variable 1 does not relate to "a" in Variable 2.

I would like a way to assess similarity (=shared values) between "Samples". I don't care to know which "Variable" is similar or different between the two "Samples", just the number of shared values is fine.

For the example above, we see that:
- Sample 1 has 2 shared value with Sample 3 (for Variable 2)
- Sample 2 has 0 shared value with Sample 1 for any variable
- Sample 3 has 0 shared values with Sample 2
- Each sample has 3 shared values with itself

In that example, Sample 1 and 3 are more similar to each other than Sample 2 (if we exclude self-similarity).

I guess a good way of outputing this is to create a "similarity" matrix:

Shared values:

``Please Login or Register  to view this content.``
Does it sound like something possible to do in Excel?

Thanks a lot for any help. I hope I was clear enough!

All the best,

G.

2. ## Re: Generate a "similarity" matrix on non-continuous values (not numbers)?

Try this where your data table (including headers is in H1:K5) and your comparision table is in A1:E5

In B2
=SUM(1*(INDEX(\$I\$2:\$K\$5,MATCH(B\$1,\$H\$2:\$H\$5,0),)=INDEX(\$I\$2:\$K\$5,MATCH(\$A2,\$H\$2:\$H\$5,0),))) entered as an array (CNTRL SHFT ENTER instead of ENTER)
Drag across and down

See attachment

3. ## Re: Generate a "similarity" matrix on non-continuous values (not numbers)?

It works perfectly, thanks a lot!

4. ## Re: Generate a "similarity" matrix on non-continuous values (not numbers)?

Thanks for the solution indeed !

I am now struggling to adapt/expand the formula provided by ChemistB using a condition -> something like: "if the two cells that are compared are empty, don't count them as similar"...

See the objective in attachment: ComparisonTable_v2.xlsx

Any help would be much appreciated

Cheers,

Y.

yomlao,

6. ## Re: Generate a "similarity" matrix on non-continuous values (not numbers)?

Dear all,

I have previously solved my initial problem using the answer above, but I have the same related question as #4, so I post it here.

The formula provided in #2 works perfectly with the exception that empty cells are considered to be identical. As the poster in #4, I am also struggling to adapt the formula to not take into account the empty cells and would be grateful for some help in this.

G.

7. ## Re: Generate a "similarity" matrix on non-continuous values (not numbers)?

(Just marked the thread as unsolved, as I was advised to post the new related question below it)

There are currently 1 users browsing this thread. (0 members and 1 guests)

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1