Welcome to the Excel Forum

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed.

Please Register to Remove these Ads

Please Register to Remove these Ads



Reply
  #1  
Old 01-17-2010, 12:22 PM
JayUSA JayUSA is offline
Forum Contributor
 
Join Date: 26 Apr 2009
Location: Orlando
MS Office Version:Excel 2007
Posts: 128
JayUSA is becoming part of the community
Count unique entries in one column that meet conditions

Please Register to Remove these Ads

Hi Everyone,

[I tried to ask this question yesterday -- but it was a follow-up question stuck at the bottom of a thread. So, with your indulgence, here is a simpler version of the question, complete with an attached spreadsheet, if you wish to use it. I also closed the other thread by marking it "Solved", since it answered my initial question.]


The situation:

I have two columns of data. The data is not in alphabetical order, and every column includes duplicate values.

name gender
jones m
martin f
smith f
collins f
wilson m
jones m
martin f
hughes m
wilson m
martin m
smith f
west f
jones m
west f
martin m


The challenge:

In one cell, count the number of unique names that appear in the name column 3 or more times... with the additional condition that each unique name (which appears at least 3 times) must include at least one one woman!


The correct result:

1. ["Martin" appears 4 times, and includes 2 women.]


Limitations:

For various reasons, I'm really keen to find a stand-alone formula that sits in a single cell. CSE formulas welcome. VB scripts are, unfortunately, not useful to me as I don't have the skills to extend them.


Helpful hints and/or starting points:

The following formulas have been suggested as methods to count the number of unique names that appear at least 3 times. But these formulas *do not address the additional requirement* that each distinct name also include at least one woman.


=SUMPRODUCT((A1:A50<>"")/COUNTIF(A1:A50,A1:A50&"")*(COUNTIF(A1:A50,A1:A50)>=3)) [thanks to: daddylonglegs]

=SUMPRODUCT(--($A$2:$A$51<>""),--(COUNTIF($A$2:$A$51,$A$2:$A$51)>=3)
,--(MATCH($A$2:$A$51&"",$A$2:$A$51&"",0)=ROW($A$2:$A$51)-ROW($A$2)+1)) [thanks to: ron coderre]

=SUM(IF(FREQUENCY(IF(A1:A50<>"",MATCH(A1:A50,A1:A50,0)),ROW(A1:A50)-ROW(A1)+1)>=3,1)) [CSE formula, with thanks to: daddylonglegs]


Well, I've done my best to make this challenge as interesting and efficient as possible for everyone. Unfortunately, I have less than 24 hours to pick an approach. Hopefully, an Excel Forum member will once again ride to my rescue.

With sincerest thanks,

Jay
Attached Files
File Type: xlsx forexcelforum.xlsx (9.7 KB, 4 views)

Last edited by JayUSA; 01-23-2010 at 10:53 PM.
Reply With Quote
  #2  
Old 01-17-2010, 12:33 PM
DonkeyOte's Avatar
DonkeyOte DonkeyOte is offline
Forum Guru
 
Join Date: 22 Oct 2008
Location: Suffolk, UK
MS Office Version:2002 & 2007
Posts: 13,667
DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute
Re: Simpler version: count unique entries in one column that meet conditions (2007)

I doubt you will be able to "simplify" - after all you have had help from the two of the very best around.

Perhaps you could use:

Code:
=SUMPRODUCT(((Table1[name]<>"")/COUNTIF(Table1[name],Table1[name]&""))*(COUNTIF(Table1[name],Table1[name])>=3)*ISNUMBER(MATCH(Table1[name]&":f",Table1[name]&":"&Table1[gender],0)))
Reply With Quote
  #3  
Old 01-17-2010, 01:05 PM
JayUSA JayUSA is offline
Forum Contributor
 
Join Date: 26 Apr 2009
Location: Orlando
MS Office Version:Excel 2007
Posts: 128
JayUSA is becoming part of the community
Re: Simpler version: count unique entries in one column that meet conditions (2007)

DonkeyOte,

The formula that you provided is amazing and astonishing -- best of all, it works!!!!!!!!!!!!!

Thanks to you and daddylonglegs and Ron Coderre and everyone who contributed in stages to the resolution of this challenge.

From my perspective, this was the trickiest question that I have ever posted on this forum... and I never believed that it could possibly be solved as elegantly, let alone without CSE.

Eternal thanks to all -- marked SOLVED!!

Jay

Last edited by JayUSA; 01-17-2010 at 01:18 PM.
Reply With Quote
  #4  
Old 01-17-2010, 01:10 PM
DonkeyOte's Avatar
DonkeyOte DonkeyOte is offline
Forum Guru
 
Join Date: 22 Oct 2008
Location: Suffolk, UK
MS Office Version:2002 & 2007
Posts: 13,667
DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute
Re: Simpler version: count unique entries in one column that meet conditions (2007)

Quote:
Originally Posted by JayUSA
I never believed that it could possibly be solved as elegantly, let alone without CSE.
I'm obviously glad you have resolved this - all I would say regards the above is don't be duped into thinking that SUMPRODUCT is significantly more efficient than an Array - it is not - there is a negligible difference
(and in which "favour" is debated though general opinion seems to support SUMPRODUCT as being the slightly quicker of the two)

The main "advantage" of SUMPRODUCT over an Array is that is does not require CTRL + SHIFT + ENTER entry making it slightly more robust from an end user perspective... Arrays on the other hand are generally more flexible when it comes to handling underlying error values etc... ie both have their advantages !

Soap Box over.
Reply With Quote
  #5  
Old 01-17-2010, 04:25 PM
JayUSA JayUSA is offline
Forum Contributor
 
Join Date: 26 Apr 2009
Location: Orlando
MS Office Version:Excel 2007
Posts: 128
JayUSA is becoming part of the community
Re: Count unique entries in one column that meet conditions

Hello DonkeyOte,

As I mentioned in my post, I am enormously grateful for your assistance (and everyone's assistance) with the problem described in this thread.

I have already used your formula to great success in several contexts, and I can immediately see where it allows whole new worlds of analysis that have always been cumbersome before.

However, I do have a few simple-seeming follow-up questions, and I hope that you might answer them:


As a reminder, here is the formula that you provided, parsed by me in order to be easier to scan and discuss:

=SUMPRODUCT(

((Table1[name]<>"")/COUNTIF(Table1[name],Table1[name]&""))

* (COUNTIF(Table1[name],Table1[name])>=3)

* ISNUMBER(MATCH(Table1[name]&":f",Table1[name]&":"&Table1[gender],0))

)


My questions are:

#1) The final section of your formula uses the colon symbol in 2different places, but your use of the colon here is new to me. Where can I read about using the colon symbol in this manner? Unfortunately, searching Google for "sumproduct" and "colon" yielded nothing interesting.

#2) How would your formula change if I wanted not "at least one woman", but "exactly two women"? I tried to fiddle the formula to produce this result, but couldn't get it to work.

#3) Finally, how would your formula change if I wanted not "at least one woman", but "any number of women except for exactly two women"?

#4) Lastly, do your answers to the above questions change if the gender is not coded as "m" and "f", but as "0" and "1"? In other words, is there anything intrinsically text-specific in your formulae?


Finally, let me repeat that you and the others have opened vast new horizons to me with with this formula as a model, and I thank you again.

Cheers,

Jay

PS: You can still respond via the uploaded spreadsheet if that is in any way easier for you.
Reply With Quote
  #6  
Old 01-17-2010, 05:55 PM
DonkeyOte's Avatar
DonkeyOte DonkeyOte is offline
Forum Guru
 
Join Date: 22 Oct 2008
Location: Suffolk, UK
MS Office Version:2002 & 2007
Posts: 13,667
DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute DonkeyOte has a reputation beyond repute
Re: Count unique entries in one column that meet conditions

Quote:
Originally Posted by JayUSA
#1) The final section of your formula uses the colon symbol in 2different places, but your use of the colon here is new to me. Where can I read about using the colon symbol in this manner? Unfortunately, searching Google for "sumproduct" and "colon" yielded nothing interesting.
the use of the colon is used to concatenate the two columns of interest (name & gender) and to subsequently conduct a test to see if on at least one occasion the combination of "name:f" is found ...
you need not use a colon per se (the character serves no "functional" purpose so to speak), you could use any reasonable delimiter of choice - the main criteria being that it (the delimiter) is not found within either field being concatenated.

Quote:
Originally Posted by JayUSA
#2) How would your formula change if I wanted not "at least one woman", but "exactly two women"? I tried to fiddle the formula to produce this result, but couldn't get it to work.
Given your use of XL2007 you could replace the ISNUMBER(MATCH with a COUNTIFS test, eg:

Code:
=SUMPRODUCT(((Table1[name]<>"")/COUNTIF(Table1[name],Table1[name]&""))*(COUNTIF(Table1[name],Table1[name])>=3)*(COUNTIFS(Table1[name],Table1[name]&"",Table1[gender],"f")=2))
Quote:
Originally Posted by JayUSA
#3) Finally, how would your formula change if I wanted not "at least one woman", but "any number of women except for exactly two women"?
Assuming you mean a min. of one and/or greater than two then you could perhaps use:

Code:
=SUMPRODUCT(((Table1[name]<>"")/COUNTIF(Table1[name],Table1[name]&""))*(COUNTIF(Table1[name],Table1[name])>=3)*LOOKUP(COUNTIFS(Table1[name],Table1[name]&"",Table1[gender],"f"),{0,1,2,3},{0,1,0,1}))
In the above the output of the COUNTIFS is compared to a LOOKUP to determine multiplier such that 0 & 2 return multiplier of 0 whereas all other instances return multiplier of 1.

Quote:
Originally Posted by JayUSA
#4) Lastly, do your answers to the above questions change if the gender is not coded as "m" and "f", but as "0" and "1"? In other words, is there anything intrinsically text-specific in your formulae?
Yes, you would need to replace instances of "f" with 1.
Reply With Quote
  #7  
Old 01-17-2010, 06:30 PM
JayUSA JayUSA is offline
Forum Contributor
 
Join Date: 26 Apr 2009
Location: Orlando
MS Office Version:Excel 2007
Posts: 128
JayUSA is becoming part of the community
Re: Count unique entries in one column that meet conditions

DonkeyOte,

I am THRILLED to hear from you this late on a Sunday evening, especially given that you are apparently in the UK!

Your answers to my questions will keep me fully occupied -- understanding, experimenting, implementing, and debugging -- for the next 6 hours or more!

Thanks again!!

Jay
Reply With Quote


Reply

Bookmarks


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off

Forum Jump