I have a file (see SAMPLE file attached) with over 2000 records. I need to know if there is a way to do a sampling or random selection by ethnicity and gender in order to get a diverse population for only 60 individuals. The selection criteria should be on ethinicity and gender.
Is this possible?
Thanks.
example doesnt show either of those criteria!
Mojito connoisseur and a dabbler in Cisco
where does code go ?
look here
how to insert code
how to enter array formula
why use -- in sumproduct
recommended reading
wiki Mojito
how to say no convincingly
most important thing you need
Martin Wilson: SPV
and RSMBC
Sorry about that. I have upload new file to include ethnicity and gender criteria. Would you please take a look?
Thanks.
i must admit i know not a fig about stats!
but so far i have how many required from each to make up 60
but as the data isnt realistic there are very few of some groups(i added a few but got bored!) see sheet 1
Mojito connoisseur and a dabbler in Cisco
where does code go ?
look here
how to insert code
how to enter array formula
why use -- in sumproduct
recommended reading
wiki Mojito
how to say no convincingly
most important thing you need
Martin Wilson: SPV
and RSMBC
Here's an alternative. Enter the sample size in H2; col I shows the lucky selectees.
Last edited by shg; 06-29-2009 at 07:06 PM.
Microsoft MVP - Excel
Entia non sunt multiplicanda sine necessitate
Martin,
Thanks for your assistance. I guess I need to be more specific on what I need to make happen. I need to actually pull the names of the 60 so that I establish an invite list to an event.
Is there anyway to do that?
Thanks.
here's how id approach it i havent opened SHG's coz its probably very clever and simple!
and it would break my heart to see i had wasted so much time lol
Mojito connoisseur and a dabbler in Cisco
where does code go ?
look here
how to insert code
how to enter array formula
why use -- in sumproduct
recommended reading
wiki Mojito
how to say no convincingly
most important thing you need
Martin Wilson: SPV
and RSMBC
WOW! You took this to the next level! Thanks. I have taken an opportunity to plug in the real data records and it works like a charm. MISSION ACCOMPLISHED.
I would however like to discuss the detail of the formulas and the functionality of the spreadsheet in more detail.
How do you know so much? How did you gain so much knowledge about Excel?
Much appreciation to you and what you do for others!!!!!!
Lillian,
The sheet applies the same method twice: fractional allocation of a number across categories. In the upper section, there is this formula in I4 and down:
=IF(G4=0, 0, ROUND( G4 * ($H$2 - SUM(H$3:H3) ) / ($G$14 - SUM(G$3:G3)), 0) )
Paraphrased, it says "select people in this proportion: (SampleSize - PeopleSelectedSoFar) / (TotalNumberOfPeople - PeopleConsideredSoFar)
The ROUND function acknowledges that you can only select whole people.
The formula in I17 and down is
=IF(RAND() < (INDEX($H$4:$H$13, E84) - SUMPRODUCT( (E$16:E83=E84) * (I$16:I83="x") ) ) / (INDEX($G$4:$G$13, E84) - COUNTIF(E$16:E83, E84) ), "x", "")
This uses a random number to select someone (or not), based on
(DesiredSampleSizeByCategory - PeopleSelectedInCategorySoFar) / (PeopleInCategory - PeopleInCategoryConsideredSoFar)
This fraction is the percentage of remaining people that must be selected to achieve the required sample for the category.
Column I just verifies that the sampling worked as desired.
Microsoft MVP - Excel
Entia non sunt multiplicanda sine necessitate
told you he was clever!
Mojito connoisseur and a dabbler in Cisco
where does code go ?
look here
how to insert code
how to enter array formula
why use -- in sumproduct
recommended reading
wiki Mojito
how to say no convincingly
most important thing you need
Martin Wilson: SPV
and RSMBC
I think clever is an understatment...I would say he is a freaking genius!
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks