+ Reply to Thread
Results 1 to 11 of 11

Random sampling without replacement

  1. #1
    Registered User
    Join Date
    08-01-2006
    Posts
    27

    Random sampling without replacement

    Hi,

    I'm trying to take a random sample without replacement. The sample is also partially representative, e.g. if a certain proportion of the data are Class A and a certain proportion is Class B, that proportion is maintained in the sample. After the proportions are set, the observations within each class are randomized.

    Is there anyway to sample a number of rows, where the number of rows is based on the value in another cell, without having to use a lookup function or index match? Sorry if the question is a little unclear; please let me know if you need any clarification.

    Thanks for any help!

  2. #2
    Forum Moderator zbor's Avatar
    Join Date
    02-10-2009
    Location
    Croatia
    MS-Off Ver
    365 ProPlus
    Posts
    15,614

    Re: Random sampling without replacement

    Some workbook exampl ewould be helpfull... range, input, desired ouput etc

  3. #3
    Forum Expert JBeaucaire's Avatar
    Join Date
    03-21-2004
    Location
    Bakersfield, CA
    MS-Off Ver
    2010, 2016, Office 365
    Posts
    33,492

    Re: Random sampling without replacement

    Hehe...you give just enough info to totally peak my interest, but not enough to actually help.

    Probably would be easiest if you generated a sample sheet (not TOO simple) showing your data. Then show us where you're wanting to do this random averaging and based on what you show, how you arrived at your sample "results".

    By letting us see your goal line laid out this way, helping you automate become much simpler.

    Click on GO ADVANCED and use the paperclip icon to upload your DATA/RESULTS sample book.
    _________________
    Microsoft MVP 2010 - Excel
    Visit: Jerry Beaucaire's Excel Files & Macros

    If you've been given good help, use the icon below to give reputation feedback, it is appreciated.
    Always put your code between code tags. [CODE] your code here [/CODE]

    ?None of us is as good as all of us? - Ray Kroc
    ?Actually, I *am* a rocket scientist.? - JB (little ones count!)

  4. #4
    Registered User
    Join Date
    08-01-2006
    Posts
    27

    Re: Random sampling without replacement

    I've attached a sample. Essentially, I'm separating the observations by class, and then want to construct a random sample of observations while keeping the proportion of each class the same in the sample as it is in the population. In the sample, the population consists of 11 observations, but the sample will only be 6. The desired number of observations from each class in the random sample is in Row 15. My question is, as best as I can word it, how can I construct a sample of observations where the number of each class represented is listed beside desired sample size, and the sample is random otherwise?

    My only other concern is that I don't want a new sample to be taken every time the sheet recalculates, which is why I'm hesitant when I use RAND() to pick out observations.
    Attached Files Attached Files
    Last edited by montreal1775; 06-05-2009 at 01:56 PM. Reason: With attachment this time. . .

  5. #5
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678

    Re: Random sampling without replacement

    See attached.
    Attached Files Attached Files
    Last edited by shg; 06-05-2009 at 06:35 PM.
    Entia non sunt multiplicanda sine necessitate

  6. #6
    Registered User
    Join Date
    08-01-2006
    Posts
    27

    Re: Random sampling without replacement

    Thanks! The process definitely works as I needed it to, but I'm a little confused with the syntax. The first thing I don't understand and can't seem to find online is what "<>" means within the IF function. The other question I have is the syntax within a few of the SUM functions. What does SUM($B2:B2) imply in C2 cell, and the others across the row? Isn't B2="Sample" Shouldn't summing that cell result in an error?

    Thanks for the help; sorry I can't work through the syntax.

  7. #7
    Forum Expert martindwilson's Avatar
    Join Date
    06-23-2007
    Location
    London,England
    MS-Off Ver
    office 97 ,2007
    Posts
    19,320

    Re: Random sampling without replacement

    <> means not equal to

    SUM($B2:B2) when dragged across becomes
    SUM($B2:C2),SUM($B2:D2) its just to make it easier to extend across columns ,sum ignores text so SUM($B2:B2) =0
    "Unless otherwise stated all my comments are directed at OP"

    Mojito connoisseur and now happily retired
    where does code go ?
    look here
    how to insert code

    how to enter array formula

    why use -- in sumproduct
    recommended reading
    wiki Mojito

    how to say no convincingly

    most important thing you need
    Martin Wilson: SPV
    and RSMBC

  8. #8
    Registered User
    Join Date
    08-01-2006
    Posts
    27

    Re: Random sampling without replacement

    Also, I used my own functions for Sample and Population (basically the ones that were in the original sample spreadsheet) because I couldn't translate the functions to my sheet without any hiccups. I get the same numbers you do in the Sample and Population rows. I then used the function you entered in columns 5-9 and rows 5-15 into the corresponding rows and columns on my sheet. When I do that though, the number sampled in each don't sum to the desired sample size, nor are the proportions preserved.

    I guess my problem is I don't fully understand how the first argument of the second IF function in those cells works (IF(RAND() < (C$2 - SUM(C$4:C4) ) / (C$3 - COUNTIF($A$4:$A4, C$4) )). This function, as I understand it, reads: if a random number between 0 and 1 is less than the ratio of the difference of sample size and the sum of selected observations (i.e. those with value 1) and the difference between the population of the relevant class and the number of already selected observations within that class, then 1 is returned. I'm confused as to the ratio, and also the way its calculating the ratio. Will there be a higher probability of selecting higher up observations if the count goes from the top down?
    Last edited by montreal1775; 06-08-2009 at 11:26 AM. Reason: Being unclear.

  9. #9
    Registered User
    Join Date
    08-01-2006
    Posts
    27

    Re: Random sampling without replacement

    Thanks for the quick response.

  10. #10
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678

    Re: Random sampling without replacement

    Post your workbook that doesn't work.

  11. #11
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678

    Re: Random sampling without replacement

    I guess my problem is I don't fully understand how the first argument of the second IF function in those cells works
    To answer your question, suppose you want to choose 3 of 5 members.

    The probabability of choosing the first is 3/5

    The probability of choosing the second depends on whether the first was selected; if it was, then choose the second with probability (3-1)/(5-1) = 1/2. If it wasn't, then (3-0)/(5-1) = 3/4

    For choosing the third, it depends on whether 0, 1, or 2 of the prior members were selected, so

    If 0, (3-0)/(5-2) = 1 (and RAND() is always less than 1)

    If 1, (3-1)/(5-2) = 2/3

    If 2, (3-2)/(5-2) = 1/3

    ... and so forth. That's what the formula does.

    You could read it as

    (numberToSelect - numberSelectedSoFar) / (candidates - candidatesEvaluatedSoFar)

    or

    remaindertoSelect / remainingCandidates

    The result is, it always converges to the correct sample size.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1