+ Reply to Thread
Results 1 to 5 of 5

Populate Data set on Mean Standard Deviation and Correlation

  1. #1
    Registered User
    Join Date
    03-13-2009
    Location
    London, England
    MS-Off Ver
    Excel 2016
    Posts
    98

    Populate Data set on Mean Standard Deviation and Correlation

    Hi, I have 2 variables (CPX and DEFS)

    DEFS is the dependant variable.

    I am trying to replicate a calculation but do not have the source date but have the Mean, Standard Deviation and Correlation between the variables. I have used the NormInv function to populate the data for both variables using the Mean and Standard Deviation. However I cannot get the correlation to get close to its desired value. Is there a way of populating the data using the mean, standard deviation and correlation.

    I have attached the excel sheet. Tab -Sample Data is where I have attempted to do the calculation. The only thing I can think of is keeping the values of 1 of the variables the same, and then changing the sequence of the numbers in the other variable and optimise the correlation.

    Any help will be much appreciated. Thanks
    Attached Files Attached Files

  2. #2
    Forum Guru Kaper's Avatar
    Join Date
    12-14-2013
    Location
    Warsaw, Poland
    MS-Off Ver
    most often: Office 365 in Windows environment
    Posts
    8,613

    Re: Populate Data set on Mean Standard Deviation and Correlation

    I'll start with two general comments:
    - as you are looking for correlation it does not matter which variable is dependant :mrgreen:
    - you wisely used NormInv - but have you tested the results? You have relatively small sample (30 items), so generated data are usually noticeably far from expected mean/std - they would probably pass the statistical tests but it does not look nicely. try
    Formula: copy to clipboard
    Please Login or Register  to view this content.

    and few <F9> strokes - quite a spread isn't it?

    And below is my vision how to do it:

    First - generate two normally distributed variables, letting Excel to repeat generation until mean and stddev are pretty close to these listed in upperleft corner
    Second: do special sorting of data - with both random and data correlation influence.

    The first part I adopted one of procedures I used some time ago. The second part is done just for this task, so I made it quick and dirty way - by macro using the spreadsheet for almost whole "calculation engine" and all adresses hardcoded in macro.

    The code is as follows (you can use it in your file or test it first in attached one):
    Please Login or Register  to view this content.
    Not tested extensively, but shall do the work. Note that it is not effective way to generate sets of high (either negative or positive) correlation like 0.9 or -0.9, not to mention -1 or 1 . If there would be such a need, at least one variable shall be generated several times in an inner loop. Something similar to:
    Please Login or Register  to view this content.
    Attached Files Attached Files
    Best Regards,

    Kaper

  3. #3
    Registered User
    Join Date
    03-13-2009
    Location
    London, England
    MS-Off Ver
    Excel 2016
    Posts
    98

    Re: Populate Data set on Mean Standard Deviation and Correlation

    Hi Kaper, really appreciate your help and points taken on sample size and the statistics.

    I actually have 6 independent variables (CPX,Fan-In,MAC ,Includes,DCR,PTR3) and 1 dependent variable (DEFS). I wanted to populate 75 data points for each based on mean, standard deviation and correlation with the dependent variable.

    I have used your VBA to populate the data for the independent variables, however since the values of the dependent variable (DEFS) is not static the correlation changes.

    I would essentially like to create 75 data points for each which produce the correlation, mean and standard deviation with a static data set for DEFS (with a given mean and standard deviation)
    Attached Files Attached Files

  4. #4
    Forum Guru Kaper's Avatar
    Join Date
    12-14-2013
    Location
    Warsaw, Poland
    MS-Off Ver
    most often: Office 365 in Windows environment
    Posts
    8,613

    Re: Populate Data set on Mean Standard Deviation and Correlation

    After tiny changes almost the same code can be applied.
    Just changed obvious addresses because layout is now different, moved DEFS location leftmost (to facilitate easy sorting of other variables), moved parts to sort generated DEFS in ascending order before generation of other parameters and in random order after generation from subroutine to main unit. This was key modification, to keep every next variable fitted (in sense of correlation coefficient) to DEFS without disturbing correlation acheved between DEFS and previus variables).
    As I added parameter to call SpecialSort in appropriate column, I made it public (previously was Private not to be visible in Macro: Alt-F8 dialog).

    In the worksheet I added regression coefficient in row 81 for each pair (row 79 and 80 Avg and stdev). That's it. So few minor changes and it works. See attachment.

    Please Login or Register  to view this content.
    Attached Files Attached Files
    Last edited by Kaper; 05-30-2014 at 10:40 AM.

  5. #5
    Registered User
    Join Date
    03-13-2009
    Location
    London, England
    MS-Off Ver
    Excel 2016
    Posts
    98

    Re: Populate Data set on Mean Standard Deviation and Correlation

    Thank you very much for this. I really appreciate it

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. [SOLVED] Computing monthly standard deviation from daily data
    By Quantopic in forum Excel Formulas & Functions
    Replies: 9
    Last Post: 04-07-2014, 07:42 AM
  2. Standard deviation and Skweness of time series data
    By rjerung in forum Excel Programming / VBA / Macros
    Replies: 2
    Last Post: 02-17-2012, 02:19 AM
  3. Data with fixed mean and standard deviation
    By Klorgrodan in forum Excel General
    Replies: 2
    Last Post: 02-10-2009, 10:23 AM
  4. Median, Average, and Standard Deviation from large set of data
    By Humberto Goyen in forum Excel General
    Replies: 6
    Last Post: 11-30-2005, 08:40 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1