+ Reply to Thread
Results 1 to 14 of 14

Calculating Correlation from Summary Table

  1. #1
    Registered User
    Join Date
    06-04-2014
    Location
    USA
    MS-Off Ver
    Office 2016
    Posts
    72

    Calculating Correlation from Summary Table

    I am curious if anyone has gone through the steps of calculating correlation from a summary table.

    For example: You have two variables which are 5 point scales that have been cross tabbed and you want to determine the correlation between the two variables without unwinding the data. The data in any given cell are counts, not proportions.

    I know that in this example the data are not continuous.

    For my data set in question, I have already gone through the process of unwinding the data, I am just curious if there is a formula, function or add-in that will work directly with a summary table.

    EDIT: Data set attached. Is there a way to calculate a Correlation Coefficient from "Raw Data" so that you do not need to go through the process of creating what is in "Unwound and Plotted"?
    Attached Files Attached Files
    Last edited by McStagger; 02-06-2017 at 12:48 PM. Reason: Updated with resolution state

  2. #2
    Forum Moderator - RIP Richard Buttrey's Avatar
    Join Date
    01-14-2008
    Location
    Stockton Heath, Cheshire, UK
    MS-Off Ver
    Office 365, Excel for Windows 2010 & Excel for Mac
    Posts
    29,464

    Re: Calculating Correlation from Summary Table

    Please upload a workbook or a representative cut down copy, anonymised if necessary. It is always easier to advise if we can see your request in its context.

    Show a before and after situation with manually calculated results, explaining which information is data and which is results, and if it's not blindingly obvious how you have arrived at your results some explanatory notes as well.

    In addition would you add your Excel version and location to your profile. This often helps when we need to consider things like time zones and regional settings for dates and list separators.
    Richard Buttrey

    RIP - d. 06/10/2022

    If any of the responses have helped then please consider rating them by clicking the small star icon below the post.

  3. #3
    Registered User
    Join Date
    06-04-2014
    Location
    USA
    MS-Off Ver
    Office 2016
    Posts
    72

    Re: Calculating Correlation from Summary Table

    Post updated as requested.

  4. #4
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678

    Re: Calculating Correlation from Summary Table

    It seems like there should be a way, but nothing is jumping to mind. You could (someone could) certainly write a UDF that does the unwinding.
    Entia non sunt multiplicanda sine necessitate

  5. #5
    Registered User
    Join Date
    06-04-2014
    Location
    USA
    MS-Off Ver
    Office 2016
    Posts
    72

    Re: Calculating Correlation from Summary Table

    My current process for unwinding is using Power Query's "From Table" tool and unpivoting the columns. From there I have some scripting I run that will duplicate my rows based on the counts for any given occurrence at which point I plot the data, generate the line and take the square root of the R^2. This isn't too bad for a one off, but for a larger set of tables it's less than ideal.

  6. #6
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678

    Re: Calculating Correlation from Summary Table

    With reference to https://en.wikipedia.org/wiki/Pearso...on_coefficient, it seems like this should work, but it doesn't:

    B
    C
    D
    E
    F
    G
    H
    I
    J
    K
    1
    Weighted Mean
    2
    y \ x
    1
    2
    3
    4
    5
    mxw
    3.0087
    J2: =SUMPRODUCT(wi * xi) / SUM(wi)
    3
    1
    639
    642
    263
    157
    641
    myw
    2.9707
    J3: =SUMPRODUCT(wi * yi) / SUM(wi)
    4
    2
    160
    5
    288
    416
    903
    5
    3
    701
    255
    205
    82
    473
    Weighted Cov
    6
    4
    727
    536
    501
    494
    550
    covxxw
    2.3574
    J6: =SUMPRODUCT(wi * (xi - mxw) * (xi - mxw)) / SUM(wi)
    7
    5
    314
    451
    160
    718
    30
    covxyw
    -0.1872
    J7: =SUMPRODUCT(wi * (xi - mxw) * (yi - myw)) / SUM(wi)
    8
    covyyw
    2.0009
    J8: =SUMPRODUCT(wi * (yi - myw) * (yi - myw)) / SUM(wi)
    9
    10
    Correlation
    11
    -0.0862
    J11: =covxyw / SQRT(covxxw * covyyw)

  7. #7
    Forum Guru
    Join Date
    04-13-2005
    Location
    North America
    MS-Off Ver
    2002/XP and 2007
    Posts
    15,827

    Re: Calculating Correlation from Summary Table

    Would you be interested in "unwinding" the math/algebra of the thing? SQRT(RSQ()) is simply PEARSON(). My statistics is weak, but I would expect you could modify the standard form for the Pearson() function to give you a "weighted" Pearson function, which should be what you are looking for.

    Wikipedia article -- see section weighted correlation coefficient: https://en.wikipedia.org/wiki/Pearso...on_coefficient

    Naturally, weighted Pearson is not a built in function in Excel (to my knowledge).

    Can someone whose statistics Kung Fu is stronger than mine verify that a weighted Pearson function would be correct for this? If it is correct, one could solve the problem by:

    1) calculate the two weighted averages
    2) calculate the three covariances from the raw data and the two weighted averages.
    3) calculate r from the three covariances.

    I would agree with shg that I would like to see that in a UDF if this was something I was going to do a lot. If the statistics is right, that might be easier than unwinding this table into a longer list like you are doing.

    on edit: shg seems to have the same idea...
    Quote Originally Posted by shg
    Mathematics is the native language of the natural world. Just trying to become literate.

  8. #8
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678

    Re: Calculating Correlation from Summary Table

    It looks to me like all of the SUM(wi) terms should cancel out, but replacing all of them with 1 doesn't give the same result. Curious. MrS, see anything obviously wrong?

  9. #9
    Forum Guru
    Join Date
    04-13-2005
    Location
    North America
    MS-Off Ver
    2002/XP and 2007
    Posts
    15,827

    Re: Calculating Correlation from Summary Table

    Why do we think that -.0862 is wrong? If I put =PEARSON(A:A,B:B) into D1 of the unwound and plotted sheet, I also get -.0862. -.0862^2 is 0.007431 (the chart says 0.0074). Maybe we are right????

  10. #10
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678

    Re: Calculating Correlation from Summary Table

    Doh!. Well, there you go.

  11. #11
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678

    Re: Calculating Correlation from Summary Table

    So ...

    B
    C
    D
    E
    F
    G
    H
    I
    J
    K
    3
    y \ x
    1
    2
    3
    4
    5
    Weighted Mean
    4
    1
    639
    642
    263
    157
    641
    mxw
    3.0087
    J4: =SUMPRODUCT(wi * xi) / SUM(wi)
    5
    2
    160
    5
    288
    416
    903
    myw
    2.9707
    J5: =SUMPRODUCT(wi * yi) / SUM(wi)
    6
    3
    701
    255
    205
    82
    473
    7
    4
    727
    536
    501
    494
    550
    Weighted Cov
    8
    5
    314
    451
    160
    718
    30
    covxxw
    2.3574
    J8: =SUMPRODUCT(wi * (xi - mxw) * (xi - mxw)) / SUM(wi)
    9
    covxyw
    -0.1872
    J9: =SUMPRODUCT(wi * (xi - mxw) * (yi - myw)) / SUM(wi)
    10
    covyyw
    2.0009
    J10: =SUMPRODUCT(wi * (yi - myw) * (yi - myw)) / SUM(wi)
    11
    12
    Correlation
    -0.0862
    J12: =covxyw / SQRT(covxxw * covyyw)
    13
    R2
    0.0074
    14
    15
    Instant gratification
    0.0074
    J15: =SUMPRODUCT(wi * (xi - SUMPRODUCT(wi * xi) / SUM(wi)) * (yi - SUMPRODUCT(wi * yi) / SUM(wi)))^2 /
    (SUMPRODUCT(wi * (xi - SUMPRODUCT(wi * xi) / SUM(wi)) ^ 2) * SUMPRODUCT(wi * (yi - SUMPRODUCT(wi * yi) / SUM(wi)) ^ 2))
    Last edited by shg; 02-03-2017 at 06:47 PM.

  12. #12
    Registered User
    Join Date
    06-04-2014
    Location
    USA
    MS-Off Ver
    Office 2016
    Posts
    72

    Re: Calculating Correlation from Summary Table

    Thanks a lot for tackling this guys. Now I need to figure out how a lot of these values are calculated so I have some Excel work ahead of me. Good to see there is a way though!

    Thanks again!

    Edit: Replicated the above on a couple data sets and it works!

    In case this is looked up in the future...

    wi = Range of scale data
    xi = x (scale points)
    yi = y (scale points)
    Last edited by McStagger; 02-06-2017 at 01:34 PM. Reason: Corrected notes

  13. #13
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678

    Re: Calculating Correlation from Summary Table

    Quote Originally Posted by McStagger View Post
    In case this is looked up in the future...

    wi = total number of data points
    xi = Sum of x scale range
    yi = Sum of y scale range
    No! They are the named ranges that contain the weight, x, and y values respectively.

  14. #14
    Registered User
    Join Date
    06-04-2014
    Location
    USA
    MS-Off Ver
    Office 2016
    Posts
    72

    Re: Calculating Correlation from Summary Table

    Whoops! Clearly I was not paying attention to the formula or concept. You are correct!

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Replies: 0
    Last Post: 06-03-2014, 01:12 AM
  2. Calculating Correlation
    By mike02 in forum Excel Formulas & Functions
    Replies: 4
    Last Post: 08-23-2013, 06:40 PM
  3. Calculating correlation scores for segments of a large amount of data
    By wrf_89 in forum Excel Formulas & Functions
    Replies: 4
    Last Post: 07-19-2013, 02:46 PM
  4. Calculating correlation
    By GIS2013 in forum Excel General
    Replies: 7
    Last Post: 12-30-2012, 10:45 AM
  5. [SOLVED] Calculating Correlation Between Two Series of Data
    By JungleJme in forum Excel General
    Replies: 1
    Last Post: 12-14-2012, 03:56 AM
  6. [SOLVED] How do I set up a correlation table?
    By rh555 in forum Excel General
    Replies: 2
    Last Post: 03-25-2005, 03:06 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1