I need to test the relationship between the values in two dynamic columns. The position of the first column will always be known. The position of the second column will vary depending on the column heading.
Observations are in columns, with an observation heading ID (0 ? 300) and a rating (1/0 for Observed/Not Observed).
Rows contain information (date, location, observer, etc.) and each specific column?s rating. Depending on the dataset, the number of columns can vary from 30 to 300 with the number of rows varying from 30 to 4500. The row and column names will vary from analysis to analysis.
Each heading ID is assigned to a specific group (A ? F for this example) in a many to one relationship. Also, the order of the observation headings cannot be altered.
A simplified dataset is attached. I have placed the correlation formulas in row 2 and the group totals starting in column AP. I?ve also color-coded columns by group to make tracking the group calculations easier. The actual data spans 5 worksheets across 2 workbooks.
By using SUMIF I have aggregated the column scores into a single group score for each row (columns AP:AU).
I now need to use CORREL to test the relationship between the ratings in each column and that column?s total group score. I believe HLOOKUP will be the best approach to identify the proper group score with each individual column. However, from what I can tell HLOOKUP indexing will not work. I need to include the full range of rows for both columns. Again, the number of rows may vary from 300 to 4500.
For example, the formula in B2 would start with ?CORREL(B6:B25,? and the second vector in AQ6 would be identified via ?HLOOKUP(B4,AP5:AU25,?. Instead of the row argument specifying and returning a single value, I need the entire vector AQ6:AQ25 returned to the CORREL function.
I?d appreciate any suggestions/guidance/alternative approaches anyone can provide.
Thanks in advance.
Bookmarks