+ Reply to Thread
Results 1 to 17 of 17

big LOOKUP and DATA wrangling for regression analysis

  1. #1
    Registered User
    Join Date
    09-08-2016
    Location
    Los Angeles, CA
    MS-Off Ver
    Office 365
    Posts
    22

    big LOOKUP and DATA wrangling for regression analysis

    Dear exalted guru(s):

    I'm trying to join a single X- and multiple Y-variables into a dataset for histogram, scatterplot, and OLS regression analysis. I've been spinning at this one for days. Excel file is attached.

    X and Ys are defined as follows:

    X: "xy_rating" column as shown in tab "problem_1" per date of rating action
    Ys: store count changes, both annual and cumulative, under date columns in the nine yellow-colored tabs per year y-action occurred, since there are multiple Ys, let's call it "y-actions" for now.
    key: names in column "xy_tic"

    Dates are the non-key, common element: first, date of rating with respect to X and second, year y-action occurred with respect to Y. Because I want to test relationship of "xy_rating" on each subsequent year's Y store count, only until the next X for each xy_tic, the final data set will have hundreds of XY combinations. I planned to do this in three steps:

    STEP ONE: Lookups with INDEX/MATCH/INDIRECT

    First, apply a lookup to populate cells P4:ET124 in "example_1" - the first section has my bad formula. My formula seems to fail evaluating dates, among possibly other errors. I haven't even gotten to the INDIRECT part - which confuses the heck out of me.

    For each xy_tic, the lookup should return the "y-action" that occurred during the period the rating was in effect. This period is defined as the years following the "y-action" but[INDENTbefore the next rating or when the rating ends.

    STEP TWO: join observations into XY dataset observations

    Complete the join by putting together the lookup Ys with xy_rating data. This one is harder to explain, so I'll do it with pictures.
    Screen Shot 2019-09-07 at 6.30.59 AM.png

    The data in the first picture results in six different observations, as shown in the second picture. I have to do it for ALL NINE Ys, with the table in columns A:J in tab "example_2" with subsequent Ys added on.
    Screen Shot 2019-09-07 at 6.32.14 AM.png

    How can I arrange the data in the first picture into the form of the second picture given how I set it up in "example_1"? Is there a better way to set it up? Can I do this with Power BI or some relational database function within Excel? Or is this only a job for VBA?

    Unless I can automate that, it seems like a very long, error-prone slog.

    STEP THREE: histograms, scatterplot, and OLS regression

    Run pivot table on data per "example_2", then manually a routine of set up bins, histograms, descriptive stats, scatterplots, and OLS regression. That's a lot of steps for each subset.

    If I isolate each XY subset, can I automate the routine?

    FINAL QUESTIONS:

    While my data set isn't huge, the way I'm going about the task is a bit unruly; I don't think I could manage more. Is there a better way to organize it all within Excel?

    Thank you so very much for even reading this! I'm grateful for any suggestions, even if only for one of the parts!
    Attached Files Attached Files

  2. #2
    Forum Guru MarvinP's Avatar
    Join Date
    07-23-2010
    Location
    Woodinville, WA
    MS-Off Ver
    Office 365
    Posts
    16,167

    Re: big LOOKUP and DATA wrangling for regression analysis

    Hi rkong,

    You need to learn the Formulas -> Evaluate tool built into Excel. Select cell P6 and then run the Evaluate Formula tool and step through what it does. It looks like P3 is the first problem. Figure out why and keep going.

    https://www.bing.com/videos/search?q...6FBD&FORM=VIRE
    One test is worth a thousand opinions.
    Click the * Add Reputation below to say thanks.

  3. #3
    Registered User
    Join Date
    09-08-2016
    Location
    Los Angeles, CA
    MS-Off Ver
    Office 365
    Posts
    22

    Re: big LOOKUP and DATA wrangling for regression analysis

    Thanks MarvinP. After evaluating, I'm still have trouble with reference returning a serial instead of a year, per below.

    Screen Shot 2019-09-07 at 10.34.50 AM.png

    Smaller example of this problem is also attached.

    What can I do about this?

    I mistakenly went against the rules and didn't cross-link to thread that address part of the problem. Here is the cross-link.
    Attached Files Attached Files
    Last edited by rkong; 09-07-2019 at 02:37 PM. Reason: adding cross-link

  4. #4
    Forum Moderator - RIP Richard Buttrey's Avatar
    Join Date
    01-14-2008
    Location
    Stockton Heath, Cheshire, UK
    MS-Off Ver
    Office 365, Excel for Windows 2010 & Excel for Mac
    Posts
    29,464

    Re: big LOOKUP and DATA wrangling for regression analysis

    Administrative Note:



    We would very much like to help you with your query, however it has been brought to our attention that the same query has been posted on one or more other forums and you have not provided the required cross-post link(s) here.

    Please see Forum Rule #3 about cross-posting and adjust accordingly. Read this to understand why we (and other sites like us) consider this to be important.

    (Note: this requirement is not optional. No help to be offered until the link is provided.)
    Richard Buttrey

    RIP - d. 06/10/2022

    If any of the responses have helped then please consider rating them by clicking the small star icon below the post.

  5. #5
    Registered User
    Join Date
    09-08-2016
    Location
    Los Angeles, CA
    MS-Off Ver
    Office 365
    Posts
    22

    Re: big LOOKUP and DATA wrangling for regression analysis

    Hi Richard!

    Would you help me with an administrative question? I enjoy the forum and wish to avoid violating rules.

    Since my original question was a bit complex with different parts, should close and break it up into its components one at a time. I've made a little progress since posting and my immediate question has now changed.

    Thank you.
    Last edited by rkong; 09-08-2019 at 06:44 AM.

  6. #6
    Forum Moderator AliGW's Avatar
    Join Date
    08-10-2013
    Location
    Retired in Ipswich, Suffolk, but grew up in Sawley, Derbyshire (England)
    MS-Off Ver
    MS 365 Subscription Insider Beta Channel v. 2404 (Windows 11 22H2 64-bit)
    Posts
    80,780

    Re: big LOOKUP and DATA wrangling for regression analysis

    No - you should not do that. Your query should remain here in this thread - it can be dealt with step-by-step. Any duplicate threads will be closed (you need to read the forum rules). What you SHOULD do is provide the cross-post link(s) requested above - you have not done so yet. Please do this NOW. Thanks.
    Ali


    Enthusiastic self-taught user of MS Excel who's always learning!
    Don't forget to say "thank you" in your thread to anyone who has offered you help.
    You can reward them by clicking on * Add Reputation below their user name on the left, if you wish.

    Forum Rules (updated August 2023): please read them here.

  7. #7
    Registered User
    Join Date
    09-08-2016
    Location
    Los Angeles, CA
    MS-Off Ver
    Office 365
    Posts
    22

    Re: big LOOKUP and DATA wrangling for regression analysis

    Hi Ali,

    Here's the link to the cross-post.

    I included it to my edited post yesterday, 10:52 AM, in response to Richard.

    Is there a designated place and way to post the link?

  8. #8
    Forum Moderator AliGW's Avatar
    Join Date
    08-10-2013
    Location
    Retired in Ipswich, Suffolk, but grew up in Sawley, Derbyshire (England)
    MS-Off Ver
    MS 365 Subscription Insider Beta Channel v. 2404 (Windows 11 22H2 64-bit)
    Posts
    80,780

    Re: big LOOKUP and DATA wrangling for regression analysis

    Sorry - it was not very visible. Better just to post what is clearly a link to a URL like this:

    https://www.excelforum.com/excel-gen...7-35795-a.html

    However, that is NOT a cross-post link - it's a link to another thread on this forum. What we are asking for is a link to the other forum or forums where you have asked this question.

    Please deal with this request immediately. Thanks.

  9. #9
    Registered User
    Join Date
    09-08-2016
    Location
    Los Angeles, CA
    MS-Off Ver
    Office 365
    Posts
    22

    Re: big LOOKUP and DATA wrangling for regression analysis

    Ah, thank you for the example Ali.

    Given that definition, I have no cross-post links; only a post to the other thread in this forum.

    Richard, to what were you referring?

  10. #10
    Forum Moderator - RIP Richard Buttrey's Avatar
    Join Date
    01-14-2008
    Location
    Stockton Heath, Cheshire, UK
    MS-Off Ver
    Office 365, Excel for Windows 2010 & Excel for Mac
    Posts
    29,464

    Re: big LOOKUP and DATA wrangling for regression analysis

    Would you clarify the request please.

    Are you trying to complete the matrix in columns P:ET on the example_1 sheet.

    Inwhich case would you manually add some typical results and explain how you arrived at them.
    P6 for instance shows an error since there is no column for 1985 on the stores_beg sheet.

  11. #11
    Forum Moderator AliGW's Avatar
    Join Date
    08-10-2013
    Location
    Retired in Ipswich, Suffolk, but grew up in Sawley, Derbyshire (England)
    MS-Off Ver
    MS 365 Subscription Insider Beta Channel v. 2404 (Windows 11 22H2 64-bit)
    Posts
    80,780

    Re: big LOOKUP and DATA wrangling for regression analysis

    Richard - is the request for cross-post links moot?

  12. #12
    Forum Moderator - RIP Richard Buttrey's Avatar
    Join Date
    01-14-2008
    Location
    Stockton Heath, Cheshire, UK
    MS-Off Ver
    Office 365, Excel for Windows 2010 & Excel for Mac
    Posts
    29,464

    Re: big LOOKUP and DATA wrangling for regression analysis

    ISTR mentioning the words 'moot point' elsewhere in connection with a question mark over whether something could be considered a duplicate thread.
    And yes I do consider there are circumstances where whether something is or isn't a duplicate thread is moot.
    Last edited by Richard Buttrey; 09-08-2019 at 05:47 PM.

  13. #13
    Registered User
    Join Date
    09-08-2016
    Location
    Los Angeles, CA
    MS-Off Ver
    Office 365
    Posts
    22

    Re: big LOOKUP and DATA wrangling for regression analysis

    Thank you Richard. I've split the problem into smaller parts, and resolved some pieces on my own.

    The roadblock at present appears to be this part of the code:
    Please Login or Register  to view this content.
    The match returns "Value Not Available Error" even though its lookup_value and lookup_array both return 1997.

    Screen Shot 2019-09-08 at 2.40.23 PM.png

    Attached updated helper file gets more to the point. Thank you!
    Attached Files Attached Files
    Last edited by rkong; 09-08-2019 at 06:03 PM. Reason: trying to get .png file to appear in message

  14. #14
    Forum Moderator - RIP Richard Buttrey's Avatar
    Join Date
    01-14-2008
    Location
    Stockton Heath, Cheshire, UK
    MS-Off Ver
    Office 365, Excel for Windows 2010 & Excel for Mac
    Posts
    29,464

    Re: big LOOKUP and DATA wrangling for regression analysis

    That's because D4 is a number and the header row on stores_beg are text. Use instead

    Formula: copy to clipboard
    Please Login or Register  to view this content.

  15. #15
    Registered User
    Join Date
    09-08-2016
    Location
    Los Angeles, CA
    MS-Off Ver
    Office 365
    Posts
    22

    Re: big LOOKUP and DATA wrangling for regression analysis

    Thank you so much Richard! Does this mean data table headers are TEXT by default? Using Format Cells dialogue (or, alt+H), the headers show up as General? How do I know which one it really is?

  16. #16
    Registered User
    Join Date
    09-08-2016
    Location
    Los Angeles, CA
    MS-Off Ver
    Office 365
    Posts
    22

    Re: big LOOKUP and DATA wrangling for regression analysis

    After applying Richard's fix, here's the next big hurdle:

    Is it possible to combine INDEX/MATCH/INDIRECT with named tables? My hope is to simply drag across, so F4:H5 looks like J4:L5.

    Screen Shot 2019-09-08 at 5.41.52 PM.png

    Is it possible for table-referencing lookup_arrays to remain relative? If not with INDIRECT, as in like the formula below, perhaps by another method? Or, do I need to change tables to ranges, and apply INDEX/MATCH/INDIRECT to the tabs?
    Please Login or Register  to view this content.
    Thank you!
    Last edited by rkong; 09-08-2019 at 11:31 PM.

  17. #17
    Forum Moderator AliGW's Avatar
    Join Date
    08-10-2013
    Location
    Retired in Ipswich, Suffolk, but grew up in Sawley, Derbyshire (England)
    MS-Off Ver
    MS 365 Subscription Insider Beta Channel v. 2404 (Windows 11 22H2 64-bit)
    Posts
    80,780

    Re: big LOOKUP and DATA wrangling for regression analysis

    Quote Originally Posted by Richard Buttrey View Post
    ISTR mentioning the words 'moot point' elsewhere in connection with a question mark over whether something could be considered a duplicate thread.
    And yes I do consider there are circumstances where whether something is or isn't a duplicate thread is moot.
    I am completely baffled. Perhaps post #4 should be removed, then.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Replies: 0
    Last Post: 10-14-2018, 08:38 PM
  2. non-linear result of regression data analysis
    By Balumderbaer in forum Excel Formulas & Functions
    Replies: 5
    Last Post: 06-22-2017, 03:27 PM
  3. Incorporating Data Analysis: Regression into a macro
    By elyse in forum Excel Programming / VBA / Macros
    Replies: 2
    Last Post: 04-10-2009, 12:45 PM
  4. Recalculating a Regression Output in Data Analysis
    By Mike Middleton in forum Excel Formulas & Functions
    Replies: 5
    Last Post: 09-06-2005, 05:05 PM
  5. Recalculating a Regression Output in Data Analysis
    By karin in forum Excel Formulas & Functions
    Replies: 0
    Last Post: 09-06-2005, 02:05 PM
  6. [SOLVED] Recalculating a Regression Output in Data Analysis
    By karin in forum Excel Formulas & Functions
    Replies: 0
    Last Post: 09-06-2005, 12:05 PM
  7. [SOLVED] How to use Tools, Data Analysis, Regression in Macros
    By Bruce Edwards in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 06-20-2005, 03:05 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1