So, I want to start working on a side project to see if I can build a model that will predicatively calculate the length of stay of current subjects/patients/tenants by assigning values to a standard set of identifiable factors (maybe 10 or so) that have been statistically significant to length of stay for the subjects/patients/tenants that have moved out previously.
I am confident that I an build the model if I can determine the statistical significance (if any) of these factors and their effects on length of stay.
However, my statistics knowledge is a little limited. I would like to be able to run these through the excel regression model to determine statistical significance of factors like air conditioning, power, square footage etc., and determine the statistical significance of each variable on its own and when used in conjunction with all of the iterations of combinations that I have available - all in relation to length of stay in days or months.
However, I'm starting from square one in regards to how to set up the original data table for the regression analysis so that I can get this data. Specifically, how to assign a value to "all or nothing" factors like air conditioning - should I just use 1 or 0?
Any direction on the initial stages of this would be great, or if this is something that excel cant do because this requires something other than linear regression - it would be great to hear some thoughts before I waste a bunch of time on something that's not possible - lol.
Thanks!
Bookmarks