A multivariate linear regression for explaining impacts of the predictors

I am trying to build a multivariate linear regression and the main goal is to understand how the various features impact the response by understanding the coefficients and their confidence intervals.

For this reason, I chose multivariate linear regression because the coefficients are intuitive to interpret and from the standard error and the degrees of freedom, I can get the 95% confidence interval of the coefficients. Therefore, I can tell what the impact of a unit increase of a predictor on the outcome is. I could use more complex models such as tree based models but, even if I can get the variable importance, it is not easy to quantify the coefficient of each variable.

My problem, though, is the data is time series data and hence it shows auto-correlation. I know using regression with ARIMA errors can handle the autocorrelation issue but I found it difficult to interpret the coefficients specially when d is non-zero in ARIMA(p, d, q). The attached image, for example, is model diagnosis for one of the models I have (I have to build 1000s of them).

How can I handle the autocorrelation issue and still get coefficients that are readily interpretable like in a multivariate linear regression? My residuals are not normally distributed and I am planning to use box-cox transform to see if it can solve that problem. But, I am not sure what I should use for the autocorrelation issue.

Topic linear-regression time-series machine-learning

Category Data Science


To handle autocorrelation, you should try to "subtract it" from the series. In other words, using differentiation, de-seasonalization, and transformations you should try to subtract the impact of time on your time series data. Once you end up with a distribution that looks like white noise, then you can run your regression.

You can do it with an ARIMA model, simply add all the AR(), I(), and MA() components in the regression equation, and then ignore them when you evaluate the impact of other coefficients.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.