![]() Which of the two measurements - body surface area or weight - do you think would be easier to obtain?! If indeed weight is an easier measurement to obtain than body surface area, then the researchers would be well-advised to remove BSA from the model and leave Weight in the model. For example, if the researchers here are interested in using their final model to predict the blood pressure of future individuals, their choice should be clear. The decision of which one to remove is often a scientific or practical one. We can choose to remove either predictor from the model. We see that the predictors Weight and BSA are highly correlated ( r = 0.875). If we review the pairwise correlations again: So, what to do? One solution to dealing with multicollinearity is to remove some of the violating predictors from the model. Where \(R_=8.42.\]Īgain, this variance inflation factor tells us that the variance of the weight coefficient is inflated by a factor of 8.42 because Weight is highly correlated with at least one of the other predictors in the model. In particular, the variance inflation factor for the j th predictor is: For example, the variance inflation factor for the estimated regression coefficient b j -denoted VIF j -is just the factor by which the variance of b j is "inflated" by the existence of correlation among the predictor variables in the model. ![]() A variance inflation factor exists for each of the predictors in a multiple regression model. But what variance? Recall that we learned previously that the standard errors - and hence the variances - of the estimated coefficients are inflated when multicollinearity exists. What is a Variation Inflation Factor?Īs the name suggests, a variance inflation factor ( VIF) quantifies how much the variance is inflated. That's why many regression analysts often rely on what are called variance inflation factors ( VIF) to help detect multicollinearity. It is possible that the pairwise correlations are small, and yet a linear dependence exists among three or even more variables, for example, if X 3 = 2 X 1 5 X 2 error, say. Looking at correlations only among pairs of predictors, however, is limiting. The correlations among pairs of predictor variables are large.The t-tests for each of the individual slopes are non-significant ( P > 0.05), but the overall F-test for testing all of the slopes are simultaneously 0 is significant ( P The analysis exhibits the signs of multicollinearity - such as, estimates of the coefficients vary excessively from model to model.Some of the common methods used for detecting multicollinearity include: Okay, now that we know the effects that multicollinearity can have on our regression analyses and subsequent conclusions, how do we tell when it exists? That is, how can we tell if multicollinearity is present in our data?
0 Comments
Leave a Reply. |