Multicollinearity

The predictors should not correlate. In the stepwise selection one variable might take the prediction of the another variable into the model and the second variable will not be taken in to the model. The variables might get also inconsistent correlation coefficients (negative if the effect is positive).

If we use wait events, there will in a lot of cases be multicollinearity. In other words, if we see a lot of waits on “db file sequential read”, it may be expected that we will also see significant waits on “db file scattered read”.

Using v$sysstat makes this even more of a problem. For example, “user I/O wait time” will almost certainly be correlated with “physical reads”.

As such, we need to programmatically identify those predictors that are not correlated with each other.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.