Does correlation help at all?

While studying how we can use statistics to better understand the performance of our applications, I came upon the concept of kurtosis. What this essentially means is that any given distribution is not normal if its kurtosis is very high…OK, that sounds like a fatal disease, what does it mean?!!!

A normal distribution just means, basically, that the vast majority of the values taken are “pretty close” to the average. This can obviously vary greatly. For example, if the average of 50 samples is 25, and the range 20-30, it is probably normally distributed. However, if the average is 15, and 95 % of the sample are less than 15, it is probably not normally distributed. How can this happen? Well in the case of Oracle’s AWR, if we have most of our samples with a “db file sequential read” performance window of 500 seconds, and two with 15,000 seconds, this would mean our distribution is not “normal”. As a result, our standard deviation and correlation calculations we have been using to predict performance may fly out the window of validity.

Kurtosis (and skewness) is what provides with insight as to how we test the normality of our data prior to drawing conclusions.

More to come…

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.