Heteroscedasticity correction
Taking the information shown in the following video into account, where data from 20 companies on profits (B) and sales (V) estimates a linear model to explain profits in terms of sales, the presence of heteroscedasticity is contrasted and Ordinary Least Squares (OLS) applied.
It is immediate that the model estimated by OLS is B(t) = 10'2229 + 0'0112223 * V(t) for t=1,2,3...,20. In addition we obtain a coefficient of determination of 0'642619.
Representing the residuals by observation number shows that for different observation groups (for example, 1 to 7, 8 to 16 and 17 to 20) gives a different dispersion, which makes us think that the disturbances have constant variance. Similarly, on the graph of residuals compared to the variable that assumes heteroscedasticity occurs (sales), we can see that sales increases with the increasing dispersion of residuals. All this tells me that, based on graphical methods, heteroscedasticity is present in the model considered..
From White’s test (analytical method), we decided to reject the null hypothesis of homoscedasticity in the model as the p-value obtained is 0'04256. This is because the p-value is defined as the minimum value of significance from which the null hypothesis is rejected. That is to say, for values greater than 0'04256, we reject the null hypothesis and for smaller values it is not rejected. Since in this case we are working at a 5% significance level, it is clear that 0.05 is greater than 0'04256, so that the decision to make is to reject the null hypothesis, leading to the presence of heteroscedasticity in the model.
As we
know, to implement White’s test, we need to raise an auxiliary regression
that explains the squared residuals
from the original variables,
their squares and cross-products excluding
repetitions. In this case, the regressors in the
auxiliary variable will be a constant, sales and its square (all other possibilities are
repetitions of those). Thus the OLS
estimate of this regression is B(t) =
-5'17545 + 0'0833682 * V(t) - 0'000132827 *
V(t)^2 para
t=1,2,...,20, with
a coefficient of determination of
0'315679. Finally, since the experimental statistic is
obtained by multiplying the number of observations by
the auxiliary regression’s coefficient of determination, it is clear that this value is 20 * 0'315679 =
6'313575.
Since heteroscedasticity is present
in the model, the OLS estimate is not
optimal. Therefore, to correct this
problem, as we know, we must transform the original data. Watching the video, it is clear that the way to transform
the data is to divide by the
square root of V. From this
transformed data, we re-perform
the OLS estimate obtaining B(t)
= 10'2147 + 0'063917 * V(t) with a coefficient of determination of 0'
The Camtasia Studio video content presented here requires JavaScript to be enabled and the latest version of the Macromedia Flash Player. If you are you using a browser with JavaScript disabled please enable it now. Otherwise, please update your version of the free Flash Player by downloading here.