One way of explaining degrees of freedom is to use a diagram. With two data
points one can estimate a line. However, it is impossible to tell how well this line
fits the true underlying relationship. With three data points we can not only fit a
line but also determine something about the distribution around the line. That is,
with three data points we have a degree of freedom beyond what we need to
determine a line. The more degrees of freedom we have the more we can determine
about the distribution.
The concept behind R2 is intuitive for most students. Its main shortcoming can be
emphasized by exploring what happens when explanatory variables are added.
Ultimately when there are as many explanatory variables (including the constant)
as there are data points, then the fit becomes perfect. However, we have lost our
degrees of freedom and with them any confidence we have in the specification.
That is, adding explanatory variables has a cost in terms of lost degrees of
freedom. The adjusted R2 is designed to take this cost into account.
For the F statistic we emphasize that it will have an F distribution if all of the
coefficients are zero. Given the assumption that all the coefficients are zero, we can
ask how likely it is that we get this value for F. If it is not very likely, then we
reject the hypothesis that all of the coefficients are zero.
All spreadsheet based regression programs now report p-values that directly
convey the statistical significance of the regression as a whole (the F statistic) and
of individual coefficients of the potential explanatory variables (respective
t-statistics). This frees students from having to pick critical values from statistical
tables to benchmark significance of their regression results. For instance, suppose
that an estimated coefficient is .76, its t-statistic is 2.31, and its associated p-value
is .032 (for say, a two-sided test). Then, we immediately know that the coefficient
is significant at the 5% level. The coefficient .76 is definitely different from zero.
Remember to explain to students what the p-value means. Namely, if the true
coefficient were zero, the chance that the estimate would be as extreme as .76 (due
to luck) would be only .032. Because this chance is so small, we reject the null
hypothesis of a zero coefficient. This is the correct formal meaning of rejecting the
null hypothesis, but it’s a bit “round about” and not always easy to remember.
(Informally, we tell students that it’s OK to think of the p-value as reflecting the