22) You have collected data on individuals and their attributes. Consequently you have generated several
binary variables, which take on a value of “1” if the individual has that characteristic and are “0”
otherwise. One example is the binary variable DMarr which is “1” for married individuals and “0” for non–
married variables. If you run the following regression:
ahei= β0 + β1×educi + β2×DMarri + ui
a. What is the interpretation for β2?
b. You are interested in directly observing the effect that being non–married (“single”) has on earnings,
controlling for years of education. Instead of recording all observations such that they are “1” for a not
married individual and “0” for a married person, how can you generate such a variable (DSingle) through
a simple command in your regression program?
23) Consider the following earnings function:
ahei= β0 + β1×DFemmei + β2×educi+…+ ui
versus the alternative specification
ahei= γ0 × DMale + γ1×DFemmei + γ2×educi+…+ ui
where ahe is average hourly earnings, DFemme is a binary variable which takes on the value of “1” if the
individual is a female and is “0″ otherwise, educ measures the years of education, and DMale is a binary
variable which takes on the value of “1” if the individual is a male and is “0” otherwise. There may be
additional explanatory variables in the equation.
a. How do the βs and γs compare? Putting it differently, having estimated the coefficients in the first
equation, can you derive the coefficients in the second equation without re–estimating the regression?
b. Will the goodness of fit measures, such as the regression R2, differ between the two equations?
c. What is the reason why economists typically prefer the second specification over the first?