Chapter 23 1 General Represent Nominal Independent Variable

subject Type Homework Help
subject Pages 14
subject Words 4432
subject Authors Eliyathamby A. Selvanathan, Gerald Keller, Saroja Selvanathan

Unlock document.

This document is partially blurred.
Unlock all pages and 1 million more documents.
Get Access
page-pf1
Chapter 23Model building
MULTIPLE CHOICE
1. In explaining the amount of money spent on children’s clothes each month, which of the following
independent variables is best represented with an indicator variable?
A.
Age.
B.
Height.
C.
Gender.
D.
Weight.
2. In explaining the income earned by college graduates, which of the following independent variables is
best represented by a dummy variable?
A.
Grade point average.
B.
Age.
C.
Number of years since graduating from high school.
D.
College major.
3. In explaining students’ test scores, which of the following independent variables would not be
adequately represented by an indicator variable?
A.
Gender.
B.
Race.
C.
Number of hours studying for the test.
D.
Marital status.
4. In explaining starting salaries for graduates of computer science programs, which of the following
independent variables would not be adequately represented with a dummy variable?
A.
Grade point average.
B.
Gender.
C.
Race.
D.
Marital status.
5. Suppose that the estimated regression equation for 200 business graduates is
ŷ = 20 000 + 2000x + 1500I,
where y is the starting salary, x is the grade point average and I is an indicator variable that takes the
value of 1 if the student is a computer information systems major and 0 if not. A business
administration major graduate with a grade point average of 4 would have an average starting salary
of:
A.
$20 000.
B.
$26 000.
C.
$29 500.
D.
$28 000.
page-pf2
6. For the regression equation
2121 35820
öxxxxy +++=
, which combination of
1
x
and
2
x
,
respectively, results in the largest average value of y?
A.
3 and 5.
B.
5 and 3.
C.
6 and 3.
D.
3 and 6.
7. For the estimated regression equation ŷ = 15 + 6x1 + 5x2 + 4x1x2, a unit increase in x2, while keeping x1
constant, increases the value of y on average by:
A.
5.
B.
9.
C.
20.
D.
an amount that depends on the value of x1.
8. In a first-order model with two predictors,
1
x
and
2
x
, an interaction term may be used when the:
A.
relationship between the dependent variable and the independent variables is linear.
B.
effect of
1
x
on the dependent variable is influenced by
2
x
.
C.
effect of
2
x
on the dependent variable is influenced by
1
x
.
D.
Both B and C are correct.
9. For the estimated regression equation ŷ = 10 + 3x1 + 4x2, a unit increase in x1, while keeping x2
constant, increases the value of y on average by:
A.
13.
B.
3.
C.
17.
D.
an amount that depends on the value of
1
x
.
10. For the regression equation
2121 4512100
öxxxxy +=
, a unit increase in
1
x
, while holding
2
x
constant at a value of 2, decreases the value of y on average by:
A.
92.
B.
85.
C.
20.
D.
an amount that depends on the value of
1
x
.
11. For the regression equation
, a unit increase in
2
x
, while holding
1
x
constant at a value of 3, decreases the value of y on average by:
A.
56.
B.
22.
C.
50.
D.
an amount that depends on the value of
2
x
.
page-pf3
12. A manufacturing company opened a second branch plant in the same city where its first plant operates.
Some of the employees had been assigned involuntarily to the new plant while some others
volunteered to be transformed from the first plant to the new plant. The production manager of the new
plant would like to find out whether employees who volunteered for the new plant and those who were
involuntarily assigned to it differ with respect to productivity. From a random sample of 50 employees,
the manager estimated the following multiple regression equation:
ŷ = 2400 + 140x1 250x2
where y is the average number of units produced by an employee a day during the second month after
joining the new plant, x1 is the employee’s aptitude test score, and x2 is a dummy variable coded 1 for
involuntary assignment and 0 for voluntary assignment.
For an employee who joined the new plant voluntarily and whose aptitude test score is 40, the
estimated average number of units produced is:
A.
3200.
B.
8000.
C.
2950.
D.
7750.
13. A manufacturing company opened a second branch plant in the same city where its first plant operates.
Some of the employees had been assigned involuntarily to the new plant while some others
volunteered to be transformed from the first plant to the new plant. The production manager of the new
plant would like to find out whether employees who volunteered for the new plant and those who were
involuntarily assigned to it differ with respect to productivity. From a random sample of 50 employees,
the manager estimated the following multiple regression equation:
ŷ = 2400 + 140x1 250x2
where y is the average number of units produced by an employee a day during the second month after
joining the new plant, x1 is the employee’s aptitude test score, and x2 is a dummy variable coded 1 for
involuntary assignment and 0 for voluntary assignment.
For each additional aptitude test score, the average number of units produced by an employee who
joined the new plant involuntarily:
A.
increases by 250.
B.
increases by 140.
C.
decreases by 140.
D.
decreases by 250.
14. A manufacturing company opened a second branch plant in the same city where its first plant operates.
Some of the employees had been assigned involuntarily to the new plant while some others
volunteered to be transformed from the first plant to the new plant. The production manager of the new
plant would like to find out whether employees who volunteered for the new plant and those who were
involuntarily assigned to it differ with respect to productivity. From a random sample of 50 employees,
the manager estimated the following multiple regression equation:
ŷ = 2400 + 140x1 250x2
where y is the average number of units produced by an employee a day during the second month after
joining the new plant, x1 is the employee’s aptitude test score, and x2 is a dummy variable coded 1 for
involuntary assignment and 0 for voluntary assignment.
For an employee who joined the new plant involuntarily and whose aptitude test score is 40, the
estimated average number of units produced is:
page-pf4
A.
3200.
B.
8000.
C.
2950.
D.
7750.
15. A manufacturing company opened a second branch plant in the same city where its first plant operates.
Some of the employees had been assigned involuntarily to the new plant while some others
volunteered to be transformed from the first plant to the new plant. The production manager of the new
plant would like to find out whether employees who volunteered for the new plant and those who were
involuntarily assigned to it differ with respect to productivity. From a random sample of 50 employees,
the manager estimated the following multiple regression equation:
ŷ = 2400 + 140x1 250x2
where y is the average number of units produced by an employee a day during the second month after
joining the new plant, x1 is the employee’s aptitude test score, and x2 is a dummy variable coded 1 for
involuntary assignment and 0 for voluntary assignment.
The difference between the average numbers of units produced by an employee who joined the new
plant voluntarily and by another employee who was involuntarily assigned to the new plant is:
A.
250.
B.
250.
C.
140.
D.
140.
16. Suppose that the sample regression equation of a second-order model is:
2
45.015.050.2
öxxy ++=
.
The value 2.50 is the:
A.
intercept where the response surface strikes the y-axis.
B.
intercept where the response surface strikes the x-axis.
C.
predicted value of y.
D.
predicted value of y when x = 1.
17. Suppose that the sample regression equation of a second-order model is
2
45.015.050.2
öxxy ++=
.
The value 4.60 is the:
A.
predicted value of y for any positive value of x.
B.
predicted value of y when x = 2.
C.
estimated change in y when x increases by 1 unit.
D.
intercept where the response surface strikes the x-axis.
18. If a qualitative independent variable has three possible categories, the number of dummy variables
needed to represent these categories is:
A.
4.
B.
3.
C.
2.
D.
1.
page-pf5
19. For the regression equation
2121 5152075
öxxxxy ++=
, a unit increase in
2
x
, while holding
1
x
constant at 1, changes the value of y on average by:
A.
5.
B.
5.
C.
10.
D.
10
20. The model y =
0 +
1x +
2x2 + … +
pxp +
is referred to as a polynomial model with:
A.
one predictor variable.
B.
p predictor variables.
C.
(p + 1) predictor variables.
D.
(p 1) predictor variables.
21. The following model
2
210 xxy
++=
+
is referred to as a:
A.
simple linear regression model.
B.
first-order model with one predictor variable.
C.
second-order model with one predictor variable.
D.
third-order model with two predictor variables.
22. When we plot x versus y, the graph of the model
2
210 xxy
++=
+
is shaped like a:
A.
straight line going upwards.
B.
straight line going downwards.
C.
circle.
D.
parabola.
23. The model
22110 xxy
++=
+
is referred to as a:
A.
first-order model with one predictor variable.
B.
first-order model with two predictor variables.
C.
second-order model with one predictor variable.
D.
second-order model with two predictor variables.
24. The following model
22110 xxy
++=
+
is used whenever the statistician believes that, on
average, y is linearly related to:
A.
1
x
, and the predictor variables do not interact.
B.
2
x
, and the predictor variables do not interact.
C.
Either A or B is correct.
D.
Both A and B are correct.
page-pf6
25. Suppose that the sample regression line of a first order model is
21 328
öxxy ++=
.
If we examine the relationship between y and
1
x
for four different values of
2
x
, we observe that the:
A.
effect of x
1
on y remains the same no matter what the value of x
2
.
B.
effect of x
1
on y remains the same no matter what the value of x
1
.
C.
only difference in the four equations produced is the coefficient of x
2
.
D.
Not enough information is given to answer this question.
26. The model
21322110 xxxxy
+++=
+
is referred to as a:
A.
first-order model with two predictor variables with no interaction.
B.
first-order model with two predictor variables with interaction.
C.
second-order model with three predictor variables with no interaction.
D.
second-order model with three predictor variables with interaction.
27. Suppose that the sample regression equation of a model is
2121 3410
öxxxxy ++=
.
If we examine the relationship between
1
x
and y for three different values of
2
x
, we observe that the:
A.
three equations produced differ only in the intercept.
B.
coefficient of
2
x
remains unchanged.
C.
coefficient of
1
x
varies.
D.
three equations produced differ not only in the intercept term but the coefficient of
1
x
,
also varies.
28. An indicator variable is a variable that can assume:
A.
one of two values (usually 0 and 1).
B.
one of three values (usually 0, 1 and 2).
C.
any number of values.
D.
None of the above answers is correct.
29. An indicator variable is also called:
A.
an independent variable.
B.
a dummy variable.
C.
a predictor variable.
D.
a dependent variable.
30. In general, to represent a qualitative independent variable that has m possible categories, we must
create:
A.
(m + 1) indicator variables.
B.
m indicator variables.
C.
(m 1) indicator variables.
D.
(m 2) indicator variables.
page-pf7
31. Stepwise regression is an iterative procedure that:
A.
adds one independent variable at a time.
B.
deletes one independent variable at a time.
C.
Either A or B is correct.
D.
Both A and B are correct.
32. In a stepwise regression procedure, if two independent variables are highly correlated, then:
A.
both variables will enter the equation.
B.
only one variable will enter the equation.
C.
neither variable will enter the equation.
D.
Not enough information is given to answer this question.
33. In a stepwise regression procedure, the independent variable with the largest F-statistic, or
equivalently with the smallest p- value, is chosen as the first entering variable. The standard, also
called the F-to-enter, is usually set at F equals:
A.
4.
B.
2.
C.
1.
D.
0.
34. In regression analysis, indicator variables allow us to use:
A.
quantitative variables.
B.
qualitative variables.
C.
only quantitative variables that interact.
D.
only qualitative variables that interact.
35. Which of the following is not an advantage of multiple regression as compared with analysis of
variance?
A.
Multiple regression can be used to estimate the relationship between the dependent
variable and independent variables.
B.
Multiple regression handles qualitative variables better than analysis of variance.
C.
Multiple regression handles problems with more than two independent variables better
than analysis of variance.
D.
All of the above are advantages of multiple regression as compared with analysis of
variance.
TRUE/FALSE
1. The graph of the model
2
0 1 2
ö
i i i
y x x
 
= + +
is shaped like a straight line going upwards.
page-pf8
2. Suppose that the sample regression equation of a model is
1 2 1 2
4 1.5 2
ö
y x x x x= + +
. If we examine
the relationship between
1
x
and y for four different values of
2
x
, we observe that the four equations
produced differ only in the intercept term.
3. The model
0 1 1 2 2
ö
y x x
 
= + +
is used whenever the statistician believes that, on average,
y
is
linearly related to
1
x
and
2
x
, and the predictor variables do not interact.
4. The model y =
0 +
1x +
2x2 + … +
pxp +
is referred to as a polynomial model with p predictor
variables.
5. Suppose that the sample regression line of a first-order model is
1 2
4 3 2
ö
y x x= + +
. If we examine the
relationship between y and
1
x
for three different values of
2
x
, we observe that the effect of
1
x
on
y
remains the same no matter what the value of
2
x
.
6. The model
0 1 1 2 2 3 1 2
ö
y x x x x
 
= + + +
is referred to as a second-order model with two predictor
variables with interaction.
7. In stepwise regression procedure, if two independent variables are highly correlated, then neither
variable will enter the equation.
8. The model
0 1 1 2 2
ö
y x x
 
= + +
is referred to as a first-order model with two predictor variables with
no interaction.
9. In regression analysis, indicator variables are also called dependent variables.
10. In general, to represent a nominal independent variable that has n possible categories, we would create
(n 1) dummy variables.
11. The model y =
0 +
1x +
is referred to as a simple linear regression model.
page-pf9
12. Stepwise regression is especially useful when there are many independent variables.
13. Stepwise regression is an iterative procedure that adds and deletes one independent variable at a time.
14. In a first-order model with two predictors,
1
x
and
2
x
, an interaction term may be used when the
relationship between the dependent variable
y
and the predictor variables is linear.
15. In the first-order regression model ŷ = 12 + 6x1 +8x2 + 4x1x2, a unit increase in x2 increases the value
of
y
on average by 8 units.
16. In explaining the amount of money spent on children’s toys during Christmas each year, the
independent variable ‘gender’ is best represented by a dummy variable.
17. In the first-order model ŷ = 60 + 40x1 10x2 + 5x1x2, a unit increase in x1, while holding x2 constant at
1, increases the value of
y
on average by 45 units.
18. In the first-order model
1 2 1 2
50 25 10 6
ö
y x x x x+ − −=
, a unit increase in
2
x
, while holding
1
x
constant
at a value of 3, decreases the value of
y
on average by 3 units.
19. In the first-order model ŷ = 8 + 3x1 +5x2, a unit increase in
2
x
, while holding
1
x
constant, increases the
value of
y
on average by 3 units.
20. In the first-order model
1 2 1 2
75 12 5 3
ö
y x x x x + −=
, a unit increase in
1
x
, while holding
2
x
constant
at a value of 2, decreases the value of
y
on average by 8 units.
21. Regression analysis allows the statistics practitioner to use mathematical models to realistically
describe relationships between the dependent variable and independent variables.
page-pfa
22. Suppose that the sample regression equation of a model is
1 2 1 2
4.7 2.2 2.6
ö
y x x x x= + +
. If we
examine the relationship between y and
2
x
for
1
x
= 1, 2 and 3, we observe that the three equations
produced not only differ in the intercept term, but the coefficient of
2
x
also varies.
23. An indicator variable (also called a dummy variable) is a variable that can assume either one of two
values (usually 0 and 1), where one value represents the existence of a certain condition, and the other
value indicates that the condition does not hold.
24. In regression analysis, a nominal variable such as season, with four different categories such as spring,
summer, autumn and winter, is best represented by three indicator variables to represent the four
seasons.
25. We interpret the coefficients in a multiple regression model by holding all variables in the model
constant.
SHORT ANSWER
1. A regression analysis was performed to study the relationship between a dependent variable and five
independent variables. The following information was obtained from the regression analysis:
20.80R=
, SSR = 9600, n = 40.
Develop the ANOVA table.
2. A regression analysis was performed to study the relationship between a dependent variable and five
independent variables. The following information was obtained from the regression analysis:
20.80R=
, SSR = 9600, n = 40.
Test the overall validity of the model at the 5% significant level.
page-pfb
3. A regression analysis involving 25 observations and four independent variables revealed that the total
variation in the dependent variable y is 1600 and that the mean squares for error is 20. Develop the
ANOVA table.
4. A regression analysis involving 25 observations and four independent variables revealed that the total
variation in the dependent variable y is 1600 and that the mean squares for error is 20. Test the validity
of the model at the 1% significance level.
5. Consider the following data for two variables, x and y.
x
7
10
3
5
3
10
4
14
5
8
y
35.0
28.5
45.0
45.0
55.0
25.0
37.5
27.5
30.0
27.5
Use Excel to develop an estimated regression equation of the form ŷ = b0 +b1x.
6. Consider the following data for two variables, x and y.
x
7
10
3
5
3
10
4
14
5
8
y
35.0
28.5
45.0
45.0
55.0
25.0
37.5
27.5
30.0
27.5
Use Excel to determine whether there is sufficient evidence at the 1% significance level to infer that
the relationship in the previous question is significant.
page-pfc
7. Consider the following data for two variables, x and y.
x
7
10
3
5
3
10
4
14
5
8
y
35.0
28.5
45.0
45.0
55.0
25.0
37.5
27.5
30.0
27.5
Use Excel to find the coefficient of determination. What does this statistic tell you about this simple
linear model?
8. Consider the following data for two variables, x and y.
x
7
10
3
5
3
10
4
14
5
8
y
35.0
28.5
45.0
45.0
55.0
25.0
37.5
27.5
30.0
27.5
Use Excel to develop a scatter diagram for the data. Does the scatter diagram suggest an estimated
regression equation of the form ŷ = b0 +b1x + b2x2? Explain.
page-pfd
9. Consider the following data for two variables, x and y.
x
7
10
3
5
3
10
4
14
5
8
y
35.0
28.5
45.0
45.0
55.0
25.0
37.5
27.5
30.0
27.5
Use Excel to develop an estimated regression equation of the form ŷ = b0 +b1x + b2x2..
10. Consider the following data for two variables, x and y.
x
7
10
3
5
3
10
4
14
5
8
y
35.0
28.5
45.0
45.0
55.0
25.0
37.5
27.5
30.0
27.5
Use Excel to determine whether there is sufficient evidence at the 1% significance level to infer that
the relationship between y, x and
2
x
in ŷ = 66.799 7.307x + 0.324x2 is significant.
page-pfe
11. Consider the following data for two variables, x and y.
x
7
10
3
5
3
10
4
14
5
8
y
35.0
28.5
45.0
45.0
55.0
25.0
37.5
27.5
30.0
27.5
Use Excel to find the coefficient of determination. What does this statistic tell you about this
curvilinear model?
12. Consider the following data for two variables, x and y.
x
7
10
3
5
3
10
4
14
5
8
y
35.0
28.5
45.0
45.0
55.0
25.0
37.5
27.5
30.0
27.5
Use the model in ŷ = 66.799 7.307x + 0.324x2 to predict the value of y when x = 10.
13. An avid football fan was in the process of examining the factors that determine the success or failure
of football teams. He noticed that teams with many rookies and teams with many veterans seem to do
quite poorly. To further analyse his beliefs, he took a random sample of 20 teams and proposed a
second-order model with one independent variable. The selected model is:
+++= 2
210 xxy
.
where
y = winning teams percentage.
x = average years of professional experience.
The computer output is shown below:
page-pff
THE REGRESSION EQUATION IS:
=y
2
48.096.56.32 xx +
Predictor
Coef
StDev
T
Constant
32.6
19.3
1.689
x
5.96
2.41
2.473
2
x
0.48
0.22
2.182
S = 16.1 R-Sq = 43.9%.
ANALYSIS OF VARIANCE
Source of Variation
df
SS
MS
F
Regression
2
3452
1726
6.663
Error
17
4404
259.059
Total
19
7856
Do these results allow us to conclude at the 5% significance level that the model is useful in predicting
the team’s winning percentage?
14. An avid football fan was in the process of examining the factors that determine the success or failure
of football teams. He noticed that teams with many rookies and teams with many veterans seem to do
quite poorly. To further analyse his beliefs, he took a random sample of 20 teams and proposed a
second-order model with one independent variable. The selected model is:
+++= 2
210 xxy
.
where
y = winning teams percentage.
x = average years of professional experience.
The computer output is shown below:
THE REGRESSION EQUATION IS:
=y
2
48.096.56.32 xx +
Predictor
Coef
StDev
T
Constant
32.6
19.3
1.689
x
5.96
2.41
2.473
2
x
0.48
0.22
2.182
S = 16.1 R-Sq = 43.9%.
page-pf10
ANALYSIS OF VARIANCE
Source of Variation
df
SS
MS
F
Regression
2
3452
1726
6.663
Error
17
4404
259.059
Total
19
7856
Test to determine at the 10% significance level if the linear term should be retained.
15. An avid football fan was in the process of examining the factors that determine the success or failure
of football teams. He noticed that teams with many rookies and teams with many veterans seem to do
quite poorly. To further analyse his beliefs, he took a random sample of 20 teams and proposed a
second-order model with one independent variable. The selected model is:
+++= 2
210 xxy
.
where
y = winning teams percentage.
x = average years of professional experience.
The computer output is shown below:
THE REGRESSION EQUATION IS:
=y
2
48.096.56.32 xx +
Predictor
Coef
StDev
T
Constant
32.6
19.3
1.689
x
5.96
2.41
2.473
2
x
0.48
0.22
2.182
S = 16.1 R-Sq = 43.9%.
ANALYSIS OF VARIANCE
Source of Variation
df
SS
MS
F
Regression
2
3452
1726
6.663
Error
17
4404
259.059
Total
19
7856
Test to determine at the 10% significance level whether the
2
x
term should be retained.
page-pf11
16. An avid football fan was in the process of examining the factors that determine the success or failure
of football teams. He noticed that teams with many rookies and teams with many veterans seem to do
quite poorly. To further analyse his beliefs, he took a random sample of 20 teams and proposed a
second-order model with one independent variable. The selected model is:
+++= 2
210 xxy
.
where
y = winning teams percentage.
x = average years of professional experience.
The computer output is shown below:
THE REGRESSION EQUATION IS:
=y
2
48.096.56.32 xx +
Predictor
Coef
StDev
T
Constant
32.6
19.3
1.689
x
5.96
2.41
2.473
2
x
0.48
0.22
2.182
S = 16.1 R-Sq = 43.9%.
ANALYSIS OF VARIANCE
Source of Variation
df
SS
MS
F
Regression
2
3452
1726
6.663
Error
17
4404
259.059
Total
19
7856
What is the coefficient of determination? Explain what this statistic tells you about the model.
17. A traffic consultant has analysed the factors that affect the number of traffic fatalities. She has come to
the conclusion that two important variables are the number of cars and the number of tractortrailer
trucks. She proposed the second-order model with interaction:
=y
++++++ 215
2
24
2
1322110 xxxxxx
.
page-pf12
Where:
y = number of annual fatalities per shire.
1
x
= number of cars registered in the shire (in units of 10 000).
2
x
= number of trucks registered in the shire (in units of 1000).
The computer output (based on a random sample of 35 shires) is shown below.
THE REGRESSION EQUATION IS
=y
21
2
2
2
121 13.051.015.161.73.117.69 xxxxxx ++
.
Predictor
Coef
StDev
T
Constant
69.7
41.3
1.688
1
x
11.3
5.1
2.216
2
x
7.61
2.55
2.984
2
1
x
1.15
0.64
1.797
2
2
x
0.51
0.20
2.55
21xx
0.13
0.10
1.30
S = 15.2 R-Sq = 47.2%.
ANALYSIS OF VARIANCE
Source of
Variation
df
SS
MS
F
Regression
5
5959
1191.800
5.181
Error
29
6671
230.034
Total
34
12 630
Is there enough evidence at the 5% significance level to conclude that the model is useful in predicting
the number of fatalities?
18. A traffic consultant has analysed the factors that affect the number of traffic fatalities. She has come to
the conclusion that two important variables are the number of cars and the number of tractortrailer
trucks. She proposed the second-order model with interaction:
=y
++++++ 215
2
24
2
1322110 xxxxxx
.
Where:
y = number of annual fatalities per shire.
1
x
= number of cars registered in the shire (in units of 10 000).
2
x
= number of trucks registered in the shire (in units of 1000).
page-pf13
The computer output (based on a random sample of 35 shires) is shown below.
THE REGRESSION EQUATION IS
=y
21
2
2
2
121 13.051.015.161.73.117.69 xxxxxx ++
.
Predictor
Coef
StDev
T
Constant
69.7
41.3
1.688
1
x
11.3
5.1
2.216
2
x
7.61
2.55
2.984
2
1
x
1.15
0.64
1.797
2
2
x
0.51
0.20
2.55
21xx
0.13
0.10
1.30
S = 15.2 R-Sq = 47.2%.
ANALYSIS OF VARIANCE
Source of
Variation
df
SS
MS
F
Regression
5
5959
1191.800
5.181
Error
29
6671
230.034
Total
34
12 630
Test at the 1% significance level to determine whether the
1
x
term should be retained in the model.
19. A traffic consultant has analysed the factors that affect the number of traffic fatalities. She has come to
the conclusion that two important variables are the number of cars and the number of tractortrailer
trucks. She proposed the second-order model with interaction:
=y
++++++ 215
2
24
2
1322110 xxxxxx
.
Where:
y = number of annual fatalities per shire.
1
x
= number of cars registered in the shire (in units of 10 000).
2
x
= number of trucks registered in the shire (in units of 1000).
The computer output (based on a random sample of 35 shires) is shown below.
THE REGRESSION EQUATION IS
=y
21
2
2
2
121 13.051.015.161.73.117.69 xxxxxx ++
.
page-pf14
Predictor
Coef
StDev
T
Constant
69.7
41.3
1.688
1
x
11.3
5.1
2.216
2
x
7.61
2.55
2.984
2
1
x
1.15
0.64
1.797
2
2
x
0.51
0.20
2.55
21xx
0.13
0.10
1.30
S = 15.2 R-Sq = 47.2%.
ANALYSIS OF VARIANCE
Source of
Variation
df
SS
MS
F
Regression
5
5959
1191.800
5.181
Error
29
6671
230.034
Total
34
12 630
Test at the 1% significance level to determine whether the
2
x
term should be retained in the model.
20. A traffic consultant has analysed the factors that affect the number of traffic fatalities. She has come to
the conclusion that two important variables are the number of cars and the number of tractortrailer
trucks. She proposed the second-order model with interaction:
=y
++++++ 215
2
24
2
1322110 xxxxxx
.
Where:
y = number of annual fatalities per shire.
1
x
= number of cars registered in the shire (in units of 10 000).
2
x
= number of trucks registered in the shire (in units of 1000).
The computer output (based on a random sample of 35 shires) is shown below.

Trusted by Thousands of
Students

Here are what students say about us.

Copyright ©2022 All rights reserved. | CoursePaper is not sponsored or endorsed by any college or university.