Book Title
Basic Marketing Research: Using Microsoft Excel Data Analysis 3rd Edition

978-0135078228 Chapter 14 Lecture Note

June 7, 2019
Determining Relationships
To learn what is meant by a “relationship” between two variables
To understand when and how cross-tabulations with chi-square analysis are applied
To become knowledgeable about the use and interpretation of correlations
To learn about the application and interpretation of regression analysis
To become proficient in the use of the XL Data Analyst to execute various types of
relationship analyses
What Is a Relationship Between Two Variables?
Categorical Variables Relationships
Cross-Tabulation Analysis
Types of Frequencies and Percentages in a Cross-Tabulation Table
Chi-Square Analysis of a Cross-Tabulation Table
How to Present a Significant Cross-Tabulation Finding
How to Perform Cross-Tabulation Analysis with the XL Data Analyst
Correlation: Assessing Metric Variables Relationships
Correlation Coefficients and Covariation
Statistical Significance of a Correlation
Rules of Thumb for Correlation Strength
The Pearson Product Moment Correlation Coefficient
How to Perform Correlation Analysis with the XL Data Analyst
How to Present Correlation Findings
Regression Analysis
Computing the Intercept and Slope for Bivariate Regression
Testing for Statistical Significance of the Intercept and the Slope
Making a Prediction with Bivariate Regression Analysis
Multiple Regression Analysis
Working with Multiple Regression
Using “Dummy” Independent Variables
Three Uses of Multiple Regression
How to Use the XL Data Analyst to Perform Regression Analysis
How to Present Regression Analysis Findings
Final Comments on Multiple Regression Analysis
Flow Chart on Relationship Analyses
Bivariate regression analysis
Chi-square analysis
Coefficient of determination
Column percentages table
Correlation coefficient
Linear relationship
Multiple R
Multiple regression analysis
Null hypothesis for a correlation
“Observed frequencies”
Pearson product moment correlation
Regression analysis
Cross-tabulation analysis
Cross-tabulation cell
Cross-tabulation table
Dependent variable
Dummy independent variable
“Expected frequencies”
Frequencies table
Independent variable
Least squares criterion
Row percentages table
R-Square value
Scatter diagram
Screening device
Standard error of the estimate
Standardized beta coefficient
Straight line formula
1. In this edition, we dropped the notion of the use of Boolean operators to describe how
cross-tabulations are determined. Instructors may retain this approach in their class
presentations as it utilizes a concept that students know and use (AND, OR, etc. in search
engines) with a concept that is new to them (cross-tabulations of categorical variables). The
description relies heavily on students’ understanding that the category or group labels are
used. It is recommended that instructors strongly emphasize that group labels are being used
throughout their coverage of cross-tabulations.
2. Cross-tabulation with Chi-square analysis is a nonparametric statistical concept.
Nonparametic procedures, of course, are very different from parametric statistical techniques.
Instructors who are comfortable with them may wish to delve deeper into nonparametric
statistical techniques (assumptions, distributions, scales, etc.) as a way of helping students
keep cross-tabulation analysis separate from correlation or other analyses.
3. Experience has taught that students will become confused with initial encounters with the
various types of frequencies and percentages possible in cross-tabulation tables. The chapter
takes students step-by-step through these slowly; however, Instructors should consider using
class time to review the various steps so students will gain a conceptual understanding of
what each type is and how it is used. At the very least, students should understand how row
and column percentages are useful in identifying the underlying association once statistical
significance is found. This fact is underlined by the XL Data Analyst, which provides row
and column percentages only if the cross-tabulation relationship is significant at the 95
percent level of confidence.
4. The null hypothesis is omnipresent in associative analysis tests, and it is the foundation for
practically all statistical tests. We recommend that Instructors continually remind students of
the null hypothesis of no association as they review the various associative analysis tests. It
may be worthwhile to remind students that the null hypothesis is present in statistical
inference tests such as t-tests (no difference between the means of the two groups), or
analysis of variance (no difference between any two group means). For Instructors’
information, the null hypothesis concept is emphasized in the next chapter, particularly with
descriptions of bivariate and multiple regression analyses.
5. For Instructors who want their student to actually compute statistics, consider end-of-chapter
questions numbers 14 and 15. Number 14 provides observed frequencies in a cross-tabulation
along with the computed Chi-square value. It has only four cells. The answer to question 14
below has the XL Data Analyst cross-tabulation output with the row and cell percentages, so
Instructors will be spared the requirement to compute them. Question 15 will require students
to set up the cross-tabulation table and to determine the observed frequencies, plus do all
other computations.
6. For Instructors who want to emphasize the scatter diagram interpretation of a correlation,
consider using Application Question 16. The correlation matrix for all five variables is
provided in the answer to question 16 below as are all the scatter diagrams. Students should
be able to build the data set in Excel for Windows quickly, and they can have Excel create all
possible scatter diagrams.
7. There are many nuances to regression analysis not treated in this chapter’s introduction to the
topic. The intent is to describe the basic concepts and to have students identify their related
values on a printout. Instructors should be aware that in no way will students become more
than fundamentally knowledgeable about regression, running it and interpreting the findings.
8. A constraint in Excel is that it will allow no more than 16 independent variables when
performing regression analysis. When students (or instructors) are confronted with a larger
number of independent variables, we recommend the following approach.
a. Identify groups of independent variables such as demographics, lifestyle
characteristics, usage and/or behavioral variables, performance evaluations and/or
satisfaction variables, etc.
b. Use regression analysis to identify the statistically significant independent variables
in the first group (such as usage and/or behavioral variables).
c. Add the next group to these variables (such as lifestyle characteristics) and determine
the statistically significant independent variables. (It is necessary to rerun the
regression each time as the addition of new independent variables affects the outcome
every time due to changes in sample size, non-inclusion of missing observations, etc.)
d. Continue with this approach until the final set of statistically significant independent
variables are determined.
9. When running multiple regression, the XL Data Analyst checks for multicolinearity among
the independent variables. That is, a requirement of multiple regression is that there is not
high correlation between any two independent variables. Using a “tolerance” test, the
program will detect the level of multicolinearity and issue a warning if the level is too high.
Processing stops. Users should systematically find the two independent variables with the
highest correlation and eliminate one from the multiple regression analysis. This approach
may take several iterations if several independent variables are highly correlated.