978-1111826925 Chapter 20 Lecture Note

subject Type Homework Help
subject Pages 7
subject Words 2483
subject Authors Barry J. Babin, Jon C. Carr, Mitch Griffin, William G. Zikmund

Unlock document.

This document is partially blurred.
Unlock all pages and 1 million more documents.
Get Access
Chapter 20
Basic Data Analysis: Descriptive Statistics
AT-A-GLANCE
I. The Nature of Descriptive Statistics
II. Tabulation
III. Cross Tabulation
A. Contingency table
B. Percentage cross-tabulations
C. Elaboration and refinement
D. How many cross-tabulations?
E. Quadrant analysis
IV. Data Transformations
A. Simple transformations
B. Problems with data transformations
C. Index numbers
V. Calculating Rank Order
VI. Tabular and Graphic Methods of Displaying Data
VII. Computer Programs for Analysis
A. Statistical packages
B. Computer graphics and computer mapping
VIII. Interpretation
LEARNING OUTCOMES
1. Know what descriptive statistics are and why they are used
2. Create and interpret simple tabulation tables
3. Understand how cross-tabulations can reveal relationships
4. Perform basic data transformations
5. List different computer software products designed for descriptive statistical analysis
6. Understand a researcher’s role in interpreting the data
CHAPTER VIGNETTE: Choose Your “Poison”?
Most Americans enjoy an adult beverage occasionally, but not all like the same drink. Researchers could
apply sophisticated statistics to address questions related to drinking preferences, but a lot can be learned
from just counting what people are buying. For example, a grocery store allocates a certain percentage of
space to different adult beverages (i.e., beer, spirits, and wine), but the customer base has changed. In
1992, American consumers showed a heavy preference toward beer, but in 2005, preferences changed
more in favor of wine. Also, the types of stores where consumers purchase adult beverages have shifted.
Younger consumers tend to purchase beer from convenience stores, but wine consumers are more
attractive from several perspectives (i.e., they are more likely to buy products like prime choice beef and
imported cheeses).
SURVEY THIS!
Students are asked draw some conclusions about which job is most attractive: (1) Calculate the number of
respondents who rank each profession as the most attractive (assign it a 1). Report this tabulation. (2) Do
you think female and male respondents respond similarly to this item? Try to create the appropriate
cross-tabulation table.
RESEARCH SNAPSHOTS
Our Four-Legged Family Members
Many families have a pet, but what does it mean to be a “member of the family,” and do men or
women treat their four-legged friends as family members in different ways? This would suggest a
set of contingent arguments, that is, does treating a pet as a member of the family depend upon
whether you are a male or female? The Harris Interactive Poll conducted a stratified survey of
over 2,000 adults across the United States. A 2x2 contingency table of males’ and females’
answers to the question, “Do you consider your pet to be a member of your family?” is given.
Women are much more likely to see their pet as a part of the family. The second contingent
question shows that females are more likely to have their pet sleep in the bed and buy them a
holiday present, but males appear more likely to buy a birthday present, cook something special
for them, and take them to work.
Twitter and the ReTweetability Index
Twitter is one of the fastest growing social networks and has evolved into a real-time messaging
service compatible with several different networks and multiple devices. Simplicity is the key to
Twitter’s success. It asks one question, “What are you doing?” Answers must be under 140
characters and can be sent via mobile texting, instant message, or the web. Some Twitter
terminology: tweet, followers, retweet (or RT). Retweet occurs when a follower takes a tweet
and then tweets that message to everyone in their own Twitter network. Encouraging others to
retweet your message is the key to spreading your message across the Twitterspace. The
Retweetability index provides a score and ranking of Twitter users based on the power of their
tweets—the higher the number, the more influential Twitter you are!
OUTLINE
I. THE NATURE OF DESCRIPTIVE STATISTICS
Can summarize responses from large numbers of respondents in a few simple statistics.
Sample descriptive statistics are used to make inferences about characteristics of the entire
population of interest.
Simple but powerful and are used very widely.
Descriptive analysis is the elementary transformation of data in a way that describes the
basic characteristics such as central tendency, distribution, and variability.
Means, medians, modes, variance, range and standard deviation typify widely applied
descriptive statistics.
Exhibit 20.1 shows how the level of scale measurement influences the choice of descriptive
statistics.
A histogram is a graphical way of showing a frequency distribution in which the height of a
bar corresponds to the frequency of a category.
II. TABULATION
Tabulation refers to the orderly arrangement of data in a table or other summary format.
When done by hand, the term tallying is used.
Counting the different ways respondents answered a question and arranging them in a simple
tabular form yields a frequency table.
The actual number of responses to each category is a variable’s frequency distribution.
A simple tabulation of this type is sometimes called a marginal tabulation.
Simple tabulation tells the researcher how frequently each response occurs, and this starting
point for analysis requires the researcher to count responses or observations for each category
or code assigned to a variable.
The frequency column shows the tally result or the number of respondents for each
category.
The percent column shows the total percentage in each category.
The cumulative percentage shows the percentage indicating either a particular category or
any preceding category.
III. CROSS TABULATION
The mere tabulation of data may answer many research questions, and as long as a question
deals with only one categorical variable, tabulation is probably the best approach.
Cross-tabulation is the appropriate technique for addressing research questions involving
relationships among multiple less-than-interval variables.
One key to interpreting a cross-tabulation table is comparing the observed table values with
hypothetical values that would result from pure chance.
Contingency Tables
Exhibit 20.3 shows example cross-tabulation results using contingency tables.
A contingency table is a data matrix that displays the frequency of some combination of
possible responses to multiple variables.
Two-way contingency tables (i.e., involve two less-than-interval variables) are used most
often.
Beyond three variables, contingency tables become difficult to analyze and explain easily.
The row and column totals are often called marginals, because they appear in the table’s
margin.
Researchers usually are more interested in the inner cells of a contingency table, which
display conditional frequencies (combinations).
Any cross-tabulation table may be classified according to the number of rows by the
number of columns (R by C: e.g., 2 x 2 table for two variables with two levels each or 3
x 4 table for two variables, one with three levels and the other with four).
Percentage Cross-Tabulations
When data from a survey are cross-tabulated, percentages help the researcher understand
the nature of the relationship by making relative comparisons simpler.
The total number of respondents or observations may be used as a statistical base for
computing the percentage in each cell.
One of the questions is commonly chosen as a base for computing percentages.
Selecting either the row percentages or the column percentages will emphasize a
particular comparison or distribution.
The nature of the problem the researcher wishes to answer will determine which
marginal total will serve as a base for computing percentages.
There is a conventional ruling for determining the direction of percentages if the
researcher has identified which variable is the independent variable and which is the
dependent variable – independent variables should form the rows and the margin total of
the independent variable should be used as the base for computing the percentages.
Elaboration and Refinement
Once the basic relationship between two variables has been examined, the researcher may
wish to investigate this relationship under a variety of different conditions.
Typically, a third variable is introduced into the analysis to elaborate and refine the
researcher’s understanding by specifying the conditions under which the relationship
between the first two variables is strongest and weakest.
Elaboration analysis involves the basic cross-tabulation within various subgroups of the
sample.
The researcher breaks down the analysis for each level of another variable.
Interactions between variables examine moderating variables.
A moderator variable is a third variable that changes the nature of a relationship
between the original independent and dependent variables.
In other situations the adding of a third variable to the analysis may lead us to reject the
original conclusion about the relationship.
When this occurs, the elaboration analysis suggests the relationship between the
original variables is spurious (see Chapter 3).
How Many Cross-Tabulations?
Surveys may ask dozens of questions and hundreds of categorical variables can be stored
in a data warehouse.
A researcher addressing an exploratory research question may find some benefit in such a
fishing expedition.
Computer-assisted researchers can “fish” for relationships by cross-tabulating every
categorical variable with every other categorical variable.
CHAID (chi-square automatic interaction detection) software exemplifies software
that makes searches through large numbers of variables possible.
Data-mining can be conducted in a similar fashion and may suggest relationships that
are worth considering further.
Outside of exploratory research, researchers should conduct cross-tabs that address
specific research questions or hypotheses, and when categorical variables are involved,
cross-tabs are the right tool.
Quadrant Analysis
Quadrant analysis is a variation of cross-tabulation in which responses to two rating
scale questions are plotted in four quadrants of a two-dimensional table.
Most quadrant analysis in research portrays or plots the relationship between the average
responses about a product attribute’s importance and average ratings of a company’s (or
brand’s) performance on that product feature.
Sometimes the term importance-performance analysis is used because consumers rate
perceived importance of several attributes and rate how well the company’s brand
performs on that attribute.
Generally speaking, the business would like to end up in the quadrant indicating high
performance on an important attribute.
IV. DATA TRANSFORMATIONS
Simple Transformations
Data transformation (also called data conversion) is the process of changing data from
their original form to a format suitable for performing a data analysis that will achieve the
research objectives
Researchers often modify the values of scalar data or create new variables.
For example, asking respondents for their year of birth may produce less response
bias that if asked for their age, and the researcher can simply transform the data.
Recoding and creating summated scales are common data transformations.
Collapsing or combining adjacent categories of a variable is a common form of data
transformation used to reduce the number of categories.
Problems With Data Transformations
Researchers often perform a median split to collapse a scale with multiple response
points into two categories.
The median split means respondents below the observed median go into one category
and respondents above the median go into another.
This approach is best applied only when the data do indeed exhibit bimodal
characteristics.
When a sufficient number of responses exist and a variable is ratio, the researcher may
choose to delete one-fourth to one-third of the responses around the median to effectively
ensure a bimodal distribution.
Index Numbers
Index numbers represent simple data transformations that allow researchers to track a
variable’s value over time and compare a variable(s) with other variables.
Recalibration allows scores or observations to be related to a certain base period or base
number.
If the data are time-related, a base year is chosen.
Index numbers are computed by dividing each year’s activity by the base-year activity
and multiplying by 100.
Require ratio measurement scales.
V. CALCULATING RANK ORDER
Ranking data can be summarized by performing a data transformation.
The transformation involves multiplying the frequency by the ranking score for each choice
to result in a new scale.
VI. TABULAR AND GRAPHIC METHODS OF DISPLAYING DATA
Tables, graphs and charts may simplify and clarify data.
Graphical representations of data may take a number of forms, ranging from a computer
printout to an elaborate pictograph.
Tables, graphs, and charts all facilitate summarization and communication.
Today’s researcher has many convenient tools to quickly produce charts, graphs, or tables
(i.e., even basic word processing programs like Word include chart functions).
Bar charts (histograms), pie charts, curve/line diagrams and scatter plots are among the most
widely used tools.
Chapter 25 discusses how these and other graphic aids may improve the communication
value of a written report or oral presentation.
VII. COMPUTER PROGRAMS FOR ANALYSIS
Statistical Packages
Unlike in the past, today, computing power is seldom a barrier to completing a research
project.
Today, most spreadsheet packages (e.g., Excel) can perform a wide variety of basic
statistical options.
Most of the basic statistical features are now menu driven, reducing the need to memorize
function labels.
Despite the advances in spreadsheet applications, commercialized statistical software
packages remain extremely popular among researchers.
Statistical packages are more tailored to the types of analyses performed by statistical
analysts.
SAS has been widely used in engineering and other technical fields.
SPSS is commonly used by university business and social science students.
Business researchers have used SPSS more than any other statistical software tool.
Economists sometimes favor MINITAB, which traditionally has not been viewed as
user friendly relative to other choices.
All the major software packages can work from data entered into a spreadsheet, which
can be imported into the data windows or simply read by the program.
Computer Graphics and Computer Mapping
Graphic aids prepared by computers have practically replaced graphic presentation aids
drawn by artists.
They are extremely useful for descriptive analysis.
Many computer maps are used by business executives to show locations of high-quality
customer segments, and competitors’ locations are often overlaid for additional quick and
easy visual reference.
Many computer programs can draw box and whisker plots, which provide graphic
representations of central tendencies, percentiles, variabilites, and the shapes of
frequency distributions (see Exhibit 20.15).
The response categories are shown on the vertical axis.
The small box inside the plot represents responses for half of all respondents.
This gives a measure of variability called the interquartile range, but the term
midspread is less complex and more descriptive.
The location of the line within the box indicates the median.
The dashed lines that extend from the top and bottom of the box are the whiskers, and
each whisker extends either the length of the box or to the most extreme observation
in that direction.
An outlier is a value that lies outside the normal range of the data.
VIII. INTERPRETATION
In research, the interpretation process explains the meaning of the data.
Interpretation is drawing inferences from the analysis results.
Inferences drawn from interpretations lead to managerial implications.
The logical interpretation of the data and statistical analysis are closely intertwined.
From a management perspective, the qualitative meaning of the data and their managerial
implications are an important aspect of interpretation.
Interpretation is crucial, but the process is difficult to explain in a textbook because there is
no one best way to interpret data.
Data are sometimes merely reported and not interpreted.
At the other extreme, some researchers tend to analyze every possible relationship between
each and every variable in the study, which is a sign that the research problem was not
adequately defined.

Trusted by Thousands of
Students

Here are what students say about us.

Copyright ©2022 All rights reserved. | CoursePaper is not sponsored or endorsed by any college or university.