Chapter 06 – Training Evaluation
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in
any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
6-1
CHAPTER 6
TRAINING EVALUATION
This chapter provides an overview of how to evaluate training programs, including the types of
outcomes that need to be measured and the types of evaluation designs available. The chapter
focuses on the evaluation of training programs and learner outcomes. It explains the criticality of
evaluating whether the training has accomplished its objectives and, particularly, whether job
performance and organizational results have improved as a result. Formative and summative
evaluation are discussed and compared and reasons for evaluating are identified. The process of
evaluating training is outlined and outcomes used to evaluate training are described in some
detail. Kirkpatrick’s five-level framework incorporating five major levels of evaluation is
highlighted, and the six major categories of outcomes are presented more extensively. Another
important issue, regarding how good the designated outcomes are, is addressed. Perhaps most
importantly, evaluation designs, important elements of evaluation design, and the preservation of
internal validity are discussed as well as the calculation of return on investment for the training
dollar. In an environment of accountability, knowledge of how to show return on investment is
invaluable. Further, this chapter gives the student knowledge of the various evaluation strategies
and how to choose an approach. A list of key terms, discussion questions, and application
assignments follow the end of the chapter.
Objectives
1. Explain why evaluation is important.
2. Identify and choose outcomes to evaluate a training program.
3. Discuss the process used to plan and implement a good training evaluation.
4. Discuss the strengths and weaknesses of different evaluation designs.
5. Choose the appropriate evaluation design based on the characteristics of the company and the
importance and purpose of the training.
6. Conduct a cost-benefit analysis for a training program.
7. Explain the role of workforce analytics and dashboards in determining the value of training
practices.
I. Introduction
A. Training effectiveness refers to the benefits that the company and the trainees experience
as a result of training. Benefits for the trainees include learning new knowledge, skills,
and behaviors. Potential benefits for the company include increased sales, improved
quality and more satisfied customers.
B. Training outcomes or criteria refer to measures that the trainer and the company use to
evaluate training programs.
C. Training evaluation refers to the process of collecting data regarding outcomes needed to
determine whether training is effective.
Chapter 06 – Training Evaluation
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in
any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
6-2
D. Evaluation design refers to the collection of informationincluding what, when, how,
and from whomthat will be used to determine the effectiveness of the training program.
II. Reasons for Evaluating Training
A. Companies have made large dollar investments in training and education and view
training as a strategy to be successful, they expect the outcomes or benefits related to
training to be measurable.
B. Training evaluation provides a way to understand the investments that training produces
and provides information needed to improve training.
Formative Evaluation
Formative evaluation refers to the evaluation of training that takes place during program
design and development. It is conducted to improve the training process; ensuring that the
training program is well-organized and runs smoothly and trainees are learning and are
satisfied with the training.
A. As a result of the formative evaluation, training content may be changed to be more
accurate, easier to understand, or more appealing; the training method can be adjusted to
improve learning.
B. Introducing the training program as early as possible to managers and customers helps in
getting them to buy into the program, which is critical for their role in helping employees
learn and transfer skills; it also allows their concerns to be addressed before the program
is implemented.
C. Pilot testing is the process of previewing a training program with potential trainees and
their managers, or other customers. The pilot testing group is then asked to provide
feedback about the content of the training as well as the methods of delivery. This
feedback enables the trainer to make needed improvements to the training.
Summative Evaluation
Summative evaluation is evaluation conducted to determine the extent to which trainees have
improved or acquired knowledge, skills attitudes, behaviors, or other outcomes specified in
the learning objectives, as a result of the training. Reasons training programs should be
evaluated:
A. To identify the program’s strengths and weaknesses, including whether the program is
meeting the learning objectives, the quality of the learning environment, and if transfer of
training back to the job is occurring.
B. To assess whether the various features of the training context and content contribute to
learning and the transfer of learning back to the job.
C. To identify which trainees benefited most or least from the program and why.
D. To gather information, such as trainees’ testimonials, to use for marketing training
programs.
E. To determine financial benefits and costs of the program.
F. To compare the costs and benefits of training versus other human resource investments.
G. To compare the costs and benefits of various training programs in order to choose the
most effective programs.
Chapter 06 – Training Evaluation
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in
any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
6-3
III. Overview of the Evaluation Process
A. The evaluation process should begin with determining training needs. Needs assessment
helps identify what knowledge, skills, behavior, or other learned capabilities are needed.
B. The next step in the process is to identify specific, measurable training objectives to
guide the program.
C. Based on the learning objectives and analysis of transfer of training, outcome measures
are designed to assess the extent to which learning and transfer have occurred.
D. Once the outcomes have been identified, the next step is to determine an evaluation
strategy. Factors such as expertise, how quickly the information is needed, change
potential, and the organizational culture should be considered in choosing a design.
E. Planning and executing the evaluation involves previewing the program (formative
evaluation), as well as collecting training outcomes according to the evaluation design.
The results of the evaluation are used to modify, market, or gain additional support for
the program.
IV. Outcomes Used in the Evaluation of Training Programs
One of the original frameworks (five-level) for identifying and categorizing training
outcomes was developed by Kirkpatrick. Levels 1 and 2 measures are collected before
trainees return to their jobs. Levels 3, 4, and 5criteria measure the extent to which the
training transfers back to the job.
Reaction Outcomes
Reaction outcomes refer to the trainees’ perceptions of the training experience, including the
content, the facilities, the trainer and the methods of delivery. These perceptions are typically
obtained at the end of the training session via a questionnaire completed by trainees, but
usually are only weakly related to learning or transfer.
Learning or Cognitive Outcomes
Cognitive outcomes demonstrate the extent to which trainees are familiar with information,
including principles, facts, techniques, procedures, and processes, covered in the training
program.
Behavior and Skill-Based Outcomes
Skill-based outcomes assess the level of technical or motor skills and behaviors acquired or
mastered. This incorporates both the learning of skills and the application of them (i.e.,
transfer).
A. Skill learning is often assessed by observing performance in work samples such as
simulators.
B. Skill transfer is typically assessed by observing trainees on the job or managerial and peer
ratings.
Chapter 06 – Training Evaluation
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in
any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
6-4
Affective Outcomes
Affective outcomes include attitudes and motivation. Affective outcomes that might be
collected in an evaluation include tolerance for diversity, motivation to learn, safety attitudes,
and customer service orientation. The attitude of interest depends on training objectives.
Results
Results are those outcomes used to determine the benefits of the training program to the
company. Examples include reduced costs related to employee turnover or accidents,
increased production, and improved quality or customer service.
Return on Investment
Return on Investment involves comparing the training program’s benefits in monetary terms
to the program’s costs, both direct and indirect.
A. Direct costs include salaries and benefits of trainees, trainers, consultants, and any others
involved in the training; program materials and supplies; equipment and facilities; and
travel costs.
B. Indirect costs include office supplies, facilities, equipment and related expenses not
directly related to the training program; travel and expenses not billed to one particular
program; and training department management and staff support salaries.
C. Benefits are the value the company receives from the training.
D. Training Quality Index (TQI) is a computer application that collects data about training
department performance, productivity, budget, and courses and allows for detailed
analysis of the data. TQI tracks all department training data into five categories:
effectiveness, quantity, perceptions, financial impact, and operational impact.
V. Determining Whether Outcomes are Appropriate
Relevance
A. Criteria relevance refers to the extent to which training outcomes appropriately reflect the
content of the training program. The learned capabilities needed to successfully complete
the training program should be the same as those required to successfully perform one’s
job.
B. Criterion contamination refers to the extent that training outcomes measure inappropriate
capabilities or is affected by extraneous conditions.
C. Criterion deficiency refers to the failure of the training evaluation measures to reflect all
that was covered in the training program.
Reliability
Reliability is the degree to which training outcomes can be measured consistently over time.
Predominantly, we are concerned with consistency over time, such that a reliable test
contains items that do not change in meaning or interpretation over time.
Chapter 06 – Training Evaluation
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in
any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
6-5
Discrimination
Discrimination refers to the degree to which trainees’ performance on the outcome actually
reflects true differences in performance; that is, we want the test to discriminate on the basis
of performance and not other things.
Practicality
Practicality is the ease with which the outcome measures can be collected. One reason
companies give for not including learning, performance, and behavior outcomes in their
evaluation of training programs is that collecting them is too burdensome.
VI. Evaluation Practices
Surveys of companies’ evaluation practices indicate that reactions (an affective outcome) and
cognitive outcomes are the most frequently used outcomes in training evaluation. Despite the
less frequent use of cognitive, behavioral, and results outcomes, research suggests that
training can have a positive effect on these outcomes.
Which Training Outcomes Should Be Collected?
A. To ensure adequate training evaluation, companies should collect outcome measures
related to both learning and transfer of training.
B. Outcome measures are largely independent of each other; it cannot be assumed that
positive reactions to the training program mean that trainees learned more and will apply
what they learned back on the job.
C. To the extent possible, evaluations should include measuring job behavior and results
level outcomes to determine whether transfer of the training has occurred.
D. Learning, behavior, and results should be measured after sufficient time has elapsed to
determine whether training has had an influence on these outcomes.
E. There are three types of transfer:
1. Positive transfer is demonstrated when learning occurs and job performance and
positive changes in skill-based, affective, or results outcomes are also observed. This
is the desirable type of transfer.
2. No transfer of training is demonstrated if learning occurs, but no changes are
observed in skill-based, affective, or learning outcomes.
3. Negative transfer is evident when learning occurs, but skills, affective outcomes, or
results are less than at pretraining levels.
VII. Evaluation Designs
The design of the training evaluation determines the confidence that can be placed in the
results. No training evaluation can be absolutely certain that the results of the evaluation are
completely true.
Chapter 06 – Training Evaluation
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in
any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
6-6
Threats to Validity: Alternative Explanations for Evaluation Results.
A. Threats to validity refer to factors that will lead an evaluator to question either (1) the
believability of the study results or (2) the extent to which the evaluation results are
generalizable to other groups of trainees and situations.
B. Internal validity is the believability of the study. An evaluation study needs internal
validity to provide confidence that the results of the evaluation are due to the training
program and not to another factor.
C. External validity refers to the generalizability of the evaluation results to other groups
and other situations.
D. Methods to control for threats to validity:
1. Use pre-tests and post-tests to determine the extent to which trainees’ knowledge,
skills or behaviors have changed from pretraining to post-training measures. The
pretraining measure essentially establishes a baseline.
2. Use a comparison (or control) group (i.e., a group that participates in the evaluation
study, but does not receive the training) to rule out factors other than training as the
cause of changes in the trainees. The group that does receive the training is referred to
as the training group or treatment group. Often employees in an evaluation will
perform higher just because of the attention they are receiving. This is known as the
Hawthorne effect.
3. Random assignment refers to assigning employees to the control and training groups
on the basis of chance. Randomization helps to ensure that members of the control
group and training group are of similar makeup prior to the training. It can be
impractical and/or even impossible to employ in company settings.
Types of Evaluation Designs
Types of evaluation designs vary as to whether they include a pretest and posttest, a control
or comparison group and randomization.
A. The posttest only design involves collecting only posttraining outcome measures. It
would be strengthened by the use of a control group, which would help to rule out
alternative explanations for changes in performance.
B. The pretest/posttest design involves collecting both pretraining and posttraining outcome
measures to determine whether a change has occurred, but without a control group which
helps to rule out alternative explanations for any change that does occur.
C. The pretest/posttest with comparison group design includes pretraining and posttraining
outcome measurements as well as a comparison group in addition to the group that
receives training. If the posttraining improvement is greater for the group that receives
training, as we would expect, this provides evidence that training was responsible for the
change.
D. The time series design involves collecting outcome measurements at periodic intervals
pre- and posttraining. A comparison group may also be used. The strength of this design
can be improved by using reversal, which refers to a time period in which participants no
longer receive the training intervention. Its advantage are: it allows an analysis of the
stability of training outcomes over time, and using both the reversal and comparison
group helps to rule out alternative explanations for the evaluation results.
E. The Solomon Four-Group design combines the pretest/posttest comparison group design
and the posttest-only control group design. It involves the use of four groups: a training
Chapter 06 – Training Evaluation
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in
any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
6-7
group and comparison group for which outcomes are measured both pre and post training
and a training group and comparison group for which outcomes are measured only after
training. This design provides the most controls for internal and external validity.
Considerations in Choosing an Evaluation Design
A. A more rigorous evaluation design should be considered if any of the following
conditions are true:
1. The evaluation results can be used to change the program.
2. The training program is ongoing and has the potential to affect many employees.
3. The training program involves multiple classes and a large number of trainees.
4. Cost justification for training is based on numerical indicators.
5. Trainers or others in the company have the expertise to design and evaluate the data
collected from an evaluation study.
6. The cost of the training creates a need to show that it works.
7. There is sufficient time for conducting an evaluation. Here, information regarding
training effectiveness is not needed immediately.
8. There is interest in measuring change from pretraining levels or in comparing two or
more different programs.
B. Evaluation designs without pretesting or comparison groups are most appropriate when
you are interested only in whether a specific level of performance has been achieved, and
not how much change has occurred.
VIII. Determining Return on Investment
A. Cost-benefit analysis of training is the process of determining the net economic benefits
of training using accounting methods. Training cost information is important for several
reasons:
1. To understand total expenditures for training, including direct and indirect costs
2. To compare the costs of alternative training programs
3. To evaluate the proportion of the training budget spent on the development of
training, administrative costs, and evaluation as well as how much is spent on various
types of employees e.g., exempt versus nonexempt
4. To control costs
B. There is an increased interest in measuring the ROI of training and development
programs because of the need to show the results of these programs to justify funding and
to increase the status of the training and development function.
C. The process of determining ROI:
1. Understand the objectives of the training program.
2. Isolate the effects of training from other factors that might influence the data.
3. The data are converted to a monetary value and ROI is calculated.
D. Because ROI analysis can be costly, it should be limited only to certain training
programs.
Chapter 06 – Training Evaluation
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in
any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
6-8
Determining costs
A. The resource requirements model compares equipment, facilities, personnel, and
materials costs across different stages of the training process (needs assessment,
development, training design, implementation, and evaluation).
B. There are seven categories of cost sources: costs related to program development or
purchase; instructional materials; equipment and hardware; facilities; travel and lodging;
and salary of the trainer and support staff along with the cost of either lost productivity or
replacement workers while trainees are away from their jobs for the training.
Determining Benefits
A. Determining benefits can be done via a number of methods, including:
1. Technical, practitioner and academic literature summarizes benefits of training
programs.
2. Pilot training programs assess the benefits from a small group of trainees before a
company commits more resources.
3. Observing successful job performers can help to determine what successful job
performers do differently than unsuccessful performers.
4. Asking trainees and their managers to provide estimates of training benefits.
B. To calculate return on investment, follow these steps:
1. Identify outcomes.
2. Place a value on the outcomes.
3. Determine the change in performance after eliminating other potential influences on
training results.
4. Obtain an annual amount of benefits from training by comparing results after training
to results before training.
5. Determine the training costs.
6. Calculate the total savings by subtracting the training costs from benefits.
7. Calculate the ROI by dividing benefits by costs. The ROI gives an estimate of the
dollar return expected from each dollar invested in training.
Example of a Cost-Benefit Analysis
A cost-benefit analysis is best explained by an example.
Other Methods of Cost-Benefit Analysis
A. Utility analysis assesses the dollar value of training based on estimates of the difference
in job performance between trained and untrained employees, the number of employees
trained, the length of time the program is expected to influence performance, and the
variability in job performance in the untrained group of employees. This is a highly
sophisticated formula that requires the use of pretest and posttest with a comparison
group.
B. Other types of economic analysis evaluate training as it benefits the firm or government
using direct and indirect costs, incentives paid by the government for training, wage
increases received by trainees as a result of the training, tax rates, and discount rates.
Chapter 06 – Training Evaluation
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in
any manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
6-9
Practical Considerations in Determining Return on Investment
Training programs best suited for ROI analysis have clearly identified outcomes, are not one-
time events, are highly visible in the company, are strategically focused, and have effects that
can be isolated.
Success Cases and Return on Expectations
A. Return on expectations (ROE) refers to the process through which evaluation
demonstrates to key business stakeholders, such as top-level managers, that their
expectations about training have been satisfied.
B. Success cases refer to concrete examples of the impact of training that show how learning
has led to results that the company finds worthwhile and the managers find credible.
IX. Measuring Human Capital and Training Activity
A. Metrics are valuable for benchmarking purposes, for understanding the current amount of
training activity in a company, and for tracking historical trends in training activity.
B. The value of learning activities is best determined through the use of workforce analytics.
Workforce analytics refers to the practice of using quantitative methods and scientific
methods to analyze data from human resource databases, corporate financial statements,
employee surveys, and other datasources to make evidence-based decisions and show that
human resource practices (including training, development, and learning) influence
important company metrics.
C. Dashboards refer to a computer interface designed to receive and analyze the data from
departments within the company to provide information to managers and other
decisionmakers.
Chapter Summary
This chapter provides sound base of knowledge regarding training evaluation, the issues
surrounding it, and how to approach it. Reasons for evaluating training were described, and the
process of evaluating training was outlined. Kirkpatrick’s model of evaluation was explained, as
well as the five major categories of outcomes that can be measured to evaluate training
effectiveness. The six outcomes (reaction, cognitive, skill-based, affective, results, and ROI)
used in evaluating training programs were explained. Good training outcomes need to be
relevant, reliable, discriminate, and practical. Next, threats to both internal and external validity
were discussed. Various evaluation designs were explained with an emphasis on related costs,
time, and strength. Return on Investment (ROI) and cost-benefit analysis were explained, and
examples given. The chapter concluded with a listing of key terms, discussion questions, and
application assignments.