You’ve spent weeks researching, planning, and designing your training. You’ve edited and reworked the instructional materials countless times. Finally, after what seems like years of hard work, you’ve created a training course that you think is as effective and engaging as it can be possibly be.
But how do you know it’s as good as you think it is?
Instructional design evaluation will help you find out. Learning and development is a high-stakes activity—developing and implementing trainings can cost tens of thousands of dollars for large organizations.
And if those trainings aren’t having measurable real-world effects, executives aren’t going to be happy. Trainers, human resources managers, and instructional designers need to be able to show that the trainings are working.
Effective instructional design evaluation uncovers evidence to prove the training’s value. Or, if the value is unexpectedly low, show how to make improvements to the ROI.
What gets measured, as they say, gets managed. It’s painfully cliche, but it’s absolutely true. Instructional design is how you measure your trainings. Here’s what you need to know to get started.
Where Evaluation Fits into the Instructional Design Process
Ross and Morrison (2010) sum up three high-level types of evaluations quite nicely:
Formative evaluation is used to provide feedback to designers as the instruction is “forming” or being developed. Summative evaluation is conducted to determine the degree to which a completed instructional product produces the intended outcomes. Confirmative evaluation examines the success of instruction as it is used over time.
If you have the resources to conduct all three assessment types, there’s no question about where evaluation fits into the design sequence. It happens throughout the entire process (and continues after design is done, as well).
Of course, not every company will be able to run the large number of assessments necessary for formative, summative, and confirmative evaluation of every course. In these cases, determining when to evaluate your instructional design may come down to your goals.
To make sure that your design is effective when it’s debuted, for example, formative evaluation practices are useful. Summative evaluations determine how much employees learned and whether this knowledge creates behavior change. And confirmative evaluations measure the long-term success of the program.
Frequent evaluation allows for continuous improvement. So it’s best to assess your instructional design as often as you can.
Designers who follow the ADDIE (analysis, design, development, implementation, evaluation) model might be tempted to leave all evaluation for the end of the process. But evaluating more often has many benefits.
Play to your organization’s strengths here. If you have a budget big enough for multiple evaluations throughout the process, go for it. If not, you may need to get creative. We’ll talk more about these three types of assessment a bit later.
Instructional Design Evaluation Strategies: The Kirkpatrick Model
How, then, do we measure the effectiveness of instructional design? There are many theories, each with advantages and disadvantages.
One of the most useful frameworks for evaluating your instructional design is the Kirkpatrick model. Although it has drawbacks, the model’s simplicity and popularity make it a good option for companies looking to evaluate their instructional design. Especially if you’re interested in comparing results to other companies in your industry.The model has four levels:
- Reaction
- Learning
- Behavior
- Results
Let’s take a look at each one individually.
1. Reaction
How do learners react to the training? If their reaction is positive, they’re more likely to have positive learning outcomes. A negative reaction doesn’t preclude learning, but it makes learning less likely.
The factors that elicit positive reactions vary. Trainers, training methodologies, and audiences all have unique needs and preferences. But there are some commonalities. A 2009 study found that trainees’ perceptions of the training’s efficiency and usefulness, as well as the trainer’s performance, were correlated to reactions.
The usefulness of the training was the most important factor in the study. Which won’t surprise anyone who’s ever been to an irrelevant or useless training.
Trainings that are efficient (i.e., don’t waste time), useful, and conducted by likable, interesting, effective trainers are going to get the most positive reactions. And that will help boost learning.
(Though it’s important to note that there are multiple types of reactions to instruction, and not all of them may be correlated with learning.)
2. Learning
How much did participants learn? Did they retain the information over time? These questions don’t seem overly complicated. But getting reliable objective measures can be difficult.
For example, when measuring learning, you need to know how your participants’ level of knowledge before they completed the training. If trainees were already experts, showing that they’re experts at the end of the training isn’t very interesting. Which is why both pre- and post-testing is necessary for effective measurement of learning.
There are many ways of testing the acquisition of knowledge; assessments, self-assessments, informal testing, and so on. Choosing the right instrument may depend on the type of training you’re running and your measurement objective.
Knowledge acquisition, for example, can be measured with a simple multiple-choice quiz. The ability to apply that knowledge, however, is more difficult to measure, and may require a more in-depth assessment.
3. Behavior
Are your participants applying what they learned to their jobs? This is possibly the ultimate test of the effectiveness of instruction.
Successfully measuring behavior change isn’t easy. You’ll need to determine the most relevant behaviors, measure them before the instruction, measure them after the instruction, and figure out if the training caused the change.
Let’s look at an example. If you run a course on sales enablement, the employee behavior you’re trying to affect might be the sharing of information between marketing and sales departments. You’ll need to identify some sort of measurable metric to see if this is actually happening.Which metrics might you look at to assess behavior change in sales enablement? Many knowledge-sharing systems provide analytics on content usage, like the number of times a particular piece of content has been accessed or used in a sales pitch.How many sales interactions include materials developed by marketing? That’s a metric that you can compare pre- and post-test.
Other software platforms can provide you with similar metrics. And while behavior change can be measured independently of tools, those tools make the process much easier.
While they aren’t as rigorous as objective measures, employee surveys are useful as well. In our case, we might ask employees how often they use the sales enablement system, the percentage of documents they think are shared between the two groups, and similar questions.
The most important thing to remember here is to choose a metric that’s closely tied to the behavior you’re trying to influence. Choosing the wrong measure of behavior can skew your results.