How do I know assessments are comparable for measuring growth?

Data points must be similar in order to be comparable.

Consider this example:  Would you easily be able to tell someone how much weight you lost by using one scale with pounds and a few weeks later, taking a body fat test?  Yes, I could be done…but it would require conversions and leads to imprecision.  In the classroom, time is precious. By making assessments more comparable the data will be more reliable, more accurate and the process will be easier.

Comparable assessment tools that allow educators to collect good growth data are called mirrored assessments.   There are three basic elements to creating mirrored assessments: form, content and complexity.

The assessment form is how it is written.  For example, results from a multiple choice test are not easily compared to a performance assessment.  Therefore, two mirrored assessments should be of the same form.  Some forms lend themselves more easily to measuring student growth, but one form is not better than other.  It should be noted that any assessment with an open ended nature (ex: free response, constructed response) must be graded with the same rubric in order to obtain growth data.

Content/Skills are essentially what the teacher wants the students to demonstrate they know.  If there is a chapter test on the types of rocks and then the next chapter test covers weather patterns, the data isn’t comparable.  An educator can’t talk about growth in understanding of rocks by the way students performed on a weather test.  Don’t throw the baby out with the bathwater just yet!  If educators pull the skills out of the assessments there may be some comparable elements. For example, the rock assessment might have a chart which asks students to make inferences about rocks.  The weather assessment might also have a chart asking students to make inferences about weather patterns.  The thread running through the two assessments, inferences, can be used to monitor student growth.

Complexity refers to the spectrum of difficulty.  We can all agree that all questions are not of the same level:  some are easy, and some are hard! If your first test is easier than the second, students may not show growth when in fact they have improved.   On the converse, if your first test is really difficult and the second is easier, there may be a false positive growth when learning has not in fact changed…the assessment itself is what changed.  When mirroring an assessment set, it is important that the questions mirror each other in level of difficulty.  If the first assessment asks a apply/analyze level question on a skill, the subsequent assessments must ask a comparable apply/analyze question.  The construction of items within the spectrum of complexity is essential to constructing assessments of the same difficulty that can be compared for student growth.

Anne Weerda
Follow Anne.

Anne Weerda

This article was written by Kids at the Core founder, Anne Weerda.

Anne is an assessment and curriculum specialist best known for her work in assessment design, data analysis and instructional effectiveness. Anne is a sought after speaker in the area of assessment design, curriculum and instruction.
Anne Weerda
Follow Anne.

Latest posts by Anne Weerda (see all)

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *