Thursday, September 17, 2009

Thinking about assessment

The education literature likes to make a distinction between assessment for learning and assessment of learning. The distinction is, in my view, a necessary insight, but the way that it is conceived is both too limiting and prone to confusion. In this rant I am going present a somewhat richer framework for discussing different types of assessment for different purposes.

Where I'm coming from

As I've mentioned before, I am training to be a high school math teacher, and I am enrolled in what I consider to be an outstanding program through Collin College. I must confess that when I signed up for the program, I, in my arrogance, did not think that I would learn much. I am pleased to report that I was dead wrong. I won't go into why I was wrong, but I will say that I go to bed thinking about the ideas that come up from class discussion and readings and I wake up thinking about them. I remain (very) critical of some of the argumentation and scholarship in the readings, but it is extremely helpful for me to read them. I'm gobbling them up and loving it.

I have been, and remain, highly critical of the kinds of testing and incentive systems that have been set up by NCLB even though I fully support the goal of keeping schools and districts accountable for how well they serve all students, particularly the ones who are at risk of being left behind. Please see my previous posts on the matter (and more to come). NCLB does appear to be reaching that stated goal but it distorts the educational system as a whole and hinders progress in other important areas. But this essay is about assessment (testing and similar things). Whether you are a critic or supporter of NCLB you will agree that it is has greatly intensified the amount and importance of (standardized) testing in schools.

The Educators' Complaint

The education literature makes a distinction between assessment of learning and assessment for learning. A similar distinction is also called summative assessment and formative assessment. I will not attempt to give a full definition of these here. I don't think that the definitions in the literature bear up under close inspection, and the fuller the definition the less enlightening it is. Instead here is the rough idea through examples. Assessment of includes things like the TAKS, end of term exams, and major examinations that determine a student's grade. Assessment for learning is the on-going assessment that teachers engage while teaching. These include asking questions of the class, seeing what sorts of questions students ask. These are considered for learning because they help the teacher adapt teaching to the particular student.

The problem with our increased emphasis on assessment of learning is that most of that assessment isn't pedagogically useful. Some even argue that it is harmful in and of itself beyond the misdirection of resources (although I have my doubts about that claim). NCLB is a reality (which really does appear to be meeting its narrow, but important, goals), but the concern among educators is that it leads to too much pedagogically useless assessment. I agree, but I think that we are talking about assessment in a far too limiting framework.

Distinguishing distinctions

When we look at assessment, and try to categorize it, I think that we need to be looking at two dimensions, instead of the one-dimensional approach in the of-for distinction. We need to ask

  1. What is the form of the assessment?
  2. What is the purpose of the assessment?

The current discussion seems to think that all standardized tests (form) serve only to assess what a student has learned and not to adjust teaching (purpose), while all of the less formal (form) assessments are only used to adjust teaching (purpose). Certainly there is a strong connection between form and function, but when looking at assessment it will be useful to look at these along these two not-quite-independent dimensions.

Three purposes

When it comes to considering the various purposes of assessment I think that it is helpful to consider three separate purposes, not just the two in the existing conceptualization.

  1. Adjusting: to help adjust teaching to the needs of the particular student
  2. Grading: to provide feedback to student and family, to assign grades and work as an incentive
  3. Accounting: to evaluate the teaching of the teacher, school, district.

Accounting is what we see in the testing that follows from NCLB. It is about rating and evaluating schools and districts (and within districts it will be used to evaluate teachers). It is the school administrators who have the most to gain or lose by these test results. And they are typically done at the end of the school year. Although students who fail the test will be intensively tutored so that they will pass a retake, these tests are not used to help students directly.

Grading is typically the assessments that a course grade is based upon. These are presented to parents and students. These become part of a student's record and are intended to indicate how much the student learned. Of course these will also feed back on how a particular student is taught. A teacher can learn from these that a student is not meeting expectations and so can look for ways to help the student. One characteristic of grading assessment is that it (almost) never goes beyond what has been taught in class.

Adjusting is used primarily to help determine how to teach a particular student. These can range from everyday queries while teaching to see if students are getting it or not. But at the other extreme these can be the kinds of evaluations that are used to determine whether a student should be in a gifted and talented program or in special education. Those typically involve highly formalized exams, but are used exclusively for determining how best to teach an individual student. Homework may be part of a student's grade (usually to get them to do it), but is used primarily as a frequent check of whether something needs to be retaught.

Any particular assessment can (and often) will serve multiple purposes. But when looking at any particular assessment it is useful to keep those three purposes in mind.

Form follows function except for when it doesn't

If you've been talking about the differences between similes and metaphors in class you may ask for examples to help with the learning that day (adjusting). But you may also ask for examples of each on an end of term examination (grading). So the same form can be used for different purposes in different contexts. I've praised the MAP testing that PISD does. But I honestly don't know what they use it for. I would hope that they use it to help differentiate teaching (adjusting), but it may be used primarily to track teacher performance (accounting). So here is a particular standardized test administered exactly the same way could be used for entirely different purposes.

Some forms of assessment really are single purpose. Some like the Texas TAKS tests can't be used for much other than accounting, and then only a limited type. The test is designed to distinguish between students who have acquired the basic knowledge expected for the grade level from those who have not. It doesn't do a very good job of discriminating between students at the high end or very low end. It is hard for me to imagine a set of exams that is more narrowly focused on one purpose.

With understanding come solutions

This understanding of purposes can bring real, practical, recommendations. The TAKS serves little direct pedagogical purpose other than accounting, we could save a great deal of time and money (that could then go to actually improving education) by sampling. Not every student needs to take the TAKS in every subject. Consider fifth grade TAKS requirements. Students take Reading, Math and Science. Not counting make-ups and such, that takes three full days for the students' to complete. But if the goal is to measure a schools' performance, then have one third of the students take Reading, one third Math, and one third Science. Students would be randomly assigned with neither student nor school staff knowing which student gets which test until test day. All of the tests can then be given on the same day.

I believe that the framework I've introduced above, first separating form from purpose and then distinguishing three separate purposes for assessment, allows for a more useful discussion of assessment than is common. At least it helps me think about these things more carefully, and I hope it does the same for any readers I might have.

No comments:

Post a Comment