Monday, February 9, 2009

No child gets ahead

Before I begin my rant about the No Child Left Behind program, let me state clearly four views that I enthusiastically share with the program.
  1. Every child should become at least minimally proficient in certain essential skills
  2. It is import to provide a way of comparing how well schools and districts achieve educational goals
  3. Incentives to schools and districts do affect school policies.
  4. Standard tests are the least subjective way of measuring students' progress
Looking at that list, you might think that I was also an enthusiastic supporter of NCLB. But you would be wrong.
What the above leads to is a system where incentives to schools, school districts, property owners and so on is the portion of students in a school who meet a minimum standard. The overwhelming portion of what goes into a school's rating is whether a sufficient portion of the children reach a minimum standard.
Now let's look at the consequence of this
  • Effort going to serve the educational needs of children who will comfortably exceed the minimum standard will be wasted. That is, improvement among students who would meet the minimum standard in the first place will not be reflected in school ratings.
  • Effort going to serve students who are unlikely to meet the minimum standard is also wasted in that improvements there are not reflected in the ratings.
  • Success of a school, district or state becomes measured in terms of minimal standards, thus curriculum will be geared toward the minimum standard.
As mentioned in my points of agreement above, incentives do work. Therefore, we need to check very carefully what kind of incentives we establishes. All of the negative consequences that I've listed are the result of the incentive system that NCLB creates.
How not to fix the system.
The system that we have in place provides us (at best) with measures of how well schools have students meet minimum standards. The worst thing that we could do is to treat that measure as anything other than what it is. It is not a measure of how well the school educates students on average; it is not a measure in any real sense of how "good" a school is. The only thing that this measure tells us is whether or not a school is failing to get most of its students to meet a minimal standard.
But because this is the only measure that we have for comparing schools and districts within a state, it gets used for purposes that it was never designed to serve. A test of how many students meet a minimal standard can only be used to identify failing schools. It is entirely useless at distinguishing decent schools from excellent schools. For example, in a successful district like Plano Independent School District, more students hit the ceiling (reach the maximum score possible on the test) than actually fail it [get source for this. It was buried in one of the NWEA documents]. The Texas Assessment of Knowledge and Skills (TAKS) completely fails to provide a measure that can be used to compare those who well exceed its minimum standard.
A natural reaction and proposed solution is to raise the tests' standards: Make the test harder. But that would actually make matters worse. If we raise the standards of the test to a degree where it would be meaningful for schools that are doing well within the state, then it no longer works as ensuring that all students reach a minimum standard. It will place the higher standard out of reach of some students. The fact of the matter is that not all students are alike. Not all students are college bound. And when we establish truly minimum standards we need to take that into account. If our "minimum" is no longer truly minimum and so become out of reach for some children, then those children will be left behind.
Of course we should set high goals for everyone so that we get the best from each, but we shouldn't insist that everyone reach high goals.
Possible fixes
I have several ideas for how to improve the system. I will sketch a few of them below, but will need to expand on them at some later date.
Make every score count
Instead of reporting just the number of students who merely pass the test, also report the average score. This way an improvement for any student (whether well below passing or well above passing) gets credited to the school. A simple average may not be the best number because of how outliers affect the results, but some statistic or set of statistics that makes every child's performance count will help motivate schools to help each and every child whether they are near the passing threshold or not.
But this can't be done with the test as it stands (at least in Texas). As I mentioned, the Texas test has a very low ceiling. In my district more students score 100% on each test than fail it. This means that the test is providing absolutely no useful measurement for those students. Also when scores reach a ceiling they have a perverse affect on averages. Developing a test which has the appropriate scoring system is difficult, but achievable. But I will leave that for another post as well.
Distinguish between pedagogically useful and useless tests
Tests like the TAKS provide little information to the teacher about how to help an individual child. The tests are not pedagogically useful. That's fine because that is not their intent. They are designed to help us compare schools and districts. Unfortunately a great deal of school, teacher, and student effort goes into pedagogically useless activity. There are several ideas of how to deal with this, all involve less individual testing. One would be to test less frequently, and another would be to test only a sample of the students in a school instead of testing the entire school population.
Simplify administration of tests by eliminating the conflict of interest
The rules and procedures that schools have to follow for administering the tests are beyond belief. Visit your local school and ask to see the printed guidelines. You won't have time to read them, and you may not even have time to count the pages. Just weigh them on a bathroom scale.
The reason for many of these rules is because there is a truly awful design decision in the administration of the tests. The people who have the most at stake (the teachers and the school officials) are the ones who are asked to administer the test. This is a massive conflict of interest. And if you believe, as I do, that incentive systems do affect how people behave, then you see that there is a terrible opportunity for test administrators helping students cheat on the tests.
Of course I believe that most teachers are honest, but I also lock my car even though I think that most people wouldn't steal it. If you ask any teacher about cheating on these tests they will of course tell you that it doesn't happen in their school, but they will also be familiar with some terrible abuses in other schools or districts. We have set up a systematic administrative conflict of interest and so add boatloads of Band-Aids to cover a wound that is wider than a church door and deeper than a well.
The entire administration of these non-pedagogical exams could be simplified if we had them administered by some third party.
I will try to expand on these thoughts and fill in many of the blanks in further posts. But at this point, I should just upload this one even though it fall far short of what I had hoped for it.