23 de novembro de 2011

Should value-added teacher ratings be adjusted for poverty?


By Sarah Garland

In Washington, D.C., one of the first places in the country to use value-added teacher ratings to fire teachers, teacher-union president Nathan Saunders likes to point to the following statistic as proof that the ratings are flawed: Ward 8, one of the poorest areas of the city, has only 5 percent of the teachers defined as effective under the new evaluation system known as IMPACT, but more than a quarter of the ineffective ones. Ward 3, encompassing some of the city’s more affluent neighborhoods, has nearly a quarter of the best teachers, but only 8 percent of the worst.
The discrepancy highlights an ongoing debate about the value-added test scores that an increasing number of states—soon to include Florida—are using to evaluate teachers. Are the best, most experienced D.C. teachers concentrated in the wealthiest schools, while the worst are concentrated in the poorest schools? Or does the statistical model ignore the possibility that it’s more difficult to teach a room full of impoverished children?
Saunders thinks it’s harder for teachers in high-poverty schools. “The fact that kids show up to school hungry and distracted and they have no eyeglasses and can’t see the board, it doesn’t even acknowledge that,” he said.
But many researchers argue that value-added models don’t need to control for demographic factors like poverty, race, English-learner or special-education status at the individual student level, as long as enough test score data (at least three years) are included in the formula. They say states and districts choose to include demographic characteristics in the models to satisfy unions and other constituents—not because it’s statistically necessary.
William Sanders, a former University of Tennessee researcher now at the SAS Institute Inc., has spent nearly three decades working on a complex statistical formula that’s been adopted in districts serving a total of 12 million students around the country. With at least three years of test-score data from different academic subjects, he says he is able to home in on a good prediction of what a particular student’s progress should look like in a given year—and thus, how much a teacher should be expected to teach the student. Adding demographic factors only muddies the picture, he argues.
“If you’ve got a poor black kid and a rich white kid that have exactly the same academic achievement levels, do you want the same expectations for both of them the next year? If the answer is yes, then you don’t want to be sticking things in the model that will be giving the black kid a boost,” he said.
But Eric Isenberg, a Mathematica researcher and one of the designers of the IMPACT value-added model for Washington, D.C., says he’s “never been really compelled by the lower-the-expectations-for-students argument.” The D.C. model only uses one year of data, and incorporates the poverty status of individual students, among other factors, to protect against biasing the ratings.
“Nobody ever makes the argument that you’re holding the kids that started at a lower [achievement level] to lower standards,” he said.
There is also debate among researchers about whether the concentration of disadvantaged students in a classroom should be taken into account. Only a handful of value-added models do so.
A large body of research has found that student achievement is affected not only by a student’s individual circumstances at home, but also by the circumstances of other children in the same school and classroom. Studies have found that students surrounded by more advantaged peers tend to score higher on tests than similarly performing students surrounded by less advantaged peers.
To some experts, this research suggests that a teacher with a large number of low-achieving minority children in a classroom, for example, might have a more difficult job than another teacher with few such students.
D.C.’s model doesn’t account for classroom characteristics, but Florida’s model accounts for the percentage of students scoring at similar levels in a class, a variable that may partly address the issue.
Controlling for the demographics of a whole class can be messy, says Douglas Harris, a University of Wisconsin-Madison professor who has studied both value-added modeling and how a student’s peers affect his or her own achievement.
“It’s very hard in a statistical sense to separate for those things,” Harris said. “Accounting for the student level and the classroom and school level is not going to make that much difference.”
Isenberg agrees: “I haven’t seen anything to date that suggests peer effects make a large difference” in the context of value-added teacher evaluations. Nevertheless, he is currently leading research in D.C. and 30 other cities to see if factoring in the concentration of disadvantaged students in a class will make a difference in teachers’ scores.
Daniel McCaffrey, a senior statistician at the RAND Corporation, a nonprofit research group, argues that peer effects can make a difference, however. If there are enough years of test-score data, “including individual-level race and income … in the model doesn’t matter very much,” he said. On the other hand, including classroom-level data “tends to matter more and can make meaningful changes” to a teacher’s rating.
Sanders says that in his years of research, he has found no correlation between the concentration of disadvantaged students and school performance on value-added measures. “It becomes a question of where do you want to put your risk,” he said. Should school districts risk hiding the fact that high-poverty schools tend to get more ineffective teachers, he asked, or risk rating teachers with high numbers of disadvantaged students incorrectly?

COMMENTS & TRACKBACKS (2) | POST A COMMENT

EMMALIE
I taught for four years in the Baltimore City Public Schools and while I do think the arguments about peer impact are valid, there’s a bigger point being left out of the arguments described above.
Children who live with deep poverty, home or community violence, food insecurity, or unstable parental relationships experience toxic stress in early childhood. When cortisol, the stress hormone, is at high levels in a young child’s brain, the hippocampus doesn’t grow properly, leading to weaker short-term and long-term memory, learning, and thinking skills. Stress has a myriad of other long-term effects, none of them good (see the ACEs Study from Anda et al in 2006 for some horrifying data).
Teachers in high poverty areas not only have to overcome the additional challenges they may face because of students’ behaviors and lack of parent support but face the harder task of teaching children whose brains are not wired for learning. Until all children have identical early experiences (and that’s said with all due irony) it will be impossible to have blanket expectations of all teachers and students.
JOHN THOMPSON
You accurately say that a large body of research has found student performance is affected by peers effects. Then you could add in the professional judgments of teachers about the effects of concentrated poverty. And then there is common sense.
And yet you cite several researchers who haven’t found evidence to support that. Its only fair to note that their failure to find such evidence cuts both ways. It is just as likely (or much much more likely) that their failure reflects inherent flaws in value-added, or in their lack of understanding of reality in inner city schools, as it does the cumulative body of knowledge of researchers and educators. (for instance, I wonder how well DC people briefed their RAND researcher on logistics of their schools when the model was being constructed.)
Firstly, there are whole ranges of poverty. The key issue is concentrations of generational poverty, that usually co-exist with high levels of trauma and low levels of social trust. And yet, I’ve never heard of a researcher controlling for high concentrations of seriously emotionally disturbed special education students, as opposed to straightforward learning disabilites.
Secondly, high-poverty magnet schools ought to produce high rates of growth, probably even greater than low-poverty schools. When the effects of poverty wash out on a macro level, and vam researchers work a lot with data on that level, I wonder if that influences their perspectives.
And what do economists mean when they find that controlling for poverty only makes a small difference? What do thet define as small? How much of their definitions are mathematical contructs? If a teacher has a 15% chance per year of having their career destroyed because they were misidentified, is that small?
Here’s what I’d call a significant difference. When the difficulty of meeting growth targets produces an exodus of teaching talent from schools with high concentrations of poverty. And then, it will be too late to say “oops!”
I hope everyone reads your piece in the context of Hechinger’s previous piece on the growing concentrations of poverty.

Nenhum comentário:

Postar um comentário