By Lisa Gartner
Times Staff Writer
Teachers stop her in the hallways of the schools she visits, and they stop her at Publix, and they all ask Linda Lerner the same question: What?
The Pinellas School Board member ran into one teacher in the grocery aisles who had been teacher of the year, who taught Advanced Placement classes, but who received a disappointing evaluation score.
Because she did not teach students who took the Florida Comprehensive Assessment Test, 50 percent of her evaluation was based on the performance of other teachers' students.
Teachers ask why. Where do they get these numbers?
Lerner and her colleagues have no answers.
When the Florida Legislature moved to base teacher evaluations on their students' FCAT score growth, it ran into a complication: Many teachers don't have students who take the test. So after calculating a score for each teacher under the "value-added model", or VAM, state-hired analysts aggregated them into a "schoolwide VAM score."
In the process, they invented a new set of numbers designed to measure the quality of teaching at a school. These numbers play an outsized role in the evaluations of thousands of teachers who don't have FCAT test-takers, comprising 40 to 50 percent of their ratings.
But a Tampa Bay Times analysis suggests that schoolwide VAM scores often don't match up with other state-driven measures of schools' success. Think Tarpon Springs High, Thurgood Marshall Fundamental Middle and Palm Harbor University High — all A schools with VAM scores suggesting teaching isn't what it should be.
Last week, Florida Senate President Don Gaetz questioned whether the evaluation system should be overhauled. Pinellas has declined to use the schoolwide numbers to make changes at the schools. Even state officials who distribute the scores can't fully explain how they should be used.
If schoolwide VAM scores are accurate measures of teaching, why is no one eager to make much hay over them? And if they're not, why are they being used so heavily in evaluations?
Over the push bar of her shopping cart, Lerner tells teachers what she and her colleagues tell everyone: They do not believe in these numbers.
• • •
The schoolwide VAM score is a temporary fix.
A teacher's individual score is an attempt to measure the impact he or she has on students' learning. It's separated from other factors affecting students: poverty, disabilities, gifted status and more. State-hired analysts consider the previous year's score, and how similar students across Florida performed. Based on that they predict how many points a student should improve on the FCAT.
If a student surpasses that predicted score, the teacher has "added value," hence the model's name. But if a student fails to meet the benchmark — even if the score improved — the teacher earns a low score.
But some teachers — nearly 4,000 in Pinellas — are so far removed from the testing arena that the state can't evaluate them on an individual VAM score alone.
By the 2014-15 school year, districts will have created enough assessments to give all teachers an individual score.
To measure these teachers' performance in the interim, analysts created a schoolwide score by averaging individual scores at each school by subject and grade level. Those numbers are averaged together again.
The product is a long, ugly decimal that hugs close to zero. A score that rounds to 0.15 would mean students scored 15 percent better than typical, similar children across Florida — thanks to quality teaching.
That is, if the number means what it says it means.
"Teaching is one of the most cognitively complex careers or professions that there is. So to take all of that and take all those kids and put it into one number is difficult," says Lisa Grant, director of professional development for Pinellas Schools.
• • •
As Florida's deputy chancellor for educator quality, Kathy Hebda believes schoolwide VAM scores are a more accurate measure of the quality of teaching than previous measures.
VAM scores level the playing field by removing outside-the-classroom influences. While Hebda acknowledges it's too soon to say how schools should respond to a low score, she believes they are a "hint" that something is happening.
According to the state's numbers, Tarpon Springs High, a solid A school, received one of the lowest scores in Pinellas. The score indicates teachers caused students to perform 28 percent worse than like students across Florida.
Palm Harbor University High, ranked the top Pinellas high school by FCAT, was middle-of-the-pack under VAM, although students did make gains. Pinellas Park came out on top.
A minus-0.28 VAM score put well-regarded Thurgood Marshall Fundamental Middle below several C and D schools.
"It was shocking," says principal Solomon Lowery.
School Board member Peggy O'Shea laughs at these scores.
"Where are they coming up with this stuff?" O'Shea said.
Bruce Proud, executive director of the Pinellas teachers' union, said some skilled teachers may leave the profession when they see scores "that have no real connection to them."
More than a few heads snapped up this week when Gaetz — the reform-minded president of Florida's Republican-dominated Senate — questioned whether the new evaluations were working.
Why, Gaetz asked, were schools with C or D grades boasting a teacher effectiveness rate of 90 percent or more?
"I think we have to start with drawing a line that connects those data points," he said. "And if we can't do that, then I think we're going to have a hard time explaining this to teachers and explaining it to parents."
• • •
Pinellas school administrators believe the public shouldn't see schoolwide VAM scores.
"Evaluations are very personal," Lisa Grant says. "We're in the middle of a paradigm shift, so there are some very good pieces about that, but (also) a measure that you don't always directly control, or even understand completely."
The Times asked the district for schoolwide data in December. The School Board attorney said they revealed too much about a teacher's individual score.
After the Times got the numbers from the state, district officials continued to argue against publication. Late last week, for the first time, they revealed that they recalculate the state numbers for a more accurate measurement. But they declined to release those numbers as well.
• • •
Grant says a low score tells her nothing about a school's problems. Other measures allow her team to dig deeper. And one year of data is not a trend.
Up in Tallahassee, Kathy Hebda maintains that schoolwide scores are useful.
As for the nonbelievers, "In any new system, even in a system that may have been around for many years, improvements continue to be made," she says. "That happens in technology. It happens in the auto industry. It happens in everything."
Why, she asks, should a teacher evaluation be any different?
Lisa Gartner can be reached at firstname.lastname@example.org.