1. Opinion

For-profit standardized testing industry can't be trusted

Published May 19, 2012

The scores for the writing portion of this year's FCAT plummeted so precipitously that the abilities of Florida's student writers aren't even being called into question. The validity of the scoring statistics are. While I don't want to say "I told you so" regarding the dubiousness of those statistics, I did tell you so, as my 2009 book highlighted in detail all the ways the numbers produced by the for-profit standardized testing industry cannot be trusted.

Take the stats produced at Pearson scoring centers around the country, where I worked for the better part of 15 years. On the first project I worked scoring student essays, I had to pass a qualifying exam to stay on the job. When I failed that qualifying exam (twice), I was unceremoniously fired. So were half the original hundred scorers who had also failed the tests. Of course, when Pearson realized the next morning they no longer had enough scorers to complete the project on time, they simply lowered the "passing" grade on the qualifying test and put us flunkies right back on the job.

Yes, those of us considered unable to score student essays 12 hours before were welcomed back into the scoring center with open arms, deemed qualified after all.

Such duplicity was not an aberration in my experience either. For a decade and a half I saw every sort of corporate chicanery and statistical tomfoolery. The test-scoring industry seemed focused on getting deadlines met, projects completed and scores put on tests, but only then did any thought seem to be given to meaningful scores being put on them.

I regularly saw unqualified people (myself included, apparently) keep their jobs scoring student responses even when they were altogether no good at the job, either when the acceptable qualifying grades were dropped so low that anyone could meet them, or when the correct answers to the qualifying exams were handed out even before the tests were taken.

I regularly saw statistics get doctored to make group reliability numbers (agreement between the scorers) look better than they really were, as high reliability stats were necessary to convince customers how standardized a job was being done and how "valid" the work really was. I regularly saw distribution numbers fixed to make score results look however a client might have wanted.

Once I attended a range-finding meeting with other test-scoring experts and English professors from around the country, the bunch of us trying to figure out how to score writing samples for a national test. After that group of experienced test scorers and esteemed writing teachers had hammered out some consensus regarding the writing rubric and writing samples we'd been reviewing, we were told we were scoring "wrong." We test-scoring experts and writing teachers were told our scoring wasn't matching the predictions of the omniscient psychometricians (statisticians/testing gurus), and we were told we had to match those predictions even though the pyschometricians had never actually seen the student responses.

When the next year I read in the New York Times that student writing scores had ended up exactly in the middle of the psychometricians' predictions, I can't say I was surprised: We had made sure they did.

And that's the thing: In my experience, the for-profit test-scoring industry could produce results on demand. There was no statistic that couldn't be doctored, no number that couldn't be fudged, no figure that couldn't be bent to our collective will. Once, when a state Department of Education (it wasn't Florida's) didn't like the distribution of essay scores we'd been producing over the first two weeks of a project, we simply followed its instruction to give more upper level scores. "More 3's!" became our battle cry on that project, even if randomly giving more 3's was fundamentally unfair to all the students whose essays had been assessed differently in the days before.

In the end, I guess I'm saying you probably needn't worry too much about this year's falling FCAT scores, because they're only a number. If you want a different number next year, just ask; surely Pearson will just make more.

Todd Farley is the author of "Making the Grades: My Misadventures in the Standardized Testing Industry." His opinions have been published in the New York Times, Washington Post and Education Week.


  1. The Howard Frankland Bridge, which connects St. Petersburg and Tampa, is a leading symbol of regional unity.
    Organizations that rebrand themselves should have a regional mission that reflects the name.
  2. The White House says it has chosen President Donald Trump's golf resort in Miami as the site for next year's Group of Seven summit.  (AP Photo/Alex Sanz, File) ALEX SANZ  |  AP
    Monday’s letters to the editor
  3. Academy Award-winning actress Lupita Nyong'o has written a children's book called Sulwe, about a girl who "was born the color of midnight."[Photo (2014) by Jordan Strauss/Invision/AP] File photo
    Most white people have never heard of skin lightening cream or the “paper bag test,” where your fiance can be no darker than a paper sack. | Leonard Pitts Jr.
  4. Ayana Lage, 26, and Vagner Lage, 27, pose with a sonogram of their unborn child. Ayana writes openly about going through a miscarriage due to the baby having a rare genetic defect. She wonders why more women don't discuss their miscarriages. JOHN PENDYGRAFT   |  Times
    Sunday’s letters to the editor
  5. Kreshae Humphrey, 26, applies ointments to the skin of her 3-year-old daughter, Nevaeh Soto De Jesus, after bathing her in bottled water. The parents bathe all three of their girls with bottled water because they believe the children were sickened by the tap water at the Southern Comfort mobile home park off U.S. 19 in Clearwater. The family is suing the park's owner over the issue, but the owner and the state say there are no problems with the drinking water there. MARTHA ASENCIO-RHINE  |  Times
    The story of a Clearwater mobile home park and its water issues reflects a systemic breakdown.
  6. A long stretch of US 98 remains closed for repairs in Mexico Beach on Friday, September 27, 2019, almost one year after Hurricane Michael made landfall in the small coastal town. DOUGLAS R. CLIFFORD  |  Tampa Bay Times
    Time is running out, so let’s get practical, says Craig Fugate
  7. FROM PRINT: Adam Goodman, national Republican media consultant
    Sure, fix capitalism’s flaws, but a wealth tax is not the way. | Adam Goodman
 CLAY BENNETT  |  Chattanooga Times Free Press
  9. A view of the downtown St. Petersburg skyline and waterfront from over Tampa Bay.
    The news that the Tampa Hillsborough Economic Development Corporation wants to change its name to include “Tampa Bay” has been met with resistance.
  10. Catherine Rampell, Washington Post columnist.
    Allegations of political cowardice can seem rich coming from candidates unwilling to acknowledge the obvious truths about things such as higher taxes. | Catherine Rampell