Glitches, cyberattacks deal blow to credibility of Florida's new school tests

The debut of standardized computer testing in Florida has proved anything but standard for the state’s eighth-, ninth- and 10th-graders, including these ninth-graders at Land O’Lakes High School.
The debut of standardized computer testing in Florida has proved anything but standard for the state’s eighth-, ninth- and 10th-graders, including these ninth-graders at Land O’Lakes High School.
Published March 11, 2015

The debut of new standardized computer testing in Florida has proved anything but standard for the state's eighth-, ninth- and 10th-graders.

First came server problems caused by the state's vendor, American Institutes for Research. Then, according to officials, a cyberattack known as a "distributed denial of service," or DDOS, struck the system, and law enforcement is investigating.

Some kids took the test without a hitch. Some struggled to overcome the glitches. Some have yet to finish, days after starting.

The upshot has been doubt. Parents, educators and decisionmakers have demanded to know: Can the results ever be trusted?

"It's an excellent question — and one the Department of Education will need to answer empirically," said state Sen. Don Gaetz, a legislative leader on school issues and a former Okaloosa County superintendent.

Last year, Kansas leaders threw out results from their trial run of a statewide exam. Their situation was, in many ways, parallel to Florida's.

"Our first week we had problems that were really our fault," tied to server errors, said Marianne Perie, director of the University of Kansas Center for Educational Testing and Evaluation. "Then we had a period of time when we got hit by a DDOS."

Eventually, the state had a time of trouble-free testing.

But multiple studies of student performance led experts to recommend throwing out the scores.

One analysis showed that students who had problems with test access skipped an average of 15 percent of the questions, while those who had no difficulties skipped 1 percent. Another review matched students with similar academic and demographic characteristics, and found the children who tested while problems were occurring no longer compared to the ones who tested later.

"Bad scores are worse than no scores," Perie said, explaining why she recommended losing the scores.

As their name implies, standardized tests rely on uniform conditions in order to get usable information. The standards are considered a foundation of fairness, comparability and integrity in scoring.

It's difficult to test all students under identical circumstances, given human error, but schools try.

"It's what we don't know that kills us," said Scott Marion, associate director of the National Center for the Improvement of Educational Assessment.

Not knowing the full extent of the problems, and how they affected students, makes it difficult to analyze their effects, he said. Comparing the scores becomes problematic, making the data questionable.

"If you get to take your test and you get to go through it seamlessly, and I'm going through it and I don't, do you have an advantage? I think you probably do," Marion said. "That becomes an issue for comparability."

Florida could face that problem.

"Any sense of standard conditions seems to have been compromised by the events reported," said Steve Dunbar, a testing expert at the University of Iowa. "Problems of this frequency and magnitude are significant in how they might affect student performance on the test. They would not be tolerated in college admissions or certification testing."

The same could be said for school accountability testing that carries high stakes, he said. Florida's 10th-grade writing exam is part of the state's graduation requirement.

The Florida Department of Education has yet to address how it will handle results from this round of testing. Meanwhile, a growing number of critics have called upon state leaders to at least abandon attaching consequences to the scores, given the swirl of uncertainties.

Perie agreed that the state should make no decisions until it can run the statistics and determine if problems are real.

"They have to be able to compare (the first week) to a clean testing window," she said, adding that the evaluation must be conducted by an independent firm, not the state or its testing vendor.

At the same time, she added, "there's a difference between statistical analysis and credibility."

Perie referred to Oklahoma's 2014 testing cycle, during which vendor software did not work properly. A review uncovered no issues with the results, she said, but the public outcry prompted officials to throw out some student scores and fire the vendor.

"At a certain point, even if you don't like to give up on accountability, you're risking the credibility of the system," Marion said.

Miami-Dade superintendent Alberto Carvalho wondered whether Florida hadn't already hit that point.

He said the state rushed to put the Florida Standards Assessments into play, and its administration has been "less than smooth."

Even if everything goes well from here, Carvalho said, he questioned whether the problems so far had eroded the tests' credibility.

Will the state now push forward and still base school consequences on these scores? "That's a question that needs to be taken up," he said.

Times-Herald Tallahassee bureau reporter Kathleen McGrory contributed to this report. Contact Jeffrey S. Solochek at or (813) 909-4614. Follow @jeffsolochek.