Bigger samples, and account for the girls, please
The BUROS Center for Testing - that's the University of Nebraska agency the state has hired to review the FCAT - has issued its first set of recommendations about how to improve scoring the test. The hope is that Florida doesn't have another embarrassment like when it was forced to admit the 2006 third-grade reading results had been inflated because of bad methods.
Admittedly, we at the Gradebook aren't testing experts. The word psychometrician actually kind of scares us a bit. But we can gather this much from the pros' rather technical writings.
First, the reviewers suggest that Florida uses too small a sample of students - about 1,500 - when trying to determine whether the questions are appropriate for the grade level. "We believe that by increasing the size of this equating sample, the year-to-year fluctuations will be decreased," they write.
Second, they recommend using gender as a variable when assessing a question. Why? Girls do better than boys, on average, on the reading tests. This matters because girls have been a larger portion of the testing sample than their representation of the entire student body. Meaning, boys might unfairly be rated because too many girls are counted.
Well, that's the way we read it. Check the document for yourself if you wish, by clicking here. Oh, yeah. The FCAT review panel meets again tomorrow in Orlando to talk about this report. To see the agenda, click here.