How does Florida's VAM work (in English)?
With this week's release of Florida teacher value-added, or VAM, scores, much has been said about how the complex statistical formula is unfair, imperfect, "junk science" and worse. "I will not dignify VAM's flawed and intellectually limited examination, celebrate its pseudo-high flyers, or bemoan its negative outliers," Miami-Dade superintendent Alberto Carvalho wrote on his Twitter account.
One frequent criticism of the system is that it's incomprehensible to most people, including teachers, and as such its public release means little. Chances are that anyone who uses the VAM numbers will do so incorrectly, observers said, trying to say one teacher or school is better or worse than another.
It prompted the question: how does the Department of Education actually calculate VAM scores? If you know what goes into the formula, and how it's accounted for, then perhaps you can determine whether comparisons are in order.
So we asked the DOE for an English-language explanation of how VAM works. The response was two pages long, and though complicated, easier to understand than the mathematical equation that makes so many people scratch their heads.
From this explanation, which you'll find in full at the end of this post, it becomes clear that these scores are calculated based on individual details, and not to compare teachers within schools, or from school to school. It notes, for instance, that a teacher's VAM score includes consideration for individual student performance and expectations, as well as classroom level variables such as class size and school-level specifics.
"Because schools exhibit differential amounts of student learning growth that may be attributable to independent factors at the school outside of the teacher’s control in addition to reflecting the teaching across the school, a common school component is also calculated, and added to the teacher-level calculation," the document explains.
That component accounts for half a teacher's VAM score. Read the full overview below. Does it help?
Florida Department of Education VAM Overview
The formula for measuring student growth using FCAT Reading and FCAT Mathematics results is a covariate-adjusted Value-Added model. Value-added models are a form of statistical modeling designed to estimate a particular teacher’s contribution to student learning. The value-added model begins by establishing the expected learning growth for each student, called a predicted score. As a covariate adjustment model, Florida’s VAM model bases each student’s predicted score on the typical learning growth seen among students who share the characteristics, called covariates, that are statistically controlled for in the model. Florida’s VAM model contains the following student-level covariates:
• Up to two prior years of achievement scores (the strongest predictor of student growth)
• The number of subject-relevant courses in which the student is enrolled
• Students with Disabilities (SWD) status
• English Language Learner (ELL) status
• Gifted status
• Mobility (number of transitions)
• Difference from modal age in grade (as an indicator of retention)
Other covariates the model considers are measured at the classroom level. Because of the model takes into account relationships between student and classroom characteristics, data in the model are said to be “nested” (in this case students, within teachers, within schools). Classroom-level covariates in the model include:
• Class size
• Similarity of prior test scores among students in the class
Because schools exhibit differential amounts of student learning growth that may be attributable to independent factors at the school outside of the teacher’s control in addition to reflecting the teaching across the school, a common school component is also calculated, and added to the teacher-level calculation. The common school component describes the amount of learning growth by grade and subject that is typical for students in each school that differs from the statewide expectation. Fifty percent of the common school component is included in a teacher’s VAM score.
The predicted score for each student is based on the FCAT developmental scale and is estimated by computing the average performance of students with the same values on the covariates, including how they performed on the test the prior year. This predicted score is then compared with their actual performance on the test for the current year to determine how much above or below this expected score the student actually performed. The difference between these two scores is called a residual which will be positive in cases where a student’s performance exceeded the expectation, and negative in cases where a student’s performance fell short of the expectation.
The model groups students based on the teachers’ classes they are assigned to in the appropriate reading/English language arts and mathematics courses aligned to FCAT. The model is then run for each subject (reading or mathematics) and at each individual grade level taught by the teacher in order to produce a VAM score that is either positive, negative, or “0.” The VAM score represents the amount, on average, that students taught by a given teacher performed above or below their predicted level of performance. A positive score indicates that the teacher’s students performed better than expected; a negative score indicates that the teacher’s students performed worse than expected; and a score of “0” indicates that the teacher’s students performed no better or worse than expected based on the factors accounted for in the model.
The grade-level specific VAM scores generated for teachers can also be combined into an aggregate score. To account for differences in the FCAT vertical scale across grade levels, subject areas, and years, the grade-level specific VAM scores are converted to a common metric – “the proportion of an average year’s growth. This conversion provides a common metric across grade levels and subjects covered by the statewide assessment and provides context to describe the magnitude of the gain or decrease in learning represented by the score. For example, if the average amount of growth for a given grade, subject, and year is 40 scale score points, transforming a VAM score of 20 points into a proportion yields a score of 0.50 (i.e., 20 divided by 40). Now, one can interpret the raw, grade-level specific VAM score of 20 to say on average students performed 50 percent higher than an average year’s growth. Aggregate scores are produced for one-year (combining across grade levels and subjects taught by a teacher in a given year), two years (combining across the two most recent years, as well as grades and subjects), and three years (combining across the three most recent years, as well as grades and subjects taught by the teacher).