Measuring the Wrong Things

Feb 17, 2021

I AM TEMPTED TO WRITE THIS BLOG POST IN ALL CAPS but, of course, screaming at readers is neither polite nor effective.

The temptation to scream arises from accumulated frustration at the intellectual, philosophical and scientific disconnect between those who develop and implement policy and those who actually know how children learn and develop.

Measure the wrong things and you’ll get the wrong behaviors.

This simple statement succinctly characterizes why the American education system continues beating its head against the wall. The more aggressively the education establishment presses its data-driven nonsense, the worse schools will be. Every rise in test scores that vindicates the aggressive practices represents a corresponding deficit in the experiences that best serve children’s development.

Education reformers and so-called policy “experts” are constantly collecting and analyzing data. Many of these experts are, not surprisingly, economists. It’s not for nothing that economics is sometimes called “the dismal science.” The hostile takeover of education by non-educators is filled with intelligent sounding phrases: “evidence-based,” “data driven,” “metrics and accountability.” At every level of schooling, mountains of data are collected to inform “best practices” based on the alleged cause and effect implications of data-based instruction and the feedback gleaned from tests.

It is not coincidental that the education policy and reform business is highly profitable. Public education is estimated to be a $600-700 billion market. Those who drive the measuring and testing industry are first in line at the trough. Pearson Publishing, for example, has its greedy tentacles in nearly every school district in America. All the iterations of reform — No Child Left Behind, Race to the Top and, more recently, the Every Student Succeeds Act — are driven by (and driving) the collection and interpretation of data.

Throughout education, an increasingly rigid, closed loop of assessment is systematically making schools worse: Define things children should know or be able to do at a certain age; design a curriculum to instruct them in what you’ve decided they should know; set benchmarks; develop tests to see if they have learned what you initially defined; rinse and repeat.

This narrow, mechanistic approach to education does not correspond to the reality of child development and brain science, but the metrics and assessment train charges down the track nevertheless. So what’s wrong with that, you might ask? Isn’t school about teaching kids stuff and then testing them to see what they’ve learned?

In a word, “NO.” It simply doesn’t work, especially with young children.

As Boston College Professor Peter Gray wrote in a Psychology Today article:

Perhaps more tragic than the lack of long-term academic advantage of early academic instruction is evidence that such instruction can produce long-term harm, especially in the realms of social and emotional development.

“Direct instruction” does increase scores on the tests the instruction is aimed toward, even with very young children. This self-fulfilling prophecy is not surprising. But multiple studies also show that the gains in performance are fleeting — they completely wash out after 1-3 years when compared to children who had no such early direct instruction.

“Wash out” is too kind.

A comprehensive study of kindergartens in Germany revealed, as Gray writes:

Despite the initial academic gains of direct instruction, by grade four the children from the direct-instruction kindergartens performed significantly worse than those from the play-based kindergartens on every measure that was used. In particular, they were less advanced in reading and mathematics and less well adjusted socially and emotionally.

In another extensive study of poor children in Ypsilanti, MI, young boys and girls who were in academic, instruction-based early education programs were, by age 23, more than twice as likely to have arrest records, less likely to be married and suffering from various types of emotional impairment compared to their peers who attended play-based preschool.

These behaviors (pressing academic work on young children) are a direct result of measuring the wrong thing (test scores). If we measured the right things (social development, curiosity, empathy, imagination and confidence), we would engage in a whole different set of education behaviors (play, socialization, arts programs, open-ended discovery).

After more than 20 years of reading, observing, teaching and presiding over a school, I’m convinced that this simple statement — “Measure the wrong things and you’ll get the wrong behaviors” — is at the root of what ails education, from cradle to grave. Measuring the wrong thing (standardized scores of 4th graders) drives the wrong behaviors (lots of test prep and dull direct instruction). In later school years, measuring the wrong thing (SAT and other standardized test scores, grade point averages, class rank) continues to invite the wrong behaviors (gaming the system, too much unnecessary homework, suppression of curiosity, risk-aversion, high stress).

Measuring the right things is more complicated and less profitable. But if we measured the things that we should truly value (creativity, joy, physical and emotional health, self-confidence, humor, compassion, integrity, originality, skepticism, critical capacities), we would engage in a very different set of behaviors (reading for pleasure, boisterous discussions, group projects, painting, discovery, daydreaming, recess, music, cooperation rather than competition).

Like most things in America, this infuriating insistence on accountability, based on testing, which drives “direct instruction,” which supplants more meaningful experiences, which in turn inhibits true growth - whew, a mouthful - disproportionately damages poor children, mostly of color. Data-driven practices may, temporarily, narrow the so-called achievement gap, but over time they significantly widen the “capacity gap,” a phrase I coin to identify the deficits in reasoning, critical thought, creativity, originality, emotional health and other qualities that limit life satisfaction and success.

Private schools, where many architects of policy send their children, would not dream of engaging in these sterile, unimaginative practices. Affluent public schools don’t either. They have the political capital to opt out of tests and the financial capital to offer rich menus of elective or extracurricular experiences that foster and celebrate the qualities that maximize life satisfaction and success.

The deepest irony is that if kids had the experiences I noted above - reading for pleasure, boisterous discussions, group projects, painting, discovery, daydreaming, recess, music, cooperation rather than competition - they would do better at reading and math too.

Leave a comment

John Roeder

This was refreshing to read, Steve!

Expand full comment

2 replies by Steve and others

Duane Edward Swacker

Whatever is measured counts

Whatever counts is measured

And counting whatever measures

Is measuring whatever counts

SomeDam Poet.

The most misleading concept/term in education is "measuring student achievement" or "measuring student learning". The concept has been misleading educators into deluding themselves that the teaching and learning process can be analyzed/assessed using "scientific" methods which are actually pseudo-scientific at best and at worst a complete bastardization of rationo-logical thinking and language usage.

There never has been and never will be any "measuring" of the teaching and learning process and what each individual student learns in their schooling. There is and always has been assessing, evaluating, judging of what students learn but never a true "measuring" of it.

But, but, but, you're trying to tell me that the supposedly august and venerable APA, AERA and/or the NCME have been wrong for more than the last 50 years, disseminating falsehoods and chimeras?? Who are you to question the authorities in testing???

Yes, they have been wrong and I (and many others, Wilson, Hoffman etc. . . ) question those authorities and challenge them (or any of you other advocates of the malpractices that are standards and testing) to answer to the following onto-epistemological analysis:

The TESTS MEASURE NOTHING, quite literally when you realize what is actually happening with them. Richard Phelps, a staunch standardized test proponent (he has written at least two books defending the standardized testing malpractices) in the introduction to “Correcting Fallacies About Educational and Psychological Testing” unwittingly lets the cat out of the bag with this statement:

“Physical tests, such as those conducted by engineers, can be standardized, of course [why of course of course], but in this volume , we focus on the measurement of latent (i.e., nonobservable) mental, and not physical, traits.” [my addition]

Notice how he is trying to assert by proximity that educational standardized testing and the testing done by engineers are basically the same, in other words a “truly scientific endeavor”. The same by proximity is not a good rhetorical/debating technique.

Since there is no agreement on a standard unit of learning, there is no exemplar of that standard unit and there is no measuring device calibrated against said non-existent standard unit, how is it possible to “measure the nonobservable”?

THE TESTS MEASURE NOTHING for how is it possible to “measure” the nonobservable with a non-existing measuring device that is not calibrated against a non-existing standard unit of learning?????

PURE LOGICAL INSANITY!

The basic fallacy of this is the confusing and conflating metrological (metrology is the scientific study of measurement) measuring and measuring that connotes assessing, evaluating and judging. The two meanings are not the same and confusing and conflating them is a very easy way to make it appear that standards and standardized testing are "scientific endeavors"-objective and not subjective like assessing, evaluating and judging.

That supposedly objective results are used to justify discrimination against many students for their life circumstances and inherent intellectual traits.

1 reply by Steve

3 more comments...

First Do No Harm

Discussion about this post