Scaling up vs. spreading out

I’ve been thinking of scale lately. Not the musical sort, but the concept of size. One of the common questions asked in education research is “how well does this scale up?” It’s already no small feat to invent a practice or intervention that makes a measurable difference to students on the scale of a classroom or school. It’s another thing entirely to “scale it up” to a district, state, or nation.

But is “scaling up” even a proper goal? This recent article in The New Republic calls into question the wisdom of scaling up successful interventions in the international development arena. The basic argument is that “context matters.” To quote from the article:

The repeated “success, scale, fail” experience of the last 20 years of development practice suggests something super boring: Development projects thrive or tank according to the specific dynamics of the place in which they’re applied. It’s not that you test something in one place, then scale it up to 50. It’s that you test it in one place, then test it in another, then another. No one will ever be invited to explain that in a TED talk.

The scaling up dynamic seems to work reasonably well in some fields of medicine. A drug or vaccine is developed, tested on a small group of volunteers, found effective in a larger scale trial, and eventually comes to market. The drug “scales up” after passing several smaller-scale tests. We don’t feel the need to re-test the efficacy of a polio vaccine in each and every city of the US, for instance.

But medical research and social research are different in key ways. First, the nature of the “treatment” in medicine is often a pill – something that can be standardized, and whose mechanism relies more on chemistry than on behavior or cognition. That is, we know that if you administer a polio vaccine to 100,000 children, some (high) percentage will develop immunity to polio. It’s basic immunology. Sure, there will be outliers for whom it doesn’t work, and we need to have backup plans for those few.

But the New Republic article cites a similar medical-like intervention – de-worming children. After implementing a de-worming program in Kenya, the researcher found:

The deworming pills made the kids noticeably better off. Absence rates fell by 25 percent, the kids got taller, even their friends and families got healthier. By interrupting the chain of infection, the treatments had reduced worm infections in entire villages. Even more striking, when they tested the same kids nearly a decade later, they had more education and earned higher salaries. The female participants were less likely to be employed in domestic services.

So of course one would want to scale up that intervention. It’s a pill, after all. Completely standardized and easy to administer. Why not de-worm an entire nation? Well, they tried that program within several states of India, and the results were… unclear. I strongly suggest you read the full article for the nuances, but here’s the author’s punch line for this part of the argument:

In 2000, the British Medical Journal (BMJ) published a literature review of 30 randomized control trials of deworming projects in 17 countries. While some of them showed modest gains in weight and height, none of them showed any effect on school attendance or cognitive performance. After criticism of the review by the World Bank and others, the BMJ ran it again in 2009 with stricter inclusion criteria. But the results didn’t change. Another review, in 2012, found the same thing: “We do not know if these programmes have an effect on weight, height, school attendance, or school performance.”

The underlying point is this: many many things contribute to children’s health, school attendance, and intellectual development. Carrying parasites is one of many problems they face. Not that eradicating worms is not by itself an unquestioned “good thing,” but it may not (re)produce the outcomes one was expecting. And, like it or not, lots of “good things” have to get rated against one another when resources are limited.

So if something as controlled and unproblematic as giving a pill can have radically different results in different social settings, how in the world do we contemplate scaling up a much less standardized intervention like giving every child a laptop? Changing a textbook for an entire state? (I note in passing that textbooks are not like pills – teachers pick and choose which aspects of a book to use and which to ignore). Much of the Big Money in education research is looking for magic interventions that are scalable. As one who works in the trenches of evaluation, I can tell you just how hard it is to tease apart why some things work in some situations and not others.

Thus far I’m convincing myself that we should be conservative with our scaling up desires. Local context matters. Now let’s ask the question: when can one make a case for universal policy? I suspect this is a no-brainer for Policy 101 students, but my last policy analysis class was taught by an active alcoholic while I was in the throws of an undergraduate depression.

We have a national civil rights policy – rooted in education policy – that prohibits de jure discrimination by race. (Whether de facto discrimination is addressed appears to depend on the priorities of a given administration). But we don’t permit “local control” or “local choice” with matters of racial discrimination, nor should we. I can think of other nasty actions we try to outright prohibit on a large scale: corporal punishment and gender discrimination come to mind.

So where does local control run the risk that the locals will “get it wrong?” Where does a one-size-fits-all policy run the risk of having no effect or worse causing harm to a significant segment of the population? In particular I’m thinking of the Common Core standards movement, an attempt to bring unity to academic standards across the states (and, as a consequence, make it much easier and cheaper to have a single national achievement test). What “problem” does Common Core purport to solve? and is it as likely as not to cause problems if local variation is not supported?

I welcome any reader to chime in with a thought.


