Are High-Quality Schools Enough to Close the Achievement Gap? Evidence from a Bold Social Experiment in Harlem

By agelman on November 11, 2009.

Steve Levitt links to this article by Will Dobbie and Roland Fryer on an educational innovation to improve the education of ethnic minority children. Dobbie and Fryer write:

Harlem Children's Zone (HCZ) is arguably the most ambitious social experiment to alleviate poverty of our time. We [Dobbie and Fryer] provide the first empirical test of the causal impact of HCZ on educational outcomes, with an eye toward informing the long-standing debate whether schools alone can eliminate the achievement gap or whether the issues that poor children bring to school are too much for educators to overcome.

Their conclusions are extremely positive:

Harlem Children's Zone is enormously effective at increasing the achievement of the poorest minority children. Taken at face value, the effects in middle school are enough to reverse the black-white achievement gap in mathematics and reduce it in English Language Arts. The effects in elementary school close the racial achievement gap in both subjects. Harlem Gems and The Baby College, the only two community programs in HCZ that keep detailed administrative data, show mixed success. We conclude by presenting three pieces of evidence that high-quality schools or high-quality schools coupled with community investments generate the achievement gains. Community investments alone cannot explain the results.

Here's how they address the potential concern that kids in the program will be better-prepared than the control group of kids not in these schools:

We implement two identification strategies. First, we exploit the fact that HCZ charter schools are required to select students by lottery when the demand for slots exceeds supply. Second, we use the interaction between a student's home address and cohort year as an instrumental variable.

Here's the punch line:

"Winners" here are students who receive a winning lottery number or who are in the top ten of the waitlist.

They also show results for English tests which are positive, but less impressive. They remark that, "Interventions in education often have larger impacts on math scores as compared to [English] scores (e.g. Decker et al., 2004; Rockoff, 2004; Jacob, 2005). This may be because it is relatively easier to teach math skills, or that reading skills are more likely to be learned outside of school. Another explanation is that language and vocabulary skills may develop early in life, making it difficult to impact reading scores later (Hart and Risley, 1995)."

What does this all mean?

I haven't looked at the statistical details of this paper--that's hard work!--but I do have a few comments, to be made on the assumption that Dobbie and Fryer's analysis is essentially correct.

My first comment is that my mindset, before reading this paper, was that more effective teaching methods do exist--KIPP and the like--and that the way they work is by getting the teachers and students to work harder and longer than is usual during the school day. The Dobbie and Fryer paper did not change my view on this; they write "Our rough estimate is that Promise Academy students that are behind grade level are in school for twice as many hours as a typical public school student in New York City. Students who are at or above grade level still attend the equivalent of about fifty percent more school in a calendar year."

This is not to dismiss the findings--it's not so easy to motivate teachers and students to work twice as hard--but just to connect these results to other things that I've heard.

My second comment is that these schools are described as a way to close the gap between whites and blacks in school performance. But if they're so effective, maybe they'd be applied to white kids also? Or is the point that these school changes would really only be applied as part of a package of interventions in predominanty-minority neighborhoods? I'd like to hear more about this issue in the Conclusion section of the article, which raises the idea of following up in regular public schools.

Silly little things

Dobbie and Fryer's paper has excellent graphs--something you don't always see in work by economists. I'm happy to see that the top economists are presenting their work graphically--this seems like an excellent sign. I just have a couple of minor comments:

I'd prefer if Figure 1 (the map) were shown in a non-distorted way and with more information that is relevant to the study. For example, more information about exactly where the kids live, where the schools are, etc. The existing map is hard to read partly because it is distorted (or so it looks to my eyes), meaning that the distance scale is not so meaningful, also the orange background color makes it hard to see any details at all. Beyond this, the map includes irrelevant information such as the path of the Central Park road; this is the sort of thing that Ed Tufte correctly calls "chartjunk." In this case, the authors didn't add the chartjunk; they just put their info on an existing map. Nonetheless, the end result of this otherwise-potentially-useful map is to show nothing much more than that the Harlem Chlidren's Zone is, indeed, located in Harlem.

Figure 2 is just great. I have only three small suggestions:
- Reduce the y-axis scale. There's no reason to go all the way from -.6 to +.5; you can restrict to the range of the data, which is from -.4 to +.3. Even a small change like this will help a lot, actually.
- There's something weird going on with the y-axis. You can't put "percent enrolled" on the same scale as test scores! That's like saying that my groceries cost $25 and it's 15 degrees out, so my groceries are higher than the temperature. Also, you have to be careful with the whole "percentage" thing. Does ".2" on the percentage scale correspond to 0% or to 20%.
- Also, once you get rid of the percentage thing, you can really expand the scale, because the red and blue lines are all between -.4 and .02 on the y-axis.
- Beyond this, how to we interpret a test score of -.2? That doesn't seem right. I assume that the actual scores are positive, and that this is all explained in the text, but I really think that graphs should be as self-contained as possible.
- The color scheme is great (once you can explain how percentages and test scores fit on a common scale). I'd recommend labeling the lines directly rather than using a legend. Once you fix the scale, the lines will be farther apart also.
- 2003 should come before 2004. In the graph shown, 2004 is on the left and 2003 is on the right, which is counter to the conventional way of displaying time ordering.

I won't go over the other graphs line by line, except to say that they're basically fine. I would prefer, however, that they use a consistent color scheme throughout. In Figure 2, blue represents Math score and red represents English score; in the other figures, blue means Lottery Winners and red means Lottery Losers.

And then there are the tables. I think you know already what I'm going to say, so I won't bother to say it. (I mean, 10.424 with a standard error of 7.167? What are these people thinking?) I know, I know, default choices don't need to be justified. But, still . . .

It's worth emphasizing, at this point, that I think the authors present their results very well, both graphically and in the text of their article. It's only because they took the leap to make these solid graphs, that I can take the next step and try to help them do even better next time. I think one of the roles of a statistician such as myself is to help researchers do their jobs even better--and this is particularly satisfying in settings such as this, where there's no way I would've been doing the research myself.

The last line of the acknowledgments says, "The usual caveat applies." I have no idea what that means--something in economics-speak? I have noticed in general that econ papers have longer acknowledgment sections than stat papers do. My theory has always been that economists write fewer articles and put more time into each one, whereas statisticians spit out articles at a machine-gun rate and don't look back. The two fields have different systems: my impression is that in econ, it's a big deal to be published in the American Economic Review or wherever, whereas, in stat, an article in JASA or Annals of Statistics or wherever won't necessarily get noticed anyway.

More like this

Basics: Standard Deviation

When we look at a the data for a population+ often the first thing we do is look at the mean. But even if we know that the distribution

Seasons, short and simple

I love this question: Why is it warmer in the summer than in the winter (for the Northern hemisphere)? Go ahead and ask your friends. I suppose they will give one of the following likely answers:

The Real Bozo Attempts to Atone: Why the DDWFTW Car Works

Technorati Tags: ddftw, bozos, markcc-screwups

BIO101 - Lecture 7 - Physiology: Coordinated Response

Last week we looked at the organ systems involved in regulation and control of body functions: the nervous, sensory, endocrine and circadian systems. This week, we will cover the organ systems that are regulated and controlled.

"The usual caveat applies." Usually you thank your chums for improving your paper, while accepting that all the surviving flaws are your own fault. Quite why economists find it necessary to abbreviate such a display of good manners I don't know. Maybe economists think that good manners are effeminate? They do have a reputation for being ill-behaved oafs.

There's this school of thought out there that these lottery winner/lottery loser comparisons are the way to go, the so-called "gold standard" in charter school research.

Frankly, it's bunk. There are important issue with this kind of design.

I wrote a little bit about this on Gotham Schools this fall. This keeps coming up, and the this methodology has not been improved in a long time.

http://gothamschools.org/2009/09/23/what-is-the-gold-standard/

Alexander, I read your post. As I mention on your blog, you make good points, but don't offer improvements or acknowledge how much better this methodology is than most studies in education.

"My second comment is that these schools are described as a way to close the gap between whites and blacks in school performance. But if they're so effective, maybe they'd be applied to white kids also?"

Well said.

It's striking that the way everybody tries to sell educational improvements these days is by saying they will close the black-white gap. You might think that the goal of educational reform would be help all students come closer to achieving their individual potentials, but nobody talks like that anymore.

The "percent enrolled" in the figures should be "fraction enrolled". That is, "0.2" means 20%. It makes no sense otherwise.

I suppose people speak of "closing the black=white gap" because the gap seems incredibly unfair -- these are just kids. And the gap is often large and, large or small, has a specific size. So when you promise to close the gap you are being more specific than simply promising improvement.

Andrew,

I was alarmed to read that educators are not doing enough to reduce the achievement gap.

Specifically, recent news reports indicate that differences between the highest and lowest segments of student populations haven't changed much. I am shocked. How can a statistical distribution STILL follow a bell curve in this day and age with all the money that has been spent to fix that!

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Bye

July 11, 2010

I realize that I haven't been posting much here. We had some plans to use the Applied Statistics blog for other purposes but it didn't really work out, so from now on you can go to my main blog for your statistical entertainment.

"How many zombies do you know?" Using indirect survey methods to measure alien attacks and outbreaks of the undead

July 1, 2010

I've been told that it's zombie day, so I thought I'd link to this research article by Gelman and Romero: The zombie menace has so far been studied only qualitatively or through the use of mathematical models without empirical content. We propose to use a new tool in survey research to allow…

Scientists can read your mind . . . as long as the're allowed to look at more than one place in your brain and then make a prediction after seeing what you actually did

June 23, 2010

Maggie Fox writes: Brain scans may be able to predict what you will do better than you can yourself . . . They found a way to interpret "real time" brain images to show whether people who viewed messages about using sunscreen would actually use sunscreen during the following week. The scans were…

Ethical and data-integrity problems in a study of mortality in Iraq

April 27, 2010

See discussion here. I've linked to it from here because ScienceBlogger and investigative journalist Tim Lambert has written some on the topic.

Random matrices in the news

April 12, 2010

Mark Buchanan wrote a cover article for the New Scientist on random matrices, a heretofore obscure area of probability theory that his headline writer characterizes as "the deep law that shapes our reality." It's interesting stuff, and he gets into some statistical applications at the end, so I'll…