Saturday, April 22, 2017

Publication Bias In Climate Science

Some recent research I've been doing has lead to an interesting experience. I'm always frustrated at the way science is communicated to the public. This was another example of something that disappointed me.

I was trying to figure out if there is publication bias in climate science. More specifically, I was looking for a funnel plot for the "climate sensitivity," something that would quickly and graphically show that there is a bias toward publishing more extreme sensitivity values.

Climate sensitivity is the response of the Earth's average temperature to the concentration of CO2 in the atmosphere. The relationship is logarithmic, so a doubling of CO2 will cause an X-degree increase in average temperature. To increase it another X-degrees would require another doubling, and so on. Obviously there are diminishing returns here. It takes a lot of CO2 to keep increasing the Earth's temperature.

If we focus on just the contribution from CO2 and ignore feedback, this problem is perfectly tractable and has an answer that can be calculated by paper and pencil. In fact Arrhenius did so in the 19th century. (He even raved about how beneficial an increase in the Earth's temperature would be, but obviously many modern scientists disagree with his optimism.) A doubling of atmospheric carbon gets you a 1º Celsius increase in average temperature. The problem is that carbon is only part of the story. That temperature increase leads to there being more water vapor in the atmosphere, and water vapor is itself a very powerful greenhouse gas. So the contribution from water vapor amplifies the contribution from carbon, so the story goes. This doesn't go on forever in an infinite feedback, but "converges" to some value. There are other feedbacks, too, but my understanding is that water vapor is the dominant amplifier.

This is a live debate. Is the true climate sensitivity closer to 1º C per doubling of CO2, or 3º (a common answer), or 6º (an extreme scenario)? This is what I was looking for: a funnel plot of published estimates for the climate sensitivity would reveal publication bias.

I found this paper, titled "Publication Bias in Measuring Climate Sensitivity" by Reckova and Irsova, which appeared to answer my question. (This link should open a pdf of the full paper.)

Figure 2 from their paper shows an idealized funnel plot:

If all circles are actually represented in the relevant scientific literature, there is no publication bias. But if the white circles are missing, an obvious publication bias is present. The idea here is that for lower-precision estimates (with a higher standard error), you will get a big spread of estimates. But journal editors and perhaps the researchers themselves are only interested in effects above a certain size. (Say, only positive effects are interesting and negative effects are thrown out. Or perhaps only climate sensitivities above 3º per doubling of CO2 will ever see the light of day, while analyses finding smaller values will get shoved into a file drawer and never be published.) In fact, here is what the plot looked like for 48 estimates from 16 studies:

It looks like there is publication bias. You can tell from the graph that 1) low-precision low-sensitivity estimates (the lower-left part of the funnel) are missing and 2) the more precise estimates indicate a lower sensitivity. The paper actually builds a statistical model so that you don't have to rely on eye-balling it. The model gives an estimate of the "true" climate sensitivity, correcting for publication bias. From the paper: “After correction for publication bias, the best estimate assumes that the mean climate sensitivity equals 1.6 with a 95% confidence interval (1.246, 1.989).” And this is from a sample with a mean sensitivity of  3.27: “The estimates of climate sensitivity range from 0.7 to 10.4, with an average of 3.27.” So, at least within this sample of the climate literature, the climate sensitivity was being overstated by a factor of two. The corrected sensitivity is half the average of published estimates (again, from an admittedly small sample).

I read this and concluded that there was probably a publication bias in the climate literature and it probably overstates the amount of warming that's coming. Then I found another paper titled "No evidence of publication bias in climate change science." You can read the entire thing here.

My first impression here was, "Oh, Jeez, we have dueling studies now." Someone writes a paper with a sound methodology casting doubt on the more extreme warming scenarios. It might even be read as impugning the integrity or disinterestedness of the scientists in this field. Of course someone is going to come up with a "better" study and try to refute it, to show that there isn't any publication bias and that the higher estimates for climate sensitivity are more plausible. But I actually read this second paper in its entirety and I don't think that's what's happening. We don't have dueling studies here. Despite the title, the article actually does find evidence of publication bias, and it largely bolsters the argument of the first paper. Don't take my word for it. Here are a few excerpts from the paper itself:
Before Climategate, reported effect sizes were significantly larger in article abstracts than in the main body of articles, suggesting a systematic bias in how authors are communicating results in scientific articles: Large, significant effects were emphasized where readers are most likely to see them (in abstracts), whereas small or non-significant effects were more often found in the technical results sections where we presume they are less likely to be seen by the majority of readers, especially non-scientists.
 Sounds kind of "biased" to me.
Journals with an impact factor greater than 9 published significantly larger effect sizes than journals with an impact factor of less than 9 (Fig. 3). Regardless of the impact factor, journals reported significantly larger effect sizes in abstracts than in the main body of articles; however, the difference between mean effects in abstracts versus body of articles was greater for journals with higher impact factors.
So more prestigious journals report bigger effect sizes. This is consistent with the other study linked to above, the one claiming there is publication bias.

From the Discussion section of the paper:
Our meta-analysis did not find evidence of small, statistically non-significant results being under-reported in our sample of climate change articles. This result opposes findings by Michaels (2008) and Reckova and Irsova (2015), which both found publication bias in the global climate change literature, albeit with a smaller sample size for their meta-analysis and in other sub-disciplines of climate change science.
I found the framing here to be obnoxious and incredibly misleading. The Michael’s and the Reckova and Irsova paper (the later linked to above) both found significant publication bias in top journals, and the “No evidence of publication bias” paper found essentially the same thing. In fact, here is the very next part:
Michaels (2008) examined articles from Nature and Science exclusively, and therefore, his results were influenced strongly by the editorial position of these high impact factor journals with respect to reporting climate change issues. We believe that the results presented here have added value because we sampled a broader range of journals, including some with relatively low impact factor, which is probably a better representation of potential biases across the entire field of study. Moreover, several end users and stakeholders of science, including other scientists and public officials, base their research and opinions on a much broader suite of journals than Nature and Science.
So this new paper looking at a larger collection of publications and published estimates confirmed that top journals publish higher effect sizes. It’s almost like they said, “We did a more thorough search in the literature and we found all those missing points on the funnel plot in Reckova and Irsova.” See the effect size plot, which is figure 3 in the paper:

Notice that for the full collection of estimates (the left-most line marked "N = 1042"), the average estimate is close to the 1.6 estimate from the other paper. Essentially, the first paper said, “We found a bias in top-level, high-visibility journals. We filled in the funnel plot using a statistical model and got a best estimate of 1.6.” And the second paper said, “We found a bias in top-level, high-visibility journals. We filled in the funnel plot by looking at more obscure journals and scouring the contents of the papers more thoroughly and got a best estimate of 1.6.” The later paper should have acknowledged that it was coming to a similar conclusion to the Reckova and Irsova paper. But if you just read the title and the abstract, you’d be misled into thinking this new “better” study refuted the old one. If you Google the name of the paper to find some media reports on it, you will see that some reviewers read the title only, or shallowly skimmed the contents and didn’t read the papers it’s commenting on.

 Here is more from the Discussion section:
We also discovered a temporal pattern to reporting biases, which appeared to be related to seminal events in the climate change community and may reflect a socio-economic driver in the publication record. First, there was a conspicuous rise in the number of climate change publications in the 2 years following IPCC 2007, which likely reflects the rise in popularity (among public and funding agencies) for this field of research and the increased appetite among journal editors to publish these articles. Concurrent with increased publication rates was an increase in reported effect sizes in abstracts. Perhaps a coincidence, the apparent popularity of climate change articles (i.e., number of published articles and reported effect sizes) plummeted shortly after Climategate, when the world media focused its scrutiny on this field of research, and perhaps, popularity in this field waned (Fig. 1). After Climategate, reported effect sizes also dropped, as did the difference in effects reported in abstracts versus main body of articles. The positive effect we see post IPCC 2007, and the negative effect post Climategate, may illustrate a combined effect of editors’ or referees’ publication choices and researchers’ propensity to submit articles or not.

Remember, this is from a paper titled “No evidence of publication bias in climate change science.” Incredibly misleading. This entire paragraph is about how social influences and specific events have affected what climate journals are willing to publish. 

“What is the true climate sensitivity?” is really a central question to the climate debate. The 3⁰ C figure is frequently claimed by advocates of climate interventionists (people pushing a carbon tax, de-industrialization, etc.), but the 1.6⁰ C figure is more plausible if you believe there’s a publication bias at work. The actual concentration of carbon has gone from 280 parts per million in pre-industrial times to 380 parts per million today, and the global average temperature has risen by about 0.8⁰ C. (Maybe it's actually more than 0.8⁰ C; 2015 and 2016 were record years and some commentators are extremely touchy about this point. Apologies if I'm missing something important here, but then again any conclusion that depends on two data-points is probably not very robust.) If the sensitivity is low, then we can keep emitting carbon and it’s really no big deal. If water vapor significantly amplifies the effect of carbon, then we’ll get more warming per CO2 doubling. There is a related question of “How much warming would it take to be harmful?” To do any kind of cost-benefit analysis on carbon reduction we’d need to know that, too. But clearly the sensitivity question is central to the climate change issue. If there’s any sort of publication bias, we need to figure out how to correct for it. People who cite individual papers (because they like that particular paper) or rely on raw averages of top journals need to be reminded of the bias and shamed into correcting for it, or at the very least they need to acknowledge it.

This is just the beginning of a new literature, I’m sure. There will be new papers that claim to have a “better” methodology, fancier statistics, and a bigger sample size. Or perhaps there will be various fancy methods to re-weight different observations based on…whatever. Or different statistical specifications might shift the best point estimate for the climate sensitivity. (I can imagine a paper justifying a skewed funnel plot because the error is heterosketastic: “Our regression assumed a non-normal distribution, because for physical reasons the funnel plot is not expected to be symmetric…”) I’m hoping this isn’t the case, but I could easily imagine a world where there are enough nobs to tweak and levers to pull that we’ll just get dueling studies forever. There are enough "researcher degrees of freedom" that everybody can come to their preconceived conclusion while convincing themselves they are doing sound statistics. Nobody will be able to definitively decide this question of publication bias, but each new study will claim to answer the critics of the previous study and prove, once and for all, that publication bias does exist (oops, I mean doesn’t exist). My apologies, but sometimes I’m an epistemic nihilist. 


It seems weird to me that there are only a few publications on publication bias in the climate sciences. "Publication Bias in Measuring Climate Sensitivity" was published in September 2015, and "No evidence of publication bias in climate change science" was published in February 2017. I remember trying to search for the funnel plot in early 2015 and not finding it. Possibly the September 2015 paper was the first paper ever to publish such a plot for climate sensitivity. If there is a deeper, broader literature on this topic and it comes to a different conclusion, I apologize for an irrelevant post. (Sometimes the literature is out there, but you just don't know the proper cant with which to search it.) But it looks like these two papers are the cutting edge in this particular vein. If more studies come out, I'll try to keep up with them. 


  1. I don’t even understand this, and to the extent I do, it seems like a total lie? How am I to parse this?

    Our meta-analysis did not find evidence of small, statistically non-significant results being under-reported in our sample of climate change articles. This result opposes findings by Michaels (2008) and Reckova and Irsova (2015), which both found publication bias in the global climate change literature, albeit with a smaller sample size for their meta-analysis and in other sub-disciplines of climate change science.

  2. @Piyo, here's one explanation: you're working with multiple ways of measuring publication bias. In the paragraph you cite, Harlos and colleages write about a specific way of measuring bias: under-reporting of small, low-powered results. Michaels and Reckova/Irsova find this, but Harlos and colleagues say they do not find it.

    Harlos and colleagues find other patterns: small effects show up less often in prestigious journals and abstracts. This all sounds like the same thing at first, but I suspect Harlos and colleagues regard publication venue and emphasis in the abstract as less serious than outright failure to publish the small numbers / negative results, and I'm inclined to agree. I don't think it's realistic to ask people to write the abstract using a randomly chosen effect from the paper. The abstract will usually, by design, highlight findings with larger implications. What do you think?

  3. Thanks for this! It really helped when I was doing my own analysis.

    I did a simple statistical evaluation of climate model accuracy and found that models systematically over predict warming, even on a 15 year time horizon where the model builders actually experienced some of those 15 years before doing the runs.

    Then I went looking and found several other lines of evidence against climate model accuracy:

    I used your analysis of publication bias as the capstone. It brought everything together. I share your annoyance at the second article's title. For a field that makes a big deal about "consensus", it seems like the title should have been “Consensus of 1042 estimates from 120 studies puts climate sensitivity at 1.6 deg C."

  4. Thanks for this. It was quite helpful in my own analysis.

    I did a simple evaluation of GCM accuracy over a 15 year forecast horizon and found they systematically over predict warming.

    Then I cataloged the other lines of evidence against GCM accuracy:

    I used your post as the capstone. It helps everything make sense. I share your frustration with the title of the second article. For a field that seems fixated on "consensus", it seems like the title should have been "Consensus of 1042 estimates from 120 studies puts climate sensitivity at 1.6 deg C"

  5. You missed something important:
    The 2 papers are dealing with completely different aspects of climate change!
    There should be no overlap. The second article does not include anything about climate sensitivity of CO2.
    Therefore the results from the first paper stand unchallenged!

    First one is about climate sensitivity of CO2 (as you explained). The second one is about "climate change in ocean systems".
    Quote from data collection section: "...identified articles for experimental results pertaining to climate change in ocean ecosystems. The search was performed with no restrictions on publication year, using different combinations of the terms: (acidification* AND ocean*) OR (acidification* AND marine*) OR (global warming* AND marine*) OR (global warming* AND ocean*) OR (climate change* AND marine* AND experiment*) OR (climate change* AND ocean* AND experiment*)."

    I think you were confused by the 1.6 estimate in the second paper (roughly in line with first one). That's not the climate sensitivity. It is a pure statistical measure, Hedge's d.