Thursday, April 23, 2020

Cause of Death Misidentification is a Big Deal, But Probably Not for COVID-19

I've written on this topic a number of times regarding drug poisoning deaths, and I see it is re-emerging in light of the recent pandemic. Some people are claiming that reported COVID-19 deaths are exaggerated because they're simply coding anyone who had the virus at the time as a "COVID-19 death". It could be incidental but did not kill the person in a "but for" sense, like someone who happens to have a cold having a fatal heart-attack. Or it could be one of several contributing causes. Possibly someone in poor health whose remaining life expectancy was in months or weeks was finished off by the coronavirus. In a "but for" sense, the virus killed them, but it didn't really remove much time from their life. Thought of another way, it's just as valid to call it a "chronic lung condition death" or a "cancer death" or a "complications of immuno-suppressive drugs death." That is the claim, anyway. I don't know if this is a big deal for COVID-19. We can say that cause of death misattribution is generally a big deal, even if it's not a big deal in the specific case of COVID-19.

I think some people have been too quick to dismiss this as "not even a thing." To be sure, some have latched onto this as a conspiracy theory or a slam-dunk "debunking" of the epidemic. Maybe what I'm seeing is an over-correction by reasonable people of less reasonable people? I want to do my part with this post to tone down some of the snark and the smugness.

See this study, a retrospective study of 601 death certificates:
A total of 580 (93%) death certificates had a change in ICD-10 codes between the original and mock certificates, of which 348 (60%) had a change in the underlying cause-of-death code.
Also see this one, a study on the quality of the information contained on death certificates:

Of the 290 properly completed CODs, 141 (49%) contained disagreements: 73 (52%) on underlying CODs; 44 (31%) on immediate CODs; and 47 (33%) on other significant conditions (part II).
Both studies are very short and quite readable. Both of these studies were cited in a Cato paper by Jeff Miron, Overdosing on Regulation. (I was involved in the writing and editing of this paper.) Rather than implying poor information quality or sloppiness  on the part of medical examiners, I think they speak to how inherently difficult it is properly assign a cause of death in general. Specifically with respect to drug overdoses...

Witness France and Germany having a 3-fold difference in drug overdose deaths without any obvious explanation, other than France being more reluctant to classify a death as a drug overdose. (See the quote from Drug War Heresies.) Reporting bias is real.

Witness the comorbidities of drug overdose deaths. Chronic illnesses are often listed on the death certificate (~1/3 of the time or more). This seems to imply that whoever filled out the certificate thought that the condition contributed to the death. It's worth asking if some of these are drug poisonings at all.

Note the language in the medical textbook The Pathology of Drug Abuse by Stephen Karch. I excerpted it at length here and here. It feels like every other sentence is warning the reader to be skeptical about the true cause of death. An apparent drug overdose might not be one. Given that he's writing a medical textbook, he must think that this is a problem to be corrected, or at least that it is something that's easy to mishandle. Karch writes:
When a doctor “certifies” a cause of death, his certification is based upon his evaluation of the evidence available to him, but it is still just his opinion and does not set a precedent for similar cases.
Like I said, I don't know if misattribution is a big deal regarding COVID-19. If COVID-19 is everywhere, say 10% of the population having been infected, I would say we need to worry about the misattribution problem. If it's much less prevalent, then it's probably small enough to ignore. And keep in mind that errors can happen in two directions. Misattribution is just as likely to lead to under-counting as over-counting, especially given the lack of adequate tests for the virus.

________________________________

A quick back-of-the-envelop calculation. There are about 2.8 million deaths a year in the U.S.. There is some mild seasonality, but ignoring that let's divide by 12 and call it 233 thousand each month. So we should get about 466 thousand total deaths from all causes in March and April. Supposing 1% of the population has been infected, we should have gotten about 1% * 466 thousand = 4.66 thousand deaths in two months from people who happened to be infected with COVID-19. Compare this to the ~50 thousand deaths reported to date (with seven days left in the month of April), and it looks like over-counting is not a big deal. (I have been using this source for total deaths; if there's another I should be using instead, let me know.) If you think the true prevalence is more than 1%, say in the 5% to 10% range, then over-counting might be a really big deal. The amount of likely over-counting depends critically on how many people actually have had the virus. If it turns out that's a large fraction of the population, we have to start talking about how many of those people would have died anyway over the relevant time period. The above is only a very crude estimate, and perhaps we'll have finer-grained understanding of "excess deaths" in the coming weeks and months. If there are enough such excess deaths, it will rule out over-counting as a significant effect. (Maybe this can already be done for New York city and other hard-hit places?)

Some mechanics of the cause-of-death attribution process. [Edit. I do not know if COVID-19 death counts are coming from the process described below. I am describing the normal process of filling out and reporting death certificates. I suspect that there is some ad hoc reporting with the coronavirus numbers.]

A death certificate contains a section that looks like this.


The person filling it out is supposed to fill in a cause, or a sequence of causally related conditions, in Part I. The immediate or proximate cause is supposed to go at the top. The conditions or incidents that initiated the sequence go toward the bottom. So "auto accident" might go in Part I line b, while "blunt force trauma to the head" might go in Part I line a. Part II allows the examiner to mention contributing factors that aren't directly part of the sequence.

When the CDC intakes this death certificate, it assigns an underlying cause of death. The way it does so is elaborate and kind of cool. It actually parses the raw text of the certificate, converts these from free-form text to a finite set (thousands) of causes of death, each with an ICD-10 code, and puts all this into one big data file, marking which part of the death certificate each cause came from. I have made a ritual of pulling this data file once a year and analyzing it. Up to 20 contributing causes of death are listed on the CDC's death record for that individual (though it's rare to see more than 10 or so). There is then a sequence of rules for how to assign the underlying cause of death:

General Principle: Select the condition on the lowest line of Part I only if it could cause all the above conditions.
Rule #1: If General Principle does not apply, select the cause of the first-mentioned sequence.
Rule #2: If there is no sequence, select the first-mentioned condition.
Rule #3: If previous rules lead to a condition that is obviously caused by something else on the certificate, report that instead.
Other useful rules: Time intervals will always be obeyed, a linkage in part I will always be preferred over Part II, The most specific chain will always be chosen.

There are some arcane and hard-to-parse decision rules which attempt to automatically code the underlying cause of death. A very patient employee at the CDC once helped walk me through an example to arrive at the correct cause of death. But she also told me that these records  have to be checked by a human being. Apparently these rules leave a lot to interpretation, and there is room for error. (I believe the text parsing software is called MICAR or superMICAR, and the decision rules are called ACME and TRANSAX, if you want to look these up.) In other words, there are a lot of checks in the process, because misattribution is a big deal. It's easy to make mistakes when assigning a cause of death, just like it's easy to make mistakes when it's assigning a cause to anything.

No comments:

Post a Comment