Science lesson: The absurdity of “n deaths per year” and “leading preventable cause” claims about smoking.
by Carl V Phillips
Smoking is quite harmful. Lots of people choose to do it. Given these facts, you would think that people who warn/scold/fret about smoking, at the individual or population level, would see no reason to exaggerate. Yet they do. They lie constantly and habitually. Still, in spite of the lying, you might think that they would avoid making mantras of claims that are simply nonsense. Yet they do not.
I have covered most of this before, highlighting some of it as one of the six impossible things tobacco controllers believe, but have not pulled it all together before.
Consider first claims like “smoking causes 483,456.7 deaths per year in the U.S.” What does this even mean? It obviously does not mean what it literally says, that but for smoking, these individuals would not have died. Occasionally someone asserting these figures phrases the claim in a way that highlights the implicit suggestion of immortality, and is rightly ridiculed for it. But in fact, even the standard phrasing implies this if treated as natural language.
Understanding what this might(!) really mean requires understanding the epidemiology definition of causing a death, which, it is safe to say, few of those reciting the claims about smoking understand. This definition is actually, like much of epidemiology, fundamentally flawed, but it gets us closer to something meaningful. The textbook definition is that something is a cause of death if it made the death occur earlier than it otherwise would have. Notice that this means that every death (like every event) has countless causes. E.g., a particular death may have been caused (in this sense of it occurring when it did and not later) by all of: smoking, being born male, not eating perfectly, occupational exposures, and choosing a low-quality physician. (Notice that if we extend to a broader definition of causation, other causes include the evolution of life on Earth and the individual’s grandfather making it home from the war.)
This typical version of the definition is fairly useless because it includes exposures that caused the death to occur only a few seconds sooner than it would have. We are seldom interested in those. Indeed, by that definition, smoking is a cause of death for almost every smoker and former smoker. It is very likely that any smoker who who is not killed instantly by trauma would have survived longer because whatever disease killed her would have developed more slowly, or simply because the body would have functioned for a few more minutes. So more useful definitions of a cause of death would be something that we estimate caused the death to occur a month, or a year, or five years earlier than it would have. Note that a far more useful measure, in light of these problems, is “years of potential life lost” (YPLL).
So which of those definitions is the “X deaths per year” claim based on, given that it is clearly neither the literal meaning (with its implication of immortality) nor the faulty textbook epidemiology definition (which would include approximately all deaths among ever-smokers)? The answer is: none of them. Those statistics are actually a toting up of deaths attributed to a particular list of diseases, each multiplied by an estimate of the portion of those cases that were caused by smoking, in historical U.S. populations. That is, it is the number of lung cancer deaths among smokers, multiplied by the portion of such deaths that are attributed to smoking, plus the number among former smokers multiplied by the attributable fraction for former smokers, plus those for heart attacks, plus those for a few dozen other specific declared causes of death.
As you might guess, based on who is doing the toting, these numbers are biased upwards in various ways. Still, it would be possible to estimate that sum honestly (no one has tried to do so for a few decades, but it would be possible). But the resulting measure would obviously not properly be described “deaths caused by smoking.” It would not be that hard to identify what the figure really is, especially in serious written material like research papers or government statements: “each year in the U.S. smoking is estimated by the CDC to cause X fatal cases among 29 diseases.” Of course, most “researchers” and “experts” in the field do not even know this is what they are trying to say.
There are also several problems with the numbers themselves, not just the phrasing. First there is the noted, um, shading upward of the numbers. Second, as I alluded to in the third paragraph, the statistic is always presented with too much precision. Even two significant digits (e.g., 480,000) is too much precision. The estimates of the smoking-attributable fraction of cases of those diseases are not precise within tens of percent for smokers, let alone former smokers, a much more heterogeneous category, at best, making even one significant digit (e.g., 400,000) an overstatement of the precision.
Third, and more important for most versions of the statistic is that “in historical U.S. populations” bit. The statistics you seen for other countries or the whole world are based on implicit assumptions that everyone share Americans’ health status and mix of exposures, because almost all the estimates come from U.S. studies (and those in the mix that do not are almost all from the countries that are most similar to the U.S.). At best, the estimated increase in risk for fatal cases of the disease ported to calculation for other populations, even though this varies across populations. That is, it is assumed that if the estimate is that half of all heart attacks among ever-smokers are caused by smoking, then that same multiplier is applied to heart attacks among ever-smokers in the other population. Worse, sometimes the attributable fraction itself is just ported, so if a quarter of all heart attacks in the U.S. are attributed to smoking, then that multiplier is applied to all heart attacks in other populations. That would mean, e.g., if a particular population has a lot of extra cases of cancers due to diet, the same fraction of those cancers that is due to smoking in the U.S. is attributed to smoking there.
Fourth, and worse still, the forward-looking versions of the statistics would be innumerate nonsense even if none of the other problems existed. These include the infamous prediction of a billion deaths from smoking in the 21st century, as well as assertions about the fate of cohorts who are taking up smoking now. The number of deaths from a list of diseases that are attributable to smoking is going to vary hugely not just across populations, but time. This is first-week Epidemiology 101 stuff. Population and time matter. There are no constants in epidemiology. The number of deaths from particular diseases will vary with technology. The attributable fraction will vary with the prevalence of other risk factors. Oh, and for those other changes, good news often makes things “worse”: An asteroid destroys higher life on Earth, and smoking stops causing any deaths. War, hunger, and infections are reduced and smoking causes a lot more cases of fatal diseases.
In summary, these statistics are: (a) not actually the number of deaths caused by smoking, (b) exaggerated, (c) far less precise than claimed, even setting aside the intentional bias, (d) only valid for a few populations, and (e) only applicable to the present (or, really, the recent past).
Moving on to the “leading preventable cause of death” claims, this mantra is equally absurd if you pause to actually look at the words. What does “preventable” mean? Typically in such contexts, it means “some obvious top-down action could have averted it.” So, for example, of the 3000 deaths from Hurricane Maria, a few score were hard to do much about. Every one of these was “preventable” in some sense (fly the particular person to Miami in advance of the storm) but this is meaningless; preventing someone, probably a few dozen someones, from getting killed was not a real option. But the vast majority of those deaths were meaningfully preventable — in the sense that an operationalizable action could have kept them from happening — with a competent relief operation.
So if this normal use of the word is what tobacco controllers mean when they recite this mantra, then they are basically testifying that they are horrifically incompetent. They spend their lives trying to prevent this from happening, and they fail even though it is doable. But while it is true that they are generally horrifically incompetent at what they do, it is clearly not doable. Smoking is not preventable by this standard sense of the word.
Perhaps they are saying it is theoretically preventable, in that sense, but no one has figured out how to do it. At least that is plausible, but then the full statement is clearly false. There are more important causes of death that are theoretically preventable. Deterioration with age of cellular repair mechanisms seems to pretty clearly top the list. Humanity will figure out how to largely prevent that. This bit of prevention (in the “we will figure it out eventually” sense) dwarfs preventing the deaths by smoking. Indeed, it will prevent a lot of the fatal disease cases that are caused by smoking. (I have a vision of one of my kids find this post in an archive 200 years from now, and being sad that this technology came a few decades too late for me. And for most of you too — sorry.)
Most likely, what they are not-quite-saying is that each individual who “dies from smoking” (i.e., has a fatal case of a disease that was caused by smoking) could have made a choice to not have that happen. In some sense, this suffers from the same problem that such a claim about hurricanes or earthquakes does: Yes, every death from a collapsed building could have been prevented by the person choosing to be in a different building. But it has a bit more legitimacy since it is obvious what the safer choice is and the risk is high enough probability to influence the decision. The problem here is to make this meaningful statement, tobacco controllers would have to acknowledge that smoking and other tobacco product use is an individual choice. They are not willing to say that out loud — and thus admit that their entire enterprise is devoted to keeping people from making the choices they want — so they hide it behind weasel words like “preventable”.
But just because the statement “the leading cause of death among individual behavioral choices” is meaningful does not mean it is right. Indeed, it is obviously wrong. Go back to the epidemiology textbook definition of a cause of death. Smoking is a cause of death, by that definition, for approximately everyone who smokes. But eating a less-than-optimal diet is, for the same reason, a cause of death for everyone who eats less than optimally. Two or three times as many deaths occur among people who ate less than optimally (i.e., basically everyone), as compared to those who smoked, so smoking is clearly not “leading”. Of course, no one really thinks in terms of that textbook definition. So how about if we limit it to deaths that occurred a year earlier than they would have. It is pretty difficult to imagine figuring out the numbers, but I would expect diet still has the edge. How about five years? At that level, smoking might really be leading. How about putting it in terms of YPLLs? Yes, it is probably true that smoking costs more YPLL than any other individual choice.
Aha, so they are right!
Um, yeah. We just have to assume that these stupid phrases really represent deep and subtle thinking on the part of those using them. By “preventable” they actually mean resulting from individuals’ behavioral choices. By “cause of death” they actually mean cause of YPLLs. And their declaration that it is true, rather than speculation, is based on valid estimates of the comparative number of YPLLs from different behavioral choices, even though they never cite such evidence. Also, by “n deaths” they mean “n cases of particular fatal diseases attributed to smoking, if you believe our numbers, and assuming that future looks exactly like the past and all population are like the U.S.” Giving someone the benefit of the doubt is sometimes noble, but it would just be silly in this case.
The bottom line are that these mantras are just as false as much of the rest of what tobacco controllers claim. Moreover, they are not just factually wrong, but are a demonstration of just how thinking-free the whole endeavor is. At least things like “second-hand smoking causes 30% of all heart attacks” or “vaping is causing more kids to take up smoking” are meaningful claims. They are obviously false, but they are valid hypotheses and are only false because empirical evidence shows they are false, not because it is impossible for them to be true based on some simple fundamentals of how we know the world works.
Sure, people say things all the time such that, if anyone paused to think and ask the question, would not stand up to a “what does that even mean?” query. We are not always precise in all our thinking, let alone how it translates into words. But the claims in question are not fleeting thoughts or ad hoc word choices. They are mantras that getting written or said a thousand times per day by supposedly credible people in supposed credible contexts. The fact that they cannot pass a “what does that even mean?” test is one of the greatest overlooked testaments to the fundamental lack of seriousness in public health. The fact that they get repeated by others is a testament to how influential sloppy public health thinking is, even over those who are attempting to position themselves as opponents of it.