by Carl V Phillips
For an overview of this collection and an explanation of the format of this post, please see this brief footnote post.
This collection will focus mainly on the misleading anti-THR papers produced by tobacco controllers. However, it is useful and important to provide reviews of potentially important paper that might be called pro-THR. This is one example of a paper that has gotten a lot of “ha, take that!”-toned traction.
If a “pro-THR” paper is tight, a review will provide a substantive endorsement, as positive reviews should do (but as the anonymous and secret — and presumptively poor-quality — journal reviews cannot do), as well as a signal boost. If a paper is useful but importantly flawed (as in the present case), the review can correct or identify the errors and focus attention on the defensible bits. And if the paper is fatally flawed, the review should point that out. Bad advice is still bad advice when it feels like it is “on your side”. Even when a paper basically only provides political ammunition and not advice, it is important to assess its accuracy. We are not tobacco controllers, after all, who just make up whatever claims seem to advance their political cause.
Johnson et al. use historical nationally-representative U.S. tobacco use data (NHIS from 2006 to 2016 and CPS over most of that period), for 25- to 44-year-olds, looking at the rate of smoking quit attempts and the association between vaping status and quit attempts or successful smoking abstinence. The authors report an unconditional increase in the population for both quit attempts (measured as a the rate of past-year incidence among people who smoke) and medium-term smoking abstinence. They also report a positive association between vaping and smoking quit attempts and abstinence at the individual level. They interpret their results as running contrary to the recent spate of “vapers are less likely to quit” claims, stating “These trends are inconsistent with the hypothesis that e-cigarette use is delaying quit attempts and leading to decreased smoking cessation.”
This is an overstatement, but the results do run contrary to the “vaping is keeping smokers from quitting” trope that the authors position their paper as a response to. This research clearly moves our priors a bit in the direction of “yes, vaping encourages people to quit smoking, and helps them do so.” Our priors only move “a bit” because rational beliefs based on all available evidence tell us we should be very confident of that conclusion already. They should instead have said something like “even if you naively believe in those methods, for this data the result is different”, but such (appropriate) epistemic modesty is absent.
The paper is quite frustrating in that the authors seem to not recognize which of their statistics are actually most informative and persuasive, let alone take the deeper dive into specific implications that could have been done. The natural experiment interpretation of some of the results is more compelling than the behavioral-association-based analysis (see below). The authors overstate the value of their association statistics and effectively endorse the same flawed methods that are the source of the “vapers are less likely to quit” literature.
One major problem with the paper is evident in the quoted sentence. These trends are not inconsistent with that hypothesis (even if we add the missing “on average” caveats, which are clearly needed). There are several obvious stories that could reconcile these results with that hypothesis (especially if you just think of the changes as trends, as the authors portray them, rather than jumps — see below). The trends are the opposite of what the hypothesis (in isolation) would predict, and thus are a hypothetico-deductive strike against it, but “inconsistent” is too strong a claim. It appears the authors overstate due to not thinking through the many candidate causal explanations and pathways, which they should have given that these are the key to this analysis.
As is too often the case from non-epidemiologists (or bad epidemiologists) doing epidemiology, the exposure contrast is implicitly treated as if it were a controlled experimental change. Presumably everyone understands it is not, but they proceed with a statistical analysis that implicitly assumes that it is. This is the core problem with the “cessation aid X actual inhibits smoking cessation” literature, including the junk papers about vaping that the present paper is pushing back against: The exposed (cessation aid using) group are not random; they self-select, perhaps motivated by a stronger commitment to quit (those who are solidly in the process of quitting, and will do so one way or another, give the aid a try — “why not?”) or perhaps by a greater reluctance to quit (those who are most concerned they will fail to quit, perhaps due to a history of failure, give it a try — “maybe this will finally help me!”). Perhaps some are motivated to avoid the aid based on their commitment to quitting (“I don’t need that stuff”). Thus, the association between NRT/vape/whatever use and successful quitting (in either direction) is fairly uninformative by itself. There needs to be an analysis of competing explanations, ideally involving a good predictor of propensity to quit (which, of course, seldom exists in the data). There is also the added complication with vaping — in contrast with, e.g., bupropion — that many smokers do it for purposes other than attempted cessation.
The association-based aspect of the present analysis suffers from these same problems. Those who vaped were more likely to become smoking abstinent. But perhaps those most likely to quit anyway were inclined to try vaping. Some observers claim that most smokers who switched to vaping would have quit at about the same time had vaping not been an option. This is often implausible (see this for a critical analysis of one such claim, that over 90% of switchers would have quit anyway), but it is a theoretical possibility that anyone analyzing this question needs to investigate — explicitly thinking about what their data might tell us about it — rather than just looking at the association.
Thus this paper should have sought to address these points and why the results contrast with some others that use similar methods. Voting (i.e., counting up how many results show a positive or negative association, for some dataset, when analyzed using these methods) is not a legitimate scientific method for reconciling apparent contradictions. Saying “our paper got a different result from those before, because reasons, so the previous ones were wrong” is not adequate.
The population averages aspect of the present analysis avoids this collection of problems because the exposure is “living a population where vaping has become popular”, a natural experimental intervention. This is an important advantage of the present analysis compared to those that the authors are disputing (and compared to their own individual associations analysis), avoiding the self-selection biases. Here we do not have to worry whether soon-to-be quitters take up vaping, thus creating a non-causal association, if the emergence of vaping in a population seems to cause a net upward shock in average quit rates. The authors give no indication they understand the significance of this.
The results show an increase in the rate of attempting smoking cessation for 2014, 2015, and 2016, which looks a lot like a new normal, a change from the lower and non-trending (the variations over time look like noise) rate for previous years. Similarly interesting is the new higher normal rate of successful smoking abstinence starting in 2012, with a big uptick in 2016. The authors obscure the potentially important discontinuities by focusing on overall trends across the period, or pairwise comparisons of later years to a baseline year (2006). Indeed, their suggestion that the change from 2006 to 2016 can be seen as a trend, rather than flat followed by an uptick, is flatly misleading.
This failure is presumably related to the problem of them reporting sampling error statistics (p-values and such), rather than outcome measures, as if they are useful quantifications of outcomes. This is not at all surprising given that the paper is out of a medical school and first author is in psych — a field particularly notorious for this error — but the implications of the error are worse than usual. This seems to be the common “we have one hammer” problem: They know how to run a particular set of test statistics on their data and so do it without even stopping to think as scientists rather than acting as human algorithms. Thus they focus on statistics that say “it is very unlikely to see this big an overall trend over the entire period due to random sampling error” (yawn!) and never really note “whoa, look at this jump — the only real change over this period — that happened just when vaping became popular!”
Some of the reported results (about which comparisons resulted in test statistic that beat the arbitrary threshold) might be interpreted as pointing this out. Thus the statistically unsophisticated reader might come away with the right conclusion for the wrong reasons, because he does not realize the test statistics are not measures of worldly phenomena. Such reporting is normal in psych (it is still not good, but since they are typically measuring some artificial lab result, not anything where the effect measures are actually meaningful, it is not a big loss), but is not acceptable in legitimate epidemiology.
Also noteworthy, mentioned but under-examined, is the result that the increase in the rates is all found among people who vaped, with the rates remaining basically constant among those who did not vape. Assessing this is complicated because it blends some self-selection back into the natural experiment (those who did not vape after those around them were doing so were self-selected). The natural experiment is also not controlled, of course; it is possible that the change over time reflects a surge in smoking cessation that was not caused by vaping, and those on a path to quitting (regardless of vaping) are likely to try vaping along the way. The authors should have addressed this possibility as best they could with their data (e.g., by looking for deeper natural experiments based on geographic variations in the popularity of vaping) rather than ignoring it. It is a fruitful topic for further analysis, but the present paper would have been much more valuable if that next step had been included, rather than just reporting the simplest possible analysis.
The authors should have made some effort to explicitly compare the results from their two datasets. This is a common problem in the literature, to ignore questions that would be asked in a more serious science (e.g., “since these are theoretically two measures of the same phenomenon, are they sufficiently concordant — quantitatively, not just when trends are translated into casual language — or are there discordances that should be examined). It turns out that in this case the results appear to be quantitatively concordant (often not the case), but since the authors did not report exactly commensurate results, it is difficult to be sure.
The Methods are well described and presented, much better than is typical. The reader could presumably replicate the analysis based on the reporting.
The explanation given for not looking at those age 45 and older seems a bit odd (the explanation for leaving out those 24 and under is more solid), but choice is not unreasonable. There is no affirmative reason to believe the authors tried different subsets of the data to try to concoct a “better” result, though something seems amiss here: In the discussion, the authors throw in the unquantified observation that similar results are seen for younger and older groups. Given that they did those analyses, they should have been reported in the Methods. Indeed, there is no reason they could not have been reported as separate analyses in the results; it is not as if this paper is exactly dense with quantitative results in its present form.
The definitions of product use status are reasonable, but what they define as “cessation” is really only medium-term (as little as three months) current abstinence. The definition of a quit attempt suffers from the usual problems of self-reporting, which the authors do not seem aware of. Specifically, the required one-day period of smoking abstinence is more likely to be self-defined as a failed quit attempt, in retrospect, if a designated cessation aid was used. (It is not clear, to my knowledge, whether this is true of vaping, as it is with other aids.) There is also the issue of “accidental quitting” with vaping: someone might not have really made a quit attempt, but she still quit. Presumably these are self-defined as a quit attempt in retrospect, though this is ambiguous. These are not major problems, but they are issues with these measures that should not have been ignored as if the responses to the survey questions are unambiguous.
The choice of covariates to include in the statistical models appears to follows the typical pattern for health research: That is, they just threw in whatever variables they thought might be associated with the outcomes, without ever thinking through whether they should actually be included as deconfounders. It probably does not make much difference in this case, and there was no way they could have created decent propensity scores using the data in those surveys, but it is still poor practice. The authors just throw in the variables and tell the reader they did so, without any explanation for choosing them (they probably have none), let alone any reporting of the assessment of the effects they had.
The Introduction is not completely worthless, but it is seriously flawed. It includes unhelpful text (authors writing about behavior measures should spare use the wasted ink about how smoking causes harms, especially if they are going to get some of it badly wrong as they did in this case) and an out-and-out wrong narrative about the history of smoking decline in the US (it was basically all caused by the initial education, and there is no good reason to believe that the various policies cited had measurable effects). A bigger problem is the uncritical reporting of previous papers’ conclusions about vaping and smoking cessation. Most of this is just the typical “Introduction as bad undergraduate essay on the general topic” waste of space, but the latter is indicative of a substantive problem with the paper. The key to paper, and how it relates to the previous (mostly nonsense) literature are points about causal pathways and explanations for associations. An ideal introduction to this paper would have been entirely about those topics. At the very least, the literature review should have included assessments of what seemed to be really going on in those other papers, rather than just reciting their dubious conclusions.
The Discussion is a legitimate discussion of the study, unlike many such sections. There is only one paragraph of the usual “essay on the general topic, and personal opinions about it” material. (This includes at least one factual error, but sensible readers will skip it in any case.)
However, the analysis in the Discussion is as superficial as that in the Results. The authors do allude to how the last few years of their results stand out, in contrast with the way they did their statistical analyses, but they spend less than a full sentence on it. They repeat the Introduction’s recitation of vaguely-phrased (at the level of popular press headlines) conclusions from previous studies, again with no assessment of the quality of the previous analyses or why the contrasts occurred.
The usual “strengths of the paper” paragraph makes clear that the authors do not actually understand the value of this analysis. Instead of noting the natural experiment aspect (which they seem unaware of) or the opportunity to see discontinuities because they have a time series, they note they have a national target population of particular ages (which is fine, but it is not a strength; it is merely a recitation of the target population). They similarly do not understand the “limitations”, failing to recognize that their association-based analysis suffers from the same overwhelming confounding problems of the studies they criticized. Instead they repeated the usual tropes about the limitations of cross-sectional data and self-reported surveys.
They did obliquely note in that paragraph that their exposure measure (current vaping) means that only those who liked vaping would be considered exposed. But this should not be a throw-away aside in a throw-away paragraph. They should have seriously analyzed how that affected their results and how it might explain contrasts with other results. This explanation for some such contrasts — that smokers who trialed vaping and kept doing it were more inclined to quitting smoking — occurred to me immediately upon reading the Methods. It appeared that the authors were entirely unaware of its importance until I read the few words in the Discussion. Rather than being unaware, however, apparently they were aware but did bother to try to assess its importance.
In the Discussion and Conclusions the authors emphasize the result that every-day smokers showed a stronger association between vaping and quitting than some-day smokers. This is a potentially interesting result, and the authors seem to think it is interesting, but since they did not seriously explore it in their analysis or report enough about it to the reader, it is impossible to take it seriously.
As a conflict of interest, the authors declare only one tangentially-relevant patent that one of them holds. They fail to note that their dependence on funding from the US government creates a substantial COI in this matter (albeit one that creates an incentive to not report the particular result). The authors fail to note whether they have political preferences in the matter; inevitably at least one out of eleven has strong preferences in some direction. However, none of the authors are immediately recognizable as activists and the paper is not written as an activist screed, so this omission is pretty minimal compared to overt political activists who claim they have no COI.