by Carl V Phillips
I was asked by Clive Bates to expand upon his analysis of this paper (open access link): “Evidence that an intervention weakens the relationship between adolescent electronic cigarette use and tobacco smoking: a 24-month prospective study”, which is “by” Mark Conner, Sarah Grogan, Ruth Simms-Ellis, Keira Flett, Bianca Sykes-Muskett, Lisa Cowap, Rebecca Lawton, Christopher Armitage, David Meads, Laetitia Schmitt, Carole Torgerson, Robert West, andKamran Siddiqi, Tobacco Control, 2019. (Scare quotes on by because you know when there are 15 authors, fewer than half of them even read it, let alone wrote it.)
It is yet another “vaping is a gateway to smoking in teenagers” study. Yet another one which provides no evidence that there is a gateway effect. It is yet another thought-free piece of public health garbage in which there is no hint of scientific thinking. Like most such, it was painful to read. There were only a couple of interesting bits. But it is an opportunity to offer some general lessons.
Before that, I will reference a tangent, this tweet from me. It is important.
If it is not immediately obvious to you, see this branch of the ensuing discussion, which explains it.
So, as for the paper, Clive Bates wrote the following [begin long quoted passage; reprinted with permission; no link because he sends his stuff via email]:
Results [as reported in the paper]:
Baseline ever use of e-cigarettes was associated with:
ever smoked cigarettes (OR=4.03, 95% CI 3.33 to 4.88;
controlling for covariates, OR=2.78, 95% CI 2.20 to 3.51),
any recent tobacco smoking (OR=3.38, 95% CI 2.72 to 4.21);
controlling for covariates, OR=2.17, 95% CI 1.76 to 2.69)
and regularly smoked cigarettes (OR=3.60, 95% CI 2.35 to 5.51;
controlling for covariates, OR=1.27, 95% CI 1.17 to 1.39)
Sigh…so, a gateway claim is hinted at…
The present research replicates previous findings in this area in showing a significant association between e-cigarette use and subsequent smoking initiation. It also shows similar effects for measures of regular smoking. These relationships were observed over a period of 24 months in measures of ever smoked cigarettes, any recent tobacco smoking and regularly smoked cigarettes. The strength of these associations was reduced but remained significant when controlling for various predictors of smoking
And duly reinforced in the discussion:
These latter findings are more consistent with the view that e-cigarette use is a risk factor for smoking initiation than the view that e-cigarette use may simply be a marker for those who would go on to smoke cigarettes even without having tried e-cigarettes.
…the problem of residual confounding. Unless the various predictors of smoking completely characterise the reasons why people smoke (hint: impossible), other than any influence attributable to vaping, then the residual association may simply reflect residual uncontrolled confounding. Note how sharply the association weakens when they control for the co-variates they have – e.g. from OR=3.60 to OR=1.27 – the observed effects are dominated by confounding Who’s to say if they had more co-variates to better characterise predictors f smoking that the association would not disappear altogether.
[end long quoted passage]
Clive nails it there. If controlling for some half-assed deconfounder variables, which only roughly proxy for what you really want to be controlling for (in this case, propensity to use a tobacco product, whichever products are readily available), makes most of your association go away, then it is just nuts to think “and so whatever association remains is right.” No, it obviously should be, “probably if we deconfounded this better, all of the association would disappear.”
Thinking about this led me to create a little metaphor for controlling for confounding. Say you have one mouse trap in your basement, and it captures a mouse. If you thought as a typical public health researcher does when they deal with confounding, you would say, “good, no more mouse.” Of course any sensible person would think, “oh crap, there are mice in the basement.” A public health researcher who happens to own four mouse traps (analogy: has four covariates available) would put them all out, catch four more mice, and then say “got them all.”
A genuine expert in getting rid of mice would, of course, not say this. They would formulate a theory about where the mice are living and focus on it. They would investigate where they are entering and try to close it off. They would keep placing traps as long as necessary, not just place whatever they happen to have. They would perform experiments (e.g., putting out food that, if nibbled, would indicated that not all the mice have been caught).
The analogy is that truth-seeking researchers doing epidemiology or a similar science would — in contrast with 99% of the garbage that comes from public health professors — formulate a theory of causal pathways, and determine what confounding they need to control for to answer their question of interest. They would (a) design their analysis to minimize that confounding and (b) collect the best possible data for the control variables, not just do whatever analysis is easy and use whatever variables they happened to have. Finally they would perform some tests to see how that all worked out, rather than just assuming it was all fine. (These steps are probably unfamiliar to you because basically no one in tobacco research does them.)
As Clive puts it: ‘If they are going the pursue the futile endeavour of deconfounding to control for common liabilities then at least try to identify the right common liabilities (in advance) for the actual behaviour in question, collect data on them and use these to control for predictors. The question they will never be able to answer is “was it that first use of an e-cig, or was it whatever caused the first use of the e-cig that caused the first use of a cigarette?”’
I will note that I despise the term “common liability”. It is a distraction as well as being gratuitously judgmental. There is already proper terminology — confounder, common cause — and making up a new term, as the tobacco controllers have done, is just a way to hide the fact that they are not getting a basic Epid 101 principle correct.
I agree that it is basically futile. A failure to control for 10% of the underlying propensity to use these products undoubtedly will produce a greater association than the signal that is being searched for (the actual gateway causation). Even if someone collected all the ideal variables, a little bit of measurement error in them, or just random sampling error, will result in under-deconfounding (or, potentially, over-deconfounding) that has greater magnitude than the signal of interest. When they do not even have good proxies for the confounder (as is pretty much always the case), the situation is far worse.
In fairness, these authors did admit to this problem, if not so bluntly. Thus they hung their gateway claim on a further detail:
Our findings also indicated that the association between ever use of e-cigarettes and subsequent ever smoked cigarettes or any recent tobacco smoking (but not regularly smoked cigarettes) was significantly stronger among adolescents with no friends who smoked, a group usually considered to be less susceptible to smoking initiation. …. This appears inconsistent with the idea that e-cigarette users are more interested in all forms of nicotine use and the fact that e-cigarette use came first is purely coincidental.
Having thought about this for a while, I cannot come up with any remotely legitimate basis for this line of reasoning. I cannot even guess with much confidence what silly illegitimate basis they have in mind. The best theory I have is that they are mis-thinking of these kids who were “less susceptible” but vaped anyway as experimental subjects who were assigned to vape. If that really were the case, then their increased incidence of smoking uptake would indeed be “inconsistent”. In reality, of course, the kids who started vaping really were inclined to use tobacco products. How do we know? They did so! These kids have demonstrated their propensity in the most definitive way, so it is insane to call them “less susceptible” because some poorer measure of propensity made a different prediction.
These authors — all 15 of them — seem oblivious to the fact that when a predictor variable fails to predict accurately, it is simply evidence that it is not really a great measure of propensity. It does not mean that underlying propensity ceases to matter. If the traps are not catching any more mice, but the bait food gets nibbled, you do not conclude that the traps are working perfectly and therefore you are observing some novel departure from the usual mice-food relationship. No, stupid, you simply still have mice and they are evading the traps! Of course, public health researchers are not usually as good at scientific inference as are exterminators.
Moreover, focus on this subgroup is obvious ex post results-driven rationalization for a politically preferred conclusion. Why did they choose this particular association with one particular propensity measure? Because it happened to be “stronger” in the results their data produced. Had this not been the case they would not have mentioned it. They would not have reported, “because the association for kids with no friends who smoke was not stronger, this is evidence against the gateway conclusion.” (This would be a stupid thing to say, but the point is that it is the counterpart of what they did say.) If the results had been “stronger” for those scoring low on some other propensity measure, they would have reported that instead.
Continuing on this theme (quoting Clive again): “Can we stop this please? If researchers want to hint at gateway effects, then please come up with: (1) a definition; (2) a theoretical model and hypothesis that allows the effects of common liability to be isolated from a causal effect of vaping itself; (3) a method for testing the hypothesis; (4) collection of data that would be necessary to test the hypothesis; and (5) publish this before starting the study. Don’t just hint at conclusions from associations found between exposures and outcomes that hardly matter and likely result from residual confounding because the wrong predictors for different behaviours were used. Then I wouldn’t have to write exactly the same review every week.”
Pro tip for Clive: Don’t write the same review every week. Just leave the damn rock at the bottom of the hill and have a picnic on it.
Pro tip for authors of these papers: Read things, stupid. I already wrote the paper that does much of what Clive asks for, and a bunch more along the same lines. Yeah, I know, reading things where you cannot just skim the abstract is such hard work.
There is nothing unique about gateway papers, or even tobacco control papers, in terms of them making up the analysis to fit the data. This is standard public health research practice. If you do not identify which associations or comparisons between associations are important — either by declaring exactly what results matter in advance or, better still, having a coherent and consistent theory of what you are studying — then there will inevitably be something to point at in support of conclusion, be it correct or incorrect. Study error can be really convenient that way.
Finally, it happens that I found one informative result in this paper, though it was nothing these authors would want to admit is informative:
In our sample, at follow-up approximately one-third (177/585=30.3%) of those who used both e-cigarettes and cigarettes reported using e-cigarettes first (a further 191/585=32.6% reported using cigarettes first, and the remainder 217/585=37.1% could not remember which they tried first).
Think about this. We are talking about something that happened in the last couple of years, not dredging up ancient memories. This points out just how terrible survey data on teenagers is. Had the survey not offered a “do not remember” option (as would be typical), then a third of the responses would have been from subjects who knew they did not know the answer. Probably more than a third, because undoubtedly some of the subjects giving other answers did not know either.
But there is a more subtle and potentially more significant implication of this. The people doing these studies are breathlessly excited about initiation of these products. But to the kids, it is just another thing they did once during their interesting young lives. If you moved into a new neighborhood and I asked you two years later “which of these two nearby restaurants did you try first”, I would expect about a third “don’t remember” answers. To these kids, trying one of these products for the first time is roughly as memorable as one trip out to dinner. Trying something once is just not that big a deal. That insignificant randomly-timed event is more a measure of propensity than it is a cause of future events.
Oh, and it is probably worth mentioning that the point of these authors’ study was to see the effects of a typical informational anti-smoking intervention in schools. Apparently most of the association between trying vaping and later smoking goes away when that intervention occurs. What this really says about anything, however, is completely opaque because the authors do not report useful results about it or really explain much. So I will just ignore it. Clive cites that result as evidence that “these relationships are not immutable laws of nature”. If tobacco controllers would understand that basic Epid 101 fact (which is always true in epidemiology), it would be a big improvement.