How we define health outcomes impacts our research: The case of COVID-19 among smokers

Introduction

In this post, I share examples from the wide range of measurements for COVID-19 outcomes and the different methods for defining a smoker.
In response to the COVID-19 pandemic, researchers around the world are working tirelessly to identify patterns of COVID-19 transmission, risk factors, treatments, and more. My co-authors Carla Berg, Nandita Krishnan, Lorien C. Abroms, and I conducted a systematic review in order to learn how the body of research to date answers the question: What is the impact of tobacco use on COVID-19 outcomes? The full results of that study were recently published in the Journal of Smoking Cessation. The short answer is that tobacco use is associated with increased risk of mortality and severity in COVID-19 patients. In conducting the review, I was reminded how much the choices researchers make when designing studies – about what variables to include and how to analyze them – shape our understanding of critical health issues. In this post, I share examples from the wide range of measurements for COVID-19 outcomes and the different methods for defining a smoker used in the studies we reviewed. I then discuss how these choices made by the studies’ authors can influence the demographic groups that are included or excluded from studies and suggest how that should be considered in future research.

Methods

Key criteria included studies which collected data after March 31, 2020, had a sample size of 30 or more tobacco users, and confirmed COVID-19 diagnosis with lab tests.
To start the review, we developed a list of search terms related to tobacco, COVID-19, and health outcomes, as well as a list of criteria to determine which articles would help answer our question. To go beyond existing reviews, key criteria included studies which collected data after March 31, 2020, had a sample size of 30 or more tobacco users, and confirmed COVID-19 diagnosis with lab tests. The selected studies were limited to those that conducted multivariable analysis, accounting for certain issues such as missing data, discussed below. We searched several databases to collect all the relevant articles, and then applied our inclusion and exclusion criteria as we reviewed their abstracts. You can read more about the process of systematic reviews here.

We included 39 studies analysis in the final review, with the majority finding smokers had an increased risk of mortality, hospitalization, Intensive Care Unit (ICU) admission, and mechanical ventilation.

Choosing an Outcome to Measure

All the studies we reviewed included various approaches to measuring the impact of COVID-19 infection as researchers worked to create a full picture of a new disease with little precedent. Hence, one of the early decisions we needed to make in conducting this systematic review was what to include as an outcome of COVID-19, defined as what happens to a patient after they are infected with COVID-19.

Mortality

Measuring mortality does not capture the full picture.
Existing reviews most often looked at mortality, and in fact out of the 39 studies included in our review, 32 of them assessed mortality. The reason for this is obvious: whether a patient survives COVID-19 or not is a relatively straightforward measurement – a yes or no answer. However, the apparent simplicity of this measurement is belied by the fact that studies must limit their timeframe (e.g., 30 days post admission (Chand et. al., 2020) or 28-day mortality (Alharthy et. al., 2020)) and count patients who were still hospitalized – neither recovered nor deceased – as ‘survivors.’

But, measuring mortality does not capture the full picture.

Severity

By including these indicators, we aimed to build a more well-rounded picture of the impact of tobacco use on COVID-19 outcomes.
Therefore, we wanted to go beyond mortality to assess what other impacts tobacco use might have on COVID-19 patients. Did tobacco use lead to more severe COVID-19?

Twenty-three of the studies included in our review assessed some form of ‘severity.’ Because studies defined severity and disease progression in various ways, comparison between them was very difficult. The most common measure was hospitalization (n = 10). Like mortality, this is a relatively straightforward measure, especially since many of the studies were retrospective cohort studies examining medical records: either a person with COVID-19 was admitted into a hospital or was not. Hospitalization represented a limited indicator of severity, however, since it told us nothing of what happened after a patient was admitted: how long did they stay? What treatment did they need? How long did symptoms persist after their release?

Other studies attempted to account for this limitation by measuring ICU admission and/or the need for mechanical ventilation. These indicators required more detailed medical records, making the study more difficult and time consuming, and possibly less representative as more cases must be excluded due to missing data. However, they do offer a fuller glimpse of disease progression.

Some studies settled on reporting “severity.” Each study defined severity under its own terms. For example, Adrish et. al., (2020) defined severe illness as “radiographic evidence of pneumonia with hypoxia requiring any form of supplemental oxygen or non-invasive positive pressure ventilation” and “critical illness” as the need for invasive mechanical ventilation (p. 2). Saurabh et. al., (2021), meanwhile, defined severe COVID-19 as “clinical signs of pneumonia plus respiratory rate >30 breaths/min, severe respiratory distress, or peripheral capillary oxygen saturation <90%” (p. 821). Mendy et. Al., (2020) defined severity simply as “admission to ICU and/or death during hospitalization” (p.6).

Finally, some studies were able to focus on other specific outcomes of severity. According to our research, one study found that smokers were “more likely to have chest X-ray abnormalities 12 weeks after hospitalization,” which was associated with longer hospital stays and recovery times (Wallis et. al., 2021). Another study found that smokers had an increased risk of pulmonary embolism, a factor putting them at a higher risk of mortality (Badr et. al., 2021).

By including these indicators, we aimed to build a more well-rounded picture of the impact of tobacco use on COVID-19 outcomes. Studies on these outcomes were too limited to draw firm conclusions, while other outcomes such as blood clots and stroke were not sufficiently studied to include in our review, leaving critical gaps in our efforts to understand the full impacts of COVID-19.

Defining a Smoker

We found two areas of definitional issues concerning the definition of a smoker.
Choosing what constitutes a COVID-19 outcome was not the only choice researchers made in their studies. During our review, we found two areas of definitional issues concerning the definition of a smoker.

First, is the narrowness of defining a tobacco user based on tobacco type. In almost all the studies (n = 38), the type of tobacco discussed was combustible smoking. Only one study focused on smokeless tobacco. We did not find any studies on vaping, despite its growing prevalence among teenagers and young adults. Much of the early research on COVID-19 focused on older populations, as they are significantly more at risk of death, so perhaps a trend in young people was not seen as crucial. Alternatively, it may indicate a bias among medical records being used in the studies – on death certificates in the United States, for example, tobacco use is a yes or no question.

This lack of nuance contributed to the second definitional difficulty: duration and frequency of smoking. Tobacco use in the studies was most commonly reported as “ever vs. never use” (n = 15) or “never/former/current use” (n = 13).

Researchers may have been working with the data they had available or doing their best to create statistically significant sample sizes. Breaking the groups down into never, former, and current tobacco users may allow for more nuance, but tended to leave very small samples for ‘current smokers,’ which was generally significantly outweighed by the former and never groups. In our review, we limited the studies included to those with 30 or more cases with a history of tobacco use – but if a study includes 100 smokers and only 1 or 2 die, statistical analysis is limited. Other studies combined all participants with a history of tobacco use into one pool, creating a larger sample size. While this may solve the problem of small samples, it may also dilute the results. Someone who has smoked for 30 years is quite different from someone who has only smoked for one – putting them together may hide the risk of negative outcomes for long term smokers.

Only two studies analyzed differences in pack-years smoked, where pack-years referred to the number of packs smoked per day times the number of years smoked. This requires significantly more detailed data for the study, with complete medical records and possibly full medical histories. However, it is the only route to revealing dose-responses – Lowe, et al. (2021), demonstrated a significant difference in mortality, hospitalization, and ICU admission between those who had smoked more than 30 pack-years versus those who had smoked for less. Such insight is critical information for understanding the impact of tobacco use on COVID-19 and could shape cessation information and public health messaging.

How research choices influence inclusivity

The majority of studies in our review relied on medical records, which may be missing data or contain incomplete histories.
The choices of these definitions – and what they mean for the data we record and study – impact the inclusivity of public health research.

The majority of studies in our review relied on medical records, which may be missing data or contain incomplete histories. Lowe et al. point out that patients with complete records are more likely to be wealthy and regularly access healthcare, potentially leading to an underestimate of the impact of tobacco use on COVID-19 outcomes (p.711 in Lowe et. al.). The narrow definition of tobacco use excluded vaping and, therefore, younger tobacco users. Meanwhile, the wide definitions of ‘severity’ of COVID-19 create a scattershot picture, with different groups in or out depending on the study.

Moreover, indicators of severity such as hospitalization, particularly in the United States, may be severely limited by the fact that many people either delay or refuse to go to the hospital because they cannot afford it. It also does not account for the fact that, during the pandemic, many hospitals have had to discourage people from coming as they exceeded capacity. Lower-income and rural populations with limited options for care may not have been able to gain admittance to a hospital even if they otherwise met the threshold.

To account for these limitations, our review included only studies which conducted multivariable analysis. These studies still had to exclude incomplete medical records for missing data, but they were able to account for differences in demographics including age, race, and socioeconomic status. Because of this, studies were able to determine tobacco use was an independent risk factor. Additionally, this level of analysis also allowed some studies to find compound risks. For example, Abbas et. al., (2021) found that diabetics who smoke suffered significantly worse outcomes than either just diabetics or just smokers.

Implications

Each choice a researcher makes is a delicate balance between budget and time constraints; maintaining high-quality, statistically significant data; and ensuring the sample is inclusive.
The wide differences in defining COVID-19 outcomes highlight the patchwork response healthcare workers and researchers have had to develop in the face of a rapidly spreading pandemic with a lack of clear guidance. The global medical community is still learning new things about COVID-19 every day, with every wave. As we learn more and consolidate our findings, it is essential we develop clear indicators for researchers in order to improve the external validity of studies. While many choices are made for expediency or based on available data, researchers must be conscious of who may be excluded from their data sets. Each choice a researcher makes is a delicate balance between budget and time constraints; maintaining high-quality, statistically significant data; and ensuring the sample is inclusive. Lastly, it is important not to lose sight of statistical analysis tools that can be used to generate more inclusive results and build a fuller picture of the impact of COVID-19.

Photo credit: iMrSquid/Getty Images

Sharing is caring!