DISCUSSION
In this cohort of individuals in the 2017/2018 and 2018/2019 influenza
seasons, we created clinically meaningful groups using k-medoids
clustering to improve the analysis of severity in a population of
patients hospitalized with influenza. Our results suggest that those who
were in clusters with hyperglycemia and lower oxygen saturation at
admission had higher risk of adverse in-hospital sequelae, and are thus
potential cohorts of interest for further study of vaccine or antiviral
effects.
We found glucose to be significantly different between clusters, with
one cluster having significantly higher glucose in both years. The
distribution of diabetes was also consistent across years, with
approximately 70% prevalence in the high-glucose clusters and 30%
prevalence in the non-hyperglycemic clusters. Together, these results
highlight that the use of simple dichotomous classifications for complex
conditions such as diabetes may not accurately indicate a patient’s risk
for adverse outcomes. Indeed, controlling for such complex confounding
has long been problematic within infectious disease severity research,
most recently when examining treatments and hospital outcomes related to
infection with SARS-CoV-2, leading to inconsistent
results25-27. This challenge is due in part to
differential measurement and management of confounding, including
analyses at the point of hospitalization admission, given model
limitations in the number of confounders which can be included and their
often-complex interrelationships. The use of techniques such as
k-medoids clustering to simultaneously account for multiple measures of
comorbidity and group like patients together independent of
outcomes-based analysis provides a tool to increase homogeneity within
groups and heterogeneity across groups for a more robust confounding
adjustment.
More traditional dimensional reduction methods such as the use of
propensity score matching have often been used to account for
differential patterns of comorbidities between groups of interest. While
propensity score matching is useful in reducing heterogeneity in the
presence of a single exposure of interest, it becomes complex in
instances where multiple treatments or exposures are being compared
simultaneously. Additionally, there is inherent reduction in sample size
when matching, limited by the number of individuals with and without the
exposure having similar propensity scores; individuals in either group
with uncommon comorbidity profiles may be overlooked and excluded from
the matching if their propensity score does not align. For example, a
2020 study by Groeneveld et al examining the effective of oseltamivir
lost 36% of oseltamivir patients and 65% of controls when matching,
reducing the sample size to 88 pairs6. While use of
propensity score matching has been shown to reduce
bias28, such significant loss of data, especially in a
rare-outcomes setting, may lead to an increase in Type II error, and
thus incorrect conclusions, due to inadequate
power29,30. K-medoids clustering can be used to
identify subgroups that are biologically different without such
restrictions, maintaining sample size for more robust analysis of effect
modification by multiple treatment types. It should be noted that
outliers within the range of biologically normal values are of great
clinical significance, as these individuals may be at higher risk for
adverse outcomes. K-medoids clustering is robust to such outliers
through use of data-derived centroids for the clusters, rather than an
arbitrary mean.
This study has several strengths, most notably that the cohort was
nested within a large prospective two-center study of influenza vaccine
effectiveness across multiple seasons, allowing for a robust and diverse
analytic cohort. Both case definition and EHR data capture were
standardized across sites, reducing heterogeneity of data quality.
Additionally, the use of two hospitals within our region allowed for a
more generalizable analysis. The biggest limitation of the study is
small sample size and small number of outcomes; however, we believe our
analysis has minimized some of the bias from this limitation.
The use of k-medoids clustering to characterize heterogeneity in
severity analysis has many direct and current applications. One of the
most immediate applications can be for evaluating the effectiveness of
new and existing antivirals for severe respiratory disease. Previous
studies of such treatments have utilized traditional methods of
covariate adjustment, which may contribute to heterogeneity of study
findings31. The use of this clustering method to
phenotype baseline presentation can reduce this confounding, and can be
quickly implemented for these analyses. Such a technique will be needed
as we continue understand how new antiviral treatments affect severity,
and how vaccination impacts severity in instances of low vaccine
effectiveness.