Statistical Analysis
General descriptive statistics of all data were computed separately for
each influenza season (2017/2018 and 2018/2019) and were reported as
means with standard deviations, medians with interquartile range, or
frequency and percentage, as appropriate. The normality of data and
presence of outliers were assessed using histograms and box-and-whisker
plots. Data clustering was performed as above with each year using the
PAM algorithm, and model results reported. Patient and clinical
characteristics between clusters were compared using Chi-squared or
Fisher’s exact tests for categorical variables and independent t-tests,
ANOVA, Mann-Whitney U, or Kruskal-Wallis tests for continuous variables,
as appropriate.
To determine if different classes of early hospitalization
characteristics were associated with severe hospital sequelae, a series
of models were constructed separately for each influenza season. For
binary outcomes (ICU admission, ventilator use, and prolonged hospital
length of stay), Firth’s logistic regression models were constructed.
Generalized linear models were used for the continuous outcome of
hospital length of stay. Variables chosen a priori for model inclusion
were k-medoids cluster group, age, sex, CCI (as a continuous variable),
hospital, and influenza vaccination status. An exploratory analysis was
conducted as in the primary analysis with the removal of outliers prior
to clustering, with an outlier conservatively defined as a value less
than the 1st quartile-1.5*(interquartile range) or greater than the 3rd
quartile+1.5*(interquartile range). Outliers were then imputed to the
mean of remaining values stratified by age group and hospital. To
maintain comparability with the primary analysis, the same number of
clusters were implemented within a given influenza year.
Analysis was conducted using RStudio version 1.2.5042 and SAS v9.4 (SAS
Institute, Cary, NC). A p-value of 0.05 was considered statistically
significant.