Why use Bayesian inference to model wildlife disease?
A Bayesian model can be loosely defined as any model deriving its
inference from a posterior probability distribution, which is acquired
from a prior probability distribution and its associated likelihoods,
using Bayes’ theorem and any available data . Bayesian inference
allows the explicit modelling of both observed and unobserved a
priori data, which act as model assumptions. Consequently, a disease
ecologist may be able to combine all known ecological information
relating to a host species into a single model. This method has already
influenced our understanding of disease risks from invasive species ,
the potential for disease transmission , and vulnerabilities within
livestock systems to foot and mouth disease . Overall, Bayesian models
are highly capable of representing multi-scale studies, making them
critical tools for disease ecologists trying to understand host-pathogen
systems.
We believe that ecologists should understand Bayesian modelling
approaches, in order to fully realise and explain the causal
relationships and interacting factors that make up disease systems .
This need is being reflected in the scientific community by authors such
as , who have recently updated the WAMBS (when to Worry and how to Avoid
the Misuse of Bayesian Statistics) checklist originally proposed in
2017. The primary goal of statistical epidemiology is to infer the
parameters most relevant to the understanding and management of
epidemics, particularly infection prevalence, force and severity. The
contemporary task of disease ecologists is to understand and
differentiate among interactions and relationships within a complex
host-pathogen system , yet there are multiple complications. For
instance, when modelling disease systems, network complexity is known to
add to network fragility , largely due to increasingly unpredictable
ecological responses to perturbations . One such example is social
perturbation, i.e. individual dispersal in response to interference, as
observed during badger culling . A further complication is that the
causative pathogens themselves may instigate unknown and potentially
bi-directional behavioural alterations affecting transmission .
Constructing a realistic host-pathogen network, including these
fine-scale interactions such as individual behaviours, remains a key
challenge to the development of a whole-system model.
The many benefits of Bayesian approaches to inference, and the
Monte-Carlo Markov Chain algorithms usually used to implement them, are
described elsewhere . However in the context of wildlife-disease
modelling, the benefits of adopting a Bayesian approach include but are
not limited to: the inference of truth given data; the ability to
introduce informative prior information when available; the flexibility
to describe a hierarchy of states, processes and their noise in a single
model; the ability to infer latent variables and parameters; the ability
to integrate across multiple sources of data and multiple statistical
processes; and the flexibility to work with a wider-than-usual range of
likelihood functions . In contrast, the costs of adopting Bayesian
methodologies include: the learning of new statistical concepts and
software; the dropping of ingrained allegiances to tests of significance
or information criteria ; the computational expense of running long,
iterative chains of likelihood calculations; and the lack of consensus
on how to judge the importance of rival models . Recent advances in
computation, methodology, education, and software are already helping to
minimise these apparent costs.
Bayesian inference is particularly useful to disease ecologists because
field data from real world diseased or healthy wildlife populations is
rare but can often be supplemented by expert prior knowledge. Therefore,
a Bayesian modeller has the flexibility to combine both quantitative and
qualitative data . Further, Bayesian hierarchical techniques can capture
the intricacies of level, scale and hierarchy within ecosystems by
accounting for their uncertainties, or if they are unobservable
features, include them as random effects . Consequently, the uncertainty
in latent variables such as an individual’s infection status, which is
often unknown and unobservable, can be both accounted for, and inferred.
These characteristics allow Bayesian models to be used as proxies for
diagnostic tests themselves—for example, to observe changes in
expected biomarker patterns due to changes in individual disease status
—or their associated accuracies .