Why use Bayesian inference to model wildlife disease?
A Bayesian model can be loosely defined as any model deriving its inference from a posterior probability distribution, which is acquired from a prior probability distribution and its associated likelihoods, using Bayes’ theorem and any available data . Bayesian inference allows the explicit modelling of both observed and unobserved a priori data, which act as model assumptions. Consequently, a disease ecologist may be able to combine all known ecological information relating to a host species into a single model. This method has already influenced our understanding of disease risks from invasive species , the potential for disease transmission , and vulnerabilities within livestock systems to foot and mouth disease . Overall, Bayesian models are highly capable of representing multi-scale studies, making them critical tools for disease ecologists trying to understand host-pathogen systems.
We believe that ecologists should understand Bayesian modelling approaches, in order to fully realise and explain the causal relationships and interacting factors that make up disease systems . This need is being reflected in the scientific community by authors such as , who have recently updated the WAMBS (when to Worry and how to Avoid the Misuse of Bayesian Statistics) checklist originally proposed in 2017. The primary goal of statistical epidemiology is to infer the parameters most relevant to the understanding and management of epidemics, particularly infection prevalence, force and severity. The contemporary task of disease ecologists is to understand and differentiate among interactions and relationships within a complex host-pathogen system , yet there are multiple complications. For instance, when modelling disease systems, network complexity is known to add to network fragility , largely due to increasingly unpredictable ecological responses to perturbations . One such example is social perturbation, i.e. individual dispersal in response to interference, as observed during badger culling . A further complication is that the causative pathogens themselves may instigate unknown and potentially bi-directional behavioural alterations affecting transmission . Constructing a realistic host-pathogen network, including these fine-scale interactions such as individual behaviours, remains a key challenge to the development of a whole-system model.
The many benefits of Bayesian approaches to inference, and the Monte-Carlo Markov Chain algorithms usually used to implement them, are described elsewhere . However in the context of wildlife-disease modelling, the benefits of adopting a Bayesian approach include but are not limited to: the inference of truth given data; the ability to introduce informative prior information when available; the flexibility to describe a hierarchy of states, processes and their noise in a single model; the ability to infer latent variables and parameters; the ability to integrate across multiple sources of data and multiple statistical processes; and the flexibility to work with a wider-than-usual range of likelihood functions . In contrast, the costs of adopting Bayesian methodologies include: the learning of new statistical concepts and software; the dropping of ingrained allegiances to tests of significance or information criteria ; the computational expense of running long, iterative chains of likelihood calculations; and the lack of consensus on how to judge the importance of rival models . Recent advances in computation, methodology, education, and software are already helping to minimise these apparent costs.
Bayesian inference is particularly useful to disease ecologists because field data from real world diseased or healthy wildlife populations is rare but can often be supplemented by expert prior knowledge. Therefore, a Bayesian modeller has the flexibility to combine both quantitative and qualitative data . Further, Bayesian hierarchical techniques can capture the intricacies of level, scale and hierarchy within ecosystems by accounting for their uncertainties, or if they are unobservable features, include them as random effects . Consequently, the uncertainty in latent variables such as an individual’s infection status, which is often unknown and unobservable, can be both accounted for, and inferred. These characteristics allow Bayesian models to be used as proxies for diagnostic tests themselves—for example, to observe changes in expected biomarker patterns due to changes in individual disease status —or their associated accuracies .