Statistical analyses
All statistical analyses were conducted in the Rstudio interface of R v4.2.1 (Appendix 1).
A changepoint analysis was implemented with the packagechangepoint (Killick and Eckley, 2014) to identify the weeks at which the statistical properties of a sequence of observations for each trend changed in its mean seller counts. We used the PELT (pruned exact linear time) method with AIC values to identify multiple changepoints (Killick et al., 2012), and the AMOC (at most one change) method with the non-parametric CUSUM test statistic (penalty = 0.8) to identify the largest changepoint in each trend (Csörgő and Horváth, 1997). We usedimputeTS (Moritz and Bartz-Beielstein, 2017) to undertake a time series imputation of missing data for Toumodi (January 2021) and Adjamé (April 2020).
We used a non-parametric two-sample Kolmogorov-Smirnov test (Marsaglia et al., 2003) to identify approximate significant differences between pairs of curves, both by the distance between curves and curve shape. These pairs included the comparison between the restaurants (Toumodi) and all markets through averaging the number of sellers across all four markets. To look more deeply into differences between pairs of curves, we calculated Fréchet distances using kmlShape (Genolini et al., 2016), which too account for the location and ordering of the points in each curve. We used a small ‘timeScale’ of 0.1 in order to measure differences of general curve shape across time (Dynamic Time Warping) and reduce the influence of time on the curve distance (Euclidean distance) (Genolini et al., 2016). Both the Kolmogorov-Smirnov and Fréchet distance tests were measured for the entirety of the study period and only during the period of governmental measures against COVID-19.
A growth curve analysis using growthcurver was conducted for each curve from the time governmental measures were enforced in order to determine relative rate of growth for the markets and restaurants under a logistic regression model (Sprouffske and Wagner, 2016).
We used the Random Forest algorithm with the package randomForest(Breiman, 2001) to undertake a machine learning approach in order to predict what would likely happen in terms of bushmeat site growth after the main study period for each site (from January to September 2021, time of the control survey). Training and test datasets were divided into a ratio of around 80:20, while machine learning robustness was validated using a Root Mean Square Error (RMSE; better models have values closer to zero). We followed two scenarios for the evolution of the sites from January 2021 until when we revisited the sites in September 2021: (i) knowing the initial maximum number of sellers prior to COVID-19 (constraint by maximum capacity for each site), and (ii) not knowing any values post initial study period (no constraint on the predictions).