Introduction
Species around the globe are redistributing in response to anthropogenic climate change1–3. Range shifting species illicit positive4 and negative5,6 ecological and societal impacts7, thus there is a need to track range shifts. Tracking range shifts requires large, high-quality occurrence datasets, such as those provided by online databases like the Global Biodiversity Information Facility (GBIF)8,52. While GBIF collates occurrence data from a range of sources, the majority of data originate from scientific surveys9. The vast majority of scientific surveys occur in a species’ “natural” habitat – where a species is historically likely to be found - which may bias occurrence records from databases such as GBIF towards rural locations. However, recent studies report that urban environments are important to range shifting species; many range shifters have been found to be human-associated, often occurring in gardens or unintentionally transported into cities as passengers on trade vessels10,11. Therefore, the possibility of relatively urban environments being under-represented in databases such as GBIF may cause a gap within occurrence data records for range shifters. Detecting rapidly and monitoring arrivals in human-dominated landscapes such as urbanised areas may therefore reduce spatial bias in predictive models and inform the association between range-shifting species and urban habitats.
Another challenge for sourcing data on range shifts is that many resources such as GBIF have a time lag (up to 3 years) associated with the process of recording, verification, and agglomeration of occurrence data12,13. However, the speed and magnitude of range shifts necessitates more rapid data availability14,15.
One potential solution could be the implementation of community science projects, which have been shown to produce high quality occurrence data quickly16–18. However, community science projects often require vast resource expenditure and many willing participants18. Another potential avenue to gather occurrence data quickly within a variety of environments is via social media19. Social media users may upload georeferenced photographs of a species of interest incidentally19. Photos of a focal species are often uploaded to social media immediately, expediting the process of gathering data. Furthermore, because the majority of humans reside in urban environments, and urban environments benefit from a good internet connection, it is likely that social media will survey these environments. Social media sources may reveal use of urban habitat overlooked within traditional surveying methods that target rural areas20.
Despite the advantages above, social media data could also be patchy and prone to a higher degree of spatial recorder bias than traditional ecological data. Heterogenous recorder effort can cause over- and under-estimation of suitability of particular environmental conditions in Habitat Suitability Models (HSMs). Patchiness could be due to hotspots of social media use within highly urbanised areas and users may be heavily influenced by trends, leading to a period of intense interest in a small number of species21. It is therefore particularly important to understand the role of spatial and temporal recorder effort bias in social media data. There may also be variations in spatial bias and the influence of trends between different social media platforms, so we need to understand how recorder effort differs between platforms.
In this study, we compare the information content provided by different sources of occurrence data of a range shifting species, the Jersey tiger moth (JTM), Euplagia quadripuncteria (formerly Callimorpha quadripuncteria ). JTM is a day-flying, recognisable, abundant lepidopteran currently undergoing rapid range shifts due to climate change22. JTM is a generalist species, likely to be able to make use of urban environments23, and is also visually striking, therefore potentially generating interest on social media platforms. We: i) model annual habitat suitability for JTM in a portion of Europe during a period of changing climate using data from GBIF; ii) assess whether occurrences of JTM from four social media data sources (Twitter, Flickr, Instagram, and iNaturalist) are found in areas that GBIF models predict to have poor habitat suitability; and iii) investigate how recorder effort affects JTM occurrence across all data sources. We predict that: i) occurrence data from social media platforms are found in areas that models based on GBIF data would predict to be of low habitat suitability; and ii) accounting for recorder effort will be particularly important for the modelling of species distribution using social media data.