Introduction
Species around the globe are redistributing in response to anthropogenic
climate change1–3. Range shifting species illicit
positive4 and negative5,6 ecological
and societal impacts7, thus there is a need to track
range shifts. Tracking range shifts requires large, high-quality
occurrence datasets, such as those provided by online databases like the
Global Biodiversity Information Facility (GBIF)8,52.
While GBIF collates occurrence data from a range of sources, the
majority of data originate from scientific surveys9.
The vast majority of scientific surveys occur in a species’ “natural”
habitat – where a species is historically likely to be found - which
may bias occurrence records from databases such as GBIF towards rural
locations. However, recent studies report that urban environments are
important to range shifting species; many range shifters have been found
to be human-associated, often occurring in gardens or unintentionally
transported into cities as passengers on trade
vessels10,11. Therefore, the possibility of relatively
urban environments being under-represented in databases such as GBIF may
cause a gap within occurrence data records for range shifters. Detecting
rapidly and monitoring arrivals in human-dominated landscapes such as
urbanised areas may therefore reduce spatial bias in predictive models
and inform the association between range-shifting species and urban
habitats.
Another challenge for sourcing data on range shifts is that many
resources such as GBIF have a time lag (up to 3 years) associated with
the process of recording, verification, and agglomeration of occurrence
data12,13. However, the speed and magnitude of range
shifts necessitates more rapid data availability14,15.
One potential solution could be the implementation of community science
projects, which have been shown to produce high quality occurrence data
quickly16–18. However, community science projects
often require vast resource expenditure and many willing
participants18. Another potential avenue to gather
occurrence data quickly within a variety of environments is via social
media19. Social media users may upload georeferenced
photographs of a species of interest incidentally19.
Photos of a focal species are often uploaded to social media
immediately, expediting the process of gathering data. Furthermore,
because the majority of humans reside in urban environments, and urban
environments benefit from a good internet connection, it is likely that
social media will survey these environments. Social media sources may
reveal use of urban habitat overlooked within traditional surveying
methods that target rural areas20.
Despite the advantages above, social media data could also be patchy and
prone to a higher degree of spatial recorder bias than traditional
ecological data. Heterogenous recorder effort can cause over- and
under-estimation of suitability of particular environmental conditions
in Habitat Suitability Models (HSMs). Patchiness could be due to
hotspots of social media use within highly urbanised areas and users may
be heavily influenced by trends, leading to a period of intense interest
in a small number of species21. It is therefore
particularly important to understand the role of spatial and temporal
recorder effort bias in social media data. There may also be variations
in spatial bias and the influence of trends between different social
media platforms, so we need to understand how recorder effort differs
between platforms.
In this study, we compare the information content provided by different
sources of occurrence data of a range shifting species, the Jersey tiger
moth (JTM), Euplagia quadripuncteria (formerly Callimorpha
quadripuncteria ). JTM is a day-flying, recognisable, abundant
lepidopteran currently undergoing rapid range shifts due to climate
change22. JTM is a generalist species, likely to be
able to make use of urban environments23, and is also
visually striking, therefore potentially generating interest on social
media platforms. We: i) model annual habitat suitability for JTM in a
portion of Europe during a period of changing climate using data from
GBIF; ii) assess whether occurrences of JTM from four social media data
sources (Twitter, Flickr, Instagram, and iNaturalist) are found in areas
that GBIF models predict to have poor habitat suitability; and iii)
investigate how recorder effort affects JTM occurrence across all data
sources. We predict that: i) occurrence data from social media platforms
are found in areas that models based on GBIF data would predict to be of
low habitat suitability; and ii) accounting for recorder effort will be
particularly important for the modelling of species distribution using
social media data.