Discussion
In this study, we demonstrate that occurrence records from Instagram and
iNaturalist are found in different and more urbanised locations compared
to occurrences from traditional datasets, such as GBIF. We therefore
highlight the utility of these social media platforms as additional and
complementary sources of data to traditional databases, such as GBIF.
However, contrary to our predictions, occurrence data from Flickr offer
a somewhat similar outlook to that provided by GBIF records. There was a
notable difference in the environments surveyed by different social
media platforms, with Flickr data occurring in more rural locations than
data from iNaturalist, and Instagram occurring in still less rural
areas.
As predicted, the majority of post-2009 occurrence records from GBIF for
JTM within our study region fell within more rural areas, likely due to
the majority of GBIF data originating from scientific surveys and formal
recorder efforts, which largely operate in rural zones. Contrary to
this, as predicted, Instagram largely contains data from highly urban
zones, likely because urban areas are densely populated by humans and
have good internet connections leading to geographic trends in human
behaviour affecting whether they upload data to social media. However,
data from Flickr are more rural than data from iNaturalist and
Instagram. Flickr is tailored towards individuals with an interest in
the quality of photography, which may attract wildlife photographers
intent on capturing wildlife in relatively rural environments, rather
than in urban zones. Overall, this demonstrates the utility of social
media sites such as iNaturalist and Instagram to fill a void in the
occurrence records provided by traditionally used data sources such as
GBIF. Although we only studied one species, we think it is likely that
the urban-rural differences between databases would remain for similar
(colourful, eye-catching) species that would likely be uploaded to
social media.
Our results also highlight the importance of accounting for recorder
effort. The strong positive effect of recorder effort on GBIF
occurrences indicates that JTM is detected where recorders are searching
for it. Thus GBIF’s predictions of low suitability in urban areas is not
necessarily trustworthy. Likewise, iNaturalist and Flickr data also
occur in areas where recorder effort on their platforms is high,
indicating that these data sources alone may not contain occurrences in
all areas the species is found. In contrast to iNaturalist and Flickr,
there was a much shallower relationship between the location of JTM
records from Instagram and recorder effort. This suggests that Instagram
is better at detecting a species in areas where it is not looked for by
the majority of users. Instagram’s ability to both detect JTM in areas
of lower GBIF-calculated habitat suitability and areas of higher
urbanisation, as well as a relatively shallow effect of recorder effort
makes it an ideal complement to traditional occurrence data for range
shifters. The utility of Flickr and iNaturalist should not be discounted
though, since both may make species records publically available more
rapidly than GBIF.
It should be noted that recorder effort was particularly geographically
uneven for social media sources and our results could be affected by
this patchiness. It’s possible that blackbird recorder effort does not
reflect JTM recorder effort, particularly within GBIF, since surveys for
different taxa (birds and insects, in this instance) are likely to
employ differing sampling techniques and audiences38.
However, abundance data for insects with which to calculate recorder
effort are rarely available. Moreover, using this species, which is well
represented in all data sources, would allow for comparison of recorder
effort between localities and time periods that could be applied to a
wide range of taxa. There may be a novelty bias towards range shifting
species, causing geographical and temporal variation in recorder effort.
Given the varying, but broadly important, effect of recorder effort,
developing improved recorder effort metrics could be particularly
important to the use of social media data in biogeography and
range-shift ecology. Even if not a precise, quantitative metric of
recorder effort, the approach we developed is a useful tool for
comparison between data sources, locations, and time periods. This is
particularly important when dealing with social media data, which are
prone to temporal and spatial trends and uneven geographical use.
Range-shifting and invasive species have previously been found to be
human-associated, persisting in urban parks and
gardens10. Although the extent of this association
remains unknown, our results highlight the potential for social media
data to track and understand range-shifting species in urban zones.
Since Instagram’s focus is on photography, it could be used to track the
arrival of eye-catching or charismatic taxa in urban area. However, a
less recognisable or visually appealing species than JTM could generate
fewer occurrences, and thus the repeatability of the use of Instagram
data across different taxa requires further investigation. In addition,
collection of ad hoc social media data may present opportunities for
researchers to assess wildlife management practises in urban and
suburban areas. Surveys of bug hotels, bird feeders, and mutualists from
social media could be recorded to assess hotspots of positive management
in cities, as well as areas that are deficient in their capacity to
support biodiversity. Furthermore, social media data could be used to
assess the persistence of endangered species in urban and suburban
areas, adding to the work already compiled regarding the importance of
gardens in supporting threatened or keystone taxa39.
Our study also suggests that there may even be scope for assessing the
potential for urban spaces to propagate range-shifts and invasions
further in a similar way to forest corridors40. It is
clear that, if robust and repeatable methodologies can be applied,
social media data sources have a high potential to provide high quality
data at speed. Furthermore, it is likely that these methods will only
increase in importance as urbanisation rises
globally41. It is also noteworthy that social media
platforms such as Twitter have been used to promote uptake of the UK
ladybird survey, yielding insights into the spread of the Harlequin
ladybird42.
Scientific and policy-maker interest in community-science in urban areas
is growing, given that urban environments are increasing, most people
live in urban environments, and most nature experiences are close to
home43. Noticing urban wildlife can improve mental and
physical wellbeing44,45, and increasing engagement
with urban nature offers opportunity for improved ecological literacy
and nature connectedness, particularly amongst social groups that have
historically had inequitable access to nature46,47.
Our results further reinforce recent findings that social media
platforms could be harnessed to assist in urban nature engagement and
conservation48,49. While our results highlight a
promising avenue for future studies and offer novel sources of data with
new information, a fundamental area of improvement is the establishment
of a rigorous and consistent methodology19. A source
of uncertainty for this study is that the search terms and the access
and use of APIs could not be made consistent across all social media
data sources. The process by which data are attained would be benefited
by greater consistency; the main barrier here is the expense of using
the API services supplied by Instagram and Twitter. Both services have
recency constraints and query limits associated with the free-to-use
APIs, and the cost of more expansive API usage was outside of the budget
of this study, costing up to £2000 per month depending on the service
used at the time at which this study was conducted. This could be
overcome with additional studies highlighting the importance of access
to these data for scientists, thus prompting social media companies to
produce an API service that is accessible to scientists. Alternatively,
machine learning programmes such as UI Path could provide a more
affordable and consistent method to gather data from online
sources50. Implementation of alternative methodologies
and different focal species are likely to increase the utility of
Twitter, which was omitted from analyses due to a low sample size, and
could permit use of other social media sources not considered here due
to data accessibility, such as TikTok or Facebook.
A further potential issue with social media use is that there is not
necessarily equal utilisation of these sources throughout all nations,
particularly in those outside of Europe and North America. We have
attempted to account for unequal usage in our study by using three
different sources of social media data, but ideally more could be
implemented. Search terms should also be considered with caution. We
have included the search terms that yielded the most occurrences of JTM.
However, there may be an English bias here, since social media users
from non-English speaking individuals will likely submit potential
occurrences using English and colloquial terminology. Although various
common names are not always simple to incorporate (as was the case here,
with German names such as “Spanish flag” and “Russian bear”, which
yielded countless non-moth results when searched), this is certainly
worthy of consideration. Social media data sources are also driven by
trends, which may contribute to varying usefulness of different sources
over time as the popularity and novelty of range shifting species wax
and wane. Such an effect seemed to be apparent with JTM, where the
inclusion of the moth on postage stamps in the Channel Islands was
associated with an increase in GBIF and iNaturalist occurrences in 2012
and 2013 (which also illustrates that even GBIF is not resistant to
trends). Nonetheless, such trends could also be a potential advantage to
social media data sources. In theory, governments and scientists could
highlight species of interest to the public, thus generating a trend
around focal organisms that could be used to generate social media
occurrence records. Such strategies could increase the use of social
media to record biological phenomena, potentially producing large
quantities of community science data.
The results presented here support the idea that the combined use of
traditional (GBIF) and social media (particularly Instagram and
iNaturalist) data sources to generate a more complete understanding of
the habitat-use of range shifting species. Our study suggests that
traditional and social media biodiversity data can contain different,
but complementary, information regarding habitat usage of a range
shifting species. While GBIF captures the rural range of JTM across the
study region, Instagram demonstrated that JTM also occupies highly
urbanised environments. Social media data may be particularly prone to
variation in recorder effort, and we propose a method that can account
for this. We suggest that data from social media should be added to
occurrence datasets when tracking range shifting species. Implementation
of occurrence records from social media could be particularly important
given the human-associated nature of some range shifters, which often
occupy parks and gardens in urban zones as well as rural spaces.