Robin Featherstone -

IntroductionNovel automation or Artificial Intelligence (AI) applications for information retrieval (AI search tools) have the potential to expedite evidence synthesis tasks, such as study identification or search strategy development.1,2 These AI search tools, including generative AI Large Language Model (LLM) chatbots, require performance testing and validation to inform implementation recommendations for Canada’s Drug Agency (CDA-AMC) and other evidence synthesis producers.There is a long history of automation in the information sciences3 and we recognize AI as an umbrella term for technology tools that perform tasks that would ordinarily require biological brainpower to accomplish.4 Our definition of AI search tools is deliberately broad to include successive generations of technologies for information retrieval and to recognize the potential value of older, developing, emerging, or novel tools.In an earlier project phase, Research Information Services (RIS) at CDA-AMC evaluated 51 AI search tools using our evaluation instrument.5 In this successive phase, RIS evaluated the performance of 3 top-ranked tools:Lens.org (“The Lens”) https://www.lens.org,SpiderCite https://sr-accelerator.com/\#/spiderciteMicrosoft Copilot https://www.microsoft.com/en-ca/microsoft-copilotOur objective in testing these tools was to determine if they should replace or supplement current RIS information retrieval methods given the estimated contribution of eligible and unique studies, and the resources needed to conduct searches and screen the results. As these tools use different automation technologies to perform different search tasks, our investigation also aimed to compare their distinct strengths and weaknesses.