AUTHOREA
Log in Sign Up Browse Preprints
LOG IN SIGN UP
Juan Xiang
Juan Xiang

Public Documents 1
Local Large Language Model-Assisted Literature Mining for On-Surface Reaction
Juan Xiang
Yizhang Li

Juan Xiang

and 4 more

December 10, 2024
Text:Large language models (LLMs) excel at extracting information from literatures. However, deploying LLMs necessitates substantial computational resources, and security concerns with online LLMs pose a challenge to their wider applications. Herein, we introduce a method for extracting scientific data from unstructured texts using a local large language model (Local LLM), exemplifying its applications to scientific literatures on the topic of on-surface reactions. By combining prompt engineering and multi-step text preprocessing, we show that the local LLM can effectively extract scientific information, achieving a recall rate of 91% and a precision rate of 70%. Moreover, despite significant differences in model parameter size, the performance of the local LLM is comparable to that of GPT-3.5 turbo (97% recall, 79% precision) and GPT-4o (94% recall, 82% precision). The simplicity, versatility, reduced computational requirements, and enhanced privacy of the local LLM make it highly promising for data mining, with the potential to accelerate the ap-plication and development of LLMs across various fields.

| Powered by Authorea.com

  • Home