AUTHOREA
Log in Sign Up Browse Preprints
LOG IN SIGN UP
Jason Hargreaves
Jason Hargreaves

Public Documents 1
Boost Large Language Model Performance through Self-Training with Reward Guided Tree...
Jason Hargreaves
Emma Fairweather

Jason Hargreaves

and 2 more

July 16, 2024
Large language models have changed the field of natural language processing by enabling sophisticated language generation tasks, yet there remains a persistent challenge in enhancing their performance through autonomous learning. Introducing Process Reward Guided Tree Search to the GPT-Neo architecture offers a novel and significant advancement by enabling the model to self-train and optimize its performance without extensive human intervention. The modified model demonstrated substantial improvements across various metrics, including reduced perplexity scores, higher BLEU scores for translation accuracy, and superior ROUGE scores for text summarization quality. The incorporation of a dual reward system ensured a comprehensive evaluation of generated text, promoting balanced enhancements in coherence, relevance, and lexical diversity. Extensive experiments validated the effectiveness of the proposed methodology, with the modified GPT-Neo exhibiting robust performance across diverse NLP tasks and benchmarks. The findings underscore the potential of autonomous learning mechanisms to significantly advance the capabilities of large language models, paving the way for future innovations in the field of natural language processing.

| Powered by Authorea.com

  • Home