Steve Samson -

Large language models(LLMs) have become increasingly useful for predicting sentiments in financial text, such as social media posts, tweets, and news articles. However, most pre-trained LLMs are proprietary or too large for classification purposes. This study aims to help finance AI developers identify the suitability of different LLMs for financial sentiment analysis tasks. The study used three datasets (balanced and imbalanced) and three types of LLMs (textclassification, text-2-text generation, and text generation). Fine-tuning of chosen LLMs was performed within resource constraints, and techniques like instruction tuning were explored to improve performance. Small LLMs, designed for classification tasks, recorded the best accuracy and F1-score, but overfitting was evident. Large LLMs, particularly those of text-2-text generation type, showed more promising performance on large datasets. It is evident from the study that small LLMs train well on small datasets, but performance decreases with increasing diversity and size of dataset, especially on imbalanced datasets.