AUTHOREA
Log in Sign Up Browse Preprints
LOG IN SIGN UP
Thomas Radcliffe
Thomas Radcliffe

Public Documents 1
Automated Prompt Engineering for Semantic Vulnerabilities in Large Language Models
Thomas Radcliffe
Emily Lockhart

Thomas Radcliffe

and 2 more

August 12, 2024
In recent years, the deployment of AI-driven conversational systems has grown exponentially, making them integral to various industries where they interact directly with users. However, the increasing sophistication of semantic attacks, which exploit the subtle vulnerabilities within these systems' interpretive mechanisms, poses significant risks to their reliability and security. This article introduces a novel approach to evaluating the susceptibility of ChatGPT-4o to such semantic manipulations through the systematic design and testing of adversarial prompts. By employing a robust methodology that includes both quantitative and qualitative analyses, the research uncovers critical weaknesses in the model's ability to maintain accuracy, coherence, and unbiased responses under adversarial conditions. The findings not only highlight the urgent need for more advanced security measures but also offer valuable insights into how prompt design and model refinement can mitigate these vulnerabilities. The implications of this research are far-reaching, providing a foundation for the development of more resilient AI systems capable of withstanding increasingly complex linguistic threats in real-world applications.

| Powered by Authorea.com

  • Home