AUTHOREA
Log in Sign Up Browse Preprints
LOG IN SIGN UP
Nelson Pinion
Nelson Pinion

Public Documents 1
Dynamic Contextual Layer Pruning for Optimal Computational Resource Utilization
Nelson Pinion

Nelson Pinion

and 4 more

November 12, 2024
The exponential growth in the scale and complexity of language models has led to significant computational challenges, necessitating innovative solutions to maintain efficiency without compromising performance. Dynamic Contextual Layer Pruning (DCLP) emerges as a novel technique that dynamically adjusts layer activation based on input complexity, thereby optimizing resource utilization while preserving model efficacy. Implementing DCLP within a large language model framework has resulted in notable reductions in processing time, memory usage, and energy consumption, alongside improvements in performance metrics such as perplexity and accuracy. Comparative analyses have demonstrated DCLP's superiority over traditional static pruning methods, highlighting its adaptive capabilities and potential for broad applicability across diverse natural language processing tasks. These findings underscore DCLP's promise as a transformative approach to enhancing the efficiency and adaptability of large language models.

| Powered by Authorea.com

  • Home