October 09, 2024
An Adaptive Compute Approach to Optimize Inference Efficiency in Large Language Model...
James Lesatod, Jonathan Rivera, Lucas Kowalski, et al.