AUTHOREA
Log in Sign Up Browse Preprints
LOG IN SIGN UP
Cherng-Liin Yong
Cherng-Liin Yong

Public Documents 1
Robust Optimization of Deep Learning Models using Spectral Proximal Method and Salien...
Cherng-Liin Yong
Ban-Hoe Kwan

Cherng-Liin Yong

and 3 more

January 12, 2023
Model generalization refers to a model’s ability to perform well on unseen data. In this paper, we present the Spectral Proximal (SP) method with saliency matrix as a training technique for deep learning models that aims to improve their generalization ability. The SP method addresses two challenges that can hinder generalization: the gradient confusion issue in deep model structures and the scarcity of training data. The method uses a damping matrix and a proximal operator with a saliency matrix to correct for errors in the descent direction and prevent over-fitting, respectively. This results in improved performance on image classification (MNIST and CIFAR-10) and detection (YOLOv7) tasks, as well as better generalization on unseen data. We conducted a thorough investigation through experiments on a diverse range of setups, controlling for potential confounding variables. The results consistently showed that the SP method outperformed the baseline method in the majority of cases.

| Powered by Authorea.com

  • Home