MelSpectroNet: Enhancing Voice Authentication Security with AI-based Siamese Model and Noise Reduction for Seamless User Experience
- Gitesh Kambli,
- Jay Oza ,
- Amit Maity
Gitesh Kambli
K. J, Somaiya Institute of Technology
Jay Oza
K. J, Somaiya Institute of Technology
Corresponding Author:jayoza198@gmail.com
Author ProfileAmit Maity
K. J, Somaiya Institute of Technology
Abstract
Voice authentication has become critical for secure access control while achieving usability. Background noise and increased security requirements, however, continue to be problems. This paper presents MelSpectroNet, an innovative voice authentication system using Siamese neural network trained on over one million samples. It leverages mel spec-trograms for efficient feature extraction and employs noise reduction, enhancing reliability. The model achieves 96.62% test accuracy, demonstrating efficacy. Our methodology involves audio denoising, meticulous spectrogram preprocessing, a tailored Siamese architecture, and rigorous training. Testing demonstrates MelSpectroNet's exceptional performance and ability to generalize. However, enhancing longitudinal accuracy by accounting for natural voice variations over time still needs exploration. Overall, MelSpectroNet pioneers highly accurate and usable voice au-thentication with enhanced security. It balances user convenience and stringent authentication needs. This research motivates further work to optimize these systems for diverse conditions while advancing security and inclusiveness.