AUTHOREA
Log in Sign Up Browse Preprints
LOG IN SIGN UP
Mercy Kimani
Mercy Kimani

Public Documents 1
Bidirectional Long short-term memory and Recurrent Neural Network model for speech re...
Mercy Kimani
Lawrence  Nderu

Mercy Kimani

and 3 more

July 05, 2023
Speech-to-text is essential as it converts spoken words to text, thus making it easy to store. It has several components; from a basic model, it is viewed in four stages; Signal pre-processing, feature extraction, feature selection, and modeling. Several works of literature have been documented on improving and achieving better results in speech recognition. However, works remains in resolving the issue of word error rate and accuracy on continuous input stream without increasing the required bandwidth. This research evaluates recurrent neural networks, long short-term memory neural networks, gated recurrent units, and bi-directional long short-term memory. It further tests the signal’s performance after introducing bias to the long short-term memory. This research then proposes a model bi-directional long short-term memory recurrent neural network. Experimental results demonstrate that even with a bias of one on long short-term memory, the bidirectional long short-term memory recurrent neural network model still achieves better results with a word error rate of 8.92%, accuracy of 91.08% and mean edit distance of 0.1910 using the Libri speech training dataset. Future work will evaluate the use of the transformer models in the reduction of the word error rate and accuracy on a continuous input stream.

| Powered by Authorea.com

  • Home