m 5 C-TNKmer: Identification of 5-methylated base Cytosine of
Ribonucleic Acid using Supervised Machine Learning Techniques
Abstract
5-methylcytosine (m 5C) is a widely known epigenetic
moderation in RNA types. Methyltransferases catalyze the genesis of m5C.
This site of RNA plays a crucial role in many biological activities. For
many years in DNA, the synthetic process and biological role of m
5C sites have remained the concentrating domain for
researchers. Recently, many characters of RNA m 5C
sites have been discovered, but it is still considered in their infancy.
The accurate and systematic detection and classification of m
5C remains a challenging task. The existence of m
5C sites shows a thriving role in numerous organic
activities. Machine learning techniques are alternatives to laboratory
experiments, which will ease the m 5C site’s
identification in Homo sapiens. This article presents a novel
computational model named m 5C-TNkmer to extract RNA
sequences. The model is enriched with the k-mer feature extraction
technique. Four subdatasets of the primary data set are created: DNC,
TNC, tetra-NC, and penta-NC. The results highlighted that m
5C-TNKmers achieved 96.15% accuracy. The
suggested technique is a talented one that will help scientists
correctly identify RNA m 5C sites and their
modification. It provides a clue to better understanding genetic
function and controlling roles.