Characterizing viral samples using machine learning for Raman and
absorption spectroscopy
Abstract
Machine learning methods can be used as robust techniques to provide
invaluable information for analyzing biological samples in
pharmaceutical industries, such as predicting the concentration of viral
particles of interest in biological samples. Here, we utilized both
convolutional neural networks and random forests to predict the
concentration of the samples containing measles, mumps, rubella, and
varicella-zoster viruses (ProQuad®) based on Raman and absorption
spectroscopy. We prepared Raman and absorption spectra datasets with
known concentration values, then used the Raman and absorption signals
individually and together to train RFs and CNNs. We demonstrated that
both RFs and CNNs can make predictions with R2 values as high as 95%.
We proposed two different networks to jointly use the Raman and
absorption spectra, where our results demonstrated that concatenating
the Raman and absorption data increases the prediction accuracy compared
to using either Raman or absorption spectrum alone. Additionally, we
further verified the advantage of using joint Raman-absorption with
principal component analysis (PCA). Furthermore, our method can be
extended to characterize properties other than concentration, such as
the type of viral particles.