In this original study, we investigate the performances of machine learning algorithms on a neonatal sepsis detection task. We consider this work to be of great interest to both engineers and clinicians, as it uses non-invasive, already existing, vital signs monitoring signals in a population of very low birth weight infants at high risk of sepsis. Vital sign variability may indeed represent a general indicator of health and wellbeing and be helpful in the early detection of systematic inflammation such as sepsis. We used state of the art feature extraction technics and evaluate a large variety of binary classification models among which a neural network based generative model. The models were chosen from two main families: discriminative and generative. This enables a comprehensive study of different kinds of traditional and advanced binary classification algorithms. Our study reveals that advanced machine learning models are more robust to changes in the feature extraction pipeline, although linear classifiers have a comparable performance when the feature extraction is tuned. The advanced model performing the best is a neural network based generative model which is a hybrid generative and discriminative model. A large window length when computing the features is beneficial to almost all algorithms, indicating the relevance of frequency domain related features for the neonatal sepsis detection task. Overall we obtain a classification AUROC above 0.85, which makes our prediction models potentially relevant in clinical practice. This will enable earlier therapeutic interventions and thereby reduce morbidity and mortality in infants.