2.1. Plackett–Burman design and BioLector growing conditions
The study assessed the effectiveness of eleven factors on bacterial growth by designing a series of experiments employing Design of Expert (Stat Ease Inc.) 7.0.0 software, resulting in 16 trial configurations (Supplementary Table 1). This study aimed to scrutinize the Rosetta™ 2 (DE3) the modified insulin strain’s growth kinetics utilizing the BioLector (mp2-labs, Baesweiler, Germany) system. Two distinct concentrations of LB broth media, precisely 30 g/L or 50 g/L, were utilized, and a comprehensive DoE-PBD with 16 parameters was executed in triplicate on a single plate (Fig. 1). The cells were then inoculated for 4 h in each well, and the experiments were conducted under constant agitation (800 rpm) at 37°C in 48-well FlowerPlates (mp2-labs, Baesweiler, Germany) with a working volume of 1000 µL. This study continuously monitored scattered light intensities to observe growth kinetics in real-time (Fig. 1). Furthermore, the induction with Isopropyl β-D-1-thiogalactopyranoside (IPTG) was performed in each well at distinct final concentrations (0.2 mM, 0.3 mM, 0.4 mM) compliance with DoE-PBD parameters, and was conducted overnight at 37°C. The final insulin density, measured in grams per liter by UV/Vis spectrophotometer (280 nm) and analyzed by SDS-PAGE, has been considered the response variable and recorded in the PBD table (Supplementary Table 1; Fig. 1; Supplementary Fig. 1). The statistical inference conducted on the data revealed that the model under consideration attained statistical significance, as deduced from a p -value of 0.05 and an R2 coefficient of determination of 85.37% (Table 1). Notably, the model terms, encompassing MgSO4, glycerol, glucose, and LB broth concentration were established to be influential factors, with p-values that fell below the conventional threshold of 0.05 for statistical significance. This observation suggests that these terms exerted a substantial effect on the response variable. Moreover, the F-statistic was employed to measure the magnitude of association between each term and the response variable, where the outcomes showed that higher F-values corresponded to stronger associations.
The half-normal plot is a graphical representation of the distribution of the residuals in a regression model, which allows for the examination of the normality assumption of the residuals(Zahn et al., 1975) . In Fig. 2, a linear pattern in the half-normal plot indicates that the residuals conform to a normal distribution. Conversely, any deviation from linearity in the plot suggests that the residuals deviate from normality, implying that the normality assumption has been violated. Following the standardized effect, the half-normal plot depicts the relative significance of the independent variables in a regression model, where the percentage probability of the standardized effect for the MgSO4, glycerol, glucose, and LB broth concentration values are more substantial than those of the other variables (Fig. 2).
A Pareto chart also serves as a graphical representation of the standardized effect exerted by each independent variable on the response variable (Kenett 1991 ). The effectiveness of the factors is gauged by employing two statistical thresholds, namely theBenforrini limit and t-value limit, both of which are ascertained at 3,898 and 2.306, respectively, for a significance level (α) of 0.05. The Pareto chart elucidates the magnitude of the standardized effects via bars, with the four bars exhibiting the most prominent values corresponding to MgSO4, glycerol, glucose, and LB broth concentration, thereby attesting to their paramount significance in influencing the response. Consequently, these four variables were shortlisted for the optimization of the experimental design by applying the CCD approach, a popular RSM technique (Fig.3).
The perturbation plot serves as a valuable instrument for discerning and juxtaposing the effects of multiple factors within a given point in the design space (Bonnans and Shapiro, 2013 ). It facilitates the visualization of the response by systematically varying a singular factor across its entire range while upholding the constancy of all other factors (Bonnans and Shapiro, 2013 ). A comprehensive elucidation of the influences exerted by the immediate and interactive effects of independent variables on insulin yield was attained through the utilization of perturbation plots. The perturbation plot (Fig. 4) visually represents the direct effects of variables C, D, E, and H on insulin yield.