Software Description
The homepage showing the layout of DC-PPMA containing three modules (pipelines) is shown in (Figure 2 ). In ‘Creation of Custom-Database’ (Module 1) the protein sequences of interest can be given in a text (.txt) file which is editable. The customized database will be generated as an output and saved as an excel file. Specifically, the protein/peptide sequence database (or list of protein/peptide sequences) that is entered as input into Module 1 is converted into another database containing the m/z values of precursor ions and/or fragment ions. The role of Module 2, ‘Custom Database Search’ is to identify the peptide/protein hits for the queried m/z values within the database created by the Module 1. Therefore, Module 1 and Module 2 are interconnected to perform peptide search. Therefore, Modules 1 and 2 should be used together for MS/MS based search and MS/MS data analysis. The Module 3 functions independently for peptide mass fingerprinting (PMF), whereby ‘proteolytic peptide mass search’ will be performed for m/z values against the protein sequence (fasta file) of a particular biological species. Usually PMF is done using MALDI mass spectrometric data that would typically contain m/zvalues of singly protonated molecular ions of proteolytic peptides. However, the Module 3 of DC-PPMA can handle even the conventional ESI mass spectrometric data, which would typically contain m/z values corresponding to multiply protonated ionic species of proteolytic peptides (depending on the length, amino acid composition and sequence). Therefore, the output from Module 3 can be useful to expedite the analysis of ESI-MS based PMF also, in addition to the MALDI-MS based PMF.
In DC-PPMA, the algorithm of peptide search for MS and MS/MS has been designed in such a way that the both the m/z values as well as their respective charge state that are queried in ‘Module 2’ should match with the values in the custom database that is obtained as output of Module 1. For MS database search, minimum of four queried m/zvalues should match with the MS database created by the Module 1. In order to perform MS/MS search, minimum of six queried m/z values have to match with a single (proteolytic) peptide corresponding to a protein in the custom MS/MS database that is created from the Module 1. Therefore, peptide search can be done for both MS and MS/MS data, in order to interpret the experimentally observed m/z values both manually as well as by using the Module 2. The observed m/zvalues and their respective charge states (either from MS or from MS/MS) can be given as input in the form of .txt file. Additionally, error width options are provided, which needs to be appropriately chosen, depending on the mass resolution of the spectrometer used for data acquisitions. The error width option also can be useful to decrease the false positives in the output.
The performance of DC-PPMA was examined using randomly chosen 25 model protein sequences. Among them experimental mass spectral data of eight model proteins under two different conditions: (i) standard trypsin digestion and (ii) trypsin digestion after arginine modification by two different reagents: 1,2-cyclohexanedione (CHD) and phenylglyoxal, were considered. Firstly, in the Module 1, the selected 25 protein sequences were entered as input in the form of a .txt file. Trypsin was chosen as the protease (enzyme) and carbamidomethylation was chosen in the modification tab of Module 1 window. For these input parameters, MS and MS/MS databases were created and saved as excel files (Figure 3 ). Subsequently, the observed m/z values from the experiments done on Agilent 6545 LC-MS Q-ToF were queried in the Modules 2 and 3. All the matched tryptic peptides are shown in the excel file output, which was verified by manual interpretation (Figure 4 ). Similarly, the custom modification option was tested by manually entering the molecular mass of CHD (112 Da) available in the Module 1 window (Figure 5 ), which is known to specifically modify arginine residues in proteins [19-21] and the respective custom databases for both MS and MS/MS were created. Those CHD modified peptides that matched with queried m/z values are shown in the output, which have also been confirmed manually. This proves the utility of DC-PPMA for targeted MS-based studies on proteins.