Preparation of the protein structures
The National Center (NCBI) database for Biotechnology Information was used to retrieve the S glycoprotein proteins of SARS-CoV-2. For the N protein, we aligned the clustered 31 conformations (1,2,35) from the 1731 full-length SARS-CoV-2 sequences and stored as a FASTA format file for analysis with Glu174 and Glu166 present in the opened RBD down conformation out of a total of 40 states COVID19 in the NMR-derived COVID19 associated protein structure (PDB codes, 6xs6,1xak,2g9t, 3fqq, 2ghv,6yb7) (1,3,4) to select a small subset representative of the protein flexibility downloaded from NCBI (30 April 202, txid2697049, NC_045512.2 with minimum length = 29,000 bp) as a coordinate reference and aligned using the MAFFT tool. (2-5,6) The BioEdit v7.2.3 sequence alignment editor was deployed to identify the conserved binding sites and short linear peptide region among the aligned sequences through multiple-sequence alignment (MSA) with ClustalW. The conserved alignment was visually generated, inspected, and curated by preserving all nucleotides using Genbank NC_045512.2 as a coordinate reference in genomes such as the ball python genome, and further proceeded with the RSFIEDLLFNKV, e.g. KNFIDLLLAGF short linear motifs as identified between the Wuhan isolate beyond the limit of serious detection of the reptile shingle back for spike protein nidovirus 1 model construction by utilizing again the NC_045512.2 and annotated (ORFs) Open Reading Frames plus additional ORFs. We then provided to the DockThor-VS online docking platforms the protein structures in PDB format files of the SARS-CoV-2 (PDB codes, 6xs6,1xak,2g9t,3fqq,2ghv,6yb7) (1,3-10) as potential therapeutic drug targets (5-11,12) for the design of our new druggable scaffold named Roccustyrna. (13,14,15) For this main purpose, we initially selected the Nsp3, Nsp5 non-structural proteins (PLpro domain), Nsp15 (endoribonuclease), Nsp12 (RdRp) (5-16,17), and the structural proteins Spike and nucleocapsid protein (N protein). (9,10-21)) We then clustered the opened conformation states (31 aligned out of 40 conserved states) (9,17,22) using the Conformer Cluster web server tool (6,14,16) according to the position of the residues (5,7,8) Arg102, Glu166, and Tyr109 using the weighted sum of the centroid distances as the single linkage method. (7,8-12,22) Finally, the nearest to the pair group centroid structure per cluster was selected as the representative conformation of each group to be available at BiogenetoligandorolTM. In this article, we effectively use an AI-decision tree and an optimum quantum walk number of small chemical active chemical features from a collection of hundreds of them utilizing neural networks and jointly docking free energy cumulative features and ranking method with input toxicity values taking both network decision tree parameters into account. In this work, we prepared the protein structures using the Protein Preparation Wizard from the BiogenetoligandorolTM (BiogenetoligandorolTM, SynthocureTM, Thessaloniki, Biogenea Pharmaceuticals Ltd-GR, 2020). (8,13,17,22) hydrogen-bond optimization and Protonation assignment were applied by using the PROPKA and the ProtAssign publicly available software at the reported experimental pH (2-11,17) considering when available the presence of the bound small molecule (8-12,15,19).