Preparation of the protein structures
The National Center (NCBI) database for Biotechnology Information was
used to retrieve the S glycoprotein proteins of SARS-CoV-2. For the N
protein, we aligned the clustered 31 conformations (1,2,35) from the
1731 full-length SARS-CoV-2 sequences and stored as a FASTA format file
for analysis with Glu174 and Glu166 present in the opened RBD down
conformation out of a total of 40 states COVID19 in the NMR-derived
COVID19 associated protein structure (PDB codes, 6xs6,1xak,2g9t, 3fqq,
2ghv,6yb7) (1,3,4) to select a small subset representative of the
protein flexibility downloaded from NCBI (30 April 202, txid2697049,
NC_045512.2 with minimum length = 29,000 bp) as a coordinate reference
and aligned using the MAFFT tool. (2-5,6) The BioEdit v7.2.3 sequence
alignment editor was deployed to identify the conserved binding sites
and short linear peptide region among the aligned sequences through
multiple-sequence alignment (MSA) with ClustalW. The conserved alignment
was visually generated, inspected, and curated by preserving all
nucleotides using Genbank NC_045512.2 as a coordinate reference in
genomes such as the ball python genome, and further proceeded with the
RSFIEDLLFNKV, e.g. KNFIDLLLAGF short linear motifs as identified between
the Wuhan isolate beyond the limit of serious detection of the reptile
shingle back for spike protein nidovirus 1 model construction by
utilizing again the NC_045512.2 and annotated (ORFs) Open Reading
Frames plus additional ORFs. We then provided to the DockThor-VS online
docking platforms the protein structures in PDB format files of the
SARS-CoV-2 (PDB codes, 6xs6,1xak,2g9t,3fqq,2ghv,6yb7) (1,3-10) as
potential therapeutic drug targets (5-11,12) for the design of our new
druggable scaffold named Roccustyrna. (13,14,15) For this main purpose,
we initially selected the Nsp3, Nsp5 non-structural proteins (PLpro
domain), Nsp15 (endoribonuclease), Nsp12 (RdRp) (5-16,17), and the
structural proteins Spike and nucleocapsid protein (N protein).
(9,10-21)) We then clustered the opened conformation states (31 aligned
out of 40 conserved states) (9,17,22) using the Conformer Cluster web
server tool (6,14,16) according to the position of the residues (5,7,8)
Arg102, Glu166, and Tyr109 using the weighted sum of the centroid
distances as the single linkage method. (7,8-12,22) Finally, the nearest
to the pair group centroid structure per cluster was selected as the
representative conformation of each group to be available at
BiogenetoligandorolTM. In this article, we effectively use an
AI-decision tree and an optimum quantum walk number of small chemical
active chemical features from a collection of hundreds of them utilizing
neural networks and jointly docking free energy cumulative features and
ranking method with input toxicity values taking both network decision
tree parameters into account. In this work, we prepared the protein
structures using the Protein Preparation Wizard from the
BiogenetoligandorolTM (BiogenetoligandorolTM, SynthocureTM,
Thessaloniki, Biogenea Pharmaceuticals Ltd-GR, 2020). (8,13,17,22)
hydrogen-bond optimization and Protonation assignment were applied by
using the PROPKA and the ProtAssign publicly available software at the
reported experimental pH (2-11,17) considering when available the
presence of the bound small molecule (8-12,15,19).