Figure 4: An example of GNINA docking results on the Hsp90
receptor in complex with ligand 9J0 (PDB ID: 5ZR3). The crystal
structure of the receptor is shown in purple with the ground truth
ligand in green. The AlphaFold model is shown in gray. The GNINA docked
conformation using the AlphaFold model as input is in white and the
docked conformation using the crystal structure as input is in orange.
While these results have proven valuable for testing our automated
workflow, they are not meant to be a comprehensive evaluation of these
PLC prediction tools especially in the context of the challenges and
concepts discussed in the previous sections. In the small set of 363 PLC
used: (1) 108 protein-ligand pairs have peptide and oligosaccharide
ligands which are not ideal as most docking tools are not calibrated for
these types of ligands31. (2)
Only 104 out of the remaining 255 small molecule and ion pockets pass
the relaxed validation criteria, and (3) the test set was created using
a time-based split and thus contains redundant proteins within itself,
indicating a biased representation of PLC space, as well as with the
PDBBind training set, indicating an overestimation of prediction results
for the tools trained on this set. Thus, it is critical to repeat this
analysis on a diverse benchmarking dataset created with both structure
quality and PLC diversity taken into account, and after ensuring that
the PLC prediction tools based on machine learning or deep learning are
trained on a dataset different from the benchmark set. This will both
ensure a more reliable and comprehensive evaluation as well as allow for
more specific pinpointing of problem cases for different tools to aid in
their further development.
For four out of the 363 complexes the workflow failed due to issues with
various steps in the process. The inability to generate conformers using
RDKit for the stapled peptide ligand of 6q4q resulted in the failure of
both DiffDock inference and the definition of the search box required to
run Autodock Vina, SMINA, and GNINA. For the 6o0h protein-ligand pair
DiffDock failed because the language model embeddings did not have the
right length for the protein. In addition, 6uhu and 6rtn failed to run
with Autodock Vina due to the presence of unsupported atoms.
Furthermore, for the 6d07 receptor, P2Rank was unable to predict a
binding pocket. During the analysis of the 256 AlphaFold modeled
receptors, P2Rank failed to predict a binding pocket for three receptors
(6d07, 6d08 and 6qlt). Further, complexes 6o0h and 6uhu suffered the
same issues already mentioned above. In addition, DiffDock inference
failed for three more complexes (6cjj, 6jib, and 6jid). These failures
were automatically identified,reported and isolated by the workflow.
Overall, we demonstrate that automated workflows can be employed for PLC
preparation, prediction and assessment.