Modeling orientational features via Geometric Algebra for 3D protein
coordinates prediction
Abstract
By protein structure prediction (PSP) we refer to the prediction of the
3-dimensional (3D) folding of a protein, known as tertiary structure,
starting from its amino acid sequence, known as primary structure. The
state-of-the-art in PSP is currently achieved by complex deep learning
pipelines that require several input features extracted from amino acid
sequences. It has been demonstrated that features that grasp the
relative orientation of amino acids in space positively impacts the
prediction accuracy of the 3D coordinates of atoms in the protein
backbone. In this paper, we demonstrate the relevance of Geometric
Algebra (GA) in instantiating orientational features for PSP problems.
We do so by proposing two novel GA-based metrics which contain
information on relative orientations of amino acid residues. We then
employ these metrics as an additional input features to a Graph
Transformer (GT) architecture to aid the prediction of the 3D
coordinates of a protein, and compare them to classical angle-based
metrics. We show how our GA features yield comparable results to angle
maps in terms of accuracy of the predicted coordinates. This is despite
being constructed from less initial information about the protein
backbone. The features are also fewer and more informative, and can be
(i) closely associated to protein secondary structures and (ii) more
readily predicted compared to angle maps. We hence deduce that GA can be
employed as a tool to simplify the modeling of protein structures and
pack orientational information in a more natural and meaningful way.