Abstract
Many problems in computer vision today are solved via deep learning.
Tasks like pose estimation from images, pose estimation from point
clouds or structure from motion can all be formulated as a regression on
rotations. However, there is no unique way of parametrizing rotations
mathematically: matrices, quaternions, axis-angle representation or
Euler angles are all commonly used in the field. Some of them, however,
present intrinsic limitations, including discontinuities, gimbal lock or
antipodal symmetry. These limitations may make the learning of rotations
via neural networks a challenging problem, potentially introducing large
errors. Following recent literature, we propose three case studies: a
sanity check, a pose estimation from 3D point clouds and an inverse
kinematic problem. We do so by employing a full geometric algebra (GA)
description of rotations. We compare the GA formulation with a 6D
continuous representation previously presented in the literature in
terms of regression error and reconstruction accuracy. We empirically
demonstrate that parametrizing rotations as bivectors outperforms the 6D
representation. The GA approach overcomes the continuity issue of
representations as the 6D representation does, but it also needs fewer
parameters to be learned and offers an enhanced robustness to noise. GA
hence provides a broader framework for describing rotations in a simple
and compact way that is suitable for regression tasks via deep learning,
showing high regression accuracy and good generalizability in realistic
high-noise scenarios.