Bojian Li

and 4 more

Unsupervised monocular depth estimation plays a vital role for endoscopy-based minimally invasive surgery (MIS). However, it remains challenging due to the distinctive imaging characteristics of endoscopy which disrupt the assumption of photometric consistency, a foundation relied upon by conventional methods. Distinct from recent approaches taking image pre-processing strategy, this paper introduces a pioneering solution through intrinsic image decomposition (IID) theory. Specifically, we propose a novel end-to-end intrinsic-based unsupervised monocular depth learning framework that is comprised of an image intrinsic decomposition module and a frame reconstruction module. This framework seamlessly integrates IID with unsupervised monocular depth estimation, and dedicated losses are meticulously designed to offer robust supervision for network training based on this novel integration. Noteworthy, we rely on the favorable property of the resulting albedo map of IID to circumvent the challenging images characteristics instead of pre-processing the input frames. The proposed method is extensively validated on SCARED and Hamlyn datasets, and better results are obtained than state-of-the-art techniques. Beside, its generalization ability and the effectiveness of the proposed components are also validated. This innovative method has the potential to elevate the quality of 3D reconstruction in monocular endoscopy, thereby enhancing the accuracy and robustness of augmented reality navigation technology in MIS. Our code will be available at: https://github.com/bobo909/IIDSfmLearner.