HTPE-Net: Monocular 6D Pose Estimation of Transparent Objects in Hand for Robot Manipulation

Ran Yu; Shoujie Li; Haixin Yu; Wenbo Ding

doi:10.22541/au.173321945.54329568/v1

loading page

HTPE-Net: Monocular 6D Pose Estimation of Transparent Objects in Hand for Robot Manipulation

Ran Yu,
Shoujie Li,
Haixin Yu,
Wenbo Ding

Abstract

Transparent objects are difficult to perceive due to their unique optical properties, and the dynamic interaction between the hand and object further complicates pose estimation. To address this problem, we propose HTPE-Net, a monocular instance-level 6D pose estimation method for hand-held transparent objects, addressing the significant challenges posed by the texture-less, non-Lambertian surfaces, and hand-object occlusions. HTPE-Net integrates hand and object features through a dual-stream feature extraction backbone and a hand-to-object feature enhancement module, generating geometric features and hand attention maps to improve robustness to background changes and occlusions. The network is trained on a modified version of the TransHand-14K dataset and demonstrates superior performance compared to state-of-the-art methods. Additionally, a sim-to-real experiment validates the practical applicability of HTPE-Net in real-world robot perception tasks. The proposed approach significantly advances the accuracy and robustness of 6D pose estimation for hand-held transparent objects, with potential applications in robotics, human-machine interaction, and augmented reality.

02 Dec 2024Submitted to Journal of Field Robotics

Show details

Hide details

03 Dec 2024Submission Checks Completed

03 Dec 2024Assigned to Editor

03 Dec 2024Review(s) Completed, Editorial Evaluation Pending

19 Dec 2024Reviewer(s) Assigned

Abstract

Peer review status:UNDER REVIEW