Ran Yu -

Ran Yu

Public Documents 2

SimLiquid: A Simulation-Based Liquid Perception Pipeline for Robot Liquid Manipulatio...

Yan Huang

and 4 more

December 03, 2024

Transparent liquid volume estimation is crucial for robot manipulation tasks, such as pouring. However, estimating the volume of transparent liquids is a challenging problem. Most existing methods primarily focus on data collection in the real world, and the sensors are fixed to the robot body for liquid volume estimation. These approaches limit both the timeliness of the research process and the flexibility of perception. In this paper, we present SimLiquid20k, a high-fidelity synthetic dataset for liquid volume estimation, and propose a YOLO-based multi-modal network trained on fully synthetic data for estimating the volume of transparent liquids. Extensive experiments demonstrate that our method can effectively transfer from simulation to the real world. In scenarios involving changes in background, viewpoint, and container variations, our approach achieves an average error of 5% in real-world volume estimation. In addition, our work conducts two application experiments integrate with ChatGPT, showcasing the potential of our method in service robotics. The accompanying video and supplementary materials are available at https://simliquid.github.io/.

HTPE-Net: Monocular 6D Pose Estimation of Transparent Objects in Hand for Robot Manip...

Ran Yu

and 3 more

December 03, 2024

Transparent objects are difficult to perceive due to their unique optical properties, and the dynamic interaction between the hand and object further complicates pose estimation. To address this problem, we propose HTPE-Net, a monocular instance-level 6D pose estimation method for hand-held transparent objects, addressing the significant challenges posed by the texture-less, non-Lambertian surfaces, and hand-object occlusions. HTPE-Net integrates hand and object features through a dual-stream feature extraction backbone and a hand-to-object feature enhancement module, generating geometric features and hand attention maps to improve robustness to background changes and occlusions. The network is trained on a modified version of the TransHand-14K dataset and demonstrates superior performance compared to state-of-the-art methods. Additionally, a sim-to-real experiment validates the practical applicability of HTPE-Net in real-world robot perception tasks. The proposed approach significantly advances the accuracy and robustness of 6D pose estimation for hand-held transparent objects, with potential applications in robotics, human-machine interaction, and augmented reality.