Advancements in Instance-Level Human Parsing: Integrating Visual
Saliency with Multi-Task Learning for Complex Environments
Abstract
Instance-level human parsing, critical for human-centric analysis,
involves labeling pixels of human body parts and associating them with
specific instances. Despite progress in multi-person parsing, segmenting
individuals in dense crowds remains challenging. The Visual
Saliency-Based Human Parsing (ViS-HuP) algorithm addresses this by using
visual saliency to enhance body pixel clarity and incorporating edge
detection to refine instance boundaries within a multi-task learning
framework. Tested on the Crowd Instance-level Human Parsing (CIHP)
dataset[1], ViS-HuP outperforms conventional methods, showing
significant accuracy improvements in crowded scenes.