Aerial manipulators, combining aerial robots with robotic arms, have emerged as highly effective for a wide range of applications, especially in aerial physical interaction. To effectively detect surfaces and perform physical interactions, these systems must accurately perceive the target's location and execute precise visual servoing. This paper introduces a novel system design and framework to enable safe physical contact with defined surfaces. The system consists of an aerial platform equipped with a lightweight tendon-driven, 3-DoF robotic arm, an RGB-D camera for detecting targets and guiding the arm to achieve the desired end-effector positioning, and essential sensors for precise localization of the platform. Results in outdoor experiments showed the system successfully performing physical inspections in industrial environments subject to light variations and windy conditions. The low inertia of the moving arm has no influence on the aerial platform, and its passive damping stabilizes the robot during contact phases. The superiority of the neural-based target detection compared to the model-based has been examined. By reaching the limits of the system at high altitudes and in strong winds, we emphasize the importance of the backup pilot when deploying robots to come in contact with infrastructures. To enable the reproducibility of our result, the dataset for computer vision, design of the arm, and simulation are shared.