In order to locate the mobile robots in three-dimensional indoor environment, mostly global navigation satellite system-denied space, a monocular visual space positioning algorithm based on deep neural network is proposed. First, we employ the lightweight YOLOv5 algorithm for target detection, and the LibTorch deep learning framework is used for model deployment to improve the inference speed. Moreover, a multi-layer perceptron (MLP) neural network with four inputs and two outputs is constructed, which regress the coordinates of the robot in the field coordinate system to complete the target localization, and this method is compared with the mathematical model solving algorithm to reflect the accuracy and superiority of positioning algorithm based on deep neural network. The proposed positioning and tracking system has been successfully applied to ICRA robot competition, and results show that the positioning error estimated by our method is within 10cm whilst having good real-time performance.