Object detection, specifically fruitlet detection, is a crucial image processing technique in agricultural automation, enabling the accurate identification of fruitlets on orchard trees within images. It is vital for early fruit load management and overall crop management, facilitating the effective deployment of automation and robotics to optimize orchard productivity and resource use. This study systematically performed an extensive evaluation of the performances of all configurations of YOLOv8, YOLOv9, YOLOv10, and YOLO11 object detection algorithms in terms of precision, recall, mean Average Precision at 50% Intersection over Union (mAP@50), and computational speeds including pre-processing, inference, and post-processing times immature green apple (or fruitlet) detection in commercial orchards. Additionally, this research performed and validated in-field counting of fruitlets using an iPhone and machine vision sensors in 4 different apple varieties (Scifresh, Scilate, Honeycrisp & Cosmic crisp). This investigation of total 22 different configurations of YOLOv8, YOLOv9, YOLOv10 and YOLO11 (5 for YOLOv8, 6 for YOLOv9, 6 for YOLOv10, and 5 for YOLO11) revealed that YOLOv9 gelan-base and YOLO11s outperforms all other configurations of YOLOv10 , YOLOv9 and YOLOv8 in terms of mAP@50 with a score of 0.935 and 0.933 respectively. In terms of precision, specifically, YOLOv9 Gelan-e achieved the highest mAP@50 of 0.935, outperforming YOLOv11s's 0.0.933, YOLOv10s's 0.924, and YOLOv8s's 0.924. In terms of recall, YOLOv9 gelan-base achieved highest value among YOLOv9 configurations (0.899), and YOLO11m performed the best among the YOLO11 configurations (0.897). In comparison for inference speeds, YOLO11n demonstrated fastest inference speeds of only 2.4 ms, while the fastest inference speed across YOLOv10, YOLOv9 and YOLOv8 were 5.5, 11.5 and 4.1 ms for YOLOv10n, YOLOv9 gelan-s and YOLOv8n respectively.