Datasets
Standard Dataset
PLUM
- Citation Author(s):
- Submitted by:
- Ming Ni
- Last updated:
- Mon, 07/08/2024 - 15:58
- DOI:
- 10.21227/0h01-3x31
- License:
- Categories:
Abstract
Accurate recognition of targets in the orchard environment is the key to vision perception for picking robots. Factors such as small, densely growing plum fruit targets and high occlusion lead to unsatisfactory recognition of plum fruit by vision algorithms. Therefore, this paper proposes an improved YOLOv5s model to detect highly occluded and dense plums in orchards. First, the backbone network of YOLOv5s is improved in this paper. A new structure Focus-Maxpool module is used to replace the downsampling convolution in the backbone network, so that the model can retain more feature information of highly occluded targets and small targets when downsampling, thus improving the detection performance of occluded targets and small targets. Second, the loss function is improved in this paper. The weighted loss of focal loss and cross entropy function is used as the classification loss of the model to reduce the interference of noise on focal loss and improve the recognition ability of the model for adhering targets. Finally, several testing experiments were designed to evaluate the model's performance. The results show that the improved YOLOv5s model has better average precision than YOLOv5s, YOLOv4, Faster-RCNN, SSD, and Centernet. Compared with the results of the YOLOv5s model, the improved model's average precision, recall, and accuracy are improved by 2.84%, 9.53%, and 1.66%, respectively, compared with the original model. Moreover, the detection speed of the improved model can reach 91.37 frames/s, which can meet the demand for real-time detection. The results show that the improved detection model has high accuracy and robustness in natural orchard environments, which can provide data reference for the research of picking robots and the work of orchard environment monitoring.
The summary of the dataset for LiDong's object detection is that it is a collection of labeled images aimed at training computer vision models to detect li fruits in various scenes and environments. Each image in the dataset has been manually annotated with information including the location and size of the li fruit within the image. This dataset can be used to train object detection models that enable computers to automatically identify and locate li fruits in images, which has important applications in fields such as agriculture and food safety.