Datasets
Standard Dataset
Computer Power Evaluation Datasets
- Citation Author(s):
- Submitted by:
- Yuhuan Huang
- Last updated:
- Wed, 09/11/2024 - 05:26
- DOI:
- 10.21227/754c-9c41
- License:
- Categories:
- Keywords:
Abstract
We have selected the ImageNet validation set and the Flower dataset as benchmark standards for the image classification domain. These datasets provide a robust and diverse set of images, ensuring a comprehensive evaluation of model performance. For benchmark testing in the object detection domain, we utilize the COCO2012 validation set and the Road Voc dataset. These datasets are well-suited for assessing the accuracy and efficiency of object detection models in various real-world scenarios. In terms of models, ResNet and MobileNet have been chosen for image classification benchmarks due to their state-of-the-art architectures and proven performance. For object detection, YOLOV3 is employed, known for its speed and accuracy in detecting multiple objects in images, making it an ideal choice for testing object detection models. This comprehensive benchmark setup ensures reliable performance validation across multiple domains.
Instructions
We have chosen the ImageNet validation set [1] and the Flower dataset [2] as benchmark standards for the image classification domain. For benchmark testing in the object detection domain, we utilize the COCO2012 validation set [3] and the Road Voc dataset [4] as the basis for performance validation. ResNet and MobileNet are selected as models for image classification benchmarks, while YOLOV3 is employed for object detection benchmark testing.Since the dataset is too large to upload, please download it from the reference link.
The benchmarking process covers the following use cases: image classification inference, object detection inference, image classification training, and object detection training. This benchmarking framework is primarily used to evaluate the performance of Mlperf and Paddle, with final statistics aggregated based on the same benchmarks. Each use case represents different computational tasks and application scenarios, aimed at comprehensively assessing the performance of devices in various environments.
During the inference stage, classic models such as ResNet, MobileNet, and YOLOV3 are used for both single-threaded and multi-threaded inference tasks. In the training stage, the same models are employed to ensure consistency and comparability in evaluation. To standardize performance metrics across different tasks, an intelligent computing evaluation algorithm is used to generate a computational capacity score for each device. These scores are then aggregated and analyzed to produce comprehensive evaluation results.
The specific operational steps are as follows: In Mlperf [5], PaddleClas [6], and PaddleDetection [7], the output metric code of each project is modified to input the corresponding output metrics into the intelligent computing evaluation algorithm. The performance scores of different machines, based on their CPU and GPU environments, are recorded, enabling a comparison of computing capabilities across different tasks.
Reference
[1]ImageNet-1k. Stanford Vision Lab. ImageNet-1k Dataset. Retrieved from https://image-net.org/download.php
[2]Flower102. University of Oxford, Visual Geometry Group. Flower102 Dataset. Retrieved from https://paddle-imagenet-models-name.bj.bcebos.com/data/flowers102.zip
[3]COCO2012. Microsoft. COCO2012 Dataset. Retrieved from https://mscoco.org/
[4]Road Voc. PaddlePaddle. Road Voc Dataset. Retrieved from https://paddlemodels.bj.bcebos.com/object_detection/roadsign_voc.tar
[5]MLCommons. MLCommons Inference Benchmark Suite. GitHub repository. Retrieved from https://github.com/mlcommons/inference
[6]PaddlePaddle. PaddleClas: An Image Classification and Recognition Toolset. GitHub repository. Retrieved from https://github.com/PaddlePaddle/PaddleClas
[7]PaddlePaddle. PaddleDetection: Object Detection and Recognition Toolset Based on PaddlePaddle. GitHub repository. Retrieved from https://github.com/PaddlePaddle/PaddleDetection