object detection

The steel tube dataset comprises comprehensive information on various attributes related to steel tubes, encompassing dimensions, material composition, manufacturing processes, and performance characteristics. This dataset facilitates in-depth analysis of steel tube properties, aiding researchers, engineers, and industry professionals in optimizing designs, ensuring structural integrity, and advancing materials science in the context of steel tube applications.


This dataset, referred to as LIED (Light Interference Event Dataset), is showcased in the article titled 'Identifying Light Interference in Event-Based Vision'. We proposed the LIED, it has three categories of light interference, including strobe light sources, non-strobe light sources and scattered or reflected light. Moreover, to make the datasets contain more realistic scenarios, the datasets include the dynamic objects and the situation of camera static and the camera moving. LIED was recorded by the DAVIS346 sensor. It provides both frame and events with the resolution of 346 * 260.


Just Recognizable Distortion (JRD) refers to the minimum distortion that notably affects the recognition performance of a machine vision model. If a distortion added to images or videos falls within this JRD threshold, the degradation of the recognition performance will be unnoticeable. Based on this JRD property, it will be useful to Video Coding for Machine (VCM) to minimize the bit rate while maintaining the recognition performance of compressed images.


To enable intelligent vehicles and transportation systems, the vehicles and relevant systems need to have the ability to sense environment and recognize objects. In order to benefit from the robustness of radar for sensing, knowing how to use the radar system for effective object recognition is critical. Observing this, we in this paper propose a novel deep learning-aided object recognition system for radar systems by combining the You only look once (YOLO) system with a proposed object recheck system.


In this dataset, we provided the raw analog-to-digital-converter (ADC) data of a 77GHz mmwave radar for the automotive object detection scenario. The overall dataset contains approximately 19800 frames of radar data as well as synchronized camera images and labels. For each radar frame, its raw data has 4 dimension: samples (fast time), chirps (slow time), transmitters, receivers. The experiment radar was assembled from the TI AWR 1843 board, with 2 horizontal transmit antennas and 4 receive antennas.


This dataset has a collection of 383 raw images of Indian vehicles in different illumination conditions using Infrared Day/Night Camera. The dataset resembles the Indian highway toll collection plazas. The dataset will be useful in developing intelligent models for applications such as automated toll collection, number plate detection and recognition, driverless vehicles, suspicious vehicle traction, and traffic management.


Fecal microscopic data set is a set of fecal microscopic images, which is used in object detection task. The datasets are collected from the Sixth People’s Hospital of Chengdu (Sichuan Province, China). The samples were went flow diluted, stirred and placed, and imaged with a microscopic imaging system. The clearest 5 images were collected for each view of each sample with Tenengrad definition algorithm. The dataset we collected includes 10670 groups of views with 53350 jpg images. The Resolution of images are 1200×1600. There are 4 categories, RBCs, WBCs, Molds, and Pyocytes.


This dataset consists of the training and the evaluation datasets for the LiDAR-based maritime environment perception presented in our journal publication "Maritime Environment Perception based on Deep Learning." Within the datasets, LiDAR raw data are processed using Deep Neural Networks (DNN). In the training dataset, we introduce the method for generating training data in Gazebo simulation. In the evaluation datasets, we provide the real-world tests conducted by two research vessels, respectively.




Surveillance video captured by Multi-intensity infrared illuminator.

GT(ground-truths) :bounding boxes of 'person' in channel 2,4 and 6 by following the Pascal VOC format.


The accompanying dataset for the CVSports 2021 paper: DeepDarts: Modeling Keypoints as Objects for Automatic Scoring in Darts using a Single Camera

Paper Abstract: