Computer Vision

3D-COCO is a dataset composed of MS COCO images with 3D models aligned on each instance. 3D-COCO was designed to achieve computer vision tasks such as 3D reconstruction or image detection configurable with textual, 2D image, and 3D CAD model queries.

3D-COCO is an extension of the original MS-COCO dataset providing 3D models and 2D-3D alignment annotations. We complete the existing MS-COCO dataset with 28K 3D models collected on ShapeNet and Objaverse. By using an IoU-based method, we match each MS-COCO annotation with the best 3D models to provide a 2D-3D alignment.

Categories:
22 Views

There is growing widespread adoption of augmented reality in tech-driven industries and sectors of society, such as medicine, gaming, flight simulation, education, interior design and modelling, entertainment, construction, tourism, repair and maintenance, public safety, agriculture, and quantum computing. However, ensuring smooth and intuitive interactions with augmented objects is challenging, requiring practical performance evaluation and optimisation models to assess and improve users' experiences as they engage with AR-enhanced devices or systems.

Categories:
30 Views

This dataset was collected from real-world recycling plants, primarily consisting of crushed glass from disassembled display devices. The dataset contains images of flat glass mixed with solid glass, colored glass, plastic film, and aluminum foil. The colored glass originated from frame areas, while the aluminum foil came from cable shielding materials. Additional objects, such as solid glass and plastic films, were sourced from other recycled materials like glass bottles and packaging.

Categories:
111 Views

      

Categories:
51 Views

CrackAirport features images containing unique elements such as aircraft, T-hangars, vegetation, airport markings and signs, as well as evidence of previous maintenance. The dataset was captured using a Sony ILCE-7RM4A camera mounted on a drone flying at an altitude of 100 feet AGL. The imagery was sourced from various local airports in Tennessee and includes common pavement distresses and environmental patterns typical of airport surfaces. The images were annotated and then cropped into 512x512 pixel segments for training.

Categories:
530 Views

The ultrasound video data were collected from two sets of neck ultrasound videos of ten healthy subjects at the Ultrasound Department of Longhua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine. Each subject included video files of two groups of LSCM, LSSCap, RSCM, and RSSCap. The video format is avi.

The MRI training data were sourced from three hospitals: Longhua Hospital, Shanghai University of Traditional Chinese Medicine; Huadong Hospital, Fudan University; and Shenzhen Traditional Chinese Medicine Hospital.

Categories:
517 Views

These are tight pedestrian masks for the thermal images present in the KAIST Multispectral pedestrian dataset, available at https://soonminhwang.github.io/rgbt-ped-detection/

Both the thermal images themselves as well as the original annotations are a part of the parent dataset. Using the annotation files provided by the authors, we develop the binary segmentation masks for the pedestrians, using the Segment Anything Model from Meta.

Categories:
191 Views

The Dataset is a large-scale, diverse collection of high-resolution RGB images containing labeled wheat heads. Assembled through a collaborative effort of nine research institutes from seven countries, the dataset encompasses a wide range of genotypes, growth stages, and pedoclimatic conditions. Its primary goal is to facilitate the development of robust and accurate wheat head detection models for applications in precision phenotyping and crop management.

Categories:
156 Views

We introduce two novel datasets for cell motility and wound healing research: the Wound Healing Assay Dataset (WHAD) and the Cell Adhesion and Motility Assay Dataset (CAMAD). WHAD comprises time-lapse phase-contrast images of wound healing assays using genetically modified MCF10A and MCF7 cells, while CAMAD includes MDA-MB-231 and RAW264.7 cells cultured on various substrates. These datasets offer diverse experimental conditions, comprehensive annotations, and high-quality imaging data, addressing gaps in existing resources.

Categories:
575 Views

Video anomaly detection (VAD) is a challenging task aiming to recognize anomalies in video frames, and existing large-scale VAD researches primarily focus on road traffic and human activity scenes. In industrial scenes, there are often a variety of unpredictable anomalies, and the VAD method can play a significant role in these scenarios. However, there is a lack of applicable datasets and methods specifically tailored for industrial production scenarios due to concerns regarding privacy and security.

Categories:
170 Views

Pages