Machine Learning
Fecal microscopic data set is a set of fecal microscopic images, which is used in object detection task. The datasets are collected from the Sixth People’s Hospital of Chengdu (Sichuan Province, China). The samples were went flow diluted, stirred and placed, and imaged with a microscopic imaging system. The clearest 5 images were collected for each view of each sample with Tenengrad definition algorithm. The dataset we collected includes 10670 groups of views with 53350 jpg images. The Resolution of images are 1200×1600. There are 4 categories, RBCs, WBCs, Molds, and Pyocytes.
- Categories:
Today, the cameras are fixed everywhere, in streets, in vehicles, and in any public area. However, Analysis and extraction of information from images are required. Particularly, in autonomous vehicles and in smart applications that are developed to guide tourists. So, a large dataset of scene text images is an important and difficult factor in the extraction of textual information in natural images. It is the input to any computer vision system.
- Categories:
This dataset consists of the training and the evaluation datasets for the LiDAR-based maritime environment perception presented in our journal publication "Maritime Environment Perception based on Deep Learning." Within the datasets, LiDAR raw data are processed using Deep Neural Networks (DNN). In the training dataset, we introduce the method for generating training data in Gazebo simulation. In the evaluation datasets, we provide the real-world tests conducted by two research vessels, respectively.
- Categories:
This dataset contains a collection of videos consisting of satellite imagery augmented with 3D ship models, accompanied by the ships' corresponding AIS data. The intention of this dataset is for detecting dark ships, which are sea vessels acting maliciously, often while spoofing their AIS data. Multiple datasets exist that consist of satellite imagery of ships, however this dataset has the advantage of including each ships' corresponding AIS data. The simulated ships include both normal and anomalous behavior, whether the anomalous behavior is benign or malicious.
- Categories:
There exist several commonly used datasets in relation to object detection that include COCO (with multiple versions) and ImageNet containing large annotations for 80 and 1000 objects (i.e. classes) respectively. However, very limited datasets are available comprising specific objects identified by visually imapeired people (VIP) such as wheel-bins, trash-Bags, e-Scooters, advertising boards, and bollard. Furthermore, the annotations for these objects are not available in existing sources.
- Categories:
The greatest challenge of machine learning problems is to select suitable techniques and resources such as tools and datasets. Despite the existence of millions of speakers around the globe and the rich literary history of more than a thousand years, it is expensive to find the computational linguistic work related to Punjabi Shahmukhi script, a member of the Perso-Arabic context-specific script low-resource language family. The selection of the best algorithm for a machine learning problem heavily depends on the availability of a dataset for that specific task.
- Categories:
This dataset is used in the experiment of the paper "A Data Embedding Scheme for Efficient Program Behavior Modeling with Neural Networks" accepted by IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI). System calsl and their relevant branch sequences are contained in the tar.gz file. For a detailed description, please refer to the paper.
- Categories:
The ability to estimate the probability of a drug to receive approval in clinical trials provides natural advantages to optimizing pharmaceutical research workflows. Success rates of a clinical trials have deep implications to costs, duration of development, and under pressure due to stringent regulatory approval processes. We propose a machine learning approach that can predict the outcome of trial with reliable accuracies, using biological activities, physico-chemical properties of the compounds, target related features and NLP-based compound representation.
- Categories:
Tweets related to 10 different types of disasters were monitored from 28 September 2021 till 6 October 2021. 67528 rows containing 16 fields were extracted using Artificial Intelligence and Natural Language Processing Services of Microsoft.
- Categories: