Machine Learning

Lemon Leaf Disease Dataset

The Lemon Leaf Disease Dataset (LLDD) is a high-quality image dataset designed for training and evaluating machine learning models for lemon leaf disease classification. The dataset contains 9 classes of images of healthy and diseased lemon leaves, such as; Anthracnose. Bacterial Blight, Citrus Canker, Curl Virus, Deficiency Leaf, Dry Leaf, Healthy Leaf, Sooty Mould, Spider Mites, making it suitable for tasks such as plant disease instance segmentation, detection, image classification, and deep learning applications in agriculture.

Categories:: Agriculture
Artificial Intelligence
Machine Learning

4 Views

AI Training Data for OCT-SLO Self Calibration and Automation

Attached Image data set from combined OCT-SLO is used to train AI models and identify features to maximize quality of data set to adjust MZI reference arm, PMT Voltage of Liquid Lens and location of object. Why adjustment is needed is explained below:

Categories:: Artificial Intelligence
Signal Processing
Machine Learning
Image Processing
Biomedical and Health Sciences
Computer Vision

104 Views

Solar power datset

This dataset consists of meteorological and environmental data collected in Riyadh, Saudi Arabia, over multiple years. The variables include solar radiation, temperature (both maximum and minimum in Celsius and Fahrenheit), precipitation, vapor pressure, and snow water equivalent, among others. The data spans from 2010 to the present, providing insights into solar radiation patterns, daily temperature fluctuations, and weather-related factors that can impact solar power generation. Specifically, the dataset contains the following columns:

Categories:: Artificial Intelligence
Machine Learning

27 Views

SNMDat2.0

SNMDat2.0 is a comprehensive multimodal dataset, expanded from the unimodal TwiBot-20, designed for Twitter social bot detection. Specifically, we add 274587 profile images and profile background images, 86498 tweet images and 49549 tweet videos based on the original 229580 twitter users, 227979 follow relationships and 33488192 tweet text.

Categories:: Artificial Intelligence
Machine Learning
Security
Social Sciences

11 Views

Bibliometric Scopus data for Leaning Analytics

This dataset provides bibliometric information of academic publications related to learning analytics and decision sciences, sourced from Scopus. It includes metadata for a wide range of papers, including author details, titles, publication years, journal sources, and document types. Key columns in the dataset include author names, IDs, titles of publications, source titles (journals or conferences), document types, publication stage, and open access status.

Categories:: Education and Learning Technologies
Machine Learning

41 Views

GeoLife Dataset

The rapid growth of spatiotemporal data makes trajectory modeling critical for extracting patterns from large-scale, dynamic mobility datasets. However, many existing methods face challenges with scalability and computational inefficiency. To address these challenges, we propose VecLSTM—a vectorized Long Short-Term Memory (LSTM) framework designed to improve both predictive accuracy and processing performance. VecLSTM introduces a novel dynamic vectorization layer that converts raw GPS trajectories into structured vector embeddings, enabling efficient storage, retrieval, and preprocessing.

Categories:: Machine Learning

74 Views

Combined rumor and non-rumor dataset

This dataset, comprising 103,806 text entries, is a comprehensive resource for rumor detection on social media, constructed by merging benchmark collections including PHEME, LIAR Fake News, Twitter15, Twitter16, and ISOT Fake News. It features a binary classification schema (47% rumor, 53% non-rumor) and integrates original and adversarially augmented samples to enhance model robustness.

Categories:: Machine Learning

29 Views

Forbes Billionaire dataset

The Forbes 2022 Billionaires List dataset contains information about the world's wealthiest individuals, including their net worth, industry, country, and key business ventures. The dataset provides structured details such as rankings, company associations, and financial status, making it useful for various NLP tasks like table-to-text generation, entity recognition, and financial analysis.

Categories:: Artificial Intelligence
IEEEXtreme
Machine Learning

32 Views

Bangla Social Media Cyberbullying Dataset

Cyberbullying is a growing problem on social media. This dataset helps detect cyberbullying in Bangla by collecting comments from YouTube, Facebook, Instagram, and TikTok. The data is categorized into two types: bullying and non-bullying. It includes various abusive and harmful texts, along with normal conversations. This dataset will help researchers and developers train AI models to automatically identify cyberbullying in Bangla text. The goal is to create better tools to keep online spaces safe for Bangla-speaking users.

Categories:: Machine Learning

204 Views

Course Rating

This dataset comprises a comprehensive collection of educational courses, each characterized by several key attributes: interests, title, description, category, level, past experience, and rating.

Categories:: Artificial Intelligence
Machine Learning

10 Views

Machine Learning

Machine Learning

Pages