Machine Learning

SNMDat2.0

SNMDat2.0 is a comprehensive multimodal dataset, expanded from the unimodal TwiBot-20, designed for Twitter social bot detection. Specifically, we add 274587 profile images and profile background images, 86498 tweet images and 49549 tweet videos based on the original 229580 twitter users, 227979 follow relationships and 33488192 tweet text.

Categories:: Artificial Intelligence
Machine Learning
Security
Social Sciences

5 Views

Bibliometric Scopus data for Leaning Analytics

This dataset provides bibliometric information of academic publications related to learning analytics and decision sciences, sourced from Scopus. It includes metadata for a wide range of papers, including author details, titles, publication years, journal sources, and document types. Key columns in the dataset include author names, IDs, titles of publications, source titles (journals or conferences), document types, publication stage, and open access status.

Categories:: Education and Learning Technologies
Machine Learning

32 Views

GeoLife Dataset

The rapid growth of spatiotemporal data makes trajectory modeling critical for extracting patterns from large-scale, dynamic mobility datasets. However, many existing methods face challenges with scalability and computational inefficiency. To address these challenges, we propose VecLSTM—a vectorized Long Short-Term Memory (LSTM) framework designed to improve both predictive accuracy and processing performance. VecLSTM introduces a novel dynamic vectorization layer that converts raw GPS trajectories into structured vector embeddings, enabling efficient storage, retrieval, and preprocessing.

Categories:: Machine Learning

46 Views

Combined rumor and non-rumor dataset

This dataset, comprising 103,806 text entries, is a comprehensive resource for rumor detection on social media, constructed by merging benchmark collections including PHEME, LIAR Fake News, Twitter15, Twitter16, and ISOT Fake News. It features a binary classification schema (47% rumor, 53% non-rumor) and integrates original and adversarially augmented samples to enhance model robustness.

Categories:: Machine Learning

20 Views

Forbes Billionaire dataset

The Forbes 2022 Billionaires List dataset contains information about the world's wealthiest individuals, including their net worth, industry, country, and key business ventures. The dataset provides structured details such as rankings, company associations, and financial status, making it useful for various NLP tasks like table-to-text generation, entity recognition, and financial analysis.

Categories:: Artificial Intelligence
IEEEXtreme
Machine Learning

25 Views

Bangla Social Media Cyberbullying Dataset

Cyberbullying is a growing problem on social media. This dataset helps detect cyberbullying in Bangla by collecting comments from YouTube, Facebook, Instagram, and TikTok. The data is categorized into two types: bullying and non-bullying. It includes various abusive and harmful texts, along with normal conversations. This dataset will help researchers and developers train AI models to automatically identify cyberbullying in Bangla text. The goal is to create better tools to keep online spaces safe for Bangla-speaking users.

Categories:: Machine Learning

165 Views

Course Rating

This dataset comprises a comprehensive collection of educational courses, each characterized by several key attributes: interests, title, description, category, level, past experience, and rating.

Categories:: Artificial Intelligence
Machine Learning

9 Views

Feni Solar-Wind Data

This dataset contains high-resolution solar and wind measurement data collected from the Feni region, Bangladesh, spanning from 2017 to 2019. Logged at a 1-minute interval, the dataset provides a comprehensive record of atmospheric and meteorological conditions, essential for renewable energy analysis, climatological studies, and resource assessment.

Categories:: Machine Learning
Power and Energy
Weather
Sensors

123 Views

Payra Wind Data

This dataset contains high-resolution wind measurement data collected from 22 channels at varying heights, providing valuable insights for wind energy assessment, atmospheric research, and meteorological studies. The dataset includes wind speed, wind direction, and environmental parameters measured at multiple altitudes ranging from 10m to 120m. Each channel records parameters such as average wind speed, standard deviation, minimum and maximum values, gust speed, and wind vane direction. Additionally, atmospheric parameters such as temperature, relative humidity, and pressure are included.

Categories:: Machine Learning
Power and Energy
Electric Utility
Sensors

121 Views

wind speed data

<p>This meteorological data is provided by the Inner Mongolia Meteorological Bureau and includes data from three stations.

Categories:: Artificial Intelligence
Machine Learning

23 Views

Machine Learning

Machine Learning

Pages