Data Mining
Please cite the following paper when using this dataset:
Vanessa Su and Nirmalya Thakur, “COVID-19 on YouTube: A Data-Driven Analysis of Sentiment, Toxicity, and Content Recommendations”, Proceedings of the IEEE 15th Annual Computing and Communication Workshop and Conference 2025, Las Vegas, USA, Jan 06-08, 2025 (Paper accepted for publication, Preprint: https://arxiv.org/abs/2412.17180).
Abstract:
- Categories:
To download this dataset without purchasing an IEEE Dataport subscription, please visit: https://zenodo.org/records/13896353
Please cite the following paper when using this dataset:
- Categories:
To download the dataset without purchasing an IEEE Dataport subscription, please visit: https://zenodo.org/records/13738598
Please cite the following paper when using this dataset:
- Categories:
Recently, combinatorial interaction strategies have a large spectrum as black box strategies for testing software and hardware. This paper discusses a novel adoption of a combinatorial interaction strategy to generate a sparse combinatorial data table (SCDT) for machine learning. Unlike test data generation strategies, in which the t-way tuples synthesize into a test case, the proposed SCDT requires analyzing instances against their corresponding tuples to generate a systematic learning dataset.
- Categories:
To access this dataset without purchasing an IEEE Dataport subscription, please visit: https://zenodo.org/doi/10.5281/zenodo.11711229
Please cite the following paper when using this dataset:
- Categories:
The alignment between the implemented database application systems and their data specification standard description files significantly affects the accuracy of enterprises' estimation of data assets based on data standard files. In this study, we proposed an automated approach for discovering and aligning these consistent fields, greatly reducing the cost of manual evaluation. We frame the field's alignment problem as an entity matching computation on two distinct graphs, respectively constructed from the database of application systems and its data specification standard.
- Categories:
Traffic data set
- Categories:
The fast development of urban advancement in the past decade requires reasonable and realistic solutions for transport, building infrastructure, natural conditions, and personal satisfaction in smart cities. This paper presents and explores predictive energy consumption models based on data-mining techniques for a smart small-scale steel industry in South Korea. Energy consumption data is collected using IoT based systems and used for prediction.
- Categories:
India is known for its highly disciplined foreign policies, strategic location, vibrant and massive Diaspora. India envisages enhancing its scope of cooperation, trade and widens its sphere of relations with the Pacific. As a result, the world is witnessing the rise of Indo-Pacific ties. Before the 1980’s the keystone of the universe was called the Atlantic, but now a radical shift to the east is noticed by the term “Indo-Pacific‟.
- Categories:
This dataset was extracted from Twitter using keywords related to Dilma Roussef and Aécio Neves, that were the candidates of the second round of the 2014 presidential election in Brazil. This dataset contains texts in Portuguese and the respective classification of sentiments resulting from the techniques described in the article published in the 2018 IEEE International Conference on Data Mining Workshops - ICDMW (https://ieeexplore.ieee.org/abstract/document/8637504).
- Categories: