Data Mining

This dataset comprises 500,153 Instagram posts related to COVID-19, published between January 2020 and September 2024. The posts span 161 unique languages and include a total of 535,021 distinct hashtags. Following the creation of the dataset, a multilingual sentiment analysis was conducted. This analysis classified each post into one of three categories: positive, negative, or neutral. The sentiment classification is provided as an additional attribute within the dataset, offering a comprehensive overview of the emotional tone expressed across the posts.

Categories:
30 Views

Recently, combinatorial interaction strategies have a large spectrum as black box strategies for testing software and hardware. This paper discusses a novel adoption of a combinatorial interaction strategy to generate a sparse combinatorial data table (SCDT) for machine learning. Unlike test data generation strategies, in which the t-way tuples synthesize into a test case, the proposed SCDT requires analyzing instances against their corresponding tuples to generate a systematic learning dataset.

Categories:
107 Views

The alignment between the implemented database application systems and their data specification standard description files significantly affects the accuracy of enterprises' estimation of data assets based on data standard files. In this study, we proposed an automated approach for discovering and aligning these consistent fields, greatly reducing the cost of manual evaluation. We frame the field's alignment problem as an entity matching computation on two distinct graphs, respectively constructed from the database of application systems and its data specification standard.

Categories:
10 Views

Traffic data set

Categories:
606 Views

The fast development of urban advancement in the past decade requires reasonable and realistic solutions for transport, building infrastructure, natural conditions, and personal satisfaction in smart cities. This paper presents and explores predictive energy consumption models based on data-mining techniques for a smart small-scale steel industry in South Korea. Energy consumption data is collected using IoT based systems and used for prediction.

Categories:
1783 Views

India is known for its highly disciplined foreign policies, strategic location, vibrant and massive Diaspora. India envisages enhancing its scope of cooperation, trade and widens its sphere of relations with the Pacific. As a result, the world is witnessing the rise of Indo-Pacific ties. Before the 1980’s the keystone of the universe was called the Atlantic, but now a radical shift to the east is noticed by the term “Indo-Pacific‟.

Categories:
554 Views

This dataset was extracted from Twitter using keywords related to Dilma Roussef and Aécio Neves, that were the candidates of the second round of the 2014 presidential election in Brazil. This dataset contains texts in Portuguese and the respective classification of sentiments resulting from the techniques described in the article published in the 2018 IEEE International Conference on Data Mining Workshops - ICDMW (https://ieeexplore.ieee.org/abstract/document/8637504). 

 

Categories:
229 Views

Depths to the various subsurface anomalies have been the primary interest in all the applications of magnetic methods of geophysical prospection. Depths to the subsurface geologic features of interest are more valuable and superior to all other properties in any correct subsurface geologic structural interpretations.

Categories:
458 Views

Pages