Machine Learning

The "Multi-Label Extremism and Jihadism Classification Tweets Dataset" dataset is a multilingual resource designed for multi-label classification of online extremism and toxic behavior, including extremism and jihadism. Each comment is annotated with labels indicating the presence of various extremism traits: toxic, severe toxic, obscenity, threats, insults, identity hate, and jihadi content.

Categories:
91 Views

The Protection of Children from Sexual Offences (POCSO) Act was an important legislation that was enacted in India in 2012. It aims to safeguard children from sexual exploitation through various enforcement and legal redressal mechanisms. This dataset has been scraped from eCourts India Services using Python script which uses Selenium. We have mined apex and high courts’ judgements, which mentioned the POCSO Act and its respective sections. We have chronologically scraped POCSO judgements from 2012 to 2020 in the corpus.

Categories:
412 Views

Measuring and assessing intelligence level in children and adolescents is crucial for monitoring their developmental progress, identifying intellectual disabilities, and implementing early interventions. To date, there is no digital and simplified tool specifically designed to evaluate whether intelligence is normal or abnormal in these age stages.

Categories:
100 Views

The metal wings of aircraft and the feathered wings of nature often compete for the same airspace. At times, this leads to collisions and may result in damage to aircraft and sometimes injuries or even death to passengers, crew, and wildlife too. Wildlife strikes are the consequence of various factors. These factors include migration of species, weather, time-of-day, phase of flight and the region of the flight’s route. However, some of the questions remain to be addressed with solid observational data.

Categories:
110 Views

This is an optical flow dataset which covers multiple independent moving object samples under various adverse weather. This is an optical flow dataset which covers multiple independent moving object samples under various adverse weather. This is an optical flow dataset which covers multiple independent moving object samples under various adverse weather.

Categories:
359 Views

This dataset contains a Magazine_Subscriptions.csv and a Gift_Card.csv, both of which contains a column of item id, a column of user(reviewer) id, a column of user rating, and a column of reviewed time. This data is originally collected by Ni, Jianmo, Jiacheng Li, and Julian McAuley in “Justifying Recommendations Using Distantly-Labeled Reviews and Fine-Grained Aspects,” proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019.

Categories:
28 Views

The dataset used was derived from a frequency tunable high power vacuum electron devices (HPVED) optimization task using a simple genetic algorithm (SGA). This device has the topology with 44 optimization parameters and 2 tuning parameters. Each of the two tuning parameters has 6 and 7 discrete adjustment values, respectively. The population size is set to 54, and each individual requires 42 simulations to evaluate its tunability.

Categories:
397 Views

Crowdfunding campaigns frequently fail to reach their funding goals, posing a significant challenge for project creators. To address this issue and empower future crowdfunding stakeholders, accurate prediction models are essential. This study evaluates the relative significance of diverse modalities (visual, audio, and text) in predicting campaign success.

Categories:
176 Views

This dataset contains precomputed MS-COCO and Flickr30K Faster R-CNN image features, which are all the data needed for reproducing the experiments in "Stacked Cross Attention for Image-Text Matching", our ECCV 2018 paper. We use splits produced by Andrej Karpathy.

Categories:
54 Views

Mulan , a sourceforge net multi-target dataset available in www.openml.org. Despite the numerous interesting applications of MTR, there are only few publicly available datasets of this kind - perhaps because most applications are industrial - and most experimental evaluations of MTR methods are based on a limited amount of datasets. For this study, much effort was made for the composition of a large and diverse collection of benchmark MTR datasets.

Categories:
353 Views

Pages