Computational Intelligence

Federated Learning: example dataset (FMCW 122GHz radars)

Database for FMCW THz radars (HR workspace) and sample code for federated learning

Categories:: Communications
Computational Intelligence
Sensors
Signal Processing

1496 Views

Benchmarking Symbolic Regression and Local Linear Modelling Methods for Reinforcement Learning

Reinforcement Learning (RL) agents can learn to control a nonlinear system without using a model of the system. However, having a model brings benefits, mainly in terms of a reduced number of unsuccessful trials before achieving acceptable control performance. Several modelling approaches have been used in the RL domain, such as neural networks, local linear regression, or Gaussian processes. In this article, we focus on a technique that has not been used much so far:\ symbolic regression, based on genetic programming.

Categories:: Computational Intelligence
Nonlinear signal processing

292 Views

Synthetic Event Logs for Concept Drift Detection

Real life business processes change over time, in both planned and unexpected ways. These changes over time are called concept drifts and its detection is a big challenge in process mining since the inherent complexity of the data makes difficult distinguishing between a change and an anomalous execution. The following logs were generated synthetically in order to prove the quality of different concept drift detection algorithms.

Categories:: Computational Intelligence
Other

950 Views

Deduplication Index for Big Code Datasets

Code duplicates in large code corpora have adverse effects on the evaluation and use of machine learning models that rely on them. Most existing corpora suffer from this problem to some extent. This dataset contains a "duplication" index for some of the existing corpora in Big Code research. The method for collecting this dataset is described in "The Adverse Effects of Code Duplication in Machine Learning Models of Code" by Allamanis [ArXiV, to appear in SPLASH 2019].

Categories:: Computational Intelligence
Other

676 Views

Computer Network Events

This dataset contains a sequence of network events extracted from a commercial network monitoring platform, Spectrum, by CA. These events, which are categorized by their severity, cover a wide range of events, from a link state change up to critical usages of CPU by certain devices. Regarding the layers they cover, they are focused on the physical, network and application layer. As such, the whole set gives a complete overview of the network’s general state.

Categories:: Communications
Computational Intelligence

890 Views

Data of a Social Network Analysis using OntoSNAQA and other conventional tools (UCINET, Pajek, Cytoscape and Gephi)

The compressed file contains:

Data files in spreadsheet format from three different networks (friendship, companionship and acquaintances).
Analysis files from UCINET, Pajek, Cytoscape and Gephi.

It is thus possible to corroborate the results mentioned in different studies that refer to these data.

Categories:: Computational Intelligence

449 Views

OntoSNAQA - Multi-domain ontology for social network analysis

OntoSNAQA is the name that combines Social Network Analysis (SNA), People and Questionnaires (Question and Answers - QA).

This ontology will be updated in this project of github and in the url http://www.jabenitez.com/ontologies/OntoSNAQA.owl.

It's an ontology that combines three different domains:
- People
- Questionnaires
- Social Network Analysis terms

Categories:: Computational Intelligence

253 Views

Big Data Machine Learning Benchmark on Spark

We introduce a benchmark of distributed algorithms execution over big data. The datasets are composed of metrics about the computational impact (resource usage) of eleven well-known machine learning techniques on a real computational cluster regarding system resource agnostic indicators: CPU consumption, memory usage, operating system processes load, net traffic, and I/O operations. The metrics were collected every five seconds for each algorithm on five different data volume scales, totaling 275 distinct datasets.

Categories:: Standards Research Data
Computational Intelligence

1886 Views

Saudi Dialect Twitter Corpus (SDTwittC)

SDTwittC consists of 200 authors evenly balanced by gender (100 for each). We identified the gender of the tweeters via their names and profile pictures. As potential copy-and-paste texts, both tweets and retweets are discarded in the first place. Only replies are compiled. The number of replies for each author varies from hundreds to thousands. Male authors produced 233926 replies whereas 219740 replies are generated by the female group

Categories:: Computational Intelligence

894 Views

Binary classifiers' outputs for ensemble creation

This dataset was created based on the paper 'Andras Hajdu, Gyorgy Terdik, Attila Tiba, and Henrietta Toman: A stochastic approach to handle knapsack problems in the creation of ensembles'.To summarize our experimental setup for UCI binary classification problems, we have considered base classifiers perceptron, decision tree, Levenberg-Marquardt feedforward neural network, random neural network, and discriminative restricted Boltzmann machine classifier for the 5 UCI datasets MAGIC Gamma Telescope, HIGGS, EEG EyeState, Musk (Version 2), and Spambase; datasets of large cardinalities were sele

Categories:: Computational Intelligence

299 Views

Computational Intelligence

Computational Intelligence

Pages