clustering data

The fifth generation (5G) wireless communications system offers faster data rates, lower latency, and higher number of interconnecting devices. Various 5G channel models were developed to study its stochastic characteristics prior to its implementation. These channel models generate multipath components that are grouped into clusters when they have similar properties in delay and angles. The multipaths and multipath clusters are used as datasets in multipath clustering which is used to examine the propagation properties of the 5G system. However, datasets are prone to outliers.

Categories:
100 Views

PROTEIN STRUCTURE AND SYNTHETIC MULTI-VIEW CLUSTERING DATASETS

Multi-View Clustering (MVC) datasets used in the following paper:

Evolutionary Multi-objective Clustering Over Multiple Conflicting Data Views. Authors: Mario Garza-Fabre, Julia Handl, and Adán José-García. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION. Accepted for publication, November 2022.

This entry contains all 420 datasets used in the paper, including:

Categories:
204 Views

Synthetic data 1: A network contains nine communities. The nodes insides the community are closely connected, and the nodes between communities are sparsely connected.

Synthetic data 2:  A network contains nine communities. There are some nodes that connect the nodes in different communities at the same time. 

Categories:
344 Views

The dataset has Gaussian Blobs of varying samples, centers and features.  The number of samples ranges from 500 to 50,000. Similarly, the number of centers varies from 2 to 100, while the number of features varies from 2 to 2048. These different sets of Gaussian blobs can be used for testing clustering algorithms for their scalability and effectiveness. There are two kinds of files inside the compressed sets. Files ending with "_X.csv" consist of datapoints, while the files ending with "_y.csv" represent respective class data.

Categories:
2843 Views