This dataset contains constellation diagrams for QPSK, 16QAM, 64QAM, which we used for our research paper "Fast signal quality monitoring for coherent communications enabled by CNN-based EVM estimation" on JOCN.


Computer vision in animal monitoring has become a research application in stable or confined conditions.

Detecting animals from the top view is challenging due to barn conditions.

In this dataset called ICV-TxLamb, images are proposed for the monitoring of lamb inside a barn.

This set of data is made up of two categories, the first is lamb (classifies the only lamb), the second consists of four states of the posture of lambs, these are: eating, sleeping, lying down, and normal (standing or without activity ).


Wildfires are one of the deadliest and dangerous natural disasters in the world. Wildfires burn millions of forests and they put many lives of humans and animals in danger. Predicting fire behavior can help firefighters to have better fire management and scheduling for future incidents and also it reduces the life risks for the firefighters. Recent advance in aerial images shows that they can be beneficial in wildfire studies.


The aerial pile burn detection dataset consists of different repositories. The first one is a raw video recorded using the Zenmuse X4S camera. The format of this file is MP4. The duration of the video is 966 seconds with a Frame Per Second (FPS) of 29. The size of this repository is 1.2 GB. The first video was used for the "Fire-vs-NoFire" image classification problem (training/validation dataset). The second one is a raw video recorded using the Zenmuse X4S camera. The duration of the video is 966 seconds with a Frame Per Second (FPS) of 29. The size of this repository is 503 MB. This video shows the behavior of one pile from the start of burning. The resolution of these two videos is 1280x720.

The third video is 89 seconds of heatmap footage of WhiteHot from the thermal camera. The size of this repository is 45 MB. The fourth one is 305 seconds of GreentHot heatmap with a size of 153 MB. The fifth repository is 25 mins of fusion heatmap with a size of 2.83 GB. All these three thermal videos are recorded by the FLIR Vue Pro R thermal camera with an FPS of 30 and a resolution of 640x512. The format of all these videos is MOV.

The sixth video is 17 mins long from the DJI Phantom 3 camera. This footage is used for the purpose of the "Fire-vs-NoFire" image classification problem (test dataset). The FPS is 30, the size is 32 GB, the resolution is 3840x2160, and the format is MOV.

The seventh repository is 39,375 frames that resized to 254x254 for the "Fire-vs-NoFire" image classification problem (Training/Validation dataset). The size of this repository is 1.3 GB and the format is JPEG.

The eighth repository is 8,617 frames that resized to 254x254 for the "Fire-vs-NoFire" image classification problem (Test dataset). The size of this repository is 301 MB and the format is JPEG.

The ninth repository is 2,003 fire frames with a resolution of 3480x2160 for the fire segmentation problem (Train/Val/Test dataset). The size of this repository is 5.3 GB and the format is JPEG.

The last repository is 2,003 ground truth mask frames regarding the fire segmentation problem. The resolution of each mask is 3480x2160. The size of this repository is 23.4 MB.

The preprint article of this dataset is available here:

For more information please find the Table at:

To find other projects and articles in our group:


Amidst the COVID-19 pandemic, cyberbullying has become an even more serious threat. Our work aims to investigate the viability of an automatic multiclass cyberbullying detection model that is able to classify whether a cyberbully is targeting a victim’s age, ethnicity, gender, religion, or other quality. Previous literature has not yet explored making fine-grained cyberbullying classifications of such magnitude, and existing cyberbullying datasets suffer from quite severe class imbalances.


Please cite the following paper when using this open access dataset: J. Wang, K. Fu, C.T. Lu, “SOSNet: A Graph Convolutional Network Approach to Fine-Grained Cyberbullying Detection,” Proceedings of the 2020 IEEE International Conference on Big Data (IEEE BigData 2020), December 10-13, 2020.

This is a "Dynamic Query Expansion"-balanced dataset containing .txt files with 8000 tweets for each of a fine-grained class of cyberbullying: age, ethnicity, gender, religion, other, and not cyberbullying.

Total Size: 6.33 MB


Includes some data from:

S. Agrawal and A. Awekar, “Deep learning for detecting cyberbullying across multiple social media platforms,” in European Conference on Information Retrieval. Springer, 2018, pp. 141–153.

U. Bretschneider, T. Wohner, and R. Peters, “Detecting online harassment in social networks,” in ICIS, 2014.

D. Chatzakou, I. Leontiadis, J. Blackburn, E. D. Cristofaro, G. Stringhini, A. Vakali, and N. Kourtellis, “Detecting cyberbullying and cyberaggression in social media,” ACM Transactions on the Web (TWEB), vol. 13, no. 3, pp. 1–51, 2019.

T. Davidson, D. Warmsley, M. Macy, and I. Weber, “Automated hate speech detection and the problem of offensive language,” arXiv preprint arXiv:1703.04009, 2017.

Z. Waseem and D. Hovy, “Hateful symbols or hateful people? predictive features for hate speech detection on twitter,” in Proceedings of the NAACL student research workshop, 2016, pp. 88–93.

Z. Waseem, “Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter,” in Proceedings of the first workshop on NLP and computational social science, 2016, pp. 138–142.

J.-M. Xu, K.-S. Jun, X. Zhu, and A. Bellmore, “Learning from bullying traces in social media,” in Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies, 2012, pp. 656–666. 


The ability of detecting human postures is particularly important in several fields like ambient intelligence, surveillance, elderly care, and human-machine interaction. Most of the earlier works in this area are based on computer vision. However, mostly these works are limited in providing real time solution for the detection activities. Therefore, we are currently working toward the Internet of Things (IoT) based solution for the human posture recognition.


This dataset consists of orthorectified aerial photographs, LiDAR derived digital elevation models and segmentation maps with 10 classes, acquired through the open data program of the German state North Rhine-Westphalia ( and refined with OpenStreeMap. Please check the license information (


Dataset description

The data was mostly acquired over urban areas in North-Rhine Westphalia, Germany. Since the acquisition dates for the aerial photographs and LiDAR do not match exactly, there can be discrepancies in what they show and in which season, e.g., trees change their leaves or lose them in autumn. In our experience, these differences are not drastic but should be kept in mind.

We have included two Python scripts. creates the example image used on this website. calculates and plots the class statistics. Furthermore, we published the code to create the dataset at, which makes it easy to extend the dataset with other areas in North-Rhine Westphalia. The repository also contains a PyTorch data loader.

This multimodal dataset should be useful for a variety of tasks. Image segmentation using multiple inputs, height estimation from the aerial photographs, or semantic image synthesis.


Similar to the original source of the data (, we organize all samples by the city they were acquired over. Their filenames, e.g., 345_5668_rgb.jp2 consists of the UTM zone 32N coordinates and the datatype (RGB, DEM or seg for land cover).

File formats

All data is geocoded and can be opened using QGIS ( The aerial photographs are stored as JPEG2000 files, the land cover maps and digital elevation models both as GeoTIFFs. The accompanying scripts show how to read the data into Python.


Diabetic Retinopathy is the second largest cause of blindness in diabetic patients. Early diagnosis or screening can prevent the visual loss. Nowadays , several computer aided algorithms have been developed to detect the early signs of Diabetic Retinopathy ie., Microaneurysms. The AGAR300 dataset presented here facilitate the researchers for benchmarking MA detection algorithms using digital fundus images. Currently, we have released the first set of database which consists of 28 color fundus images, shows the signs of Microaneurysm.


The files corresponding to the work reported in paper titled " A novel automated system of discriminating Microaneurysms in fundus images”. The images  are taken from Fundus photography machine with the resolution of 2448x3264. This dataset contains Diabetic Retinopathy images and users of this dataset should cite the following article.


D. Jeba Derwin, S. Tamil Selvi, O. Jeba Singh, B. Priestly Shan,”A novel automated system of discriminating Microaneurysms in fundus images”, Biomedical Signal Processing and Control,Vol.58, 2020, pages: 101839,ISSN 1746-8094,



The dataset was collected from the following: Kaggle cardiovascular dataset available at; UCI machine learning Cleveland and Hungarian heart disease dataset available at and available from the corresponding author upon request.


Data for the study has been retrieved from a publicly available data set of a leading European P2P lending platform, Bondora ( The retrieved data is a pool of both defaulted and non-defaulted loans from the time period between 1st March 2009 and 27th January 2020. The data comprises demographic and financial information of borrowers and loan transactions. In P2P lending, loans are typically uncollateralized and lenders seek higher returns as compensation for the financial risk they take.


The dataset also consists of data preprocessing Jupyter notebook that will help in working with the data and to perform basic data pre-processing. The zip file of the dataset consists of pre-processed and raw dataset directly extracted from the Bondora website

In the attached notebook, I have used my intuition and assumption for performing data-preprocessing.