The detection of settlements without electricity challenge track (Track DSE) of the 2021 IEEE GRSS Data Fusion Contest, organized by the Image Analysis and Data Fusion Technical Committee (IADF TC) of the IEEE Geoscience and Remote Sensing Society (GRSS), Hewlett Packard Enterprise, SolarAid, and Data Science Experts, aims to promote research in automatic detection of human settlements deprived of access to electricity using multimodal and multitemporal remote sensing data.

Last Updated On: 
Thu, 12/03/2020 - 04:16
Citation Author(s): 
Colin Prieur, Hana Malha, Frederic Ciesielski, Paul Vandame, Giorgio Licciardi, Jocelyn Chanussot, Pedram Ghamisi, Ronny Hänsch, Naoto Yokoya

The generated dataset consists of 23 synthetic 4D LFVs with 1,204x436 pixels, 9x9 views, and 20--50 frames, and has ground-truth depth values in the central view, so that can be used for training deep learning-based methods. Each scene was rendered with a ``clean'' pass after modifying the production file of ``Sintel'' with reference to the MPI Sintel dataset ~\cite{butler2012naturalistic}.


An oral presentation at IWAIT2021 is planned. Under construction...


This dataset contains RF signals from drone remote controllers (RCs) of different makes and models. The RF signals transmitted by the drone RCs to communicate with the drones are intercepted and recorded by a passive RF surveillance system, which consists of a high-frequency oscilloscope, directional grid antenna, and low-noise power amplifier. The drones were idle during the data capture process. All the drone RCs transmit signals in the 2.4 GHz band. There are 17 drone RCs from eight different manufacturers and ~1000 RF signals per drone RC, each spanning a duration of 0.25 ms. 


The dataset contains ~1000 RF signals in .mat format from the remote controllers (RCs) of the following drones:

  • DJI (5): Inspire 1 Pro, Matrice 100, Matrice 600*, Phantom 4 Pro*, Phantom 3 
  • Spektrum (4): DX5e, DX6e, DX6i, JR X9303
  • Futaba (1): T8FG
  • Graupner (1): MC32
  • HobbyKing (1): HK-T6A
  • FlySky (1): FS-T6
  • Turnigy (1): 9X
  • Jeti Duplex (1): DC-16.

In the dataset, there are two pairs of RCs for the drones indicated by an asterisk above, making a total of 17 drone RCs. Each RF signal contains 5 million samples and spans a time period of 0.25 ms. 

The scripts provided with the dataset defines a class to create drone RC objects and creates a database of objects as well as a database in table format with all the available information, such as make, model, raw RF signal, sampling frequency, etc. The scripts also include functions to visualize data and extract a few example features from the raw RF signal (e.g., transient signal start point). Instructions for using the scripts are included at the top of each script and can also be viewed by typing help scriptName in MATLAB command window.  

The drone RC RF dataset was used in the following papers:

  • M. Ezuma, F. Erden, C. Kumar, O. Ozdemir, and I. Guvenc, "Micro-UAV detection and classification from RF fingerprints using machine learning techniques," in Proc. IEEE Aerosp. Conf., Big Sky, MT, Mar. 2019, pp. 1-13.
  • M. Ezuma, F. Erden, C. K. Anjinappa, O. Ozdemir, and I. Guvenc, "Detection and classification of UAVs using RF fingerprints in the presence of Wi-Fi and Bluetooth interference," IEEE Open J. Commun. Soc., vol. 1, no. 1, pp. 60-79, Nov. 2019.
  • E. Ozturk, F. Erden, and I. Guvenc, "RF-based low-SNR classification of UAVs using convolutional neural networks." arXiv preprint arXiv:2009.05519, Sept. 2020.

Other details regarding the dataset and data collection and processing can be found in the above papers and attached documentation.  


Author Contributions:

  • Experiment design: O. Ozdemir and M. Ezuma
  • Data collection:  M. Ezuma
  • Scripts: F. Erden and C. K. Anjinappa
  • Documentation: F. Erden
  • Supervision, revision, and funding: I. Guvenc 



This work was supported in part by NASA through the Federal Award under Grant NNX17AJ94A.


This dataset is composed of side channel information (e.g., temperatures, voltages, utilization rates) from computing systems executing benign and malicious code.  The intent of the dataset is to allow aritificial intelligence tools to be applied to malware detection using side channel information.


Retinal Fundus Multi-disease Image Dataset (RFMiD) consisting of a wide variety of pathological conditions. 


Detailed instructions about this dataset are available on the challenge website:


This dataset contains constellation diagrams for QPSK, 16QAM, 64QAM, which we used for our research paper "Fast signal quality monitoring for coherent communications enabled by CNN-based EVM estimation" on JOCN.


Computer vision in animal monitoring has become a research application in stable or confined conditions.

Detecting animals from the top view is challenging due to barn conditions.

In this dataset called ICV-TxLamb, images are proposed for the monitoring of lamb inside a barn.

This set of data is made up of two categories, the first is lamb (classifies the only lamb), the second consists of four states of the posture of lambs, these are: eating, sleeping, lying down, and normal (standing or without activity ).


Wildfires are one of the deadliest and dangerous natural disasters in the world. Wildfires burn millions of forests and they put many lives of humans and animals in danger. Predicting fire behavior can help firefighters to have better fire management and scheduling for future incidents and also it reduces the life risks for the firefighters. Recent advance in aerial images shows that they can be beneficial in wildfire studies. Among different methods and technologies for aerial images, Unmanned Aerial Vehicles (UAVs) and drones are beneficial to collect information regarding the fire. 


The aerial pile fire detection dataset consists of different repositories. The first one is a raw video recorded using the Zenmuse X4S camera. The format of this file is MP4. The duration of the video is 966 seconds with a Frame Per Second (FPS) of 29. The size of this repository is 1.2 GB. The first video was used for the "Fire-vs-NoFire" image classification problem (training/validation dataset). The second one is a raw video recorded using the Zenmuse X4S camera. The duration of the video is 966 seconds with a Frame Per Second (FPS) of 29. The size of this repository is 503 MB. This video shows the behavior of one pile from the start of burning. The resolution of these two videos is 1280x720.

The third video is 89 seconds of heatmap footage of WhiteHot from the thermal camera. The size of this repository is 45 MB. The fourth one is 305 seconds of GreentHot heatmap with a size of 153 MB. The fifth repository is 25 mins of fusion heatmap with a size of 2.83 GB. All these three thermal videos are recorded by the FLIR Vue Pro R thermal camera with an FPS of 30 and a resolution of 640x512. The format of all these videos is MOV.

The sixth video is 17 mins long from the DJI Phantom 3 camera. This footage is used for the purpose of the "Fire-vs-NoFire" image classification problem (test dataset). The FPS is 30, the size is 32 GB, the resolution is 3840x2160, and the format is MOV.

The seventh repository is 39,375 frames that resized to 254x254 for the "Fire-vs-NoFire" image classification problem (Training/Validation dataset). The size of this repository is 1.3 GB and the format is JPEG.

The eighth repository is 8,617 frames that resized to 254x254 for the "Fire-vs-NoFire" image classification problem (Test dataset). The size of this repository is 301 MB and the format is JPEG.

The ninth repository is 2,003 fire frames with a resolution of 3480x2160 for the fire segmentation problem (Train/Val/Test dataset). The size of this repository is 5.3 GB and the format is JPEG.

The last repository is 2,003 ground truth mask frames regarding the fire segmentation problem. The resolution of each mask is 3480x2160. The size of this repository is 23.4 MB.

For more information please find the Table in


Amidst the COVID-19 pandemic, cyberbullying has become an even more serious threat. Our work aims to investigate the viability of an automatic multiclass cyberbullying detection model that is able to classify whether a cyberbully is targeting a victim’s age, ethnicity, gender, religion, or other quality. Previous literature has not yet explored making fine-grained cyberbullying classifications of such magnitude, and existing cyberbullying datasets suffer from quite severe class imbalances.


Please cite the following paper when using this open access dataset: J. Wang, K. Fu, C.T. Lu, “SOSNet: A Graph Convolutional Network Approach to Fine-Grained Cyberbullying Detection,” Proceedings of the 2020 IEEE International Conference on Big Data (IEEE BigData 2020), December 10-13, 2020.

This is a "Dynamic Query Expansion"-balanced dataset containing .txt files with 8000 tweets for each of a fine-grained class of cyberbullying: age, ethnicity, gender, religion, other, and not cyberbullying.

Total Size: 6.33 MB


Includes some data from:

S. Agrawal and A. Awekar, “Deep learning for detecting cyberbullying across multiple social media platforms,” in European Conference on Information Retrieval. Springer, 2018, pp. 141–153.

U. Bretschneider, T. Wohner, and R. Peters, “Detecting online harassment in social networks,” in ICIS, 2014.

D. Chatzakou, I. Leontiadis, J. Blackburn, E. D. Cristofaro, G. Stringhini, A. Vakali, and N. Kourtellis, “Detecting cyberbullying and cyberaggression in social media,” ACM Transactions on the Web (TWEB), vol. 13, no. 3, pp. 1–51, 2019.

T. Davidson, D. Warmsley, M. Macy, and I. Weber, “Automated hate speech detection and the problem of offensive language,” arXiv preprint arXiv:1703.04009, 2017.

Z. Waseem and D. Hovy, “Hateful symbols or hateful people? predictive features for hate speech detection on twitter,” in Proceedings of the NAACL student research workshop, 2016, pp. 88–93.

Z. Waseem, “Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter,” in Proceedings of the first workshop on NLP and computational social science, 2016, pp. 138–142.

J.-M. Xu, K.-S. Jun, X. Zhu, and A. Bellmore, “Learning from bullying traces in social media,” in Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies, 2012, pp. 656–666.