Machine Learning

The BirDrone dataset is compiled by aggregating images of small drones and birds sourced from various online datasets. It comprises 2970 high-resolution images (640x640 pixels), each featuring unique backdrops and lighting conditions. This dataset is designed to enhance machine learning models by simulating real-world scenarios.


Dataset Specifications:


In deep learning, images are utilized due to their rich information content, spatial hierarchies, and translation invariance, rendering them ideal for tasks such as object recognition and classification. The classification of malware using images is an important field for deep learning, especially in cybersecurity. Within this context, the Classified Advanced Persistent Threat Dataset is a thorough collection that has been carefully selected to further this field's study and innovation.


Nasal Cytology, or Rhinology, is the subfield of otolaryngology, focused on the microscope observation of samples of the nasal mucosa, aimed to recognize cells of different types, to spot and diagnose ongoing pathologies. Such methodology can claim good accuracy in diagnosing rhinitis and infections, being very cheap and accessible without any instrument more complex than a microscope, even optical ones.


Popularity of smartphones also popularized, reading content using smartphones. Reading using smartphones quite differs from reading using desktop system. Mouse and Keyboard are the peripherals associated with the reading in desktop systems. Study of the handling of such devices has led to provide implicit feedback of the content read. Similar study in smartphones to get implicit feedback remains to be a huge gap. Reading using smartphones involves screen gestures like pinch to zoom, tap, scroll, orientation change and screen capture.


QuaN is a collection of specially designed datasets for exploring the impact of noise quantum machine learning and other applications. The presented work focuses on the transformation of clean datasets into noisy counterparts across diverse domains, including MNIST-handwritten digits datasets, Medical MNIST, IRIS datasets and Mobile Health datasets. The dataset is created using noise from classical and quantum domains.




We're excited to present a unique challenge aimed at advancing automated depression diagnosis. Traditional methods using written speech or self-reported measures often fall short in real-world scenarios. To address this, we've curated a dataset of authentic depression clinical interviews from a psychiatric hospital.


Student learning willingness is the decisive factor for achieving the final learning outcomes in curriculum teaching. On the other hand, the final learning outcomes achieved by students in the curriculum are a true reflection of student learning willingness. This paper selects 6 types of theoretical teaching method data and 4 types of student engagement behavior data used in the teaching process of the "Computer Systems" course in the Software Engineering major of Information Engineering School in the academic years 2021, 2022, and 2023 as the basic data.


With the goal of improving machine learning approaches in inverse scattering, we provide an experimental data set collected with a 2D near-field microwave imaging system. Machine learning approaches often train solely on synthetic data, and one of the reasons for this is that no experimentally-derived public data set exists. The imaging system consists of 24 antennas surrounding the imaging region, connected via a switch to a vector network analyzer. The data set contains over 1000 full Scattering parameter scans of five targets at numerous positions from 3-5 GHz.


Predicting future events always comes with uncertainty, but traditional non-probabilistic methods cannot distinguish certain from uncertain predictions. In survival analysis, probabilistic methods applied to state-of-the-art solutions in the healthcare and biomedical field are still novel and their implications have not been fully evaluated. In this paper, we study the benefits of modeling uncertainty in deep neural networks for survival analysis with a focus on prediction and calibration performance.