Machine Learning

Computer vision in animal monitoring has become a research application in stable or confined conditions.

Detecting animals from the top view is challenging due to barn conditions.

In this dataset called ICV-TxLamb, images are proposed for the monitoring of lamb inside a barn.

This set of data is made up of two categories, the first is lamb (classifies the only lamb), the second consists of four states of the posture of lambs, these are: eating, sleeping, lying down, and normal (standing or without activity ).


Wildfires are one of the deadliest and dangerous natural disasters in the world. Wildfires burn millions of forests and they put many lives of humans and animals in danger. Predicting fire behavior can help firefighters to have better fire management and scheduling for future incidents and also it reduces the life risks for the firefighters. Recent advance in aerial images shows that they can be beneficial in wildfire studies.


Amidst the COVID-19 pandemic, cyberbullying has become an even more serious threat. Our work aims to investigate the viability of an automatic multiclass cyberbullying detection model that is able to classify whether a cyberbully is targeting a victim’s age, ethnicity, gender, religion, or other quality. Previous literature has not yet explored making fine-grained cyberbullying classifications of such magnitude, and existing cyberbullying datasets suffer from quite severe class imbalances.


The ability of detecting human postures is particularly important in several fields like ambient intelligence, surveillance, elderly care, and human-machine interaction. Most of the earlier works in this area are based on computer vision. However, mostly these works are limited in providing real time solution for the detection activities. Therefore, we are currently working toward the Internet of Things (IoT) based solution for the human posture recognition.


This dataset consists of orthorectified aerial photographs, LiDAR derived digital elevation models and segmentation maps with 10 classes, acquired through the open data program of the German state North Rhine-Westphalia ( and refined with OpenStreeMap. Please check the license information (


Diabetic Retinopathy is the second largest cause of blindness in diabetic patients. Early diagnosis or screening can prevent the visual loss. Nowadays , several computer aided algorithms have been developed to detect the early signs of Diabetic Retinopathy ie., Microaneurysms. The AGAR300 dataset presented here facilitate the researchers for benchmarking MA detection algorithms using digital fundus images. Currently, we have released the first set of database which consists of 28 color fundus images, shows the signs of Microaneurysm.


This heart disease dataset is curated by combining 3 popular heart disease datasets. The first dataset (Collected from Kaggle)  contains 70000 records with 11 independent features which makes it the largest heart disease dataset available so far for research purposes. These data were collected at the moment of medical examination and information given by the patient. Second and third datasets contain 303 and 293 intstances respectively with 13 common features. The three datasets used for its curation are:

  1. Cardio Data (Kaggle Dataset)


Data for the study has been retrieved from a publicly available data set of a leading European P2P lending platform, Bondora ( The retrieved data is a pool of both defaulted and non-defaulted loans from the time period between 1st March 2009 and 27th January 2020. The data comprises demographic and financial information of borrowers and loan transactions. In P2P lending, loans are typically uncollateralized and lenders seek higher returns as compensation for the financial risk they take.


Recently, Temporal Information Retrieval (TIR) has grabbed the major attention of the information retrieval community. TIR exploits the temporal dynamics in the information retrieval process and harnesses both textual relevance and temporal relevance to fulfill the temporal information requirements of a user Ur Rehman Khan et al., 2018. The focus time of document is an important temporal aspect which is defined as the time to which the content of the document refers Jatowt et al., 2015; Jatowt et al., 2013; Morbidoni et al., 2018, Khan et al., 2018.