Human facial data hold tremendous potential to address a variety of classification problems, including face recognition, age estimation, gender identification, emotion analysis, and race classification. However, recent privacy regulations, such as the EU General Data Protection Regulation, have restricted the ways in which human images may be collected and used for research.


The dataset explores the linguistic characteristics of Ukrainian online community members on "Lviv. Forum Ridne City" ( based on gender (female/male). It includes vectors of male and female profiles, along with 36 control vectors for 18 women's profiles and 18 men's profiles. The dataset includes 48 linguistic characteristics of gender in online communication. The linguistic features analyzed encompass a wide range, including apology, modal designs, emotions, profanity, sports and politics references, and more.


Gestational diabetes is a type of high blood sugar that develops during pregnancy. It can occur at any stage of pregnancy and cause problems for both the mother and the baby, during and after birth. The risks can be reduced if they are early detected and managed, especially in areas where only periodic tests of pregnant women are available. Intelligent systems designed by machine learning algorithms are remodelling all fields of our lives, including the healthcare system. This study proposes a combined prediction model to diagnose gestational diabetes.


Several fields of study can benefit from a large, structured, and accurate dataset of historical figures. Due to a lack of such a dataset, in this paper, we aim to use machine learning and text mining models to collect, predict, and cleanse online data with a focus on age and gender. We developed a five-step method and inferred birth and death years, binary gender, and occupation from community-submitted data to all language versions of the Wikipedia project.


Although several databases of handwriting movements have been created so, none of them has been specifically designed for studying the effect of age during ellipse drawing. Ninety subjects voluntarily participated in the database construction. Their age ranged from 19 to 85 years: 30 participants in the range [19, 39] years, 30 in the range [40, 59] and 30 subjects in the range [60, 85]. Twenty-six women (range 19-72 years) and sixty-four men (range 25-85 years) participated.


This dataset contains Wi-Fi sensing data using Channel State Information (CSI) for respiration rate measurements in a standard 3m x 3m room. The Wi-Fi CSI data was collected using the Wi-Fi module on the ESP32 Microcontroller units using the esp32-csi-tool. The Wi-Fi CSI data is accompanied by respiration belt data taken with the Wi-Fi measurements simultaneously using the Neulog NUL-236 respiration belt logger as ground truth.


Any work using this dataset should cite this paper as follows:

Nirmalya Thakur and Chia Y. Han, "Country-Specific Interests towards Fall Detection from 2004–2021: An Open Access Dataset and Research Questions", Journal of Data, Volume 6, Issue 8, pp. 1-21, 2021.



Mother’s Significant Feature (MSF) Dataset has been designed to provide data to researchers working towards woman and child health betterment. MSF dataset records are collected from the Mumbai metropolitan region in Maharashtra, India. Women were interviewed just after childbirth between February 2018 to March 2021. MSF comprise of 450 records with a total of 130 attributes consisting of mother’s features, father’s features and health outcomes. A detailed dataset is created to understand the mother’s features spread across three phases of her reproductive age i.e.