Several fields of study can benefit from a large, structured, and accurate dataset of historical figures. Due to a lack of such a dataset, in this paper, we aim to use machine learning and text mining models to collect, predict, and cleanse online data with a focus on age and gender. We developed a five-step method and inferred birth and death years, binary gender, and occupation from community-submitted data to all language versions of the Wikipedia project.

Dataset Files

You must be an IEEE Dataport Subscriber to access these files. Subscribe now or login.

[1] Issa Annamoradnejad, Rahimberdi Annamoradnejad, "Age dataset: A structured general-purpose dataset on life, work, and death of 1.22 million distinguished people", IEEE Dataport, 2022. [Online]. Available: http://dx.doi.org/10.21227/h1hz-wy90. Accessed: Dec. 26, 2024.
@data{h1hz-wy90-22,
doi = {10.21227/h1hz-wy90},
url = {http://dx.doi.org/10.21227/h1hz-wy90},
author = {Issa Annamoradnejad; Rahimberdi Annamoradnejad },
publisher = {IEEE Dataport},
title = {Age dataset: A structured general-purpose dataset on life, work, and death of 1.22 million distinguished people},
year = {2022} }
TY - DATA
T1 - Age dataset: A structured general-purpose dataset on life, work, and death of 1.22 million distinguished people
AU - Issa Annamoradnejad; Rahimberdi Annamoradnejad
PY - 2022
PB - IEEE Dataport
UR - 10.21227/h1hz-wy90
ER -
Issa Annamoradnejad, Rahimberdi Annamoradnejad. (2022). Age dataset: A structured general-purpose dataset on life, work, and death of 1.22 million distinguished people. IEEE Dataport. http://dx.doi.org/10.21227/h1hz-wy90
Issa Annamoradnejad, Rahimberdi Annamoradnejad, 2022. Age dataset: A structured general-purpose dataset on life, work, and death of 1.22 million distinguished people. Available at: http://dx.doi.org/10.21227/h1hz-wy90.
Issa Annamoradnejad, Rahimberdi Annamoradnejad. (2022). "Age dataset: A structured general-purpose dataset on life, work, and death of 1.22 million distinguished people." Web.
1. Issa Annamoradnejad, Rahimberdi Annamoradnejad. Age dataset: A structured general-purpose dataset on life, work, and death of 1.22 million distinguished people [Internet]. IEEE Dataport; 2022. Available from : http://dx.doi.org/10.21227/h1hz-wy90
Issa Annamoradnejad, Rahimberdi Annamoradnejad. "Age dataset: A structured general-purpose dataset on life, work, and death of 1.22 million distinguished people." doi: 10.21227/h1hz-wy90