This dataset is extracted from GitHub and contains 172,919 java source codes written by 3,128 authors. It can be used for authorship attribution.

Dataset Files

You must be an IEEE Dataport Subscriber to access these files. Subscribe now or login.

[1] Farzaneh Abazari, Enrico Branca, Norah Ridley, Natalia Stakhanova, Mila Dalla Preda, "Github Dataset for Authorship Attribution", IEEE Dataport, 2021. [Online]. Available: http://dx.doi.org/10.21227/vx0v-j232. Accessed: Dec. 08, 2024.
@data{vx0v-j232-21,
doi = {10.21227/vx0v-j232},
url = {http://dx.doi.org/10.21227/vx0v-j232},
author = {Farzaneh Abazari; Enrico Branca; Norah Ridley; Natalia Stakhanova; Mila Dalla Preda },
publisher = {IEEE Dataport},
title = {Github Dataset for Authorship Attribution},
year = {2021} }
TY - DATA
T1 - Github Dataset for Authorship Attribution
AU - Farzaneh Abazari; Enrico Branca; Norah Ridley; Natalia Stakhanova; Mila Dalla Preda
PY - 2021
PB - IEEE Dataport
UR - 10.21227/vx0v-j232
ER -
Farzaneh Abazari, Enrico Branca, Norah Ridley, Natalia Stakhanova, Mila Dalla Preda. (2021). Github Dataset for Authorship Attribution. IEEE Dataport. http://dx.doi.org/10.21227/vx0v-j232
Farzaneh Abazari, Enrico Branca, Norah Ridley, Natalia Stakhanova, Mila Dalla Preda, 2021. Github Dataset for Authorship Attribution. Available at: http://dx.doi.org/10.21227/vx0v-j232.
Farzaneh Abazari, Enrico Branca, Norah Ridley, Natalia Stakhanova, Mila Dalla Preda. (2021). "Github Dataset for Authorship Attribution." Web.
1. Farzaneh Abazari, Enrico Branca, Norah Ridley, Natalia Stakhanova, Mila Dalla Preda. Github Dataset for Authorship Attribution [Internet]. IEEE Dataport; 2021. Available from : http://dx.doi.org/10.21227/vx0v-j232
Farzaneh Abazari, Enrico Branca, Norah Ridley, Natalia Stakhanova, Mila Dalla Preda. "Github Dataset for Authorship Attribution." doi: 10.21227/vx0v-j232