Datasets
Standard Dataset
Tamil Cyberbullying Dataset
- Citation Author(s):
- Submitted by:
- Arul Antran Vijay S
- Last updated:
- Sat, 10/26/2024 - 11:34
- DOI:
- 10.21227/20s2-jh36
- Data Format:
- License:
0 ratings - Please login to submit your rating.
Abstract
We used the broad group of 47,692 tweets from the Cyberbullying Classification dataset. This worldwide sourced dataset offers a broad range of examples of cyberbullying, guaranteeing a thorough viewpoint. Our thorough translation and modification procedure guaranteed the dataset's contextual and cultural relevance for the Tamil-speaking population, even though it is not solely from South Asia. These tweets were carefully divided into six classes, each of which represented a different facet of cyberbullying, as well as cases that weren't considered cyberbullying. Because the sample was evenly distributed throughout all categories, it offered a thorough understanding of the complex nature of cyberbullying in online communication.