Multilingual Cyberbullying detection
The dataset crafted for this study is intentionally designed to encapsulate instances of cyberbullying across three distinct languages: Urdu, Roman Urdu, and English. This strategic selection aims to mirror the linguistic variations that are prevalent in social media dialogues among Urdu-speaking communities globally. Further, it undergoes meticulous annotation to encapsulate the diverse linguistic nuances characteristic of these languages. This process includes integrating critical aspects of cyberbullying, such as aggression, repetition, and intent to harm.
- Categories:
376 Views