Cyberbullying
Cyberbullying is a growing problem on social media. This dataset helps detect cyberbullying in Bangla by collecting comments from YouTube, Facebook, Instagram, and TikTok. The data is categorized into two types: bullying and non-bullying. It includes various abusive and harmful texts, along with normal conversations. This dataset will help researchers and developers train AI models to automatically identify cyberbullying in Bangla text. The goal is to create better tools to keep online spaces safe for Bangla-speaking users.
- Categories:

The dataset crafted for this study is intentionally designed to encapsulate instances of cyberbullying across three distinct languages: Urdu, Roman Urdu, and English. This strategic selection aims to mirror the linguistic variations that are prevalent in social media dialogues among Urdu-speaking communities globally. Further, it undergoes meticulous annotation to encapsulate the diverse linguistic nuances characteristic of these languages. This process includes integrating critical aspects of cyberbullying, such as aggression, repetition, and intent to harm.
- Categories: