The Dravidian Spam SMS dataset has Spam and Ham messages in English, Tamil, Telugu, Kannada, and Malayalam languages. Nearly 7700 messages were collected by sending friends and other contacts a Google form. Language experts (reading and writing skills) were used to label the messages of corresponding languages carefully. The dataset also includes the Tamil verbatim messages written in English. For example, “Nee Nalama”. The Ham messages are mostly normal. Spam messages include business, annoying, and unnecessary messages an anonymous user sends.

Dataset Files

You must be an IEEE Dataport Subscriber to access these files. Subscribe now or login.

[1] Ramanujam Elangovan, Abirami A M, "Spam SMS in Dravidian Languages", IEEE Dataport, 2023. [Online]. Available: http://dx.doi.org/10.21227/dcym-pd69. Accessed: Dec. 03, 2023.
@data{dcym-pd69-23,
doi = {10.21227/dcym-pd69},
url = {http://dx.doi.org/10.21227/dcym-pd69},
author = {Ramanujam Elangovan; Abirami A M },
publisher = {IEEE Dataport},
title = {Spam SMS in Dravidian Languages},
year = {2023} }
TY - DATA
T1 - Spam SMS in Dravidian Languages
AU - Ramanujam Elangovan; Abirami A M
PY - 2023
PB - IEEE Dataport
UR - 10.21227/dcym-pd69
ER -
Ramanujam Elangovan, Abirami A M. (2023). Spam SMS in Dravidian Languages. IEEE Dataport. http://dx.doi.org/10.21227/dcym-pd69
Ramanujam Elangovan, Abirami A M, 2023. Spam SMS in Dravidian Languages. Available at: http://dx.doi.org/10.21227/dcym-pd69.
Ramanujam Elangovan, Abirami A M. (2023). "Spam SMS in Dravidian Languages." Web.
1. Ramanujam Elangovan, Abirami A M. Spam SMS in Dravidian Languages [Internet]. IEEE Dataport; 2023. Available from : http://dx.doi.org/10.21227/dcym-pd69
Ramanujam Elangovan, Abirami A M. "Spam SMS in Dravidian Languages." doi: 10.21227/dcym-pd69