Datasets
Standard Dataset
Multimodal fake news datasets

- Citation Author(s):
- Submitted by:
- Qiang Lu
- Last updated:
- Fri, 04/18/2025 - 00:38
- DOI:
- 10.21227/8mdk-pe09
- License:
- Categories:
- Keywords:
Abstract
We evaluate the proposed FSDUF model on three publicly available social media benchmark datasets: Weibo {jin2017multimodal}, Twitter {boididou2015verifying}, and Pheme {zubiaga2017exploiting}.
Weibo dataset consists of original tweet texts and corresponding images collected from authoritative news sources in China. Twitter dataset contains real and fake news samples sourced from the MediaEval Benchmarking Initiative. Pheme dataset includes rumors and non-rumors posted on Twitter during breaking news events, covering nine different events, with each rumor annotated as real or fake. All datasets are randomly divided into training, validation, and testing sets with a ratio of 7:1:2.
Weibo data: Z. Jin, J. Cao, H. Guo, Y. Zhang, and J. Luo, “Multimodal fusion with recurrent neural networks for rumor detection on microblogs,” in Proceedings of the 25th ACM international conference on Multimedia, 2017, pp. 795–816.
Twitter data : C. Boididou, K. Andreadou, S. Papadopoulos, D. T. Dang Nguyen, G. Boato, M. Riegler, Y. Kompatsiaris et al., “Verifying multimedia use at mediaeval 2015,” in MediaEval 2015. CEUR-WS, 2015, vol. 1436.
Pheme data: A. Zubiaga, M. Liakata, and R. Procter, “Exploiting context for rumour detection in social media,” in Social Informatics: 9th International Conference, SocInfo 2017, Oxford, UK, September 13-15, 2017, Proceedings, Part I 9. Springer, 2017, pp. 109–123.