Real name: 
Congratulations!  You have been automatically subscribed to IEEE DataPort and can access all datasets on IEEE DataPort!
First Name: 
Charles
Last Name: 
Nicholas

Datasets & Competitions

With the widespread use of the Portable Document Format (PDF), it’s increasingly becoming a target for malware, highlighting the need for effective detection solutions. In recent years, machine learning-based methods for PDF malware detection have grown in popularity. However, the effectiveness of ML models is closely related to the quality of the training datasets. In this research, we investigated two widely used PDF malware datasets: Contagio and CIC. We found biases and representativeness issues that could affect the reliability and applicability of models built on them.

Categories:
392 Views