phishing attack; artificial intelligence; deep learning
The PhishFOE Dataset is a comprehensive dataset designed for phishing URL detection using machine learning techniques. The dataset contains 101,083 URLs, with labeled features extracted from both the URL structure and HTML content of webpages. It provides insights into key characteristics that distinguish phishing websites from legitimate ones.
-
Total Samples: 101,063
-
Label:
0
for Legitimate,1
for Phishing
- Categories:

The data set has been prepared as 2 different versions. The data set was shared in two versions due to the fact that the researchers could easily reproduce the tests and hardware limitations. The first version (small_dataset) was prepared using a 10% sub-sample of all dataset. The other version (big_dataset) contains the entire data. In this study, the scenarios tested were run on the small_dataset. The most successful configuration that was selected as a result of the analysis on small_dataset was applied to big_dataset.
- Categories: