Chinese cybersecurity event dataset

Citation Author(s):
Bingzhi
Xu
Submitted by:
Bingzhi Xu
Last updated:
Mon, 08/12/2024 - 10:30
DOI:
10.21227/z61y-9617
Data Format:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

This paper introduces a new dataset named CSED, designed for Chinese cybersecurity ED. The dataset has collected approximately 18,000 news articles related to cybersecurity. We have drawn on the classification definitions of cybersecurity event types from the CAISE [38] , defining two event types: Attack and Vulnerability, and further subdividing them into nine sub-event types: Data Breach, Phishing, Ransom, DDoS Attack, Malware, Supply Chain, Vulnerability Impact, Vulnerability Discovery, and Vulnerability Patch. Additionally, sentences that do not contain any specific event are categorized as ‘NA’. The key to annotating cybersecurity event tasks is to identify trigger words; carefully selected trigger words can significantly enhance the efficiency of subsequent event recognition. We establish rules for the annotation process, selecting only the most representative event for annotation when a sentence contains multiple events of the different type. This approach avoids unnecessary redundancy and ensures a refined dataset. It includes 2054 event instances, 2 event types, and 9 sub-types.

Comments

It is design for Chinese cybersecurity evnet detection

Submitted by Bingzhi Xu on Mon, 08/12/2024 - 10:31