The following data set is modelled after the implementers’ test data in 3GPP TS 33.501 “Security architecture and procedures for 5G System” with the same terminology. The data set corresponds to SUCI (Subscription Concealed Identifier) computation in the 5G UE (User Equipment) for IMSI (International Mobile Subscriber Identity) based SUPI (Subscription Permanent Identifier) and ECIES Profile A.
The following data set is modelled after the implementers’ test data in 3GPP TS 33.501 “Security architecture and procedures for 5G System” with the same terminology. The data set corresponds to SUCI (Subscription Concealed Identifier) computation in the 5G UE (User Equipment) for IMSI (International Mobile Subscriber Identity) based SUPI (Subscription Permanent Identifier) and ECIES Profile A, the IMSI consists of MCC|MNC: '274012'.
In the 5G system, the globally unique 5G subscription permanent identifier is called SUPI as defined in 3GPP TS 23.501. For privacy reasons, the SUPI from the 5G devices should not be transferred in clear text, and is instead concealed inside the privacy preserving SUCI. Consequently, the SUPI is privacy protected over-the-air of the 5G radio network by using the SUCI. For SUCIs containing IMSI based SUPI, the UE in essence conceals the MSIN (Mobile Subscriber Identification Number) part of the IMSI. On the 5G operator-side, the SIDF (Subscription Identifier De-concealing Function) of the UDM (Unified Data Management) is responsible for de-concealment of the SUCI and resolves the SUPI from the SUCI based on the protection scheme used to generate the SUCI.
The SUCI protection scheme used in this data set is ECIES Profile A. The size of the scheme-output is a total of 256-bit public key, 64-bit MAC & 40-bit encrypted MSIN. The SUCI scheme-input MSIN is coded as hexadecimal digits using packed BCD coding where the order of digits within an octet is same as the order of MSIN. As the MSINs are odd number of digits, bits 5 to 8 of final octet is coded as ‘1111’.
# Example Python code to load data into Spark DataFrame
df = spark.read.format("csv").option("inferSchema","true").option("header","true").option("sep",",").load(“5g_suci_using_ecies_profile_a_100k.gz”)
Network traffic analysis, i.e. the umbrella of procedures for distilling information from network traffic, represents the enabler for highly-valuable profiling information, other than being the workhorse for several key network management tasks. While it is currently being revolutionized in its nature by the rising share of traffic generated by mobile and hand-held devices, existing design solutions are mainly evaluated on private traffic traces, and only a few public datasets are available, thus clearly limiting repeatability and further advances on the topic.
MIRAGE-2019 is a human-generated dataset for mobile traffic analysis with associated ground-truth, having the goal of advancing the state-of-the-art in mobile app traffic analysis. MIRAGE-2019 takes into consideration the traffic generated by more than 280 experimenters using 40 mobile apps via 3 devices.
A sampled version of the dataset (one app per category) is readily downloadable, whereas the complete version is available on request.
APP LIST reports the details on the apps contained in the two versions of the dataset.
If you are using MIRAGE-2019 human-generated dataset for scientific papers, academic lectures, project reports, or technical documents, please help us increasing its impact by citing the following reference:
Giuseppe Aceto, Domenico Ciuonzo, Antonio Montieri, Valerio Persico and Antonio Pescapè,"MIRAGE: Mobile-app Traffic Capture and Ground-truth Creation",4th IEEE International Conference on Computing, Communications and Security (ICCCS 2019), October 2019, Rome (Italy).