Datasets
Open Access
CRAWDAD eurecom/elasticmon5G2019
- Citation Author(s):
- Submitted by:
- CRAWDAD Team
- Last updated:
- Wed, 08/28/2019 - 08:00
- DOI:
- 10.15783/c7-s58c-qn61
- Data Format:
- License:
- Collection:
- CRAWDAD
- Categories:
- Keywords:
Abstract
4G and 5G RAN monitoring data collected using the ElasticMon 5G monitoring framework over FlexRAN
last modified : 2019-08-28
nickname : elasticmon5G2019
institution : eurecom
release date : 2019-08-28
date/time of measurement start : 2019-12-11
date/time of measurement end : 2019-12-11
Ten dataset files containing 4G/5G MAC, RRC and PDCP statistics and monitoring data, grouped into two versions of 5 datasets each: raw statistics and processed monitoring data. Raw datasets are recorded using ElasticMon v0.1, a prototype version of a monitoring framework extension of the FlexRAN 5G programmable platform for Software-Defined Radio Access Networks. For details, see here: http://mosaic-5g.io/flexran/ Scenarios setup: Raw datasets are recorded for one eNB and a single mobile User Equipment (UE) in five different mobility scenarios by following different motions and distance patterns relative to the eNB . All raw data have been recorded without including Tx power amplification on the RF frontend (0 dBm transmit power), which implies an approximately 10m maximum range of coverage. Future versions of the datasets will refer t o multiple UEs monitoring and an eNB with Tx power amplification. How to use: The contributed raw datasets can be processed and used for training intelligent 4G/5G models. The processed datasets that are also contributed here follow a proposed stepwise paradigm procedure. Therefore, you are advised to customize and adapt the proposed processing steps to match your own needs. Datasets: ftp://ftp.eurecom.fr/incoming/01-RawDatasets.zip and ftp://ftp.eurecom.fr/incoming/02-PreprocessedDatasets.zip
collection environment : The 10 datasets come in two different grouped versions, namely: 01-RawDatasets: raw statistics containing MAC, RRC and PDCP metric values provided by the FlexRAN controller; 02-PreprocessedDatasets: processed monitoring data by adding timestamps and cleaning out (i) corrupt/inaccurate metric values and (ii) static values; Each grouped version is composed of five comma-separated files: -1- moving-away.csv: the UE moves away from the eNB to a maximum distance of 10 meters. -2- movingcloserfarcloser.csv: the UE moves back and forth relative to the eNB, from a 0 distance up to approximately 10 meters. -3- stableshortdistance.csv: the UE stands still in a short distance (approx., 0-1m) away from the eNB. -4- stablemiddistance.csv: the UE stands still in a mid distance (approx., 1-5m) away from the eNB. -5- stablelongdistance.csv: the UE stands still in a long distance (approx., 5-10m) away from the eNB. In what follows, we describe our processing process as a paradigm. Prospective users are advised to customize this process in order to match their needs. Step 1 - Raw datasets: Medium Access Control (MAC), Radio Resource Control (RRC), Packet Data Convergence Protocol (PDCP) data provided by the FlexRAN controller recorded for 1 UE in a JSON format. Each JSON measurement contains more than a 100 metrics. A detailed description of mea surement metrics available by the FlexRAN controller is available here: http://mosaic-5g.io/apidocs/flexran/#api-Stats. Step 2 - processed Datasets: Pre-processing takes place to give a proper structure to raw recordings and to reduce the number of metrics per measurement from over a 100 to 42. Pre-processing is necessary for a series of reasons: - Adding a timestamp: Exact dates in raw measurements do not give useful information. It is necessary to add timestamps inside the recorded JSON tree of each measurement. This is needed for computing the time elapsed between consecutive measurements. - Cleaning out static values: Omitting specific metric fields that do not change over time. Such metrics maintain in a constant value across measurements regardless of the UE being in motion or not. Therefore, they offer no valuable information for prediction. Note that the rem aining 'dynamic' metrics after this step drops to 42. - Adjusting corrupt/inaccurate metric values: There are measurements such as 'macStats_phr', with corrupt/inaccurate values due to integer overflow. The problem is addressed based on the type of metric and number of consecutive corrupt/inaccurate values by replacing evidently corrupt/inaccurate values with either (a) the median value of their neighboring rows or (b) the mean value over a period of time (e.g., past 100ms) out of a series of neighboring rows (resp., 100 rows). Option (a) was used in cases where consequent values created a trend that was not matched by the identified as corrupt/inaccurate; value, while option (b) was preferred for particular types of metrics such as macStats_phr.
network configuration : One user using an Android v8.0 (Oreo) Nexus 6P phone connected to an eNB (carrier band 7). 5G network based on FlexRAN v2.0 and OpenAirInterface snap packages as follows: oai-ran rev. 16 openairinterface5g tag 2018.w42 (see https://snapcraft.io/oai-ran) and oai-cn rev. 26 (see https://snapcraft.io/oai-cn).
data collection methodology : (1) Collection environment: Raw datasets are collected using our prototype version of ElasticMon v0.1. ElasticMon is a novel elastic monitoring 5G framework for OAI-RAN (see https://snapcraft.io/oai-ran) and OAI-CN (see https://snapcraft.io/oai-cn) built over the F lexRAN v2.0 programmable SD-RAN platform (see: http://mosaic-5g.io/flexran/) (2) Data collection: A single mobile user engaged into different mobility scenarios by following different motion patterns for moving further away or closer to the eNB, or by remaining in a static distance relative to the eNB. The adopted measurement frequency was 50ms.
sanitization : The datasets contain no sensitive information that could raise any privacy concerns. Therefore, the datasets are not sanitized
Traceset
eurecom/elasticmon5G2019/rawdatasets
- files: 01-RawDatasets.zip
- description:
- measurement purpose: None of the above
eurecom/elasticmon5G2019/rawdatasets Traces
- rawdatasets:
- configuration: Raw datasets are collected using our prototype version of ElasticMon v0.1. ElasticMon is a novel elastic monitoring 5G framework for OAI-RAN (see https://snapcraft.io/oai-ran) and OAI-CN (see https://snapcraft .io/oai-cn) built over the FlexRAN v2.0 programmable SD-RAN platform (see: http://mosaic-5g.io/flexran/)
- format:
MAC, RRC and PDCP data provided by the FlexRAN controller recorded for 1 UE in a JSON format. Each JSON measurement contains more than a 100 metrics. A detailed description of measurement metrics available by the Flex
RAN controller is available here: http://mosaic-5g.io/apidocs/flexran/#api-Stats.
eurecom/elasticmon5G2019
Processed monitoring data (derives from traceset 01-RawDatasets)
- files: 02-PreprocessedDatasets.zip
- description: URL: ftp:/ftp.eurecom.fr/incoming/02-PreprocessedDatasets.zip A necessary pre-processing takes place to give a proper structure to raw recordings and to reduce the number of metrics per measurement from over 100 to 42. In brief, each of the five comma-separated files in the second (processed) traceset contains: -1- moving-away.csv: the UE moves away from the eNB to a maximum distance of 10 meters. -2- movingcloserfarcloser.csv: the UE moves back and forth relative to the eNB, from a 0 distance up to approximately 10 meters. -3- stableshortdistance.csv: the UE stands still in a short distance (approx., 0-1m) away from the eNB. -4- stablemiddistance.csv: the UE stands still in a mid-distance (approx., 1-5m) away from the eNB. -5- stablelongdistance.csv: the UE stands still in a long-distance (approx., 5-10m) away from the eNB.
- measurement purpose: None of the above
- methodology: This traceset derives from the raw monitoring date traceset. A necessary pre-processing takes place to give a proper structure to raw recordings and to reduce the number of metrics per measurement from over 100 to 42 . Pre-processing is necessary for a series of reasons: Adding a timestamp: Exact dates in raw measurements do not give useful information. It is necessary to add timestamps inside the recorded JSON tree of each measurement. This is needed for computing the time elapsed between consecutiv e measurements. Cleaning out static values: Omitting specific metric fields that do not change over time. Such metrics maintain in a constant value across measurements regardless of the UE being in motion or not. Therefore, they offer no valuable in formation for prediction. Note that the remaining 'dynamic' metrics after this step drops to 42. Adjusting corrupt/inaccurate metric values: There are measurements such as 'macStats_phr', with corrupt/inaccurate values due to integer overflow. The problem is addressed based on the type of metric and number of consecutive corrupt /inaccurate values by replacing evidently corrupt/inaccurate values with either (a) the median value of their neighboring rows or (b) the mean value over a period of time (e.g., past 100ms) out of a series of neighboring rows (resp., 100 ro ws). Option (a) was used in cases where consequent values created a trend that was not matched by the identified as corrupt/inaccurate value, while option (b) was preferred for particular types of metrics such as mac Stats_phr.
eurecom/elasticmon5G2019 Traces
- PreprocessedDatasets:
- configuration: Raw datasets are collected using our prototype version of ElasticMon v0.1.
- format:
MAC, RRC and PDCP data provided by the FlexRAN controller recorded for 1 UE in a JSON format
The files in this directory are a CRAWDAD dataset hosted by IEEE DataPort.
About CRAWDAD: the Community Resource for Archiving Wireless Data At Dartmouth is a data resource for the research community interested in wireless networks and mobile computing.
CRAWDAD was founded at Dartmouth College in 2004, led by Tristan Henderson, David Kotz, and Chris McDonald. CRAWDAD datasets are hosted by IEEE DataPort as of November 2022.
Note: Please use the Data in an ethical and responsible way with the aim of doing no harm to any person or entity for the benefit of society at large. Please respect the privacy of any human subjects whose wireless-network activity is captured by the Data and comply with all applicable laws, including without limitation such applicable laws pertaining to the protection of personal information, security of data, and data breaches. Please do not apply, adapt or develop algorithms for the extraction of the true identity of users and other information of a personal nature, which might constitute personally identifiable information or protected health information under any such applicable laws. Do not publish or otherwise disclose to any other person or entity any information that constitutes personally identifiable information or protected health information under any such applicable laws derived from the Data through manual or automated techniques.
Please acknowledge the source of the Data in any publications or presentations reporting use of this Data.
Citation: Berkay Koksal, Robert Schmidt, Xenofon Vasilakos, Navid Nikaien, eurecom/elasticmon5G2019, https://doi.org/10.15783/c7-s58c-qn61, Date: 20190828
Dataset Files
- 01-RawDatasets.zip (1.26 MB)
- 02-PreprocessedDatasets.zip (841.69 kB)
Open Access dataset files are accessible to all logged in users. Don't have a login? Create a free IEEE account. IEEE Membership is not required.
These datasets are part of Community Resource for Archiving Wireless Data (CRAWDAD). CRAWDAD began in 2004 at Dartmouth College as a place to share wireless network data with the research community. Its purpose was to enable access to data from real networks and real mobile users at a time when collecting such data was challenging and expensive. The archive has continued to grow since its inception, and starting in summer 2022 is being housed on IEEE DataPort.
Questions about CRAWDAD? See our CRAWDAD FAQ. Interested in submitting your dataset to the CRAWDAD collection? Get started, by submitting an Open Access Dataset.